Regex struct in Javascript [duplicate] - javascript

This question already has answers here:
How to convert unicode in JavaScript?
(4 answers)
Closed 9 years ago.
I am getting data returned in a JSON file with Unicode characters. How to replace Unicode characters as per my example?
\u003cli class=\"channels-content-item\"\u003e\n
\n
\u003cdiv class=\"shmoovie-content-cell\"\u003e\n
\u003ca href=\"\/movie\/the-makeover\" class=\"ux-thumb-wrap contains-addto yt-uix-sessionlink\" data-sessionlink=\"ei=Oo21UdLqM8aDhgHc_IHYCA\"\u003e
After replacing with regex:
\u003c must be replaced by <
\u003e must be replaced by >
\/ must be replaced by /
\" must be replaced by "
How to do that?

Using the bit of the string you posted I put this fiddle together that shows how to just use the string value you have (like in this SO answer I mentioned in the comments).
HTML
<div id="content"></div>
JS
var s = "\u003cli class=\"channels-content-item\"\u003e\n\n\u003cdiv class=\"shmoovie-content-cell\"\u003e\n\u003ca href=\"\/movie\/the-makeover\" class=\"ux-thumb-wrap contains-addto yt-uix-sessionlink\" data-sessionlink=\"ei=Oo21UdLqM8aDhgHc_IHYCA\"\u003e";
var div = document.getElementById('content');
div.innerHTML = s;
console.log(s);
Which sets the HTML content for the div with the elements:
<li class="channels-content-item">
<div class="shmoovie-content-cell">
<a href="/movie/the-makeover" class="ux-thumb-wrap contains-addto yt-uix-sessionlink" data-sessionlink="ei=Oo21UdLqM8aDhgHc_IHYCA">
Although it's not valid HTML, javascript seems to figure it out, at least it does in Chrome.

Not sure that regex is the best idea...The below solves the problem using the replace function built into JavaScript.
DEMO: http://jsfiddle.net/abc123/5DFUb/1/
var str = "\u003cli class=\"channels-content-item\"\u003e\n \n \u003cdiv class=\"shmoovie-content-cell\"\u003e\n \u003ca href=\"/movie/the-makeover\" class=\"ux-thumb-wrap contains-addto yt-uix-sessionlink\" data-sessionlink=\"ei=Oo21UdLqM8aDhgHc_IHYCA\"\u003e";
var newStr = str.replace("\\u003c", "<").replace("\\u003e",">").replace("\\/","/").replace("\\\"","\"");
alert(newStr);

Related

js Regex not working as expected. Newline not getting detected [duplicate]

This question already has answers here:
Why it's not possible to use regex to parse HTML/XML: a formal explanation in layman's terms
(10 answers)
XML parsing of a variable string in JavaScript
(10 answers)
The best node module for XML parsing [closed]
(2 answers)
Closed 4 years ago.
I have a string as follows:
<abc name = "foo">
<child>bar</child>
</abc>
<xyz>1</xyz>
<abc name = "foo2">
<child>bar2</child>
</abc>
<xyz>5</xyz>
I have created a regex as follows:
var regexapi = /<abc\s*name\s*=\s*"(.*?)"[\s\S]*?<\/abc>\n*<xyz>/gim;
while ( (resApi = regexapi.exec(data))) {
array1.push(resApi[0]);
}
console.log(array1[0]);
Now if I don't have the tag <xyz>1</xyz> printing array1[0] should show undefined but it is printing as follows:
<abc name = "foo">
<child>bar</child>
</abc>
<abc name = "foo2">
<child>bar2</child>
</abc>
<xyz>
I think there is some problem in \n* since I'm giving multiline flag. Not sure aout this though.
Note that this is without <xyz>1</xyz> tag. I want it to print undefined.
Thanks.
Regex:
<\/abc>\n(?:<xyz>(.*)(?=<\/xyz))*
Regex Demo
js Demo
Matches a </abc> followed by <xyz> and value. if <xyz> tag is missing array[0] will return an empty string (not undefined)
You would be better off using an XML parser here. If you insist on using regex, here is one option:
var input = "<abc name = \"foo\">\n\t<child>bar</child>\n</abc>\n<xyz>\n\n<abc name = \"foo2\">\t\n<child>bar2</child>\n</abc>\n<xyz>35</xyz>";
var regex = /<abc[^>]*>(?:(?!<\/abc>)[\s\S]*)<\/abc>\s*<xyz>((?!<xyz>)[\s\S]*)<\/xyz>/g;
var match = regex.exec(input);
console.log(match[1]); // 35
This matches an <abc> tag followed by optional whitespace, then followed immediately by an <xyz> tag. Should that tag be empty, then nothing would be capture in the first capture group match[1].

Remove all HTML tags *AND* content from json response [duplicate]

This question already has answers here:
RegEx match open tags except XHTML self-contained tags
(35 answers)
Closed 5 years ago.
I am aware of this solution - str.replace(/<\/?[^>]+>/gi, '') - which removes the HTML tags from json response.
{"fdfd":4}<p>fdfdf</p> -> {"fdfd":4} fdfdf
However i want to remove all content within the HTML, script tags as well for which i am seeking a solution.
Requirement -
{"fdfd":4}<p>fdfdf</p> -> {"fdfd":4}
Use string match and keep only those characters which are inside {}
var str = '{"fdfd":4}<p>fdfdf</p>'
var m = str.match(/{(.*)}/g)[0]
console.log(m)
To do that you have to use this regex
var str='{"fdfd":4}<p>fdfdf</p>';
console.log(str.replace(/<[^>]+>[^\n]+<\/?[^>]+>/gi, ''));
Added runnable code:
var str = '{"fdfd":4}<p>fdfdf</p>';
str = str.replace(/<\/?.*[^>*]+>/gi, '');
console.log(str);

Regex to remove a whitespace from href link [duplicate]

This question already has answers here:
Remove ALL white spaces from text
(14 answers)
Closed 6 years ago.
Can some one help me to create a regular expression in Javascript to remove a whitespace from href link and replace whitespace to hyphen to in my content?
For example:
<a class="card" href=http://www.eee.com/sffsd/sdfs/Aks's Reb outsider/4234234234324>
it should convert it into
<a class="card" href=http://www.eee.com/sffsd/sdfs/Aks's-Reb-outsider/4234234234324>
A couple of things, like I said in the comment, replacing the spaces with dashes should be as easy as:
link.href = link.href.replace(/ /g, '-');
//or in php:
$href = preg_replace('/ /', '-', $href);
(JS only): using the g flag ensures the entire string is searched for spaces, and replaces them all with dashes.
The better question to ask is: how did the spaces get there in the first place?
My first port of call would be to look at the code generating the markup, and fix the problem there. You shouldn't be writing code fixing the output of code that is broken. Fix the bug, don't accommodate it.
Arguably, the URL's should be properly escaped, rather tan using regex, the URI should've been passed through a function like encodeURI to convert all spaces to %20 etc...
use this regex format
link.href = link.href.replace(/\s/g, '-');

How to add Unicode to HTML? [duplicate]

This question already has answers here:
How do I correctly insert unicode in an HTML title using JavaScript?
(2 answers)
Closed 7 years ago.
help me pls.
In Unicode symbol & = "&" + "#38;" If I add it to HTML like this
<div id="div" title="&"></div>
ALL OK! I create symbol in title, like this
div id="div" title="&"></div>
But, when I add Unicode in HTML via JavaScript:
var div = document.getElementById('div');
div.setAttribute('title', "&");
I have a bad result:
<div id="div" title="&"></div>
How I can add Unicode in html attribute via JavaScript and get the correct result like it:
<div id="div" title="&"></div>
Thanks for help!
AND
IF I do it:
var code = 26;
div.setAttribute('title', "\x" + code);
I have a error. How I can fix it?
HTML entity escaping (e.g. &) is, as the name implies, only necessary in HTML. It's not necessary in Javascript; you can set the character literally:
div.setAttribute("title", "&");
If you need to escape a character, you can do so using a hexadecimal character escape:
div.setAttribute("title", "\x26");
or a Unicode character escape:
div.setAttribute("title", "\u0026");
For the & character in Javascript, you do not have to escape it. Otherwise, you should can escape Unicode in Javascript with \u, but most of the time you will not have to.

Replacing a colon using string replace using Javascript and jQuery [duplicate]

This question already has answers here:
Replace method doesn't work
(4 answers)
Closed 4 years ago.
I have a simple string that I'm trying to manipulate:
Your order will be processed soon:
I grab the string using:
var html = jQuery('.checkout td h4').html();
I then try to replace the ':' using:
html.replace(":", ".");
When I print it out to the console, the string is the same as the original string. I've also tried making sure that the html variable is of type "string" by doing the following:
html = html + "";
That doesn't do anything. In searching around, it seems that the replace function does a RegEx search and that the ":" character might have a special meaning. I do not know how to fix this. Can someone help me get rid of this stinkin' colon?
Slightly related...
I couldn't get these answers to work to replace all ":" in a string for the url encoded character %3a and modified this answer by'xdazz' to work: Javascript: Replace colon and comma characters to get...
str = str.replace(/:\s*/g, "%3a");
In your case it would be
str = str.replace(/:\s*/g, ".");
If you wanted to replace all colons with periods on a longer string.
Hope this helps somebody else.
The replace function returns a new string with the replacements made.
Javascript strings are immutable—it cannot modify the original string.
You need to write html = html.replace(":", ".");
I think c++ is the only high level language where strings are mutable. This means that replace cannot modify the string it operates on and so must return a new string instead.
Try the following instead
var element = jQuery('.checkout td h4');
element.html(element.html().replace(":", "."));
Or, perhaps more correctly (since you may have multiple elements).
jQuery('.checkout td h4').html(
function (index, oldHtml) {
return oldHtml.replace(":", ".");
}
);

Categories