I'm trying to get some specific title text to display via JavaScript, but I'm having some issues getting the entire string to show up.
The text I'm trying to display:
mechanical : Failed to copy
And here's what shows up in HTML:
`<td title="mechanical" :="" failed="" to="" copy="">mechanical : Failed to copy</td>`
The actual title displayed afterwards is just mechanical.
In Javascript:
var copyResult = json_obj[i].CopyResult; //variable that contains the text
copyResult = copyResult.replace(/["{}]/g, " "); //regex that removes some characters and replaces them with spaces
The copyResult variable is then added to the element I want.
It looks like having spaces "ends" the title attribute, so the browser tries to make more attributes with the remaining text.
What's the best way to fix this?
I was able to create a workaround. Since any space would end the title attribute, I simply used a regex to properly escape all of the space characters for the copyResult variable.
var copyResult = copyResult.replace(/[ ]/g,"\u00a0")
\u00a0 is the Unicode character for NO-BREAK-SPACE.
it's not the spaces ending the atribute, its the quotation marks... try escaping them with backslashes like \"
Related
I am trying to parse a page using javascript this is part of page:
<div class="title">
<h1>
Affect and Engagement in Game-BasedLearning Environments
</h1>
</div>
This is link tom page source:view-source:http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6645369?tp=&arnumber=6645369
I am using this:
$(data).find('h1').each(function()
{console.log($(this).text());
});
Now I am able to get the value inside header but the value displayed have lots of space in front and back.I tried to replace the whitespace by using replace function but replce isn't happening.I don't understand what is there in front and back of the value of header.I somehow want to remove the extra space.
Replace only replaces the first instance found, it might have only removed one space... try this instead, using regular expression syntax:
text.replace(/ /g, '');
This should remove all spaces, even the ones inside your string text. To avoid this, you may only want to replace double spaces instead:
text.replace(/ /g, '');
Also you may want to remove new lines:
text.replace(/\n/g, '');
Here is an example JSFiddle
If you know for sure that your string is only surrounded on either end by spaces, but you want to preserve everything inside, you can use trim:
text.trim();
Since your already using jQuery, you can take advantage of their $.trim function which removes leading and trailing whitespace.
$(data).find('h1').each(function() {
console.log($.trim($(this).text()));
});
Reference: $.trim()
Try using Javascript's Trim() to get rid of the whitespaces that's present on both sides.
Function's Reference:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/Trim
Your HTML actually contains these paces within your <h1> element, so it is to be expected that they are present in the result of .text().
Normally you'd just use .trim(). However, you'll likely want to replace line breaks inside the text as well.
$(data).find('h1').each(function() {
var text = $(this).text();
// Replaces any multiple whitespace sequences with a single space.
// Even inside the text.!
// E.g. " \r\n \t" -> " "
text = text.replace(/\s+/g, " ");
// Trim leading/trailing whitespace.
text = text.trim();
console.log(text);
});
Fiddle for your pleasure.
I need to remove tab characters within text inputted into particular fields of a web interface. The problem seems to be that when this happens the resulting text now contains spaces where the tabs were.
I tried using the regex : vVal = vVal.replace(/(\s+)/, ""); but using the example input 11111[tab], the value becomes 11111[space].
I dont know how this could be..
\s matches any whitespace that includes space also.
For tab try \t with global switch:
vVal = vVal.replace(/\t+/g, "");
we have a text like:
this is a test :rep more text more more :rep2 another text text qweqweqwe.
or
this is a test :rep:rep2 more text more more :rep2:rep another text text qweqweqwe. (without space)
we should replace :rep with TEXT1 and :rep2 with TEXT2.
problem:
when try to replace using something like:
rgobj = new RegExp(":rep","gi");
txt = txt.replace(rgobj,"TEXT1");
rgobj = new RegExp(":rep2","gi");
txt = txt.replace(rgobj,"TEXT2");
we get TEXT1 in both of them because :rep2 is similar with :rep and :rep proccess sooner.
If you require that :rep always end with a word boundary, make it explicit in the regex:
new RegExp(":rep\\b","gi");
(If you don't require a word boundary, you can't distinguish what is meant by "hello I got :rep24 eggs" -- is that :rep, :rep2, or :rep24?)
EDIT:
Based on the new information that the match strings are provided by the user, the best solution is to sort the match strings by length and perform the replacements in that order. That way the longest strings get replaced first, eliminating the risk that the beginning of a long string will be partially replaced by a shorter substring match included in that long string. Thus, :replongeststr is replaced before :replong which is replaced before :rep .
If your data is always consistent, replace :rep2 before :rep.
Otherwise, you could search for :rep\s, searching for the space after the keyword. Just make sure you replace the space as well.
I have a string that contains HTML image elements that is stored in a var.
I want to remove the image elements from the string.
I have tried: var content = content.replace(/<img.+>/,"");
and: var content = content.find("img").remove(); but had no luck.
Can anyone help me out at all?
Thanks
var content = content.replace(/<img[^>]*>/g,"");
[^>]* means any number of characters other than >. If you use .+ instead, if there are multiple tags the replace operation removes them all at once, including any content between them. Operations are greedy by default, meaning they use the largest possible valid match.
/g at the end means replace all occurrences (by default, it only removes the first occurrence).
$('<p>').html(content).find('img').remove().end().html()
The following Regex should do the trick:
var content = content.replace(/<img[^>"']*((("[^"]*")|('[^']*'))[^"'>]*)*>/g,"");
It first matches the <img. Then [^>"']* matches any character except for >, " and ' any number of times. Then (("[^"]*")|('[^']*')) matches two " with any character in between (except " itself, which is this part [^"]*) or the same thing, but with two ' characters.
An example of this would be "asf<>!('" or 'akl>"<?'.
This is again followed by any character except for >, " and ' any number of times. The Regex concludes when it finds a > outside a set of single or double quotes.
This would then account for having > characters inside attribute strings, as pointed out by #Derek 朕會功夫 and would therefore match and remove all four image tags in the following test scenario:
<img src="blah.png" title=">:(" alt=">:)" /> Some text between <img src="blah.png" title="<img" /> More text between <img /><img src='asdf>' title="sf>">
This is of course inspired by #Matt Coughlin's answer.
Use the text() function, it will remove all HTML tags!
var content = $("<p>"+content+"</p>").text();
I'm in IE right now...this worked great, but my tags come out in upper case (after using innerHTML, i think) ... so I added "i" to make it case insensitive. Now Chrome and IE are happy.
var content = content.replace(/<img[^>]*>/gi,"");
Does this work for you?:
var content = content.replace(/<img[^>]*>/g, '')
You could load the text as a DOM element, then use jQuery to find all images and remove them. I generally try to treat XML (html in this case) as XML and not try to parse through the strings.
var element = $('<p>My paragraph has images like this <img src="foo"/> and this <img src="bar"/></p>');
element.find('img').remove();
newText = element.html();
console.log(newText);
To do this without regex or libraries (read jQuery), you could use DOMParser to parse your string, then use plain JS to do any manipulations and re-serialize to get back your string.
I have a scenario like this
in html tags, if the attributes is not surrounded either by single or double quotes.. i want to put double quotes for that
how to write regex for that?
If you repeat this regex as many times as there might be tags in an element, that should work so long as the text is fairly normal and not containing lots of special characters that might give false positives.
"<a href=www.google.com title = link >".replace(/(<[^>]+?=)([^"'\s][^\s>]+)/g,"$1'$2'")
Regex says: open tag (<) followed by one or more not close tags ([^>]+) ungreedily (?) followed by equals (=) all captured as the first group ((...)) and followed by second group ((...)) capturing not single or double quote or space ([^"'\s]) followed by not space or close tag ([^\s>]) one or more times (+) and then replace that with first captured group ($1) followed by second captured group in single quotes ('$2')
For example with looping:
html = "<a href=www.google.com another=something title = link >";
newhtml = null;
while(html != newhtml){
if(newhtml)
html = newhtml;
var newhtml = html.replace(/(<[^>]+?=)([^"'\s][^\s>]+)/,"$1'$2'");
}
alert(html);
But this is a bad way to go about your problem. It is better to use an HTML parser to parse, then re-format the HTML as you want it. That would ensure well formatted HTML wheras regular expressions could only ensure well formatted HTML if the input is exactly as expected.
Very helpful! I made a slight change to allow it to match attributes with a single character value:
/(<[^>]+?=)([^"'\s>][^\s>]*)/g (changed one or more + to zero or more * and added > to the first match in second group).