How to delete string from string with regex and jQuery/JS? - javascript

I wonder, how to delete:
<span>blablabla</span>
from:
<p>Text wanted <span>blablabla</span></p>
I'm getting the text from p using:
var text = $('p').text();
And for some reason I'd like to remove from var text the span and its content.
How can I do it?

It's impossible to remove the <span> from the variable text, because it doesn't exist there — text is just text, without any trace of elements.
You have to remove the span earlier, while there is still some structure:
$('p').find('span').remove();
Or serialize the element using HTML (.html()) rather than plain text.
Don't edit HTML using regular expressions — HTML is not a regular language and regular expressions will fail in non-trivial cases.
var html = $('p').html();
var tmp = $('<p>').html(html);
tmp.find('span').remove();
var text = tmp.text();

text = text.replace(/<span>.*<\/span>/g, '');

to remove the unwanted whitespace before the <span> use
text = text.replace(/\s*<span>.*<\/span>/g, '');
leaving you with
<p>Text wanted</p>

Related

How to remove content within the &lt and &gt javascript

I have a content that contains a string of elements along with images. ex:
var str= <p><img src=\"v\">fwefwefw</img></p><p><br></p><p><br></p>
the text that is within the &lt and &gt is a dirty tag and I would like to remove it along with the content that is within it. the tag is generated dynamically and hence could be any tag i.e <div>, <a>, <h1> etc....
the expected output : <p></p><p><br></p><p><br></p>
however with this code, im only able to remove the tags and not the content inside it.
str.replaceAll(/<.*?>/g, "");
it renders like this which is not what im looking for:
<p>fwefwefw</p><p><br></p><p><br></p><p><br></p>
how can I possibly remove the & tags along with the content so that I get rid of dirty tags and text inside it?
fiddle: https://jsfiddle.net/3rozjn8m/
thanks
A safe way is to use a DOM parser, visiting each text node, where then each text can be cleaned separately. This way you are certain the DOM structure is not altered; only the texts:
let str= "<p><img src=\"v\">fwefwefw</img></p><p><br></p><p><br></p>";
let doc = new DOMParser().parseFromString(str, "text/html");
let walk = doc.createTreeWalker(doc.body, 4, null, false);
let node = walk.nextNode();
while (node) {
node.nodeValue = node.nodeValue.replace(/<.*>/gs, "");
node = walk.nextNode();
}
let clean = doc.body.innerHTML;
console.log(clean);
This will also work when you have more than one <p> element that has such content.
Remove the question mark.
var str= "<p><img src=\"v\">fwefwefw</img></p><p><br></p><p><br></p>";
console.log(str.replaceAll(/<.*>/g, ""));

Finding a div in a text variable with jQuery

I have a text variable in Javascript. Its name is text. It contains a whole HTML document. I've tried to find a jQuery selector that matches a contained div with id "mainContent":
var innerText = text.find('div[id=mainContent]');
Unfortunately, this does not work. The JavaScript somehow breaks at this point.
I've also tried it with:
var innerText = $(text).find('div[id=mainContent]');
But this also does break the JavaScript flow.
Does anybody have an idea?
If text is string then you should parse them first, you can do so using jQuery.parseHTML().
Demo:
var text = `<div><div id="mainContent">Test Container</div></div>`;
text = $.parseHTML(text);
console.log($(text).find('div#mainContent'));
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>

get all words without html tags with javascript regex

i try to use an regex expression to get all words without html tags
the goal of this is to tag all words with span tags to be capable to get the word when my mouse is over, but keep html initial tags
for example this code
<p>hello i'm <b>jesus</b></p>
should become
<p><span>hello</span> <span>i'm</span><b><span>jesus<span></b></p>
So, first step for me, is to get all words, without html tags, and then replace it with span
This is my regex in javascript
([^\r\n\t\f>< /]+(?!>))
But i have some problems with some tags like
Live example here
Finally , when my regex will be ok, i will be ok to replace all words by
$(this).html($(this).html().replace(reg, "$1"));
thx for your help
Maybe there is an other way to do this ...
Use .split() to split the textContent of the element. Array#forEach to iterate array after .split and appendChild to append Element.
var ELEMENT = document.getElementsByTagName('p')[0];
var text = ELEMENT.textContent;
ELEMENT.innerHTML = '';
text.split(' ').forEach(function(elem) {
var span = document.createElement('span');
span.innerHTML = elem;
ELEMENT.appendChild(span);
});
span {
margin-left: 10px;
}
<p>hello i'm <b>jesus</b>
</p>
Fiddle Demo

Select single word between any HTML tag?

I am looking to build a regular expression that will select a single word out of all text between HTML tags. I am looking for the occurrence of the word anywhere but inside HTML tags. The issue is that the word I am looking to match may occur in the class or id of a tag - I would only like to match it when it is between the tags.
Here is further clarification from my comment:
I am looking for a regex to use in a loop that will find a string in another string that contains HTML. The large string will contain something like this:
<div class="a-class"<span class="some-class" data-content="some words containing target">some other text containing target</span>
I want the regex to match the word "target" only between the tags, not within the tag in the data-content attribute. I can use:
/(\btarget)\b/ig
to find every instance of target.
http://jsfiddle.net/techsin/xt1j2cj8/3/
here is one way to do it.
var cont = $(".cont")
html = cont.html(),
word = "Lorem";
word = word.replace(/(\s+)/, "(<[^>]+>)*$1(<[^>]+>)*");
var pattern = new RegExp("(" + word + ")", "gi");
html = html.replace(pattern, "<mark>$1</mark>");
html = html.replace(/(<mark>[^<>]*)((<[^>]+>)+)([^<>]*<\/mark>)/, "$1</mark>$2<mark>$4");
$(".cont").html(html);
If the word can be present anywhere i.e. even as a class name or id name then here is what you can do,
Take <html> as the parent element and access all the contents within it using innerHTML, now you can find any word as follows,
<html id="main">
<div>
<p class="yourword">
</p>
</div>
</html>
var str = document.getElementById("main").innerHTML;
var res = str.match(/yourword/gi);
alert(res);
The above string matches the word "yourword" from the entire document.
Here is a demo which selects the string "sub".

Replacing HTML String & Avoiding Tags (regex)

I'm trying to use JS to replace a specific string within a string that contains html tags+attributes and styles while avoiding the inner side of the tags to be read or matched (and keep the original tags in the text).
for example, I want <span> this is span text </span> to be become: <span> this is s<span class="found">pan</span> text </span> when the keyword is "pan"
I tried using regex with that ..
My regex so far:
$(this).html($(this).html().replace(new RegExp("([^<\"][a-zA-Z0-9\"'\=;:]*)(" + search + ")([a-zA-Z0-9\"'\=;:]*[^>\"])", 'ig'), "$1<span class='found'>$2</span>$3"));
This regex only fails in cases like <span class="myclass"> span text </span> when the search="p", the result:
<s<span class="found">p</span>an class="myclass"> s<span class="found">p</span>an text</s<span class="found">p</span>an>
*this topic should help anyone who seeks to find a match and replace the matched string while avoiding strings surrounded by specific characters to be replaced.
As thg435 say, the good way to deal with html content is to use the DOM.
But if you want to avoid something in a replace, you can match that you want to avoid first and replace it by itself.
Example to avoid html tags:
var text = '<span class="myclass"> span text </span>';
function callback(p1, p2) {
return ((p2==undefined)||p2=='')?p1:'<span class="found">'+p1+'</span>';
}
var result = text.replace(/<[^>]+>|(p)/g, callback);
alert(result);

Categories