Replace anything in certain position in string using javascript regular expression - javascript

I am trying to remove empty all paragraph tags, regardless of what style attributes might be in the p tag, from a string. I want to remove, for example, all of these and replace with an empty string.
<p style="margin-left:0px"></p>
<p></p>
<p style="margin-left:1cm; margin-right:1cm"></p>
So far, to deal with one situation I have, I am doing this:
str = str.replace(/<p style=\"margin:0cm 0cm 10pt\"><\/p>/g,'')
which is working in that particular situation. How can I write it so it removes
<p AnythingHereInThisTag></p>
and replaces it with an empty string?
Edit - further to answer below - if I do this:
str = str.replace(/<p(.*)><\/p>/g,'')
it is replacing the whole string which might look like
<p>Hello</p><p>Some text in the middle</p><p>Goodbye</p>
It needs to look at each pair of tags

Replace Any charecter without a char " has the regex as [^\"]
var reg=/\<p( [a-zA-Z]*=((\"[^\"]*\")|(\'[^\']*\')))*\>\<\/p\>/g;
console.log('<p style=\'margin:0cm 0cm 10pt\'></p>'.replace(reg,''));
console.log('<p style=\"margin:0cm 0cm 10pt\"></p>'.replace(reg,''));
console.log('<p style=\"margin:0cm 0cm 10pt\" class=\"test\"></p>'.replace(reg,''));
console.log('<p></p>'.replace(reg,''));

Something like this?
str = 'asd<p style="margin-left:0px"></p><p></p><p style="margin-left:1cm; margin-right:1cm"></p>'
str.replace(/<p(.*)><\/p>/g,'') // "asd"
Reading the question again, it is unclear if you wanted to remove only the attributes within the tag, or the tag completely. Please clarify.
You can read more about regular expression here.

Related

How do I take a block of text (with `\n`) from an object property and parse it as paragraphs?

The property is like this.
key: "paragraph.\n More text.\n Another sentence."
How would I show it like...
paragraph.
More text.
Another sentence.
without iterating or split()ting the text?
Number of paragraphs will be unknown at time of read. I have access to the object to rewrite the text in some other format, but it needs to stay as a single property.
I've tried
<p>{item["instruction"]}</p>
<p>{item.instruction}</p>
which both return solid blocks.
You can use for example css 'white-space':
<p style="white-space: pre-line;">{item.instruction}</p>
Or depending on what template library you use replace \n sign with <br /> tag (but most template libs escape html when rendering the value).
you can replace all \n in your string with <br /> element
for replace all \n in your string you must use RegExp in replace method.
var key = "paragraph.\n More text.\n Another sentence."
// put result of this command in your html element
key.replace(/\n/g,'<br />')

Contenteditable regex whitespace not working

I am trying to validate if the contenteditiable value has only whitespace/blank space. In my example if the value have only whitespace/blank space it should not match according to my regex string, but it not working as intended. It keeps matching when I enter complete blank spaces.
edit: the black space is where you can enter text.
https://jsfiddle.net/j1kcer26/5/
JS
var checkTitle = function() {
var titleinput = document.getElementById("artwork-title").innerHTML;
var titleRegexp = new RegExp("^(?!\s*$).+"); //no blank spaces allowed
if (!titleRegexp.test(titleinput)) {
$('.start').removeClass('active-upload-btn');
console.log('no match')
} else if (titleRegexp.test(titleinput)) {
$('.start').addClass('active-upload-btn');
console.log('match')
}
};
$('#artwork-title').on('keyup change input', function() {
checkTitle();
});
HTML
<div class="post-title-header">
<span class="user-title-input title-contenteditable maxlength-contenteditable" placeholder="enter text here" contenteditable="true" name="artwork-title" id="artwork-title" autocomplete="off" type="text" spellcheck="false">
</span>
</div>
<div class="start">
turn red if match
</div>
If you look at the actual inner HTML, you'll see things like <br> elements or entities. Your regex doesn't look equipped to handle these.
You may want to consider using textContent instead of innerHTML if you just care about the text, not the HTML. Or alternatively, if you really want plain text, use a <textarea/> instead of a content-editable div, which is for rich-text-style editing that produces HTML.
Edit:
Your regex is not quite right either. Because you're using the RegExp constructor with new RegExp("^(?!\s*$).+"), the \s in your string literal is going to turn into a plain s; you have to use a \\s if you want the regex to have an actual \s in it. IMO, it's always better to use a regexp literal unless you're building one dynamically, like /^(?!\s*$).+/, or I find this to be a simpler alternative to tell you if a string is entirely whitespace: /^\s+$/.

regex to select all URL by certain pattern

I'm trying to use regex for selecting all characters and words between the two ("...") in all tags, by certain pattern for example select which starts from /desktop/content.
I'm sure this is fairly simple but couldn't make it on my own, can someone help?
Example:
<img src="/desktop/content/img/illustrations/small-flower2.svg" width="138"/>
selected part should be: /desktop/content/img/illustrations/small-flower2.svg
you mean a regex like /"someQuotedString([^"]*)"/gm ?
var str = '<img src="/desktop/content/img/illustrations/small-flower2.svg" width="138"/>';
console.dir(str.match(/"\/desktop\/content([^"]*)"/gm));
console.log(str.match(/"\/desktop\/content([^"]*)"/gm)[0]);
https://regex101.com/r/ahjdCZ/1
...if you really want to make sure it's an <img... tag you could also:
/(?!<img.*)"([^"]+)"/
or within any < > tag:
/<.*"(\/desktop\/content[^"]+)".*>/

Replacing HTML String & Avoiding Tags (regex)

I'm trying to use JS to replace a specific string within a string that contains html tags+attributes and styles while avoiding the inner side of the tags to be read or matched (and keep the original tags in the text).
for example, I want <span> this is span text </span> to be become: <span> this is s<span class="found">pan</span> text </span> when the keyword is "pan"
I tried using regex with that ..
My regex so far:
$(this).html($(this).html().replace(new RegExp("([^<\"][a-zA-Z0-9\"'\=;:]*)(" + search + ")([a-zA-Z0-9\"'\=;:]*[^>\"])", 'ig'), "$1<span class='found'>$2</span>$3"));
This regex only fails in cases like <span class="myclass"> span text </span> when the search="p", the result:
<s<span class="found">p</span>an class="myclass"> s<span class="found">p</span>an text</s<span class="found">p</span>an>
*this topic should help anyone who seeks to find a match and replace the matched string while avoiding strings surrounded by specific characters to be replaced.
As thg435 say, the good way to deal with html content is to use the DOM.
But if you want to avoid something in a replace, you can match that you want to avoid first and replace it by itself.
Example to avoid html tags:
var text = '<span class="myclass"> span text </span>';
function callback(p1, p2) {
return ((p2==undefined)||p2=='')?p1:'<span class="found">'+p1+'</span>';
}
var result = text.replace(/<[^>]+>|(p)/g, callback);
alert(result);

Javascript regex not working as intended

I have the HTML from a page in a variable as just plain text. Now I need to remove some parts of the text. This is a part of the HTML that I need to change:
<div class="post"><a name="6188729"></a>
<div class="igmline small" style="height: 20px; padding-top: 1px;">
<span class="postheader_left">
RuneRifle
op 24.08.2012 om 21:41 uur
</span>
<span class="postheader_right">
Citaat Bewerken
</span>
<div style="clear:both;"></div>
</div>
<div class="text">Testforum</div>
<!-- Begin Thank -->
<!-- Thank End -->
</div>
These replaces work:
pageData = pageData.replace(/href=\".*?\"/g, "href=\"#\"");
pageData = pageData.replace(/target=\".*?\"/g, "");
But this replace does not work at all:
pageData = pageData.replace(
/<span class=\"postheader_right\">(.*?)<\/span>/g, "");
I need to remove every span with the class postheader_right and everything in it, but it just doesn't work. My knowledge of regex isn't that great so I'd appreciate if you would tell me how you came to your answer and a small explanation of how it works.
The dot doesn't match newlines. Use [\s\S] instead of the dot as it will match all whitespace characters or non-whitespace characters (i.e., anything).
As Mike Samuel says regular expressions are not really the best way to go given the complexity allowed in HTML (e.g., if say there is a line break after <a), especially if you have to look for attributes which may occur in different orders, but that's the way you can do it to match the case in your example HTML.
I need to remove every span with the class postheader_right and everything in it, but it just doesn't work.
Don't use regular expressions to find the spans. Using regular expressions to parse HTML: why not?
var allSpans = document.getElementsByClassName('span');
for (var i = allSpans.length; --i >= 0;) {
var span = allSpans[i];
if (/\bpostheader_right\b/.test(span.className)) {
span.parentNode.removeChild(span);
}
}
should do it.
If you only need to work on newer browsers then getElementsByClassName makes it even easier:
Find all div elements that have a class of 'test'
var tests = Array.filter( document.getElementsByClassName('test'), function(elem){
return elem.nodeName == 'DIV';
});

Categories