selecting text between round brackets in javascript - javascript

I need to select a text using javascript that is between round brackets, and wrap it all in a span:
<p>Just some text (with some text between brackets) and some more text</p>
should become:
<p>Just some text <span class="some-class">(with some text between brackets)</span> and some more text</p>
I think something like this should be possible using regex, but i'm totally unfamiliar with using regex in javascript. Can someone please help? Thanks!

This should do the trick (str is the string holding the text you want to manipulate):
str.replace((\([^()<>]*\)), "<span class=\"some-class\">$1</span>");
It disallows (, ), < or > within the parenthesis. This avoids nesting issues and html tags falling in the middle of the parenthesis. You might need to adapt it to meet your exact requirements.
Since you're new to regular expressions, I recommend reading http://www.regular-expressions.info/ if you want to learn more.

oldString = '<p>Just some text (with some text between brackets) and some more text</p>';
newString = oldString.replace(/\((.*?)\)/g, '<span class="some-class">($1)</span>');

Try this:
<p id="para">Just some text (with some text between brackets) and some more text</p>
<input type="button" value="Change Text" onclick="ChangeText()"/>
<script>
function ChangeText()
{
var para = document.getElementById("para");
var text = para.innerHTML;
para.innerHTML = text.replace(/(.*)(\(.*\))(.*)/g, "$1<span class=\"some-class\">$2</span>$3")
}
</script>

Using RegExp object:
var str = "<p>Just some text (with some text between brackets) and some more text</p>";
var re = new RegExp("\(+(.*)\)+", "g");
var myArray = str.replace(re,"<span class="some-class">($1)</span>" );
Using literal:
var myArray = str.replace(/\(+(.*)\)+/g,"<span class="some-class">($1)</span>")

Related

regex to replace "<p><br/></p>" string with empty string- Javascript [duplicate]

This question already has answers here:
How to keep Quill from inserting blank paragraphs (`<p><br></p>`) before headings with a 10px top margin?
(3 answers)
Closed 1 year ago.
I have some HTML as a string
var str= "<p><br/></p>"
How do I strip the p tags from this string using JS.
here is what I have tried so far:
str.replace(/<p[^>]*>(?:\s| )*<\/p>/, "") // o/p: <p><br></p>'
str.replace("/<p[^>]*><\\/p[^>]*>/", "")// o/p: <p><br></p>'
str.replace(/<p><br><\/p>/g, "")// o/p: <p><br></p>'
all of them return me same str as above, expected o/p is:
str should be ""
what im doing wrong here?
Thanks
You probably should not be using RegExp to parse HTML - it's not particularly useful with (X)HTML-style markup as there are way too many edge cases.
Instead, parse the HTML as you would an element in the DOM, then compare the trim()med innerText value of each <p> with a blank string, and remove those that are equal:
var str = "<p><br/></p><p>This paragraph has text</p>"
var ele = document.createElement('body');
ele.innerHTML = str;
[...ele.querySelectorAll('p')].forEach(para => {
if (para.innerText.trim() === "") ele.removeChild(para);
});
console.log(ele.innerHTML);
You should be able to use the following expression: <p[^>]*>( |\s+|<br\s*\/?>)*<\/p>
The expression above looks at expressions enclosed in <p>...</p> and matches them against , whitespace (\s+) and <br> (and / variations).
I think you were mostly there with /<p[^>]*>(?:\s| )*<\/p>/, but you just needed to remove ?: (not sure what you were trying to do here), and adding an additional case for <br>.
const str = `
<p><br></p>
<p><br/></p>
<p><br /></p>
<p> <br/> </p>
<p> </p>
<p> </p>
<p><br/> </p>
<p>
<br>
</p><!-- multiline -->
<p><br/> don't replace me</p>
<p>don't replace me</p>
`;
const exp = /<p[^>]*>( |\s+|<br\s*\/?>)*<\/p>/g;
console.log(str.replace(exp, ''));

UBB Code [textarea] - do not replace \n by <br> within tags [textarea][/textarea]

I currently load a value from my database straight into a hidden textarea.
<textarea name="text" id="text" style="visibility:hidden">
[textarea]Content showing raw [b]HTML[/b] or any other code
Including line breaks </a>[/textarea]
</textarea>
From there I pick up the textarea's content and run it trough several replace arguments with a simple Javascript, like
<script type="text/javascript">
document.addEventListener('DOMContentLoaded', function parser() {
post_text=post_text.replace(/\r?\n/g, "<br>");
post_text=post_text.replace(/\[size=1\]/g, "<span style=\"font-size:80%\">");
post_text=post_text.replace(/\[url=(.+?)\](.+?)\[\/url\]/g, "$2 <img src=\"images/link.gif\" style=\"border:0px\">");
post_text=post_text.replace(/\[url\](.+?)\[\/url\]/g, "$1 <img src=\"images/link.gif\" style=\"border:0px\">");
document.getElementById('vorschau').innerHTML = post_text;
}, false);
</script>
<div id="vorschau"></div>
to render it into HTML which is then parsed by the Browser, so I do all the formatting of the entries on the Frontend/client side.
However, the textarea may also contain such an UBB tag:
[textarea]Content showing raw [b]HTML[/b] or any other code
Including line breaks </a>[/textarea]
I currently just replace the textarea UBB elements like any other content
post_text=post_text.replace(/\[textarea\]/g, "<textarea id=\"codeblock\" style=\"width:100%;min-height:200px;\">");
post_text=post_text.replace(/\[\/textarea\]/g, "</textarea>");
The issue with this is that my other code
post_text=post_text.replace(/\r?\n/g, "<br>");
post_text=post_text.replace(/\</g, "<");
post_text=post_text.replace(/\>/g, ">");
Does not skip the content within the [textarea][/textarea] elements resulting in a textarea filled with this:
Content showing raw <b>HTML</b> or any other code<br>Including line breaks </a>
Above example
So how do I prevent to replace anything within [textarea][/textarea] (which can occur more than once in id="text")?
What you might do, is use a dynamic pattern that captures from [textarea] till [/textarea] in group 1, and use an alternation to match what you want to replace.
Then use a callback function for replace. Check if group 1 exists, and if it does return it unmodified. If it does not, we have a match outside of the text area.
An example of the pattern with the alternation and match for <
(\[textarea][^]*\[\/textarea])|<
(\[textarea][^]*\[\/textarea]) Capture group 1, match from [textarea] till [/textarea]
| Or
< Match literally
Regex demo
Note to double escape the backslash in the RegExp constructor.
(Assuming this is the right order of replacements:)
const replacer = (text, find, replace) => text.replace(
new RegExp(`(\\[textarea][^]*\\[\\/textarea])|${find}`, "g"),
(m, g1) => g1 ? g1 : replace
);
document.addEventListener('DOMContentLoaded', function parser() {
let post_text = document.getElementById('text').value;
post_text = post_text.replace(/\[size=1]/g, "<span style=\"font-size:80%\">");
post_text = post_text.replace(/\[url=(.+?)](.+?)\[\/url\]/g, "$2 <img src=\"images/link.gif\" style=\"border:0px\">");
post_text = post_text.replace(/\[url](.+?)\[\/url]/g, "$1 <img src=\"images/link.gif\" style=\"border:0px\">");
post_text = replacer(post_text, "\\r?\\n", "<br>");
post_text = replacer(post_text, "<", "<");
post_text = replacer(post_text, ">", ">");
post_text = post_text.replace(/\[textarea]/g, "<textarea id=\"codeblock\" style=\"width:100%;min-height:200px;\">");
post_text = post_text.replace(/\[\/textarea]/g, "</textarea>");
document.getElementById('vorschau').innerHTML = post_text;
}, false);
<textarea name="text" id="text" rows="10" cols="60">
[textarea]Content showing raw [b]HTML[/b] or any other code
Including line breaks </a>[/textarea]
< here and > here and
</textarea>
<div id="vorschau"></div>

Why my regex is not working in react but working anywhere else (e.g. regex tester online)? [duplicate]

I am trying to remove all the html tags out of a string in Javascript.
Heres what I have... I can't figure out why its not working....any know what I am doing wrong?
<script type="text/javascript">
var regex = "/<(.|\n)*?>/";
var body = "<p>test</p>";
var result = body.replace(regex, "");
alert(result);
</script>
Thanks a lot!
Try this, noting that the grammar of HTML is too complex for regular expressions to be correct 100% of the time:
var regex = /(<([^>]+)>)/ig
, body = "<p>test</p>"
, result = body.replace(regex, "");
console.log(result);
If you're willing to use a library such as jQuery, you could simply do this:
console.log($('<p>test</p>').text());
This is an old question, but I stumbled across it and thought I'd share the method I used:
var body = '<div id="anid">some text</div> and some more text';
var temp = document.createElement("div");
temp.innerHTML = body;
var sanitized = temp.textContent || temp.innerText;
sanitized will now contain: "some text and some more text"
Simple, no jQuery needed, and it shouldn't let you down even in more complex cases.
Warning
This can't safely deal with user content, because it's vulnerable to script injections. For example, running this:
var body = '<img src=fake onerror=alert("dangerous")> Hello';
var temp = document.createElement("div");
temp.innerHTML = body;
var sanitized = temp.textContent || temp.innerText;
Leads to an alert being emitted.
This worked for me.
var regex = /( |<([^>]+)>)/ig
, body = tt
, result = body.replace(regex, "");
alert(result);
This is a solution for HTML tag and &nbsp etc and you can remove and add conditions
to get the text without HTML and you can replace it by any.
convertHtmlToText(passHtmlBlock)
{
str = str.toString();
return str.replace(/<[^>]*(>|$)| |‌|»|«|>/g, 'ReplaceIfYouWantOtherWiseKeepItEmpty');
}
Here is how TextAngular (WYSISYG Editor) is doing it. I also found this to be the most consistent answer, which is NO REGEX.
#license textAngular
Author : Austin Anderson
License : 2013 MIT
Version 1.5.16
// turn html into pure text that shows visiblity
function stripHtmlToText(html)
{
var tmp = document.createElement("DIV");
tmp.innerHTML = html;
var res = tmp.textContent || tmp.innerText || '';
res.replace('\u200B', ''); // zero width space
res = res.trim();
return res;
}
you can use a powerful library for management String which is undrescore.string.js
_('a link').stripTags()
=> 'a link'
_('a link<script>alert("hello world!")</script>').stripTags()
=> 'a linkalert("hello world!")'
Don't forget to import this lib as following :
<script src="underscore.js" type="text/javascript"></script>
<script src="underscore.string.js" type="text/javascript"></script>
<script type="text/javascript"> _.mixin(_.str.exports())</script>
my simple JavaScript library called FuncJS has a function called "strip_tags()" which does the task for you — without requiring you to enter any regular expressions.
For example, say that you want to remove tags from a sentence - with this function, you can do it simply like this:
strip_tags("This string <em>contains</em> <strong>a lot</strong> of tags!");
This will produce "This string contains a lot of tags!".
For a better understanding, please do read the documentation at
GitHub FuncJS.
Additionally, if you'd like, please provide some feedback through the form. It would be very helpful to me!
For a proper HTML sanitizer in JS, see http://code.google.com/p/google-caja/wiki/JsHtmlSanitizer
<html>
<head>
<script type="text/javascript">
function striptag(){
var html = /(<([^>]+)>)/gi;
for (i=0; i < arguments.length; i++)
arguments[i].value=arguments[i].value.replace(html, "")
}
</script>
</head>
<body>
<form name="myform">
<textarea class="comment" title="comment" name=comment rows=4 cols=40></textarea><br>
<input type="button" value="Remove HTML Tags" onClick="striptag(this.form.comment)">
</form>
</body>
</html>
The selected answer doesn't always ensure that HTML is stripped, as it's still possible to construct an invalid HTML string through it by crafting a string like the following.
"<<h1>h1>foo<<//</h1>h1/>"
This input will ensure that the stripping assembles a set of tags for you and will result in:
"<h1>foo</h1>"
additionally jquery's text function will strip text not surrounded by tags.
Here's a function that uses jQuery but should be more robust against both of these cases:
var stripHTML = function(s) {
var lastString;
do {
s = $('<div>').html(lastString = s).text();
} while(lastString !== s)
return s;
};
The way I do it is practically a one-liner.
The function creates a Range object and then creates a DocumentFragment in the Range with the string as the child content.
Then it grabs the text of the fragment, removes any "invisible"/zero-width characters, and trims it of any leading/trailing white space.
I realize this question is old, I just thought my solution was unique and wanted to share. :)
function getTextFromString(htmlString) {
return document
.createRange()
// Creates a fragment and turns the supplied string into HTML nodes
.createContextualFragment(htmlString)
// Gets the text from the fragment
.textContent
// Removes the Zero-Width Space, Zero-Width Joiner, Zero-Width No-Break Space, Left-To-Right Mark, and Right-To-Left Mark characters
.replace(/[\u200B-\u200D\uFEFF\u200E\u200F]/g, '')
// Trims off any extra space on either end of the string
.trim();
}
var cleanString = getTextFromString('<p>Hello world! I <em>love</em> <strong>JavaScript</strong>!!!</p>');
alert(cleanString);
If you want to do this with a library and are not using JQuery, the best JS library specifically for this purpose is striptags.
It is heavier than a regex (17.9kb), but if you need greater security than a regex can provide/don't care about the extra 17.6kb, then it's the best solution.
Like others have stated, regex will not work. Take a moment to read my article about why you cannot and should not try to parse html with regex, which is what you're doing when you're attempting to strip html from your source string.

How to replace a text plus brackets

Hi there how can I replace from this to this
var str = document.getElementById('bos').innerHTML.replace('col_nr', "");
document.getElementById('bos').innerHTML = str;
<div id="bos">
col_nr[504]
</div>
I want to be able to take only the number without brackets
You can perform more replace() to achieve your goal, demonstrated as below. Alternatively, you can use regular expression to perform your task as well.
var str = document.getElementById('bos').innerHTML.replace('col_nr[', '').replace(']', '');
document.getElementById('bos').innerHTML = str;
<div id="bos">
col_nr[504]
</div>
You could replace all not number characters.
var element = document.getElementById('bos');
element.innerHTML = element.innerHTML.replace(/\D/g, "");
<div id="bos">
col_nr[504]
</div>

Html - insert html into <p> </p> tags

Let's say I have a text :
<p> hello world! </p>
and I am using a function that cut the text after 5 words and adds " ...show more"
I want the result to be like this :
hello ... show more
Because of the <p> tags what I get is this output :
hello ...show more
what I see when I inspect the element is this :
<p> hello </p> ...show more
I must mention that the text can be with <p> or without.
Is there a way to solve this problem ?
Is there a way to insert the added text inside the <p> tag ?
I need to mention that I need the <p> tags, I can't use strip tags function.
Thanks,
Yami
Do you mean this?
var text = "<p>hello world</p>";
var res = "<p>" + text.substring(3, 8) + " ...show more</p>";
It results in:
<p>hello ...show more</p>
The way I see it, you have two options:
.split() the string by spaces (assuming a space separates words) then slice the first (up to) 5 elements. If there are greater than 5 element, add "...read more"; if not, it's unnecessary.
You can use some regex replace and (with a negative lookahead) ignore the first 5 words, but replace all other text with your "...read more". (I personally find this one having more overhead, but you could probably use (?!(?:[^\b]+?[\b\s]+?){5})(.*)$ as a pattern)
Having said that, here's what i mean with a string split:
function readMore(el){
var ary = el.innerHTML.split(' ');
el.innerHTML = (ary.length > 5 ? ary.slice(0,5).join(' ') + '... read more' : ary.join(' '));
}
var p = document.getElementById('foo');
readMore(p);
Assuming of course, for the purposes of this demo, <p id="foo">Hello, world! How are you today?</p> (which would result in <p id="foo">Hello, world! How are you...read more</p>)
$('p').text($('p').text().replace('world!', '... show more'));

Categories