javascript replace issue - javascript

i have buildt a small highlight script, this script has a results tags, which means that you can enter words in a input field and it will be displayed as clickable tags, the tags are created by a whitespace(enter space and a new tag will form). You can click on the tags to remove the results from the input and text.
The issue, if you enter a single letter and click it to remove it, it removes all letters in all of the search words(so click on a single a all of the a's are beeing removed from the search input)
the code
$('a').live('click',function(){
var searchPhrase = $(this).text();
$('input').val(
$('input').val().replace(searchPhrase,'')
);
})
i use this piece of code to simple remove the matched text from the input.
What do i need, well the tags should be removed if they match, so i need to include a regex begin of a string pattern....i think.
Found the solution:
var reg = new RegExp("\\b"+ searchPhrase +"\\b", "g");

Use: Regular Expressions for JavaScript, instead of JavaScript plain search method.

Related

Convert raw html to text with javascript and regex

I have raw html with link tags and the goal I want to achieve is extract href attribute from tags and all text between tags except tags.
For example:
<br>#EXTINF:-1 tvg-name="1377",Страшное HD<br>
<a title="Ссылка" rel="nofollow" href="http://4pda.ru/pages/go/?u=http%3A%2F%2F46.61.226.18%2Fhls%2FCH_C01_STRASHNOEHD%2Fbw3000000%2Fvariant.m3u8%3Fversion%3D2" target="_blank">http://46.61.226.18/hl…variant.m3u8?version=2</a>
<br>#EXTINF:-1 tvg-name="983" ,Первый канал HD<br>
<a title="Ссылка" rel="nofollow" href="http://4pda.ru/pages/go/?u=http%3A%2F%2F46.61.226.18%2Fhls%2FCH_C06_1TVHD%2Fbw3000000%2Fvariant.m3u8%3Fversion%3D2" target="_blank">http://46.61.226.18/hl…variant.m3u8?version=2</a>
have to convert to:
#EXTINF:-1 tvg-name="1377",Страшное HD
http://4pda.ru/pages/go/?u=http%3A%2F%2F46.61.226.18%2Fhls%2FCH_C01_STRASHNOEHD%2Fbw3000000%2Fvariant.m3u8%3Fversion%3D2
#EXTINF:-1 tvg-name="983" ,Первый канал HD
http://4pda.ru/pages/go/?u=http%3A%2F%2F46.61.226.18%2Fhls%2FCH_C06_1TVHD%2Fbw3000000%2Fvariant.m3u8%3Fversion%3D2
I tried different regex's:
Here what I did
var source_text = $("#source").val();
var delete_start_of_link_tag = source_text.replace(/<a(.+?)href="/gi, "");
delete beginning of the tag to the href attribute
var delete_tags = delete_start_of_link_tag.replace(/<\/?\w+((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)\/?>/gi, "");
delete all tags </a>, <br>
example
And then I want to delete all text after href values to the end of the line.
What regex should i use in replace method or maybe where is a some different way to do this converting?
Formatting Anchor Tags
In your example , you are not replacing the "> part form the html.
So check this example
use this code to remove everything after href close quote(' or ")
var delete_tags = delete_start_of_link_tag.replace(/".*/gi, "");
And few things to notice are
1.The value in href is enclosed in single quote(') or double quotes("), both are valid.
2.The exact regex to match all href in a given string or content is href=[\"|'].*?[\"|']
3.Some patterns in href values , I came across are below.
http://www.so.com
https://www.so.com
www.so.com
//so.com
/socom.html
javascript*
mailto*
tel*
So if you want to format URL's then you have consider the above cases and i may have missed some.
Looks like you're already using jQuery.
Get the href of each anchor
$('a').each(function(){
var href = $(this).attr('href');
});
Get the text of each anchor:
$('a').each(function(){
var text = $(this).text();
});
You haven't shown a wrapper element around these but you can get the text (without tags) of any selection.
var text = $('#some_id').text();
Example

Select single word between any HTML tag?

I am looking to build a regular expression that will select a single word out of all text between HTML tags. I am looking for the occurrence of the word anywhere but inside HTML tags. The issue is that the word I am looking to match may occur in the class or id of a tag - I would only like to match it when it is between the tags.
Here is further clarification from my comment:
I am looking for a regex to use in a loop that will find a string in another string that contains HTML. The large string will contain something like this:
<div class="a-class"<span class="some-class" data-content="some words containing target">some other text containing target</span>
I want the regex to match the word "target" only between the tags, not within the tag in the data-content attribute. I can use:
/(\btarget)\b/ig
to find every instance of target.
http://jsfiddle.net/techsin/xt1j2cj8/3/
here is one way to do it.
var cont = $(".cont")
html = cont.html(),
word = "Lorem";
word = word.replace(/(\s+)/, "(<[^>]+>)*$1(<[^>]+>)*");
var pattern = new RegExp("(" + word + ")", "gi");
html = html.replace(pattern, "<mark>$1</mark>");
html = html.replace(/(<mark>[^<>]*)((<[^>]+>)+)([^<>]*<\/mark>)/, "$1</mark>$2<mark>$4");
$(".cont").html(html);
If the word can be present anywhere i.e. even as a class name or id name then here is what you can do,
Take <html> as the parent element and access all the contents within it using innerHTML, now you can find any word as follows,
<html id="main">
<div>
<p class="yourword">
</p>
</div>
</html>
var str = document.getElementById("main").innerHTML;
var res = str.match(/yourword/gi);
alert(res);
The above string matches the word "yourword" from the entire document.
Here is a demo which selects the string "sub".

How can I Strip all regular html tags except <a></a>, <img>(attributes inside) and <br> with javascript?

When a user create a message there is a multibox and this multibox is connected to a design panel which lets users change fonts, color, size etc.. When the message is submited the message will be displayed with html tags if the user have changed color, size etc on the font.
Note: I need the design panel, I know its possible to remove it but this is not the case :)
It's a Sharepoint standard, The only solution I have is to use javascript to strip these tags when it displayed. The user should only be able to insert links, images and add linebreaks.
Which means that all html tags should be stripped except <a></a>, <img> and <br> tags.
Its also important that the attributes inside the the <img> tag that wont be removed. It could be isplayed like this:
<img src="/image/Penguins.jpg" alt="Penguins.jpg" style="margin:5px;width:331px;">
How can I accomplish this with javascript?
I used to use this following codebehind C# code which worked perfectly but it would strip all html tags except <br> tag only.
public string Strip(string text)
{
return Regex.Replace(text, #"<(?!br[\x20/>])[^<>]+>", string.Empty);
}
Any kind of help is appreciated alot
Does this do what you want? http://jsfiddle.net/smerny/r7vhd/
$("body").find("*").not("a,img,br").each(function() {
$(this).replaceWith(this.innerHTML);
});
Basically select everything except a, img, br and replace them with their content.
Smerny's answer is working well except that the HTML structure is like:
var s = '<div><div>Link<span> Span</span><li></li></div></div>';
var $s = $(s);
$s.find("*").not("a,img,br").each(function() {
$(this).replaceWith(this.innerHTML);
});
console.log($s.html());
The live code is here: http://jsfiddle.net/btvuut55/1/
This happens when there are more than two wrapper outside (two divs in the example above).
Because jQuery reaches the most outside div first, and its innerHTML, which contains span has been retained.
This answer $('#container').find('*:not(br,a,img)').contents().unwrap() fails to deal with tags with empty content.
A working solution is simple: loop from the most inner element towards outside:
var $elements = $s.find("*").not("a,img,br");
for (var i = $elements.length - 1; i >= 0; i--) {
var e = $elements[i];
$(e).replaceWith(e.innerHTML);
}
The working copy is: http://jsfiddle.net/btvuut55/3/
with jQuery you can find all the elements you don't want - then use unwrap to strip the tags
$('#container').find('*:not(br,a,img)').contents().unwrap()
FIDDLE
I think it would be better to extract to good tags. It is easy to match a few tags than to remove the rest of the element and all html possibilities. Try something like this, I tested it and it works fine:
// the following regex matches the good tags with attrinutes an inner content
var ptt = new RegExp("<(?:img|a|br){1}.*/?>(?:(?:.|\n)*</(?:img|a|br){1}>)?", "g");
var input = "<this string would contain the html input to clean>";
var result = "";
var match = ptt.exec(input);
while (match) {
result += match;
match = ptt.exec(input);
}
// result will contain the clean HTML with only the good tags
console.log(result);

regex to replace a string from the inner html of a div

i have a div that contains some html and text (html is added dynamically)the structure would be like
<div id="contentContainer">
<span>ProductA</span> ; <span>ProductB</span>; prod
</div>
i want to remove the last incomplete text (prod) from the inner html of the div contentContainer on submit button click
for this i was using regex returnText.replace(/\w+$/, ''); and it works fine
as i can not trim the text to last index of ';'
but not the issue is when user puts some special charaters in the incomplete text as pr\od
the regex fails
so is there any solution to trim the last appended text the inner html of the div
or can i trim the text to the last html tag and place ; after that
please suggest any solution
if you are using jquery you can pull out all span elements and replace innerHTML with them.
$("#contentContainer").html( $("#contentContainer span") );
That should clean rest things. Maybe not the best but i think its better then regexp on content.
Solution looks at the last DOM node in DIV, if it is a text node it changes text to semi-colon
var lastNode = $('#contentContainer').contents().last()[0]
if (lastNode.nodeType == 3) {
lastNode.textContent=';'
}
demo: http://jsfiddle.net/EhcLh/

string search in body.html() not working

Hi here is my total work to search a string in HTML and highlight it if it is found in document:
The problem is here
var SearchItems = text.split(/\r\n|\r|\n/);
var replaced = body.html();
for(var i=0;i<SearchItems.length;i++)
{
var tempRep= '<span class="highlight" style="background-color: yellow">';
tempRep = tempRep + SearchItems[i];
tempRep = tempRep + '</span>';
replaced = replaced.replace(SearchItems[i],tempRep); // It is trying to match along with html tags...
// As the <b> tags will not be there in search text, it is not matching...
}
$("body").html(replaced);
The HTML I'm using is as follows;
<div>
The clipboardData object is reserved for editing actions performed through the Edit menu, shortcut menus, and shortcut keys. It transfers information using the system clipboard, and retains it until data from the next editing operation replace s it. This form of data transfer is particularly suited to multiple pastes of the same data.
<br><br>
This object is available in script as of <b>Microsoft Internet Explorer 5.</b>
</div>
<div class='b'></div>
If I search for a page which is pure or without any html tags it will match. However, if I have any tags in HTML this will not work.. Because I am taking body html() text as the target text. It is exactly trying to match along with html tags..
In fiddle second paragraph will not match.
First of all, to ignore the HTML tags of the element to look within, use the .text() method.
Secondly, in your fiddle, it wasn't working because you weren't calling the SearchQueue function on load.
Try this amended fiddle

Categories