Regex: Don't urlify links that were code-prettified before - javascript

I have a regex that first prettifies any text between back-ticks (``) as code. And then I use another regex that detects urls and generates anchor tag. Just like stackoverflow's editor. The problem is I don't want to urlify the links in code-prettified part.
For example the first url in the following example should be urlified but the second one shouldn't:
To generate a link to http://stackoverflow.com you should write
StackOverflow

Related

Jump to string an website without an anchor

Is it possible, probably using javascript, to jump to a particular sentence (string) via a link on a website?
Like an anchor, only without the anchor in HTML.
This is an example from the search results, searchstring was "content directory":
Result: planets/ on line 20: <br>
If you create a folder within the content directory (e.g. <code class="hljs lua">content/<span class="hljs-built_in">sub</span></code>) and...
After the link has been opened, the browser should jump to this line (The line number is of course only the one from the searched text file.) and like to color the search string.
I found some usefull script: http://www.seabreezecomputers.com/tips/find6.htm
It offers a On-Side-Search-Button

RegEx for matching style tag

I have an HTML code that contain CSS code inside tag under the header tag. I want to use regex to extract all text in HTML, only pure text (between HTML tags ). I tried,
console.log(HTML_TEXT.replace(/(<([^>]+)>)/g, ""))
which replace every thing between <> by empty char, the problem is the CSS code inside STYLE tag is still there, so i want to know how to write the regular expression to remove CSS code inside tags.
How do I solve this problem?
This RegEx might help you to do so:
(\>)(.+)(<\/style>)
It creates a right boundary in a capturing group: (<\/style>)
It has a left boundary in another capturing group: (\>), which you can add additional boundaries to it, if you wish/necessary
Then, it has a no-boundary middle capturing group, (.+), where your target is located, and you can call it using $2 and replace it with an empty string, or otherwise.
I'm not so sure, did not test it, but your code might look like something similar to:
console.log(HTML_TEXT.replace(/(\>)(.+)(<\/style>)/g, '\\$1\\$3'))
This post explains how to do a string replace in JavaScript.
Edit:
Based on the comment, this RegEx might help you to filter your tags using $1:
(\<style type=\"text\/css\"\>)([\s\S]*)(\<\/style\>)

How do I allow <img> and <a> tags for innerHTML, but no others? (Making a forum)

I am currently programming a forum using only javascript (No JQuery please). I am doing very well, however, there is one issue I would love help with.
Currently I am getting the post from a database, assigning it to variable MainPost, and then attaching it to a div via a text node:
var theDiv = document.getElementById("MainBody");
var content = document.createTextNode(MainPost);
theDiv.appendChild(content);
This is working quite well, however, I would LOVE to be able to do this:
document.getElementById("MainBody").innerHTML += MainPost;
But I know this would allow people to use ANY html tag they want, even something like "script" followed by javascript code. This would be bad for business, obviously, but I do like the idea of allowing posters to use the "img" tag as well as the "a href" tags. Is there a way to somehow disable all tags except these two for the innerHTML?
Thank you all so much for any help you can offer.
Ok, the first thought that came to my mind when I read this question was to find a regular expression to exclude a specific string in a word. Simple search gave a lot of results from SO.
Starting point - To remove all the HTML tags from a string (from this answer):
var regex = /(<([^>]+)>)/ig
, body = "<p>test</p>"
, result = body.replace(regex, "");
console.log(result);
To exclude a string you would do something like this (again from all the source mentioned above):
(?!StringToBeExcluded)
Since you want to exlcude the <a href and <img tags. The suitable regex in your case could be:
(<(?![\/]?a)(?![\/]?img)([^>]+)>)
Explanation :
Think of it as three capturing groups in succession:
(?![\/]?a) : Negative Lookahead to assert that it is impossible to match the regex containing the string "a" prefixed by zero or one backslashes (Should take care of the a href tags)
(?![\/]?img) : Same as 1, just here it looks for the string "img". I don't know why I allowed the </img> tag. Yes, <img> doesn't have a closing tag. You could remove the [\/]? bit from it to fix this.
([^>]+) : Makes sure to not match > zero or one times to take care of tags that have opening and closing tags.
Now all these capture groups lie between < and >. You might want to try a regex demo that I've created incorporating these three capture groups to take care of ignoring all HTML elements except the image and link tags.
Sidenote - I haven't thoroughly given this regex a try. Feel free to play around with it and tweak it according to your needs. In any case, I hope this gets you started in the right direction.

Javascript match regex links

Hi i have the following text in a div:
[http://www.google.com Google this link] some other random text [http://example.com and another website] but don't parse [this is not a link, leave me alone].
What I tried to do was to convert the links into normal html links. The Format is always like this, so it opens with a [ and then the url, followed by the link text and then a closing ]. But I only want to match links, not all text in square brackets.
I want to use the .match() function in javascript to do this task, but I wasn't able to figure out the regex expression (I only need the text parts that are links - the rest should be a simple split).
Any help would be apprechiated.
Why .match and not .replace?
string.replace(/\[(https?:\/\/[^\]\s]+)(?: ([^\]]*))?\]/g, "<a href='$1'>$2</a>");

Whitelist javascript to strip html tags

I have modified a whitelist javascript regex that strip unwanted tags.
I am trying to allow this code:
<span style="color: #000000"></span>
but I am unable to do it in regex.
Bellow is what is have so far:
(/<(?!(br|\/br|p|\/p|b|\/b|u|\/u|ol|\/ol|ul|\/ul|li|\/li))([^>])+>/gi
Thanks
Works for me as well - unless there is more that you are trying to do - e.g. if there is any content between the tags, or if you want to match the opening and closing tag in the same run - then post the example in your question.
BTW: the regex can be simplified a little the following way:
<(?!((?:\/\s*)?(?:br|p|b|u|[o|i]l|li)))([^>])+>
(?:\/\s*)? - an optional slash
(?:br|p|b|u|[o|i]l|li) - followed by any of these tags
UPDATE:
Here's my last try:
if you want to match all the other tags use this
<(?!(?:\/\s*)?(?:br|p|b|[o|u]l|li|span)(?:\s*style='color: #[A-Fa-f0-9]+'))([^>])*>
if you want to match the tags with color use this
<((?:\/\s*)?(?:br|p|b|[o|u]l|li|span)(?:\s*style='color: #[A-Fa-f0-9]+'))([^>])*>
this works for me (no parenthesis at the beginning):
/<(?!(br|\/br|p|\/p|b|\/b|u|\/u|ol|\/ol|ul|\/ul|li|\/li))([^>])+>/gi
I have developed tool with Source code. It'll strip all the tags with exception list proveded by user : try this HTML Tag Stripper

Categories