Javascript function that gets a string with URLs [duplicate]

Javascript function that gets a string with URLs [duplicate] - javascript

This question already has answers here:
How to replace plain URLs with links?
(25 answers)
Closed 8 years ago.
I have a Javascript function that receives a string.
that string may have URLs in it, with, or without http:// or http/s:
example string:
This is an example string with www.cnn.com and http://microsoft.com"
my purpose is to take that string and inject anchor tags for the URLs such that they are clickable when injected as HTML to an html document.
Is there anything I can use that does this? some Jquery function?

No native jQuery function for this but regular expression can do this
Let me give point you to this blog post that will get you started. What it has is a regular expression (may seem rather complicated and complex, but is not that much) that can detect links and emails in strings so you'll be able to find them and replace them with links that point to extracted URLs.
Maybe an even better expression (that I know I used in the past) has been published by Stackoverflow's own Jeff Atwood. He also describes the whole URL situation in very much detail so you can decide whether this applies to your problem and how much or not.

If a regex still is wanted:
/((www|http:\/\/)\S*)/ig
This matches www.cnn.com and http://www.microsoft.com from the following example:
This is an example string with www.cnn.com and http://www.microsoft.com
If you use $1 to replace you'll get:
www.cnn.com
http://www.microsoft.com
Remember that this regex doesn't validate the urls address. It just searches for www or http:// and then takes everything after that until there is a whitespace.

Related

Regex to process only certain files [duplicate]

This question already has answers here:
Regex: match everything but a specific pattern
(6 answers)
Closed 3 years ago.
I need a regex (javascript) that will process all files except for two.
The file names look like this-
Epic_IKDH_Appt_Phone_Reminders_20191030.txt
Epic_NAMMI_Appt_Phone_Reminders_20191031.txt
QCNIL-Recall_Phone_Reminders_20191029.txt
Epic_SNA_Appt_No_Show_Reminders_20191029.txt
I want to process all files that don't start with QCNIL and Epic_SNA.
Tried this regex but it doesn't seem to work
^((?!QCNIL).)*$|^((?!Epic_SNA).)*$
One of the other seems to work but not together.
Then tried this:
^((?!Epic_SNA)(?!QCNIL).)*$
This seems to work but with my limited knowledge of regular expressions, I'm afraid I might be missing something. Basically, if new file names are generated, I want them to also process. I only don't want to process the SNA and QCNIL files.

The second pattern ^((?!Epic_SNA)(?!QCNIL).)*$ would work but the approach taken is a tempered greedy token which will do 2 assertions before matching a single char and can be a costly operation in the number of steps.
You might simplify the pattern to use a negative lookahead at the start asserting what is directly to the right is not QCNIL or Epic_SNA.
Then match any char except a newline 1+ times to prevent matching an empty string.
^(?!QCNIL|Epic_SNA).+$
Regex demo

You need to perform the or inside the exclusion.
Otherwise you are saying "it is not this" or "it is not that". You need to say it is not "this or that".
Also, did you intend the whole expression to be repeated by *, or only the .? Try the following, though there are more brackets than necessary:
^(?!((Epic_SNA)|(QCNIL))).*$

(Discord.js) Bot to delete all URLs posted

I want my bot to delete the URLs posted by the members.
Im not sure about how to detect that a URL has been posted as they could start from https or www. Or something else altogether...
And insight is appreciated !
EDIT:
Found similar question to mine.
Regular expression for URL validation (in JavaScript)
Successfully tackled the same using a npm plugin called Linkify.
Consider this closed.

I'd recommend looking into regular expressions (regex) for URL Validation.
Here's another SO question that you can use as a starting point: Regular expression for URL validation (in JavaScript)
One recommended approach is using a regex string like this (from an answer in that question):
/^(ftp|http|https):\/\/[^ "]+$/.test(url)
This expression matches a full string, so you'll have to either edit this regex to match any place inside a string, or do some message content manipulation before using that regex.

Regular expression matching multiple entries, spanning multiple lines [duplicate]

This question already has answers here:
Regular expression to stop at first match
(9 answers)
Closed 2 years ago.
I have this gigantic ugly string:
J0000000: Transaction A0001401 started on 8/22/2008 9:49:29 AM
J0000010: Project name: E:\foo.pf
J0000011: Job name: MBiek Direct Mail Test
J0000020: Document 1 - Completed successfully
I'm trying to extract pieces from it using regex. In this case, I want to grab everything after Project Name up to the part where it says J0000011: (the 11 is going to be a different number every time).
Here's the regex I've been playing with:
Project name:\s+(.*)\s+J[0-9]{7}:
The problem is that it doesn't stop until it hits the J0000020: at the end.
How do I make the regex stop at the first occurrence of J[0-9]{7}?

Make .* non-greedy by adding '?' after it:
Project name:\s+(.*?)\s+J[0-9]{7}:

Using non-greedy quantifiers here is probably the best solution, also because it is more efficient than the greedy alternative: Greedy matches generally go as far as they can (here, until the end of the text!) and then trace back character after character to try and match the part coming afterwards.
However, consider using a negative character class instead:
Project name:\s+(\S*)\s+J[0-9]{7}:
\S means “everything except a whitespace and this is exactly what you want.

Well, ".*" is a greedy selector. You make it non-greedy by using ".*?" When using the latter construct, the regex engine will, at every step it matches text into the "." attempt to match whatever make come after the ".*?". This means that if for instance nothing comes after the ".*?", then it matches nothing.
Here's what I used. s contains your original string. This code is .NET specific, but most flavors of regex will have something similar.
string m = Regex.Match(s, #"Project name: (?<name>.*?) J\d+").Groups["name"].Value;

I would also recommend you experiment with regular expressions using "Expresso" - it's a utility a great (and free) utility for regex editing and testing.
One of its upsides is that its UI exposes a lot of regex functionality that people unexprienced with regex might not be familiar with, in a way that it would be easy for them to learn these new concepts.
For example, when building your regex using the UI, and choosing "*", you have the ability to check the checkbox "As few as possible" and see the resulting regex, as well as test its behavior, even if you were unfamiliar with non-greedy expressions before.
Available for download at their site:
http://www.ultrapico.com/Expresso.htm
Express download:
http://www.ultrapico.com/ExpressoDownload.htm

(Project name:\s+[A-Z]:(?:\\w+)+.[a-zA-Z]+\s+J[0-9]{7})(?=:)
This will work for you.
Adding (?:\\w+)+.[a-zA-Z]+ will be more restrictive instead of .*

Regex for repeated sub strings in a lengthy string [duplicate]

This question already has answers here:
Regular expression to stop at first match
(9 answers)
Closed 2 years ago.
I have this gigantic ugly string:
J0000000: Transaction A0001401 started on 8/22/2008 9:49:29 AM
J0000010: Project name: E:\foo.pf
J0000011: Job name: MBiek Direct Mail Test
J0000020: Document 1 - Completed successfully
I'm trying to extract pieces from it using regex. In this case, I want to grab everything after Project Name up to the part where it says J0000011: (the 11 is going to be a different number every time).
Here's the regex I've been playing with:
Project name:\s+(.*)\s+J[0-9]{7}:
The problem is that it doesn't stop until it hits the J0000020: at the end.
How do I make the regex stop at the first occurrence of J[0-9]{7}?

Make .* non-greedy by adding '?' after it:
Project name:\s+(.*?)\s+J[0-9]{7}:

Using non-greedy quantifiers here is probably the best solution, also because it is more efficient than the greedy alternative: Greedy matches generally go as far as they can (here, until the end of the text!) and then trace back character after character to try and match the part coming afterwards.
However, consider using a negative character class instead:
Project name:\s+(\S*)\s+J[0-9]{7}:
\S means “everything except a whitespace and this is exactly what you want.

Well, ".*" is a greedy selector. You make it non-greedy by using ".*?" When using the latter construct, the regex engine will, at every step it matches text into the "." attempt to match whatever make come after the ".*?". This means that if for instance nothing comes after the ".*?", then it matches nothing.
Here's what I used. s contains your original string. This code is .NET specific, but most flavors of regex will have something similar.
string m = Regex.Match(s, #"Project name: (?<name>.*?) J\d+").Groups["name"].Value;

I would also recommend you experiment with regular expressions using "Expresso" - it's a utility a great (and free) utility for regex editing and testing.
One of its upsides is that its UI exposes a lot of regex functionality that people unexprienced with regex might not be familiar with, in a way that it would be easy for them to learn these new concepts.
For example, when building your regex using the UI, and choosing "*", you have the ability to check the checkbox "As few as possible" and see the resulting regex, as well as test its behavior, even if you were unfamiliar with non-greedy expressions before.
Available for download at their site:
http://www.ultrapico.com/Expresso.htm
Express download:
http://www.ultrapico.com/ExpressoDownload.htm

(Project name:\s+[A-Z]:(?:\\w+)+.[a-zA-Z]+\s+J[0-9]{7})(?=:)
This will work for you.
Adding (?:\\w+)+.[a-zA-Z]+ will be more restrictive instead of .*

Why concatenate HTML tags when appending to DOM [duplicate]

This question already has answers here:
Javascript - breaking up string literals... why?
(3 answers)
Closed 9 years ago.
I've seen this a few times before, finally decided to find out why.
Given this line of code:
$("body").append('<ifr'+'ame src="foo.html"></ifr'+'ame>');
Why are they concatenating '<ifr' + 'ame' and </ifr'+'ame>'?

Bit unsure if I should answer this question or not, based on the fact there could be many reasons, and only the writer would know. But here goes anyway...
The only reason I can think of (apart from just to be annoying) would be to prevent something viewing the code and seeing the complete <iframe> tag. For example, if I search the code for "iframe" I will not get any matches.
Now for the speculative part of this answer: why would anybody wants to prevent something from finding a matching iframe tag?
Here are a couple of options, as provided by myself and Kevin B via comments:
To prevent a process from scanning the code file and detecting an iframe tag. For example, imagine you have a content management system that allows HTML/Javascript but doesn't allow the use of iframes. Part of the upload process might be to analyze the code and look for the tag. The technique of concatenation used here would prevent that scan from detecting the iframe, and thus allow the upload
Parsing software, such as a syntax highlighter may have problems working with html mixed inside JavaScript, therefore adding the concatenation method would likely allow the parser to successfully identify the string as the string it actually meant to be
In all examples, it would be fair to say this technique may also be applicable for other tags that exist within a JavaScript string, not just the iframe one shown in this example

We Keep Coding

JavaScript is the programming language of the Web.

Javascript function that gets a string with URLs [duplicate] - javascript

Related

Regex to process only certain files [duplicate]

(Discord.js) Bot to delete all URLs posted

Regular expression matching multiple entries, spanning multiple lines [duplicate]

Regex for repeated sub strings in a lengthy string [duplicate]

Why concatenate HTML tags when appending to DOM [duplicate]

Categories

Resources