On my site, there is text that may contain a mention to an account, like #Melon or #Banana. How can I use JavaScript to automatically link these to their respective account (e.g. #Banana -> example.com/users/Banana)? I also want to make sure that if someone wrote something like "#Banana's", it would only link "#Banana". Please comment if this is confusing.
Thanks!
Here you go, with regex but it basically depends on how a user's username could look like
const regex = /#([\w\d_-]*)(?:[^\.'<\bbr\b/*>\s])*/g;
const testCases = ["#Banana<br/>'s", "#Fruit", "#MOnkeys-12342<br>", "#International_Alien234"];
console.log(testCases.map(str => ({original: str, replace: str.replaceAll(regex, '$&')})));
you can regex them out:
text.replace(/(^|\s)#([^\s]+)(\s|$)/g, "$1#$2$3")
for the case #Banana's, you'll need to define which chars are part of the link, and which chars terminate the #statement.
This regex assumes the link starts with # and ends with a whitespace.
Related
I am using Sublime Text to write some Javascript and need to do a simple text replacement in the editor in order to set code up. I can do it manually but I figured there must be a way to have the replacement occur automatically with RegEx. I've used RegEx a bunch before but have never used it to grab data from one part of the code to reference and edit another part of the code. For example, I have this:
var example_1 = 836;
var example_2 = 837;
var example_3 = 838;
var example_4 = 846;
And then I have this:
SELECT_122=836
SELECT_143=837
SELECT_144=838
SELECT_145=846
I want these to use the corresponding values and format them like this:
SELECT_122: example_1,
SELECT_143: example_2,
SELECT_144: example_3,
SELECT_145: example_4
Note that I'm updating the equal signs to colons with spaces so I figured doing all these changes could be done with some sort of search and replace. I have a large amount of these so I figured it would be best to learn how to do this if it's possible.
I don't have SublimeText, but you said in a comment that you want to do it through a text editor. Here is what works for me in EditPad Pro, it may work in Sublime.
Search:
(?s)(var (example_\d++) = (\d++).*?SELECT_\d++)=\3
Replace:
\1: \2,
Then I click "Replace". This will replace the first instance (SELECT_122=836) with "SELECT_122: example_1,"
Then I click "Replace Next" multiple times, and the SELECT_ strings are left looking like this:
SELECT_122: example_1,
SELECT_143: example_2,
SELECT_144: example_3,
SELECT_145: example_4,
Is this what you want?
Hope the regex and replacement string at least get you started. :)
Given something a regex like this:
http://rubular.com/r/ai1LFT5jvK
I want to use string.replace to replace "subdir" with a string of my choosing.
Doing myStr.replace(/^.*\/\/.*\.net\/.*\/(.*)\/.*\z/,otherStr)
only returns the same string, as shown here: http://jsfiddle.net/nLmbV/
If you view the Rublar, it appears to capture what I want it to capture, but on the Fiddle, it doesn't replace it.
I'd like to know why this happens, and what I'm doing wrong. A correct regex or a correct implementation of the replace call would be nice, but most of all, I want to understand what I'm doing wrong so that I can avoid it in the future.
EDIT
I've updated the fiddle to change my regex from:
/^.*\/\/.*\.net\/.*\/(.*)\/.*\z/
to
/^.*\/\/.*\.net\/.*\/(.*)\/.*$/
And according to the fiddle, it just returns hello instead of https://xxxxxxxxxxx.cloudfront.net/dir/hello/Slide1_v2.PNG
It's that little \z in your regex.
You probably forgot to replace it with a $ sign. JavaScript uses ^ and $ as anchors, while Ruby uses \A and \z.
To answer your edit:
The match is always replaced as a whole. You'll want to group both the left side and the right side of the to-be-replaced part and reinsert it in the replacement:
url.replace(/^(.*\/\/.*\.net\/.*\/).*(\/.*)$/,"$1hello$2")
Before I get marked down, I know the question asks about regexp. The reason for this answer URLs are nearly impossible to process reliably with a regexp without writing fiendishly complex regexps. It can be done, but it makes your head hurt!
If you are doing this in a browser, you can use an A tag in your script to make things much simpler. The A tag knows how to parse them into pieces, and it lets you modify the pieces independently, so you only need to deal with the pathname:
//make a temporary a tag
var a = document.createElement('a');
//set the href property to the url you want to process
a.href = "scheme://host.domain/path/to/the/file?querystring"
//grab the path part of the url, and chop up into an array of directories
var dirs = a.pathname.split('/');
//set 2nd dir name - array is ['','path','to','file']
dirs[2]='hello';
//put the path back together
a.pathname = dirs.join('/');
a.href now contains the URL you want.
More lines, but also more hair left when you come back to change the code later.
I have the following code that is used to turn http URLs in text into anchor tags. It's looking for anything starting with http, surrounded by white space (or the beginning/end of input)
function linkify (str) {
var regex = /(^|\s)(https?:\/\/\S+)($|\s)/ig;
return str.replace(regex,'$1$2$3')
}
// This works
linkify("Go to http://www.google.com and http://yahoo.com");
// This doesn't, yahoo.com doesn't become a link
linkify("Go to http://www.google.com http://yahoo.com");
The case where it doesn't work is if I only have a single space between two links. I'm assuming it's because the space in between the two links can't be used to match both URLs, after the first match, the space after the URL has already been consumed.
To play with: http://jsfiddle.net/NgMw8/
Can somebody suggest a regex way of doing this? I could scan the string myself, looking for a regex way of doing it (or some way that doesn't require scanning the string my self and building a new string on my own.
Don't capture the final \s. This way, the second url will match the preceding \s, as required:
function linkify (str) {
var regex = /(^|\s)(https?:\/\/\S+)/ig;
return str.replace(regex,'$1$2')
}
http://jsfiddle.net/NgMw8/3/
Just use a positive lookahead when matching your final $|\s, like so:
var regex = /(^|\s)(https?:\/\/\S+)(?=($|\s))/ig;
None will work if there are any html element stuck to the url ...
Similar question and it's answers HERE
Some solutions can handle url like "test.com/anothertest?get=letsgo" and append http://
Workaround may be done to handle https and miscellaneous tld ...
I can't post the exact data i'm trying to extract but here's a basic scenario with the same outcome. I'm grabbing the body of a page and trying to extract a bit.ly link from it. So let's say for example, this is the chunk of data where i'm trying to grab the link from.
String:
http://bit.ly/Pq8AkS</div><div class="shareUnit"><div class="-cx-PRIVATE-fbTimelineExternalShareUnit__wrapper"><div><div class="-cx-PRIVATE-fbTimelineExternalShareUnit__root -cx-PRIVATE-fbTimelineExternalShareUnit__hasImage"><a class="-cx-PRIVATE-fbTimelineExternalShareUnit__video -cx-PRIVATE-fbTimelineExternalShareUnit__image -cx-PRIVATE-fbTimelineExternalShareUnit__content" ajaxify="/ajax/flash/expand_inline.php?target_div=uikk85_59&share_id=271663136271285&max_width=403&max_height=403&context=timelineSingle" rel="async" href="#" onclick="CSS.addClass(this, "-cx-PRIVATE-fbTimelineExternalShareUnit__loading");CSS.removeClass(this, "-cx-PRIVATE-fbTimelineExternalShareUnit__video");"><i class="-cx-PRIVATE-fbTimelineExternalShareUnit__play"></i><img class="img" src="http://external.ak.fbcdn.net/safe_image.php?d=AQDoyY7_wjAyUtX2&w=155&h=114&url=http%3A%2F%2Fi1.ytimg.com%2Fvi%2FDre21lBu2zU%2Fmqdefault.jpg" alt="" /></a>
Now, I can get what i'm looking for with the following code but the link isn't always going to be exactly 6 characters long. So this causes an issue...
Body = document.getElementsByTagName("body")[0].innerHTML;
regex = /2Fbit.ly%2F(.{6})&h/g;
Matches = regex.exec(Body);
Here's what I was orginally trying but the problem I have is that it grabs too much data. It's going all the way to the last "&h" in the string above instead of stopping at the first one it hits.
Body = document.getElementsByTagName("body")[0].innerHTML;
regex = /2Fbit.ly%2F(.*)&h/g;
Matches = regex.exec(Body);
So basically the main part of the string i'm trying to focus on is "%2Fbit.ly%2FPq8AkS&h" so that I can get the "Pq8AkS" out of it. When I use the (.*) it's grabbing everything between "%2F" and the very last "&h" in the large string above.
You should not be using a regex on HTML. Use DOM functions to get the desired link object, then get the href attribute from that, then you can use a regex on just the href.
By default .* is greedy meaning that it matches the most it can match and still find a match. If you want it to be non-greedy (match the least possible), you can use this .*? instead like this:
regex = /2Fbit.ly%2F(.*?)&h/;
I also don't think you want the g flag on the regex as there should only be one match in the right URL.
If you show the rest of your HTML, we could offer advice on finding the right link object rather than trying to match the entire body HTML.
FYI, another trick for a non-greedy match is to do something like this:
regex = /2Fbit.ly%2F([^&]*)&h/;
Which matches a series of characters that are not & followed by &h which accomplishes the same goal as long as & can't be in the matched sequence.
By default + and * are greedy and match as much as possible. You need a non-greedy match for your (.+). A quick search gives the solution as
? directly following a quantifier makes the quantifier non-greedy (makes it match minimum instead of maximum of the interval defined).
So try changing your regex= line to
regex = /2Fbit.ly%2F(.*?)&h/g;
Edit: #jfriend00's answer below is more complete.
I have an existing replace that matches http within a text string and creates a working URL from the text.
Working Example:
var Text = "Visit Gmail at http://gmail.com"
var linkText = Text.replace(/http:\/\/\S+/gi, '$&');
document.write(linkText);
Output:
Visit Gmail at http://gmail.com
Problem:
The problem arises when the link appears at the end of a sentence and the punctuation incorrectly becomes appended to the end of the URL.
Can someone advise on a way of extending my regex (or maybe adding a second replacement after this has been transformed) to overcome this?
I think the right answer will include adding something along the lines of /\W$/g to my original regex, but I can't see how this can be applied to just one word within the whole string.
As always, very grateful for any help.
Thanks,
Pete
Examples of problem links
http://gmail.com/.
http://gmail.com,
http://gmail.com/?
http://gmail.com!
All of these should resolve the link to http://gmail.com
Note how some could end in a slash then punctuation and others with punctuation directly after the domain name.
Try
/http:\/\/(.(?![.?] |$))*/
My logic is, if the last char is a dot, or question mark followed by either a space or end of string, you don't need it.
var Text = "Visit Gmail at http://gmail.com"
var linkText = Text.replace(/http:\/\/(.(?![.?](?:\s|$)))*./gi, '$&');
document.write(linkText);
Gives
"Visit Gmail at http://gmail.com"
Edit:
This may be better (it doesn't match white space now)
http:\/\/(.(?!(?:[.?](?: |$))))*.
Why not just use a negative character class?
/http://\S+[^.,?!]/gi
You could account for trailing unwanted characters, whether stripping them or not.
The replacement for both is capture buffer 1: <a href="$1">$1<\/a>
This also asumes you can do lookbehind. though I'm not sure if client side JS can do lookbehind assertions.
Strip unwanted chars
/(http:\/\/\S+)(?<![\/.,?!])[\/.,?!]*/
Or, leave unwanted characters
/(http:\/\/\S+)(?<![\/.,?!])/
Alternate, using lookahead
Strip
/(http:\/\/\S+?(?=[\/.,?!]+(?:\s|$)|\s|$))[\/.,?!]*/
Leave
/(http:\/\/\S+?(?=[\/.,?!]+(?:\s|$)|\s|$))/