Javascript regex to find a string and extract it from whole string - javascript

I have a Javascript array of string that contains urls like:
http://www.example.com.tr/?first=DSPN47ZTE1BGMR&second=NECEFT8RYD
http://www.example.com.tr/?first=RTR22414242144&second=YUUSADASFF
http://www.example.com.tr/?first=KOSDFASEWQESAS&second=VERERQWWFA
http://www.example.com.tr/?first=POLUJYUSD41234&second=13F241DASD
http://www.example.com.tr/?first=54SADFD14242RD&second=TYY42412DD
I want to extract "first" query parameter values from these url.
I mean i need values DSPN47ZTE1BGMR, RTR22414242144, KOSDFASEWQESAS, POLUJYUSD41234, 54SADFD14242RD
Because i am not good using regex, i couldnt find a way to extract these values from the array. Any help will be appreciated

Instead of using regex, why not just create a URL object out of the string and extract the parameters natively?
let url = new URL("http://www.example.com.tr/?first=54SADFD14242RD&second=TYY42412DD");
console.log(url.searchParams.get("first")); // -> "54SADFD14242RD"
If you don't know the name of the first parameter, you can still manually search the query string using the URL constructor.
let url = new URL("http://www.example.com.tr/?first=54SADFD14242RD&second=TYY42412DD");
console.log(url.search.match(/\?([^&$]+)/)[1]); // -> "54SADFD14242RD"
The index of the search represents the parameter's position (with index zero being the whole matched string). Note that .match returns null for no matches, so the code above would throw an error if there's no parameters in the URL.

Does it have to use regex? Would something like the following work:
var x = 'http://www.example.com.tr/?first=DSPN47ZTE1BGMR&second=NECEFT8RYD';
x.split('?first=')[1].split('&second')[0];

Try this regex:
first=([^&]*)
Capture the contents of Group 1
Click for Demo
Code
Explanation:
first= - matches first=
([^&]*) - matches 0+ occurences of any character that is not a & and stores it in Group 1

You can use
(?<=\?first=)[^&]+?
(?<=\?first=) - positive look behind to match ?first=
[^&]+? - Matches any character up to & (lazy mode)
Demo
Without positive look behind you do like this
let str = `http://www.example.com.tr/?first=DSPN47ZTE1BGMR&second=NECEFT8RYD
http://www.example.com.tr/?first=RTR22414242144&second=YUUSADASFF
http://www.example.com.tr/?first=KOSDFASEWQESAS&second=VERERQWWFA
http://www.example.com.tr/?first=POLUJYUSD41234&second=13F241DASD
http://www.example.com.tr/?first=54SADFD14242RD&second=TYY42412DD`
let op = str.match(/\?first=([^&]+)/g).map(e=> e.split('=')[1])
console.log(op)

Related

Whats wrong with this regex logic

I am trying to fetch the value after equal sign, its works but i am getting duplicated values , any idea whats wrong here?
// Regex for finding a word after "=" sign
var myregexpNew = /=(\S*)/g;
// Regex for finding a word before "=" sign
var mytype = /(\S*)=/g;
//Setting data from Grid Column
var strNew = "QCById=20";
var matchNew = myregexpNew.exec(strNew);
var newtype = mytype.exec(strNew);
alert(matchNew);
https://jsfiddle.net/6vjjv0hv/
exec returns an array, the first element is the global match, the following ones are the submatches, that's why you get ["=20", "20"] (using console.log here instead of alert would make it clearer what you get).
When looking for submatches and using exec, you're usually interested in the elements starting at index 1.
Regarding the whole parsing, it's obvious there are better solution, like using only one regex with two submatches, but it depends on the real goal.
You can try without using Regex like this:
var val = 'QCById=20';
var myString = val.substr(val.indexOf("=") + 1);
alert(myString);
Presently exec is returning you the matched value.
REGEXP.exec(SOMETHING) returns an array (see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec).
The first item in the array is the full match and the rest matches the parenthesized substrings.
You do not get duplicated values, you just get an array of a matched value and the captured text #1.
See RegExp#exec() help:
If the match succeeds, the exec() method returns an array and updates properties of the regular expression object. The returned array has the matched text as the first item, and then one item for each capturing parenthesis that matched containing the text that was captured.
Just use the [1] index to get the captured text only.
var myregexpNew = /=(\S*)/g;
var strNew = "QCById=20";
var matchNew = myregexpNew.exec(strNew);
if (matchNew) {
console.log(matchNew[1]);
}
To get values on both sides of =, you can use /(\S*)=(\S*)/g regex:
var myregexpNew = /(\S*)=(\S*)/g;
var strNew = "QCById=20";
var matchNew = myregexpNew.exec(strNew);
if (matchNew) {
console.log(matchNew[1]);
console.log(matchNew[2]);
}
Also, you may want to add a check to see if the captured values are not undefined/empty since \S* may capture an empty string. OR use /(\S+)=(\S+)/g regex that requires at least one non-whitespace character to appear before and after the = sign.

How to extract a particular text from url in JavaScript

I have a url like http://www.somedotcom.com/all/~childrens-day/pr?sid=all.
I want to extract childrens-day. How to get that? Right now I am doing it like this
url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all"
url.match('~.+\/');
But what I am getting is ["~childrens-day/"].
Is there a (definitely there would be) short and sweet way to get the above text without ["~ and /"] i.e just childrens-day.
Thanks
You could use a negated character class and a capture group ( ) and refer to capture group #1. The caret (^) inside of a character class [ ] is considered the negation operator.
var url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all";
var result = url.match(/~([^~]+)\//);
console.log(result[1]); // "childrens-day"
See Working demo
Note: If you have many url's inside of a string you may want to add the ? quantifier for a non greedy match.
var result = url.match(/~([^~]+?)\//);
Like so:
var url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all"
var matches = url.match(/~(.+?)\//);
console.log(matches[1]);
Working example: http://regex101.com/r/xU4nZ6
Note that your regular expression wasn't actually properly delimited either, not sure how you got the result you did.
Use non-capturing groups with a captured group then access the [1] element of the matches array:
(?:~)(.+)(?:/)
Keep in mind that you will need to escape your / if using it also as your RegEx delimiter.
Yes, it is.
url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all";
url.match('~(.+)\/')[1];
Just wrap what you need into parenteses group. No more modifications into your code is needed.
References: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp
You could just do a string replace.
url.replace('~', '');
url.replace('/', '');
http://www.w3schools.com/jsref/jsref_replace.asp

RegEx - Get All Characters After Last Slash in URL

I'm working with a Google API that returns IDs in the below format, which I've saved as a string. How can I write a Regular Expression in javascript to trim the string to only the characters after the last slash in the URL.
var id = 'http://www.google.com/m8/feeds/contacts/myemail%40gmail.com/base/nabb80191e23b7d9'
Don't write a regex! This is trivial to do with string functions instead:
var final = id.substr(id.lastIndexOf('/') + 1);
It's even easier if you know that the final part will always be 16 characters:
var final = id.substr(-16);
A slightly different regex approach:
var afterSlashChars = id.match(/\/([^\/]+)\/?$/)[1];
Breaking down this regex:
\/ match a slash
( start of a captured group within the match
[^\/] match a non-slash character
+ match one of more of the non-slash characters
) end of the captured group
\/? allow one optional / at the end of the string
$ match to the end of the string
The [1] then retrieves the first captured group within the match
Working snippet:
var id = 'http://www.google.com/m8/feeds/contacts/myemail%40gmail.com/base/nabb80191e23b7d9';
var afterSlashChars = id.match(/\/([^\/]+)\/?$/)[1];
// display result
document.write(afterSlashChars);
Just in case someone else comes across this thread and is looking for a simple JS solution:
id.split('/').pop(-1)
this is easy to understand (?!.*/).+
let me explain:
first, lets match everything that has a slash at the end, ok?
that's the part we don't want
.*/ matches everything until the last slash
then, we make a "Negative lookahead" (?!) to say "I don't want this, discard it"
(?!.*) this is "Negative lookahead"
Now we can happily take whatever is next to what we don't want with this
.+
YOU MAY NEED TO ESCAPE THE / SO IT BECOMES:
(?!.*\/).+
this regexp: [^\/]+$ - works like a champ:
var id = ".../base/nabb80191e23b7d9"
result = id.match(/[^\/]+$/)[0];
// results -> "nabb80191e23b7d9"
This should work:
last = id.match(/\/([^/]*)$/)[1];
//=> nabb80191e23b7d9
Don't know JS, using others examples (and a guess) -
id = id.match(/[^\/]*$/); // [0] optional ?
Why not use replace?
"http://google.com/aaa".replace(/(.*\/)*/,"")
yields "aaa"

Get Second to last character position from string using jQuery

I have a dynamically formed string like - part1.abc.part2.abc.part3.abc
In this string I want to get the substring based on second to last occurrence of "." so that I can get and part3.abc
Is there any direct method available to get this?
You could use:
'part1.abc.part2.abc.part3.abc'.split('.').splice(-2).join('.'); // 'part3.abc'
You don't need jQuery for this.
Nothing to do with jQuery. You can use a regular expression:
var re = /[^\.]+\.[^\.]+$/;
var match = s.match(re);
if (match) {
alert(match[0]);
}
or
'part1.abc.part2.abc.part3.abc'.match(/[^.]+\.[^.]+$/)[0];
but the first is more robust.
You could also use split and get the last two elements from the resulting array (if they exist).

Extract text from HTML with Javascript regex

I am trying to parse a webpage and to get the number reference after <li>YM#. For example I need to get 1234-234234 in a variable from the HTML that contains
<li>YM# 1234-234234 </li>
Many thanks for your help someone!
Rich
currently, your regex only matches if there is a single number before the dash and a single number after it. This will let you get one or more numbers in each place instead:
/YM#[0-9]+-[0-9]+/g
Then, you also need to capture it, so we use a cgroup to captue it:
/YM#([0-9]+-[0-9]+)/g
Then we need to refer to the capture group again, so we use the following code instead of the String.match
var regex = /YM#([0-9]+-[0-9]+)/g;
var match = regex.exec(text);
var id = match[1];
// 0: match of entire regex
// after that, each of the groups gets a number
(?!<li>YM#\s)([\d-]+)
http://regexr.com?30ng5
This will match the numbers.
Try this:
(<li>[^#<>]*?# *)([\d\-]+)\b
and get the result in $2.

Categories