I have a querstion about simple regex. I need to get between of these characters: - and ~
My string: Champions tour - To Win1 - To Win2 ~JIM FURYK
When I use this: \-([^)]+\~) it is giving as matched this:
To Win1 - To Win2 ~
But I need this:
To Win2 ~JIM FURYK
Is it possible to this?
My regex is here: https://regex101.com/r/fJBLXb/1/
Just add \-([^-)]+\~) - dash to not match
Your \-([^)]+\~) regex matches the leftmost - that is directly followed with one or more chars other than ) (so it matches -, a, ยง, etc.) and then a ~ char. It does not stop at - chars and thus can match any amount of hyphens.
To match the value after last hyphen you can use
[^\s-][^-]*$
See the regex demo and the regex graph. Details:
[^\s-] - a char other than whitespace and -
[^-]* - zero or more chars other than -
$ - end of string.
See the JavaScript demo:
const text = 'Champions tour - To Win1 - To Win2 ~JIM FURYK';
const match = text.match(/[^\s-][^-]*$/);
if (match) {
console.log(match[0]);
}
You could use match as follows:
var input = "Champions tour - To Win1 - To Win2 ~JIM FURYK";
var output = input.match(/- ([^-]+~.*)$/)[1];
console.log(output);
The regex pattern used above says to match:
- a hyphen
[ ] a single space
( capture what follows
[^-]+ match all content WITHOUT crossing another hyphen
~ ~
.* all remaining content
) stop capture
$ end of the string
Related
If I have three and more hyphens in string I need to get from the string substring after second hyphen.
For example, I have this string:
"someStr1 - someStr2 - someStr3 - someStr4"
As you can see it has 3 hyphens, I need to get from string above substring:
"someStr3 - someStr4"
I know that I need to get the index position of second hyphen and then I can use substring function
But I don't know how to check if there is more then 3 hyphens and how to check thet position is of the second hyphen.
You can use the RegEx (?<=([^-]*-){2}).*
(?<=([^-]*-){2}) makes sure there is 2 - before your match
(?<= ... ) is a positive lookbehind
[^-]* matches anything but a -, 0 or more times
- matches - literally
.* matches anything after those 2 dashes.
Demo.
const data = "someStr1 - someStr2 - someStr3 - someStr4";
console.log(/(?<=([^-]*-){2}).*/.exec(data)[0]);
Split string to array with - and check if array.length > 3 which means at least three - in the string. If true, join the array from index == 2 to the end with - and trim the string.
var text = "someStr1 - someStr2 - someStr3 - someStr4"
var textArray = text.split('-')
if(textArray.length>3){
console.log(textArray.slice(2).join('-').trim())
}
How about something like this:
var testStr = "someStr1 - someStr2 - someStr3 - someStr4";
var hyphenCount = testStr.match(/-/g).length;
if(hyphenCount > 2){
var reqStr = testStr.split('-').slice(-2).join('-');
console.log(reqStr) // logs "someStr3 - someStr4"
}
Criteria:
any word that start with a and end with b having middle char digit. this word should not be on the line which start with char '#'
Given string:
a1b a2b a3b
#a4b a5b a6b
a7b a8b a9b
Expected output:
a1b
a2b
a3b
a7b
a8b
a9b
regex: ?i need it for javascipt.
So far tried below thing:
var text_content =above_mention_content
var reg_exp = /^[^#]?a[0-9]b/gmi;
var matched_text = text_content.match(reg_exp);
console.log(matched_text);
Getting below output:
[ 'a1b', ' a7b' ]
Your /^[^#]?a[0-9]b/gmi will match multiple occurrences of the pattern matching the start of line, then 1 or 0 chars other than #, then a, digit and b. No checking for a whole word, nor actually matching words farther than at the beginning of a string.
You may use a regex that will match lines starting with # and match and capture the words you need in other contexts:
var s = "a1b a2b a3b\n#a4b a5b a6b\n a7b a8b a9b";
var res = [];
s.replace(/^[^\S\r\n]*#.*|\b(a\db)\b/gm, function($0,$1) {
if ($1) res.push($1);
});
console.log(res);
Pattern details:
^ - start of a line (as m multiline modifier makes ^ match the line start)
[^\S\r\n]* - 0+ horizontal whitespaces
#.* - a # and any 0+ chars up to the end of a line
| - or
\b - a leading word boundary
(a\db) - Group 1 capturing a, a digit, a b
\b - a trailing word boundary.
Inside the replace() method, a callback is used where the res array is populated with the contents of Group 1 only.
I would suggest to use 2 reg ex:
First Reg ex fetches the non-hashed lines:
^[^#][a\db\s]+
and then another reg ex for fetching individual words(from each line):
^a\db\s
var url = document.referrer;
var a=document.createElement('a');
a.href=url;
var path = a.pathname;
Let's say path is this:
/cat-dog-fish/
I want to remove leading and trailing slashes, if they exist, else do nothing.
I can do this (removes trailing slash):
a.pathname.replace(/\/$/,'')
Or this (removes leading slash)
a.pathname.replace(/^\//,'')
But how do I remove both at once, in a oner, if they exist?
A regex literal like /^\/|\/$/g can be used to replace with empty string, or you may use /^\/([^]*)\// (match /, then any 0+ chars up to the last / capturing what is in-between the slashes) to replace with $1:
var s = "/cat-dog-fish/";
console.log(s.replace(/^\/|\/$/g, ''));
console.log(s.replace(/^\/([^]*)\/$/, '$1'));
Note:
^\/ - matches the start of string and a / right there
| - means OR
\/$ - matches a / at the end of string
([^]*) - is a capturing group (...) that captures 0 or more (*) any characters as [^] means not nothing.
var a="/cat-dog-fish/";
var d = a.replace(new RegExp("(^\/|\/$)",'g'),'');
console.log(d);
a.pathname.replace(new RegExp("(^\/|\/$)",'g'),'');
When I parse Amazon products I get this such of string.
"#19 in Home Improvements (See top 100)"
I figured how to retrieve BSR number which is /#\d*/
But have no idea how to retrieve Category which is going after in and end until brackets (See top 100).
I suggest
#(\d+)\s+in\s+([^(]+?)\s*\(
See the regex demo
var re = /#(\d+)\s+in\s+([^(]+?)\s*\(/;
var str = '#19 in Home Improvements (See top 100)';
var m = re.exec(str);
if (m) {
console.log(m[1]);
console.log(m[2]);
}
Pattern details:
# - a hash
(\d+) - Group 1 capturing 1 or more digits
\s+in\s+ - in enclosed with 1 or more whitespaces
([^(]+?) - Group 2 capturing 1 or more chars other than ( as few as possible before th first...
\s*\( - 0+ whitespaces and a literal (.
I'm trying to match all the words starting with # and words between 2 # (see example)
var str = "#The test# rain in #SPAIN stays mainly in the #plain";
var res = str.match(/(#)[^\s]+/gi);
The result will be ["#The", "#SPAIN", "#plain"] but it should be ["#The test#", "#SPAIN", "#plain"]
Extra: would be nice if the result would be without the #.
Does anyone has a solution for this?
You can use
/#\w+(?:(?: +\w+)*#)?/g
See the demo here
The regex matches:
# - a hash symbol
\w+ - one or more alphanumeric and underscore characters
(?:(?: +\w+)*#)? - one or zero occurrence of:
(?: +\w+)* - zero or more occurrences of one or more spaces followed with one or more word characters followed with
# - a hash symbol
NOTE: If there can be characters other than word characters (those in the [A-Za-z0-9_] range), you can replace \w with [^ #]:
/#[^ #]+(?:(?: +[^ #]+)*#)?/g
See another demo
var re = /#[^ #]+(?:(?: +[^ #]+)*#)?/g;
var str = '#The test-mode# rain in #SPAIN stays mainly in the #plain #SPAIN has #the test# and more #here';
var m = str.match(re);
if (m) {
// Using ES6 Arrow functions
m = m.map(s => s.replace(/#$/g, ''));
// ES5 Equivalent
/*m = m.map(function(s) {
return s.replace(/#$/g, '');
});*/ // getting rid of the trailing #
document.body.innerHTML = "<pre>" + JSON.stringify(m, 0, 4) + "</pre>";
}
You can also try this regex.
#(?:\b[\s\S]*?\b#|\w+)
(?: opens a non capture group for alternation
\b matches a word boundary
\w matches a word character
[\s\S] matches any character
See demo at regex101 (use with g global flag)