I need to match the text between two brackets. many post are made about it but non are supported by JavaScript because they all use the lookbehind.
the text is as followed
"{Code} - {Description}"
I need Code and Description to be matched with out the brackets
the closest I have gotten is this
/{([\s\S]*?)(?=})/g
leaving me with "{Code" and "{Description" and I followed it with
doing a substring.
so... is there a way to do a lookbehind type of functionality in Javascript?
You could simply try the below regex,
[^}{]+(?=})
Code:
> "{Code} - {Description}".match(/[^}{}]+(?=})/g)
[ 'Code', 'Description' ]
Use it as:
input = '{Code} - {Description}';
matches = [], re = /{([\s\S]*?)(?=})/g;
while (match = re.exec(input)) matches.push(match[1]);
console.log(matches);
["Code", "Description"]
Actually, in this particular case, the solution is quite easy:
s = "{Code} - {Description}"
result = s.match(/[^{}]+(?=})/g) // ["Code", "Description"]
Have you tried something like this, which doesn't need a lookahead or lookbehind:
{([^}]*)}
You would probably need to add the global flag, but it seems to work in the regex tester.
The real problem is that you need to specify what you want to capture, which you do with capture groups in regular expressions. The part of the matched regular expression inside of parentheses will be the value returned by that capture group. So in order to omit { and } from the results, you just don't include those inside of the parentheses. It is still necessary to match them in your regular expression, however.
You can see how to get the value of capture groups in JavaScript here.
Related
Let's say I have the following string: div.classOneA.classOneB#idOne
Trying to write a regexp which extracts the classes (classOneA, classOneB) from it. I was able to do this but with Lookbehind assertion only.
It looks like this:
'div.classOneA.classOneB#idOne'.match(/(?<=\.)([^.#]+)/g)
> (2) ["classOneA", "classOneB"]
Now I would like to archive this without the lookbehind approach and do not really understand why my solution's not working.
'div.classOneA.classOneB#idOne'.match(/\.([^.#]+)/g)
> (2) [".classOneA", ".classOneB"]
Thought that the grouping will solve my problem but all matching item contains the dot as well.
There isn't a good way in Javascript to both match multiple times (/g option) and pick up capture groups (in the parens). Try this:
var input = "div.classOneA.classOneB#idOne";
var regex = /\.([^.#]+)/g;
var matches, output = [];
while (matches = regex.exec(input)) {
output.push(matches[1]);
}
This is because with g modifier you get all matching substrings but not its matching groups (that is as if (...) pairs worked just like (?:...) ones.
You see. Whithout g modifier:
> 'div.classOneA.classOneB#idOne'.match(/\.([^.#]+)/)
[ '.classOneA',
'classOneA',
index: 3,
input: 'div.classOneA.classOneB#idOne',
groups: undefined ]
With g modifier:
> 'div.classOneA.classOneB#idOne'.match(/\.([^.#]+)/g)
[ '.classOneA', '.classOneB' ]
In other words: you obtain all matches but only the whole match (0 item) per each.
There are many solutions:
Use LookBehind assertions as you pointed out yourself.
Fix each result later adding .map(x=>x.replace(/^\./, ""))
Or, if your input structure won't be much more complicated than the example you provide, simply use a cheaper approach:
> 'div.classOneA.classOneB#idOne'.replace(/#.*/, "").split(".").slice(1)
[ 'classOneA', 'classOneB' ]
Use .replace() + callback instead of .match() in order to be able to access capture groups of every match:
const str = 'div.classOneA.classOneB#idOne';
const matches = [];
str.replace(/\.([^.#]+)/g, (...args)=>matches.push(args[1]))
console.log(matches); // [ 'classOneA', 'classOneB' ]
I would recommend the third one (if there aren't other possible inputs that could eventually break it) because it is much more efficient (actual regular expressions are used only once to trim the '#idOne' part).
If you want to expand you regex. you can simply map on results and replace . with empty string
let op = 'div.classOneA.classOneB#idOne'.match(/\.([^.#]+)/g)
.map(e=> e.replace(/\./g,''))
console.log(op)
If you know you are searching for a text containing class, then you can use something like
'div.classOneA.classOneB#idOne'.match(/class[^.#]+/g)
If the only thing you know is that the text is preceded by a dot, then you must use lookbehind.
This regex will work without lookbehind assertion:
'div.classOneA.classOneB#idOne'.match(/\.[^\.#]+/g).map(item => item.substring(1));
Lookbehind assertion is not available in JavaScript recently.
I'm not an expert on using regex - particularly in Javascript - but after some research on MDN I've figured out why your attempt wasn't working, and how to fix.
The problem is that using .match with a regexp with the /g flag will ignore capturing groups. So instead you have to use the .exec method on the regexp object, using a loop to execute it multiple times to get all the results.
So the following code is what works, and can be adapted for similar cases. (Note the grp[1] - this is because the first element of the array returned by .exec is the entire match, the groups are the subsequent elements.)
var regExp = /\.([^.#]+)/g
var result = [];
var grp;
while ((grp = regExp.exec('div.classOneA.classOneB#idOne')) !== null) {
result.push(grp[1]);
}
console.log(result)
My string:
AA,$,DESCRIPTION(Sink, clinical),$
Wanted matches:
AA
$
DESCRIPTION(Sink, clinical)
$
My regex sofar:
\+d|[\w$:0-9`<>=&;?\|\!\#\+\%\-\s\*\(\)\.ÅÄÖåäö]+
This gives
AA
$
DESCRIPTION(Sink
clinical)
I want to keep matches between ()
https://regex101.com/r/MqFUmk/3
Here's my attempt at the regex
\+d|[\w$:0-9`<>=&;?\|\!\#\+\%\-\s\*\.ÅÄÖåäö]+(\(.+\))?
I removed the parentheses from within the [ ] characters, and allowed capture elsewhere. It seems to satisfy the regex101 link you posted.
Depending on how arbitrary your input is, this regex might not be suitable for more complex strings.
Alternatively, here's an answer which could be more robust than mine, but may only work in Ruby.
((?>[^,(]+|(\((?>[^()]+|\g<-1>)*\)))+)
That one seems to work for me?
([^,\(\)]*(?:\([^\(\)]*\))?[^,\(\)]*)(?:,|$)
https://regex101.com/r/hLyJm5/2
Hope this helps!
Personally, I would first replace all commas within parentheses () with a character that will never occur (in my case I used # since I don't see it within your inclusions) and then I would split them by commas to keep it sweet and simple.
myStr = "AA,$,DESCRIPTION(Sink, clinical),$"; //Initial string
myStr = myStr.replace(/(\([^,]+),([^\)]+\))/g, "$1#$2"); //Replace , within parentheses with #
myArr = myStr.split(',').map(function(s) { return s.replace('#', ','); }); //Split string on ,
//myArr -> ["AA","$","DESCRIPTION(Sink, clinical)","$"]
optionally, if you're using ES6, you can change that last line to:
myArr = myStr.split(',').map(s => s.replace('#', ',')); //Yay Arrow Functions!
Note: If you have nested parentheses, this answer will need a modification
At last take an aproximation of what you need:
\w+(?:\(.*\))|\w+|\$
https://regex101.com/r/MqFUmk/4
I have a url like http://www.somedotcom.com/all/~childrens-day/pr?sid=all.
I want to extract childrens-day. How to get that? Right now I am doing it like this
url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all"
url.match('~.+\/');
But what I am getting is ["~childrens-day/"].
Is there a (definitely there would be) short and sweet way to get the above text without ["~ and /"] i.e just childrens-day.
Thanks
You could use a negated character class and a capture group ( ) and refer to capture group #1. The caret (^) inside of a character class [ ] is considered the negation operator.
var url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all";
var result = url.match(/~([^~]+)\//);
console.log(result[1]); // "childrens-day"
See Working demo
Note: If you have many url's inside of a string you may want to add the ? quantifier for a non greedy match.
var result = url.match(/~([^~]+?)\//);
Like so:
var url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all"
var matches = url.match(/~(.+?)\//);
console.log(matches[1]);
Working example: http://regex101.com/r/xU4nZ6
Note that your regular expression wasn't actually properly delimited either, not sure how you got the result you did.
Use non-capturing groups with a captured group then access the [1] element of the matches array:
(?:~)(.+)(?:/)
Keep in mind that you will need to escape your / if using it also as your RegEx delimiter.
Yes, it is.
url = "http://www.somedotcom.com/all/~childrens-day/pr?sid=all";
url.match('~(.+)\/')[1];
Just wrap what you need into parenteses group. No more modifications into your code is needed.
References: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp
You could just do a string replace.
url.replace('~', '');
url.replace('/', '');
http://www.w3schools.com/jsref/jsref_replace.asp
I have a string I would like to split using #, ., [], or {} characters, as in CSS. The desired functionality is:
- Input:
"div#foo[bar='value'].baz{text}"
- Output:
["div", "#foo", "[bar='value'", ".baz", "{text"]
This is easy enough, with this RegEx:
input.match(/([#.\[{]|^.*?)[^#.\[{\]}]*/g)
However, this doesn't ignore syntax characters inside quotes, as I would like it too. (e.x. "div[bar='value.baz']" should ignore the .)
How can I make the second part of my RegEx (the [^#.\[{\]}]* portion) capture not only the negated character set, but also any character within quotes. In other words, how can I implement the RegEx, (\"|').+?\1 into my current one.
Edit:
I've figured out a regex that works decent, but can't handle escaped-quotes inside quotes (for example: "stuff here \\" quote "). If someone knows how to do that, it would be extremely helpful:
str.match(/([#.\[{]|^.*?)((['"]).*?\3|[^.#\[\]{\}])*/g);
var str = "div#foo[bar='value.baz'].baz{text}";
str.match(/(^|[\.#[\]{}])(([^'\.#[\]{}]+)('[^']*')?)+/g)
// [ 'div', '#foo', '[bar=\'value.baz\'', '.baz', '{text' ]
var tokens = myCssString.match(/\/\*[\s\S]*?\*\/|"(?:[^"\\]|\\[\s\S]*)"|'(?:[^'\\]|\\[\s\S])*'|[\{\}:;\(\)\[\]./#]|\s+|[^\s\{\}:;\(\)\[\]./'"#]+/g);
Given your string, it produces
div
#
foo
[
bar=
'value.foo'
]
.
baz
{
text
}
The RegExp above is loosely based on the CSS 2.1 lexical grammar
Firstly, and i can't stress this enough: you shouldn't use regexps to parse css, you should use a real parser, for instance http://glazman.org/JSCSSP/ or similar - many have built them, no need for you to reinvent the wheel.
that said, to solve your current problem do this:
var str = "div#foo[bar='value.foo'].baz{text}";
str.match(/([#.\[{]|^.*?)(?:[^#\[{\]}]*|\.*)/g);
//["div", "#foo", "[bar='value.foo'", ".baz", "{text"]
I have a div that I am trying to run a regular expression on
<div class="module-header-content module-default">
I am using this replace operation that used to work,but now that I have added the module-header-content class it becomes problematic
replace(/module-\w+/gi, ' ');
I need a regular expression that removes all instances of module- except for module-header-content
Any help.
Thanks
The entire call:
var $target = $(this).parent().parent().parent().parent();
//// Removes all module-xxxx classes
var classes = $target[0].className.replace(/module-\w+/gi, '');
You need a negative lookahead.
module-(?!header-content)\w+
Try this:
str = "module-header-content module-default module-default-foo module-default-foo-bar";
str.replace(/module(?!-header)(-\w+)*/gi, '');
It'll get all classes except "module-header-content".
Expanding on masher's answer, lots of programmers know about using parentheses to get matches within a regex, but the very useful non-matching parentheses are not as well known.
/(foo)/ will match foo and store it in the matches array. But what if you don't want a match to be stored? In that case, you can use ?: inside the parentheses: /(?:foo)/ . This will match the pattern but not store it in the matches array.
You can also search for anything except what is inside the parentheses with ?! so /(?!foo)/ will match anything except 'foo'. If you wanted to store the match, you'd use /[^(foo)]/ .
Yes, regular expressions are wonderful.