Same regex have different results in Java and JavaScript [duplicate] - javascript

This question already has answers here:
Difference between matches() and find() in Java Regex
(5 answers)
Closed 3 years ago.
Same regex, different results;
Java
String regex = "Windows(?=95|98|NT|2000)";
String str = "Windows2000";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
System.out.println(m.matches()); // print false
JavaScript
var value = "Windows2000";
var reg = /Windows(?=95|98|NT|2000)/;
console.info(reg.test(value)); // print true
I can't understand why this is the case?

From the documentation for Java's Matcher#matches() method:
Attempts to match the entire region against the pattern.
The matcher API is trying to apply your pattern against the entire input. This fails, because the RHS portion is a zero width positive lookahead. So, it can match Windows, but the 2000 portion is not matched.
A better version of your Java code, to show that it isn't really "broken," would be this:
String regex = "Windows(?=95|98|NT|2000)";
String str = "Windows2000";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
while (m.find()) {
System.out.println(m.group()); // prints "Windows"
}
Now we see Windows being printed, which is the actual content which was matched.

Related

Pattern working on regex101 but not with Google Script [duplicate]

This question already has answers here:
Why do regex constructors need to be double escaped?
(5 answers)
Closed 4 years ago.
I'm trying to match some paragraphs in Google Docs but the pattern that I wanted to use for it doesn't match the string when run inside a Google Script. However, it works properly on regex101 so I guess I'm missing something. Do you know what?
This is a sample of what I have:
function test() {
var str = "brown fox → jumps over the lazy dog";
var definitionRe = new RegExp('([\w\s]+)\s+[\u2192]\s+(.+)', 'g');
var definitionMatch = definitionRe.exec(str); // null
var dummy = "asdf"; // makes the debugger happy to break here
}
When using a string regex such as new RegExp(...), you need to escape your \'s, so then the following:
var definitionRe = new RegExp('([\w\s]+)\s+[\u2192]\s+(.+)', 'g');
Will become an escaped version like this:
var definitionRe = new RegExp('([\\w\\s]+)\\s+[\\u2192]\\s+(.+)', 'g');
Otherwise you can do a non string version, but you then can no longer concatenate values to the string (If that is something you would like):
var definitionRe = /([\w\s]+)\s+[\u2192]\s+(.+)/g;

Looking to trim a string using javascript / regex [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 4 years ago.
I'm looking for some assistance with JavaScript/Regex when trying to format a string of text.
I have the following IDs:
00A1234/A12
0A1234/A12
A1234/A12
000A1234/A12
I'm looking for a way that I can trim all of these down to 1234/A12. In essence, it should find the first letter from the left, and remove it and any preceding numbers so the final format should be 0000/A00 or 0000/AA00.
Is there an efficient way this can be acheived by Javascript? I'm looking at Regex at the moment.
Instead of focussing on what you want to strip, look at what you want to get:
/\d{4}\/[A-Z]{1,2}\d{2}/
var str = 'fdfhfjkqhfjAZEA0123/A45GHJqffhdlh';
match = str.match(/\d{4}\/[A-Z]{1,2}\d{2}/);
if (match) console.log(match[0]);
You could seach for leading digits and a following letter.
var data = ['00A1234/A12', '0A1234/A12', 'A1234/A12', '000A1234/A12'],
regex = /^\d*[a-z]/gi;
data.forEach(s => console.log(s.replace(regex, '')));
Or you could use String#slice for the last 8 characters.
var data = ['00A1234/A12', '0A1234/A12', 'A1234/A12', '000A1234/A12'];
data.forEach(s => console.log(s.slice(-8)));
You could use this function. Using regex find the first letter, then make a substring starting after that index.
function getCode(s){
var firstChar = s.match('[a-zA-Z]');
return s.substr(s.indexOf(firstChar)+1)
}
getCode("00A1234/A12");
getCode("0A1234/A12");
getCode("A1234/A12");
getCode("000A1234/A12");
A regex such as this will capture all of your examples, with a numbered capture group for the bit you're interested in
[0-9]*[A-Z]([0-9]{4}/[A-Z]{1,2}[0-9]{2})
var input = ["00A1234/A12","0A1234/A12","A1234/A12","000A1234/A12"];
var re = new RegExp("[0-9]*[A-Z]([0-9]{4}/[A-Z]{1,2}[0-9]{2})");
input.forEach(function(x){
console.log(re.exec(x)[1])
});

Nothing to repeat with regexp in Javascript [duplicate]

This question already has an answer here:
Javascript Regular Expression not matching
(1 answer)
Closed 7 years ago.
I'm trying to replace all occurences of {0}, {1}, {2}, etc in a string with Javascript.
Example string:
var str = "Hello, my name is {0} and I'm {1} years.";
I'm tried the following to construct the regexp:
var regex1 = new RegExp("{" + i + "}", "g")
var regex2 = new RegExp("\{" + i + "\}", "g")
Both attempts throws the error:
Invalid regular expression: /{0}/: Nothing to repeat
I use replace like this:
str.replace(regex, "Inserted string");
Found all kinds of StackOverflow posts with different solutions, but not quite to solve my case.
The string literal "\{" results in the string "{". If you need a backslash in there, you need to escape it:
"\\{"
This will results in the regex \{..\}, which is the correct regex syntax.
Having said that, your approach is more than weird. Using a regex you should do something like this:
var substitues = ['foo', 'bar'];
str = str.replace(/\{(\d+)\}/, function (match, num) {
return substitutes[num];
});
In other words, don't dynamically construct a regex for each value; do one regex which matches all values and lets you substitute them as needed.

Extract specific data from JavaScript .getAttribute() [duplicate]

This question already has answers here:
Parse query string in JavaScript [duplicate]
(11 answers)
Closed 8 years ago.
So let's say I have this HTML link.
<a id="avId" href="http://www.whatever.com/user=74853380">Link</a>
And I have this JavaScript
av = document.getElementById('avId').getAttribute('href')
Which returns:
"http://www.whatever.com/user=74853380"
How do I extract 74853380 specifically from the resulting string?
There are a couple ways you could do this.
1.) Using substr and indexOf to extract it
var str = "www.something.com/user=123123123";
str.substr(str.indexOf('=') + 1, str.length);
2.) Using regex
var str = var str = "www.something.com/user=123123123";
// You can make this more specific for your query string, hence the '=' and group
str.match(/=(\d+)/)[1];
You could also split on the = character and take the second value in the resulting array. Your best bet is probably regex since it is much more robust. Splitting on a character or using substr and indexOf is likely to fail if your query string becomes more complex. Regex can also capture multiple groups if you need it to.
You can use regular expression:
var exp = /\d+/;
var str = "http://www.whatever.com/user=74853380";
console.log(str.match(exp));
Explanation:
/\d+/ - means "one or more digits"
Another case when you need find more than one number
"http://www.whatever.com/user=74853380/question/123123123"
You can use g flag.
var exp = /\d+/g;
var str = "http://www.whatever.com/user=74853380/question/123123123";
console.log(str.match(exp));
You can play with regular expressions
Well, you could split() it for a one liner answer.
var x = parseInt(av.split("=")[1],10); //convert to int if needed

JavaScript Regular Expressions - g modifier doesn't work [duplicate]

This question already has answers here:
Why does a RegExp with global flag give wrong results?
(7 answers)
Closed 5 years ago.
I have the following code :
var str = "4 shnitzel,5 ducks";
var rgx = new RegExp("[0-9]+","g");
console.log( rgx.exec(str) );
The output on chrome and firefox is ["4"].
Why don't I get the result ["4","5"]?
exec does only search for the next match. You need to call it multiple times to get all matches:
If your regular expression uses the "g" flag, you can use the exec method multiple times to find successive matches in the same string.
You can do this to find all matches with exec:
var str = "4 shnitzel,5 ducks",
re = new RegExp("[0-9]+","g"),
match, matches = [];
while ((match = re.exec(str)) !== null) {
matches.push(match[0]);
}
Or you simply use the match method on the string `str:
var str = "4 shnitzel,5 ducks",
re = new RegExp("[0-9]+","g"),
matches = str.match(re);
By the way: Using the RegExp literal syntax /…/ is probably more convenient: /[0-9]+/g.
exec() always only returns one match. You get further matches you need to call exec repeatedly.
https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/RegExp/exec

Categories