Regular expression to match all ranges not between brackets or quotes - javascript

I have the following string:
'b1:b10 + sum(a1:a10, sum(b1:b21)) + a1 + "d23:d44" '
I want to extract all the ranges in the string (a range is b1:b10 or a1), so I use this regular expression:
var rxRanges = new
RegExp('(([a-z]+[0-9]+[:][a-z]+[0-9]+)|([a-z]+[0-9]+))', 'gi');
This is returns all my ranges, so it returns: [b1:b10, a1:a10, b1:b21, a1, d23:d44].
I now want to modify this regular expression to only search for the root ranges, in other words return ranges not between specifically brackets or quotes. So I am looking for this: ["b1:b10","a1"]
Not sure how to approach this?

#Updated according to comments
You can achieve that using a negative lookahead:
/(?=[^"]*(?:"[^"]*"[^"]*)*$)(?![^(]*[,)])[a-z]\d+(:\w+)?/gi
Live demo

Get the matched group from index 2
(^|[^(\"])([a-z]+[0-9]+:[a-z]+[0-9]+)
Here is demo
Note: I think there is no need to check for both end If needed then add ($|[^(\"]) in the the end of the above regex pattern.
Pattern explanation:
( group and capture to \1:
^ the beginning of the line/string
| OR
[^(\"] any character except: '(', '"'
) end of \1
( group and capture to \2:
[a-z]+ any character of: 'a' to 'z' (1 or more times)
[0-9]+ any character of: '0' to '9' (1 or more times)
: ':'
[a-z]+ any character of: 'a' to 'z' (1 or more times)
[0-9]+ any character of: '0' to '9' (1 or more times)
) end of \2
Sample code:
var str = 'b1:b10 + sum(a1:a10, sum(b1:b21)) + (a1) + "d23:d44" ';
var re = /(^|[^(\"])([a-z]+[0-9]+:[a-z]+[0-9]+)/gi;
var found = str.match(re);
alert(found);

Related

Regex to match character unless it is preceded by an odd number of another specific character

Another way to state my problem is to match a character always when it is preceded by an even number (0, 2, 4, ...) of another specific character.
In my case I want to match all ' characters in string unless it is preceded by an odd number (1, 3, 5 ...) of ?
example:
- ?' => shouldn't match (preceded by one ?)
- ??' => Should match (preceded by 2 ?)
- ?????' => Shouldn't match (preceded by 5 ?)
Lets consider this scenario:
We have this string : ' ??' ????' ?' ??????' then the regex should match all ' characters in this case except for the 4th one, so for example if I want to use String.split(regex) the result would be ['', '??', '????', ?' '??????']
Currently I was using this regex: (?<!\?)', but the problem is that it matches only if there is no ? before '
You can use
/(?<=(?<!\?)(?:\?\?)*)'/g
See the regex demo. Details:
(?<=(?<!\?)(?:\?\?)*) - a positive lookbehind that matches a location that is preceded with any zero or more occurrences of double ? not immediately preceded with another ?
' - a ' char.
Sample code:
const texts = ["The ?' should not match","The ??' should match","?????' => The ?????' should not match"];
const rx = /(?<=(?<!\?)(?:\?\?)*)'/g
for (var text of texts) {
console.log(text, '=>', rx.test(text));
}
If you need replacing, it is possible with
const texts = ["The ?' should not match","The ??' should match","?????' => The ?????' should not match"];
const rx = /(?<=(?<!\?)(?:\?\?)*)'/g
for (var text of texts) {
console.log(text, '=>', text.replace(rx, '<MATCH>$&</MATCH>'));
}
You may use this regex with a lookbehind condition:
(?<=([^?]|^)(?:\?\?)*)'
RegEx Demo
RegEx Explanation
(?<=: Start lookbehind condition
([^?]|^): Match a non-? character or start
(?:\?\?)*: Match 0 or more pairs of ?
): End lookbehind condition
': Match a '

Replace not numbers or words to underscore but leave dash and remove spaces around it

So I got this string
'word word - word word 24/03/21'
And I would like to convert it to
'word_word-word_word_24_03_21'
I have tried this
replace(/[^aA-zZ0-9]/g, '_')
But I get this instead
word_word___word_word_24_03_21
You can use 2 .replace() calls:
const s = 'word word - word word 24/03/21'
var r = s.replace(/\s*-\s*/g, '-').replace(/[^-\w]+/g, '_')
console.log(r)
//=> "word_word-word_word_24_03_21"
Explanation:
.replace(/\s*-\s*/g, '-'): Remove surrounding spaces of a hyphen
.replace(/[^-\w]+/g, '_'): Replace all character that are not a hyphen and not a word character with an underscore
You can use
console.log(
'word word - word word 24/03/21'.replace(/\s*(-)\s*|[^\w-]+/g, (x,y) => y || "_")
)
Here,
/\s*(-)\s*|[^\w-]+/g - matches and captures into Group 1 a - enclosed with zero or more whitespaces, and just matches any non-word char excluding -
(x,y) => y || "_") - replaces with Group 1 if it was matched, and if not, replacement is a _ char.
With a function for replace and an alternation in the pattern, you could also match:
(\s*-\s*) Match a - between optional whtiespace chars
| Or
[^a-zA-Z0-9-]+ Match 1+ times any of the listed ranges
In the callback, check if group 1 exists. If it does, return only a -, else return _
Note that this notation [^aA-zZ0-9] is not the same as [a-zA-Z0-9], see what [A-z] matches.
let s = "word word - word word 24/03/21";
s = s.replace(/(\s*-\s*)|[^a-zA-Z0-9-]+/g, (_, g1) => g1 ? "-" : "_");
console.log(s);
You can use the + regex operator to replace 1 or more continuous matches at once.
let s = 'word word - word word 24/03/21';
let r = s
.replace(/[^aA-zZ0-9]*-[^aA-zZ0-9]*/g, '-')
.replace(/[^aA-zZ0-9-]+/g, '_');
console.log(r);
// 'word_word-word_word_24_03_21'

Recursively patten js

I want to check a recursively text that verufy three rules.
1º: All the string should be a sequence of numbers between 0-31 + a dot .
Example: 1.23.5.12
2º: The string can't begin or end with a dot.
Like this.
.1.23.5.12.
3º You can write a max of 51 digits (following the previous rules)
I tried to make a pattern to my js function. But this dont work.
This is my function:
var str = document.getElementById("numero").value;
var patt1 = /^[0-9]+\./g;
var result = str.match(patt1);
document.getElementById("demo").innerHTML = result;
What is wrong in the pattern?
You may use
/^(?!(?:\D*\d){52})(?:[12]?\d|3[01])(?:\.(?:[12]?\d|3[01]))*$/
See the regex demo
Details
^ - start of string
(?!(?:\D*\d){52}) - fail if there are 52 or more digits separated with any 0+ non-digits
(?:[12]?\d|3[01]) - 1 or 2 (optional) followed with any single digit or 3 followed with 0 or 1 (0 - 31)
(?:\.(?:[12]?\d|3[01]))* - zero or more consecutive repetitions of
\. - dot
(?:[12]?\d|3[01]) - see above (0 - 31)
$ - end of string.
Use it with test:
if (/^(?!(?:\D*\d){52})(?:[12]?\d|3[01])(?:\.(?:[12]?\d|3[01]))*$/.test(str)) {
// Valid!
}
Test:
var rx = /^(?!(?:\D*\d){52})(?:[12]?\d|3[01])(?:\.(?:[12]?\d|3[01]))*$/;
var strs = [".12", "123", "1.23.5.12", "12345678"];
for (var s of strs) {
console.log(s, "=>", rx.test(s));
}
The regex ^[0-9]+\. matches from the start of the string ^ one or more digits [0-9]+ followed by a dot \.
You might use:
^(?!(\.?\d){52})(?:[0-9]|[12][0-9]|3[01])(?:\.(?:[0-9]|[12][0-9]|3[01]))+$
Explanation
^ Assert the start of the line
(?!(\.?\d){52}) Negative lookahead to assert that what follows is not 52 times an optional dot followed by one or more digits
(?:[0-9]|[12][0-9]|3[01]) Match a number 0 - 31
(?:\.(?:[0-9]|[12][0-9]|3[01]))+ Repeat in a group matching a dot followed by a number 0 - 31 and repleat that one or more times so that a single digit wihtout a dot does not match
$ Assert the end of the string
const strings = [
'1.23.5.12',
'1.23.5.12.',
'.1.23.5.12.',
'1.23.5.12',
'1',
'1.23.5.12.1.23.5.1.23.5.12.1.23.5.1.23.5.12.1.23.5.1.23.5.12.1.23.5.1.23.5.12.1.23.5.2',
'1.23.5.12.1.23.5.12.1.23.5.12.1.23.5.12.1.23.5.12.1.23.5.12.1.23.5.12.1.23.5.12.1.23.5.12'
];
let pattern = /^(?!(\.?\d){52})(?:[0-9]|[12][0-9]|3[01])(?:\.(?:[0-9]|[12][0-9]|3[01]))+$/;
strings.forEach((s) => {
console.log(s + " ==> " + pattern.test(s));
});

Format and replace a string with a regular expression

I have a number that's at least 7 digits long.
Typical examples: 0000123, 00001234, 000012345
I want to transform them so that they become respectively:
01:23, 12:34, 23:45
Which mean replacing the whole string by the last 4 characters and putting a colon in the middle.
I can get the last 4 digits with (\d{4})$
And I can get 2 groups with this: (\d{2})(\d{2})$
With the last option, on a string 0000123 $1:$2 match gives me 00001:23
where I want 01:23
I replace the string like so:
newVal = val.replace(/regex/, '$1:$2');
You need to match the beginning digits with \d* (or with just .* if there can be anything):
var val = "0001235";
var newVal = val.replace(/^\d*(\d{2})(\d{2})$/, '$1:$2');
console.log(newVal);
Pattern details:
^ - start of string
\d* - 0+ digits (or .* will match any 0+ chars other than line break chars)
(\d{2}) - Group 1 capturing 2 digits
(\d{2}) - Group 2 capturing 2 digits
$ - end of string.
As Alex K. said, no need for a regular expression, just extract the parts you need with substr:
val = val.substr(-4, 2) + ":" + val.substr(-2);
Note that when the starting index is negative, it's from the end of the string.
Example:
function update(val) {
return val.substr(-4, 2) + ":" + val.substr(-2);
}
function test(val) {
console.log(val + " => " + update(val));
}
test("0000123");
test("0001234");
test("000012345");
You could throw the first characters away and the replace only the last matched parts.
console.log('00000001234'.replace(/^(.*)(\d{2})(\d{2})$/, '$2:$3'));
Use this regex: ^(\d+?)(\d{2})(\d{2})$:
var newVal = "0000123".replace(/^(\d+?)(\d{2})(\d{2})$/, '$2:$3');
console.log(newVal);

JS regex return string from url

I have the following URL structure:
https://api.bestschool.com/student/1102003120009/tests/json
I want to cut the student ID from the URL. So far I've came up with this:
/(student\/.*[^\/]*)/
which returns
student/1102003120009/tests/json
I only want the ID.
Your regex (student\/.*[^\/]*) matches and captures into Group 1 a literal sequence student/, then matches any characters other than a newline, 0 or more occurrences (.*) - that can match the whole line at once! - and then 0 or more characters other than /. It does not work because of .*. Also, a capturing group should be moved to the [^\/]* pattern.
You can use the following regex and grab Group 1 value:
student\/([^\/]*)
See regex demo
The regex matches student/ literally, and then matches and captures into Group 1 zero or more symbols other than /.
Alternatively, if you want to avoid using capturing, and assuming that the ID is always numeric and is followed by /tests/, you can use the following regex:
\d+(?=\/tests\/)
The \d+ matches 1 or more digits, and (?=\/tests\/) checks if right after the digits there is a /tests/ character sequence.
var re = /student\/([^\/]*)/;
var str = 'https://api.bestschool.com/student/1102003120009/tests/json';
var m = str.match(re);
if (m !== null) {
document.getElementById("r").innerHTML = "First method : " + m[1] + "<br/>";
}
var m2 = str.match(/\d+(?=\/tests\/)/);
if (m2 !== null) {
document.getElementById("r").innerHTML += "Second method: " + m2;
}
<div id="r"/>

Categories