Is there any simple way to check if first and last character of a string are the same or not, only with regex?
I know you can check with charAt
var firstChar = str.charAt(0);
var lastChar = str.charAt(length-1);
console.log(firstChar===lastChar):
I'm not asking for this : Regular Expression to match first and last character
You can use regex with capturing group and its backreference to assert both starting and ending characters are same by capturing the first caharacter. To test the regex match use RegExp#test method.
var regex = /^(.).*\1$/;
console.log(
regex.test('abcdsa')
)
console.log(
regex.test('abcdsaasaw')
)
Regex explanation here :
^ asserts position at start of the string
1st Capturing Group (.)
.* matches any character (except newline) - between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\1 matches the same text as most recently matched by the 1st capturing group
$ asserts position at the end of the string
The . doesn't include newline character, in order include newline update the regex.
var regex = /^([\s\S])[\s\S]*\1$/;
console.log(
regex.test(`abcd
sa`)
)
console.log(
regex.test(`ab
c
dsaasaw`)
)
Refer : How to use JavaScript regex over multiple lines?
Regex explanation here :
[.....] - Match a single character present
\s - matches any whitespace character (equal to [\r\n\t\f\v ])
\S - matches any non-whitespace character (equal to [^\r\n\t\f ])
finally [\s\S] is matches any character.
You can try it
const rg = /^([\w\W]+)[\w\W]*\1$/;
console.log(
rg.test(`abcda`)
)
console.log(
rg.test(`aebcdae`)
)
console.log(
rg.test(`aebcdac`)
)
var rg = /^([a|b])([a|b]+)\1$|^[a|b]$/;
console.log(rg.test('aabbaa'))
console.log(rg.test('a'))
console.log(rg.test('b'))
console.log(rg.test('bab'))
console.log(rg.test('baba'))
This will make sure that characters are none other than a and b which have the same start and end.
It will also match single characters because they too start and end with same character.
Related
i was trying to create a regex that could match numbers within brackets or not, for example:
(1.000.000,00) //match
(1.000,00) //match
(100,00) //match
(10) //match
(1) //match
(2.000.000,00 //dont't match
(2.000,00 //dont't match
(200,00 //dont't match
(20 //dont't match
(2 //dont't match
3.000.000,00) //dont't match
3.000,00) //dont't match
300,00) //dont't match
30) //dont't match
3) //dont't match
4.000.000,00 //should match
4.000,00 //should match
400,00 //should match
40 //should match
4 //should match
I need to match only numbers(in brackets or not), but only if they have all brackets (2) or none(0)
At the moment this is what i came up with: \((\d+[\.,]?)+\d*\), it matches the --match and doesn't match the --don't match but should match also the --should match
I've added the javascript tag because i'm using this regex in js and not all of the regex tokens work in the js regex constructor
I'm posting also a regex101 link
If supported, you can usea negative lookbehind to match either with or without parenthesis:
\(\d+(?:[.,]\d+)*\)|(?<!\S)\d+(?:[.,]\d+)*(?!\S)
\( Match (
\d+(?:[.,]\d+)* Match 1+ digits and optionally repeat matching . or , and again 1+ digits
\) Match )
| Or
(?<!\S) Negative lookbehind, assert a word boundary to the left
\d+(?:[.,]\d+)* Match 1+ digits and optionally repeat matching . or , and again 1+ digits
(?!\S) Negative lookahead, assert a whitespace boundary to the right
Regex demo
Another option could be matching optional parenthesis at both sides, and only keep the ones that have either an opening and closing parenthesis, or none.
const regex = /\(?\d+(?:[.,]?\d+)*\)?/
const strings = ["(1.000.000,00)", "(1.000,00)", "(100,00)", "(10)", "(1)", "(2.000.000,00", "(2.000,00", "(200,00", "(20", "(2", "3.000.000,00)", "3.000,00)", "300,00)", "30)", "3)", "4.000.000,00", "4.000,00", "400,00", "40", "4"];
strings.forEach(s => {
const m = s.match(regex);
const firstChar = s.charAt(0);
const lastChar = s.charAt(s.length - 1);
if (
m &&
(firstChar !== '(' && lastChar !== ')') ||
firstChar === '(' && lastChar === ')'
) {
console.log(s)
}
});
If you don't want to repeat the part matching numbers (which in this case is short, so maybe an exception to the DRY rule is warranted), you can reach for \(?((\d+[\.,]?)+\d*)\)?(?<=\(\1\)|(^|[^(\d.,])\1(?=($|[^\d.,)]))).
Edit: this is broken.
It will match numbers like (7 as it only matches the number and ignores the parentheses in that case.
Kept here for future reference.
It's usually easier to do regex in multiple passes, but here goes:
/(\((\d+[\.,]?)+\d*\))|(\d+[\.,]?\d*)/gm
You can test it on https://regex101.com/.
Usually it's better to process something in multiple passes as you can see the regex becomes even more unreadable.
I took your regex and just split it into two regexes: one that requires the parentheses and one that doesn't, then combined them with the or operator.
Note that this regex will allow things like "123.3,5.7" as one number, and the capturing groups will be nasty.
I'm trying to find a regular expression that will match the base string without the optional trailing number (_123). e.g.:
lorem_ipsum_test1_123 -> capture lorem_ipsum_test1
lorem_ipsum_test2 -> capture lorem_ipsum_test2
I tried using the following expression, but it would only work when there is a trailing _number.
/(.+)(?>_[0-9]+)/
/(.+)(?>_[0-9]+)?/
Similarly, adding the ? (zero or more) quantifier only worked when there is no trailing _number, otherwise, the trailing _number would just be part of the first capture.
Any suggestions?
You may use the following expression:
^(?:[^_]+_)+(?!\d+$)[^_]+
^ Anchor beginning of string.
(?:[^_]+_)+ Repeated non capturing group. Negated character set for anything other than a _, followed by a _.
(?!\d+$) Negative lookahead for digits at the end of the string.
[^_]+ Negated character set for anything other than a _.
Regex demo here.
Please note that the \n in the character sets in the Regex demo are only for demonstration purposes, and should by all means be removed when using as a pattern in Javascript.
Javascript demo:
var myString = "lorem_ipsum_test1_123";
var myRegexp = /^(?:[^_]+_)+(?!\d+$)[^_]+/g;
var match = myRegexp.exec(myString);
console.log(match[0]);
var myString = "lorem_ipsum_test2"
var myRegexp = /^(?:[^_]+_)+(?!\d+$)[^_]+/g;
var match = myRegexp.exec(myString);
console.log(match[0]);
You might match any character and use a negative lookahead that asserts that what follows is not an underscore, one or more digits and the end of the string:
^(?:(?!_\d+$).)*
Explanation
^ Assert start of the string
(?: Non capturing group
(?! Negative lookahead to assert what is on the right side is not
_\d+$Match an underscore, one or more digits and assert end of the string
.) Match any character and close negative lookahead
)* Close non capturing group and repeat zero or more times
Regex demo
const strings = [
"lorem_ipsum_test1_123",
"lorem_ipsum_test2"
];
let pattern = /^(?:(?!_\d+$).)*/;
strings.forEach((s) => {
console.log(s + " ==> " + s.match(pattern)[0]);
});
You are asking for
/^(.*?)(?:_\d+)?$/
See the regex demo. The point here is that the first dot pattern must be non-greedy and the _\d+ should be wrapped with an optional non-capturing group and the whole pattern (especially the end) must be enclosed with anchors.
Details
^ - start of string
(.*?) - Capturing group 1: any zero or more chars other than line break chars, as few as possible due to the non-greedy ("lazy") quantifier *?
(?:_\d+)? - an optional non-capturing group matching 1 or 0 occurrences of _ and then 1+ digits
$ - end of string.
However, it seems easier to use a mere replacing approach,
s = s.replace(/_\d+$/, '')
If the string ends with _ and 1+ digits, the substring will get removed, else, the string will not change.
See this regex demo.
Try to check if the string contains the trailing number. If it does you get only the other part. Otherwise you get the whole string.
var str = "lorem_ipsum_test1_123"
if(/_[0-9]+$/.test(str)) {
console.log(str.match(/(.+)(?=_[0-9]+)/g))
} else {
console.log(str)
}
Or, a lot more concise:
str = str.replace(/_[0-9]+$/g, "")
I am trying to match a string of the type $word1.word2.word3, which contain dots inside, but should not end with a dot.
In other words:
$context.abc.value, $context.abc.value.random() - should match full string
$context.abc.value. - should match everything except the last character (dot).
My regex for now is:
(?:^|\s)\$(?!\d)[\w.\[\]\(\)]+
Here's a fiddle to play with: https://regex101.com/r/PxCtUv/1
How can I avoid matching the trailing dot character?
You may "decompose" the last [a.]+ pattern into [a.]*[a]:
(?:^|\s)\$(?!\d)[\w.[\]()]*[\w[\]()]
^^^^^^^^^^^^^^^^^^^^
See the regex demo.
Details
(?:^|\s) - a non-capturing group matching either start of string (^) or (|) a whitespace (\s)
\$ - a $ char
(?!\d) - a negative lookahead that fails the match if there is a digit right after the $ char
[\w.[\]()]* - zero or more word, ., [, ], ( or ) chars
[\w[\]()] - a word, ., [, ], ( or ) char.
Use the following one:
\$(?!\d)[\w]+([.]{0,1}[\w()]+)+
You can try it the following way: https://regex101.com/r/2ONUHj/1
This will not match the whitespaces before the $ sign.
it will allow only one .
There could be a lot of other edge cases not defined here.
For further details use the explanation on https://regexr.com/.
You could do it without using regex, if you wanted to:
const data = [
'context.abc.value',
'$context.abc.value.random()',
'$context.abc.value.'
];
const filtered = data.filter(item => Array.from(item).pop() !== '.');
// [ 'context.abc.value', '$context.abc.value.random()' ]
I am trying to replace in a formula all floating numbers that miss the preceding zero. Eg:
"4+.5" should become: "4+0.5"
Now I read look behinds are not supported in JavaScript, so how could I achieve that? The following code also replaces, when a digit is preceding:
var regex = /(\.\d*)/,
formula1 = '4+1.5',
formula2 = '4+.5';
console.log(formula1.replace(regex, '0$1')); //4+10.5
console.log(formula2.replace(regex, '0$1')); //4+0.5
Try this regex (\D)(\.\d*)
var regex = /(\D)(\.\d*)/,
formula1 = '4+1.5',
formula2 = '4+.5';
console.log(formula1.replace(regex, '$10$2'));
console.log(formula2.replace(regex, '$10$2'));
You may use
s = s.replace(/\B\.\d/g, '0$&')
See the regex demo.
Details
\B\. - matches a . that is either at the start of the string or is not preceded with a word char (letter, digit or _)
\d - a digit.
The 0$& replacement string is adding a 0 right in front of the whole match ($&).
JS demo:
var s = "4+1.5\n4+.5";
console.log(s.replace(/\B\.\d/g, '0$&'));
Another idea is by using an alternation group that matches either the start of the string or a non-digit char, capturing it and then using a backreference:
var s = ".4+1.5\n4+.5";
console.log(s.replace(/(^|\D)(\.\d)/g, '$10$2'));
The pattern will match
(^|\D) - Group 1 (referred to with $1 from the replacement pattern): start of string (^) or any non-digit char
(\.\d) - Group 2 (referred to with $2 from the replacement pattern): a . and then a digit
I have spent the last couple of hours trying to figure out how to match all whitespace (\s) unless followed by AND\s or preceded by \sAND.
I have this so far
\s(?!AND\s)
but it is then matching the space after \sAND, but I don't want that.
Any help would be appreciated.
Often, when you want to split by a single character that appears in specific context, you can replace the approach with a matching one.
I suggest matching all sequences of non-whitespace characters joined with AND enclosed with whitespace ones before and then match any other non-whitespace sequences. Thus, we'll ensure we get an array of necessary substrings:
\S+\sAND\s\S+|\S+
See regex demo
I assume the \sAND\s pattern appears between some non-whitespace characters.
var re = /\S+\sAND\s\S+|\S+/g;
var str = 'split this but don\'t split this AND this';
var res = str.match(re);
document.write(JSON.stringify(res));
As Alan Moore suggests, the alternation can be unrolled into \S+(?:\sAND\s\S+)*:
\S+ - 1 or more non-whitespace characters
(?:\sAND\s\S+)* - 0 or more (thus, it is optional) sequences of...
\s - one whitespace (add + to match 1 or more)
AND - literal AND character sequence
\s - one whitespace (add + to match 1 or more)
\S+ - one or more non-whitespace symbols.
Since JS doesn't support lookbehinds, you can use the following trick:
Match (\sAND\s)|\s
Throw away any match where $1 has a value
Here's a short example which replaces the spaces you want with an underscore:
var str = "split this but don't split this AND this";
str = str.replace(/(\sAND\s)|\s/g, function(m, a) {
return a ? m : "_";
});
document.write(str);