Here is my js Regex test.
'AAa\nbBB'.match(/AA[.\n]+BB/);//failed match
I thought [.\n]+ could match any characters. Am I wrong?
The dot matches a literal dot inside a character class.
Use 'AAa\nbBB'.match(/AA[\s\S]*BB/); instead.
In most regex flavors, you could set the /s flag to allow the dot to match newlines (for a regex like /AA.*BB/s). But in JavaScript, that feature is not available, so you need to use [\s\S] to match any character.
Related
I've this javascript regular expression that check if an URI is valid (RFC 3986):
/^(https?):\/\/((?:[a-z0-9.-]|%[0-9A-F]{2}){3,})(?::(\d+))?((?:\/(?:[a-z0-9-._~!$&'()*+,;=:#]|%[0-9A-F]{2})*)*)(?:\?((?:[a-z0-9-._~!$&'()*+,;=:\/?#]|%[0-9A-F]{2})*))?(?:#((?:[a-z0-9-._~!$&'()*+,;=:\/?#]|%[0-9A-F]{2})*))?$/i
Now i need to convert that in a MySQL query, using REGEXP.
Eg:
SELECT *
FROM table_name t
WHERE t.uri REGEXP '....'
Could you help me?
You need to
Double all ' chars inside '...' string literals
Replace all (?: with ( as MySQL legacy versions used a POSIX compliant regex engine that does not support non-capturing groups
Certainly remove the first / and last /i since the pattern is passed as a string in MySQL, not as a regex literal, and the pattern is case insensitive by default, no need to add i anywhere (or add A-Z manually in case some global settings are overridden)
Replace all \/ with / just to keep the regex clean
Replace \d with [0-9] (again, POSIX is not aware of shorthand character classes, although you may also use POSIX character classes, e.g. [[:digit:]] to match any digit)
Most likely, replace \? with \\?, or just use [?] to match a literal ? symbol
Always use a literal hyphen at the end or start of a character class (bracket expression in POSIX regex).
Use
WHERE t.uri REGEXP '^https?://(([A-Za-z0-9.-]|%[0-9A-Fa-f]{2}){3,})(:[0-9]+)?((/([A-Za-z0-9._~!$&''()*+,;=:#-]|%[0-9A-Fa-f]{2})*)*)([?](([A-Za-z0-9._~!$&''()*+,;=:/?#-]|%[0-9a-fA-F]{2})*))?(#(([A-Za-z0-9._~!$&''()*+,;=:/?#-]|%[0-9A-Fa-f]{2})*))?$'
I have a special requirement, where i need the achieve the following
No Special Character is allowed except _ in between string.
string should not start or end with _, . and numeric value.
underscore should not be allowed before or after any numeric value.
I am able to achieve most of it, but my RegEx pattern is also allowing other special characters.
How can i modify the below RegEx pattern to not allow any special character apart from underscore that to in between strings.
^[^0-9._]*[a-zA-Z0-9_]*[^0-9._]$
What you might do is use negative lookaheads to assert your requirements:
^(?![0-9._])(?!.*[0-9._]$)(?!.*\d_)(?!.*_\d)[a-zA-Z0-9_]+$
Explanation
^ Assert the start of the string
(?![0-9._]) Negative lookahead to assert that the string does not start with [0-9._]
(?!.*[0-9._]$) Negative lookahead to assert that the string does not end with [0-9._]
(?!.*\d_) Negative lookahead to assert that the string does not contain a digit followed by an underscore
(?!.*_\d) Negative lookahead to assert that the string does not contain an underscore followed by a digit
[a-zA-Z0-9_]+ Match what is specified in the character class one or more times. You can add to the character class what you would allow to match, for example also add a .
$ Assert the end of the string
Regex demo
Keep it simple. Only allow underscore and alphanumeric regex:
/^[a-zA-Z0-9_]+$/
Javascript es6 implementation (works for React):
const re = /^[a-zA-Z0-9_]+$/;
re.test(variable_to_test);
Your opening and closing sections; [^0-9._], say match ANY character other than those.
So you need to change it to be what you can match.
/^[A-Z][A-Z0-9_]*[A-Z]$/i
And since you now said one character is valid:
/^[A-Z]([A-Z0-9_]*[A-Z])?$/i
I am getting a string containing newlines (/n), tabs (/t) and lowercase letters [a-z]. It is possible to do that by matching /\n|\t/. AFAIK the dot represents the wildcard.
Therefore I was wondering, why /\n|\t/ doesn't match the same things as /\\./
var text = 'test1 \ntest2';
text.split(/\n/) //['test1', 'test2']
text.split(/\./) //['test1 \ntest2']
text.split(/\\./) //['test1 \ntest2']
Shouldn't the \\. match the \n (newline)?
Let me try and answer all the points:
AFAIK the dot represents the wildcard.
No, in regex, we do not use the term "wildcard". It is a special regex (meta)character. A dot in JavaScript regex matches any character but a newline.
I was wondering, why /\n|\t/ doesn't match the same things as /\\./
Because /\n|\t/ matches 1 symbol, either a newline or tab, while the regex /\\./ matches a literal \ and a character other than a newline.
The \n and \t are escape sequences. That means that the \ is not a literal backaslash that, together with the following symbol forms a code unit, a string that cannot be written otherwise. Indeed, how can we write a line break on the paper with a pen? No way!
See more about JavaScript character escape sequences here.
Now,
text.split(/\n/) //['test1', 'test2']
True, your input string contains a line break, thus, you get two elements in the resulting array
text.split(/\./) //['test1 \ntest2']
No match was found because \. matches a literal dot. A dot that is escaped (that has a literal \ before it) in the regex stops being a special regex metacharacter, and just matches its literal representation. Your string has no dot, thus, no matches.
text.split(/\\./) //['test1 \ntest2']
Again, no match is found, as /\\./ looks for a literal \ followed by any character but a newline.
A hint: use your expressions at regex101.com, it will tell you what your regex can match on the right.
Here, with regex, you have a literal notation (/.../). In literal notation, \ is considered a literal, thus, you do not have to escape it twice. If you used a constructor notation (i.e. RegExp(....)), you would have to use double escaping. E.g.
var re = /\\./; // is equal to
var re = new RegExp("\\\\.");
See more about constructor and literal notations at MDN RegExp help page.
\n gets evaluated to a new line, so you're essentially matching against an empty string. If you do a quick console.log('\n'); you can see the output of that.
Using Jquery validator plugin in my implementation. Need a regular expression which excludes special characters like , and &.
is there any regular expression for this. also if this special characters are anywhere in the string it should find and throw the error.
You can use regular expressions like this:
[\,\&]
you can add as much as u want to this.
try it out yourself on this site:
http://www.regexr.com/
/[,&]/g
matches , and &.
Demo: https://regex101.com/r/gY0mC3/2#javascript
If you want to search for every special character except letters, numbers and the underscore, use
/\W/g
Demo: https://regex101.com/r/gY0mC3/5#javascript
If you need to include spaces (e.g. a name) use
/[^\w\s]/g
Demo: https://regex101.com/r/gY0mC3/4#javascript
The brackets [] define custom regex classes.
To match a character for only those characters, you can do [\,\&].
To match all except that, you can add a ^, such as [^\,\&].
To match any non-word character, you can use \W (any character not a-z, A-Z, 0-9, or _).
To include an underscore, you can do [\W_].
Keep in mind that whitespaces are represented by \s and that depending on your environment, you may need to escape (add an additional backslash to) your backslashes.
I do not know what I am doing wrong. I have this string that I want to replace
<?xml version="1.0" encoding="utf-8" ?>
<Sections>
<Section>
I am using regex to replace everything including <Section>, and leave the rest untouched.
arrayValues[index].replace("/[([.,\n,\s])*<Section>]/", "---");
What is wrong with my regex? Doesn't this mean repalce every character, including new line and spaces, up to and including <Section> with ---?
First of all, you need to remove the quotes around your regex—if they're there, the argument won't be processed as a regex. JavaScript will see it as a string (because it is a string) and try to match it literally.
Now that that's taken care of, we can simplify your regex a bit:
arrayValues[index].replace(/[\s\S]*?<Section>/, "---");
[\s\S] gets around JavaScript's lack of an s flag (a handy option supported by most languages that enables . to match newlines). \s does match newlines (even without an s flag specified), so the character class [\s\S] tells the regex engine to match:
\s - a whitespace character, which could be a newline
OR
\S - a non-whitespace character
So you can think of [\s\S] as matching . (any character except a newline) or the literal \n (a newline). See Javascript regex multiline flag doesn't work for more.
? is used to make the initial [\s\S]* match non-greedy, so the regex engine will stop once it hits the first occurrence of <Section>.
arrayValues[index].replace("/[([.,\n,\s])*<Section>]/", "---");
What is wrong with my regex?
It's no regex, it's string literal. A string would be converted to a regex, but yours would then include the slashes. Use a regex literal instead:
arrayValues[index].replace(/[\S\s]*<Section>/, "---");
Also, you have too many unnecessary characters in it. The [] around the whole thing build a character class, which is not what you want. The capturing group () just wraps a character class which can be repeated itself. And a dot . inside a character class does match a literal dot, instead of all characters.