Could anyone give an explain on following javascript RE code? - javascript

Could anyone give an explain on following example code?
it's from the last example here.
Not sure why there's no '\' before the '.' , it can get same result by adding '\'.
JavaScript:
var url = "http://xxx.domain.com";
print(/[^.]+/.exec(url)[0].substr(7)); // prints "xxx"

Note the paragraph here regarding Metacharacters Inside Character Classes
Note that the only special characters or metacharacters inside a character class are the closing bracket (]), the backslash (\), the caret (^) and the hyphen (-). The usual metacharacters are normal characters inside a character class, and do not need to be escaped by a backslash.

Get the chars up to the first period, then remove the first 7 which is the http:// so that leaves you with the first part of the domain which in this case is xxx.
[^.]+ means one or more characters that is not a period so this matches http://xxx. Noe that the period does not need to be escaped inside the brackets to be treated as a normal character as it has no special meaning inside the brackets.
[0] means the entire match which is http://xxx
.substr(7) means to get the characters after the first 7 which will be xxx

Related

Regular Expression for Blocking a character in begining

I am facing an issue with a regular expression while trying to block any string which has minus(-) in the beginning of some white listed characters.
^(?!-.*$).([a-zA-Z0-9-:#\\,()\\/\\.]+)$
It is blocking minus(-) at place and allowing it any where in the character sequence but this regex is not working if the passed string is single character.
For e.g A or 9 etc.
Please help me out with this or give me a good regex to do the task.
Your pattern requires at least 2 chars in the input string because there is a dot after the first lookahead and then a character class follows that has + after it (that is, at least 1 occurrence must be present in the string).
So, you need to remove the dot. Also, you do not need to escape any special char inside a character class. Besides, to avoid matching strings atarting with - a mere (?!-) will suffice, no need adding .*$ there. You may use
^(?!-)[a-zA-Z0-9:#,()/.-]+$
See the regex demo. Remember to escape / if used in a regex literal notation in JavaScript, there is no need to escape it in a constructor notation or in a Java regex pattern.
Details
^ - start of a string
(?!-) - cannot start with -
[a-zA-Z0-9:#,()/.-]+ - 1 or more ASCII letters, digits and special chars defined in the character class (:, #, ,, (, ), /, ., -)
$ - end of string.
If i understand correctly, and you don't want a minus at the beginning, does ^[^-].* work as a regex for you? Java's "matches" would return false if it starts with minus
There is a method in a String class that provides you exactly what you are asking for - it's a startsWith() method - you could use this method in your code like this (you can translate it as "If the given String doesn't start with -, doSomething, in other case do the else part, that can contain some code or might be empty if you want nothing to be done if the given String starts with - ") :
if(!(yourString.startsWith("-"))) {
doSomething()
} else {
doNothingOrProvideAnyInformationAboutWrongInput()
}
I think that it can help you.
^(?!-).*[a-zA-Z0-9-:#\\,()\/\\.]+$

RegEx breaking when string contains square brackets

I've been using this regular expression to pull out mustached {{Hello}} content:
/{{\s*[\w\.]+\s*}}/g
It's falling down when the mustached string contains square brackets. I've been fiddling with it for ages to no avail, could anyone suggest an adjustment that will mean it will match {{Hello[0]}} ?
I'm your Huckleberry:
\{\{(.*?)\}\}
I always hack around with these using the excellent http://www.regexr.com/
So, to explain why this works for this situation:
First, consider \{\{ – we escape (by 'escaping' with a backslash the next character doesn't get evaluated by the expression, e.g. it just looks for that character) the first character we are looking for (the curly brace).
We then repeat that to get the second curly brace.
Next we open a parenthesis ( to make a 'group' to capture multiple tokens – so we can grab everything inside the braces.
The . matches any characters except line breaks.
The * matches zero or more of the preceding token (in this case any token except line breaks)
The ? makes the previous quantifier 'lazy' in that it will match as few as possible.
Then we close the group ).
Finally we close out with the two more escaped characters \}\}

I want to ignore square brackets when using javascript regex [duplicate]

This question already has answers here:
Why is this regex allowing a caret?
(3 answers)
Closed 1 year ago.
I am using javascript regex to do some data validation and specify the characters that i want to accept (I want to accept any alphanumeric characters, spaces and the following !&,'\- and maybe a few more that I'll add later if needed). My code is:
var value = userInput;
var pattern = /[^A-z0-9 "!&,'\-]/;
if(patt.test(value) == true) then do something
It works fine and excludes the letters that I don't want the user to enter except the square bracket and the caret symbols. From all the javascript regex tutorials that i have read they are special characters - the brackets meaning any character between them and the caret in this instance meaning any character not in between the square brackets. I have searched here and on google for an explanation as to why these characters are also accepted but can't find an explanation.
So can anyone help, why does my input accept the square brackets and the caret?
The reason is that you are using A-z rather than A-Za-z. The ascii range between Z (0x5a) and a (0x61) includes the square brackets, the caret, backquote, and underscore.
Your regex is not in line with what you said:
I want to accept any alphanumeric characters, spaces and the following !&,'\- and maybe a few more that I'll add later if needed
If you want to accept only those characters, you need to remove the caret:
var pattern = /^[A-Za-z0-9 "!&,'\\-]+$/;
Notes:
A-z also includesthe characters: [\]^_`.
Use A-Za-z or use the i modifier to match only alphabets:
var pattern = /^[a-z0-9 "!&,'\\-]+$/i;
\- is only the character -, because the backslash will act as special character for escaping. Use \\ to allow a backslash.
^ and $ are anchors, used to match the beginning and end of the string. This ensures that the whole string is matched against the regex.
+ is used after the character class to match more than one character.
If you mean that you want to match characters other than the ones you accept and are using this to prevent the user from entering 'forbidden' characters, then the first note above describes your issue. Use A-Za-z instead of A-z (the second note is also relevant).
I'm not sure what you want but I don't think your current regexp does what you think it does:
It tries to find one character is not A-z0-9 "!&,'\- (^ means not).
Also, I'm not even sure what A-z matches. It's either a-z or A-Z.
So your current regexp matches strings like "." and "Hi." but not "Hi"
Try this: var pattern = /[^\w"!&,'\\-]/;
Note: \w also includes _, so if you want to avoid that then try
var pattern = /[^a-z0-9"!&,'\\-]/i;
I think the issue with your regex is that A-z is being understood as all characters between 0x41 (65) and 0x7A (122), which included the characters []^_` that are between A-Z and a-z. (Z is 0x5A (90) and a is 0x61 (97), which means the preceding characters take up 0x5B thru 0x60).

regex returning true even if comma is not enclosed in double quotes

Below is my regular expression.
/^\\"[a-zA-Z0-9!#\$%&\\'\*\+-\/=\?\^_`{\|}~;,:<>()#\[\]]*\\"$/
It is working correctly apart from the fact that it is returning true even if comma is not included in double quotes.
Why is it showing odd behaviour for a comma.
Eg a:b without quotes returns false while a,b without quotes returns true.
Experts can you please help
Because you are creating a character range here :
/^\\"[a-zA-Z0-9!#\$%&\\'\*\+-\/=\?\^_`{\|}~;,:<>()#\[\]]*\\"$/
^^^^^
This means all characters from + to /, this includes also the ,.
INside a character class, you don't need to escape the normal regex special characters, but there is another one, that get a special meaning the -.
So the correct character class would be
/^\\"[a-zA-Z0-9!#$%&\\'*+\-\/=?^_`{\|}~;,:<>()#\[\]]*\\"$/
The alternative would be to put the - at the start or the end of the character class, in that cases it would not create a range and does not need escaping.

can someone help to explain this regular expression in javascript?

This code is used to get rid of mime type from rawdata.but I can not understand how it works
content.replace(/^[^,]*,/ , '')
it seems quite different from java.... any help will be appreciated.
Your mime-type probably is seperated by a comma , and at the beginning of your raw data.
This regex says take everything from the beginning (^) that is NOT a comma ([^,]*) (the star makes it as many characters until there is a comma) and take the comma itself (,). Then replace it by nothing ('').
This one only gets the first appearence because it is marked by the beginning ^ that it must be at the beginning of the string.
The first thing you need to know is that there are regex literals in JavaScript, constructed by pairs of slashes. So like "..." is a string, /.../ is a regex. That's actually the only difference your code shows as compared to a Java regex.
Then, [abc] within a regex is called a character class, meaning "one character out of a, b or c". Conversely, [^abc] is a negated character class, meaning "one character except a, b or c".
So your sample means:
/ # Start of regex literal
^ # Start the match at the start of the string
[^,]* # Match any number of characters except commas
, # Match a comma
/ # End of regex literal
The regular expression is the text between the two forward slashes, the first carat (^) means at the begining of the string, the brackets mean a character class, the carat inside the brackets means any character except a comma, then asterisk after the closing bracket means match zero or more of the character defined by the character class (which again is any character except the comma), and then finally the last comma means match the comma after all this. Then its used in a replace function so the matching result will be replaced with the second parameter, in your case: an empty string.
Basically it matches the first characters up to and including the first comma in the 'content' variable and then replaces it with an empty string.

Categories