I'm using the exact email regex pattern from the RFC :
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
However
When I paste it in vs :
var emailPattern = /[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?/i;
it is shown with error :
http://i.stack.imgur.com/UkJJE.jpg
Undetermined string constant
How can I remove this error ?
I think you need to escape your / in your regexp if you decide to use the / as a delimiter so replace in the regexp each / occurrence by \/.
The thing is, as the first and last / are here to indicate where you regexp really starts and where it really ends, you need to escape / inside the regexp for the parser to understand where to stop.
This should work :
var emailPattern = /[a-z0-9!#$%&'*+\/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+\/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?/i;
And as you pointed out in your comment, using the construct new Regexp("...") allows you to build your regexp without having to escape.
Both constructs are equivalent. More info here
Related
I am using Nodejs to build application in which I need to process certain strings I have used the JS "RegExp" object for this purpose.
I want only a part of my string in the regex to be case insensitive
var key = '(?i)c(?-i)ustomParam';
var find = '\{(\\b' + key +'\\b:?.*?)\}';
var regex = new RegExp(find,"g");
But it breaks with following error
SyntaxError: Invalid regular expression: /{(\b(?i)c(?-i)ustomParam\b:?.*?)}/
I will get the key from some external source like redis and the string to be matched from some other external source , I want that the first alphabet should be case-Insensitive and the rest of alphabets to be case-Sensitive.
When I get the key from external source I will append the (?i) before the first alphabet and (?-i) after the first alphabet.
I even tried this just for starters sake, but that also didn't work
var key ='customParam';
var find = '(?i)\{(\\b' + key +'\\b:?.*?)\}(?-i)';
var regex = new RegExp(find,"g");
I know I can use "i" flags instead of above ,but that's not my use case. I did it just to check.
JavaScript built-in RegExp does not support inline modifiers, like (?im), let alone inline modifier groups that can be placed anywhere inside the pattern (like (?i:....)).
Even XRegExp cannot offer this functionality, you can only use (?i) on the whole pattern declaring it at the pattern beginning.
In XRegExp, you can define the regex ONLY as
var regex = XRegExp('(?i)\\{(\\b' + key +'\\b:?.*?)\\}', 'g');
On May 27, 2020, still neither JavaScript native RegExp, nor XRegExp patterns support inline modifier groups (i.e. (?i:...)), nor placing them in any part of the pattern (as far as XRegExp is concerned).
I am using in .Net the [Url(UrlOptions.DisallowProtocol)] data annotation attribute which checks URL regex (no mandatory for https/http or www).
The code of this attribute looks like this:
string const regex = new RegExp('^((https?|ftp):\/\/)?(((([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|([a-zA-Z][\-a-zA-Z0-9]*)|((([a-zA-Z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-zA-Z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-zA-Z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-zA-Z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-zA-Z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-zA-Z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)?(\?((([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$');
I want to convert it to JS validation but facing lots of difficulties because this is a long validation.
Is there any tool or any easy way to convert this regex to work in JS?
The regular expression seems to work in Javascript to some extent without any modification - see regex101 demo. Just remember to escape all backslashes (\\ in place of \) and single quotes ('' instead of ') if defining it in a single-quoted Javascript string:
var jsRegex = new RegExp('^((https?|ftp):\\/\\/)?(((([a-zA-Z]|\\d|-|\\.|_|~|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])|(%[\\da-fA-F]{2})|[!\\$&''\\(\\)\\*\\+,;=]|:)*#)?(((\\d|[1-9]\\d|1\\d\\d|2[0-4]\\d|25[0-5])\\.(\\d|[1-9]\\d|1\\d\\d|2[0-4]\\d|25[0-5])\\.(\\d|[1-9]\\d|1\\d\\d|2[0-4]\\d|25[0-5])\\.(\\d|[1-9]\\d|1\\d\\d|2[0-4]\\d|25[0-5]))|([a-zA-Z][\\-a-zA-Z0-9]*)|((([a-zA-Z]|\\d|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])|(([a-zA-Z]|\\d|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])([a-zA-Z]|\\d|-|\\.|_|~|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])*([a-zA-Z]|\\d|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])))\\.)+(([a-zA-Z]|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])|(([a-zA-Z]|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])([a-zA-Z]|\\d|-|\\.|_|~|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])*([a-zA-Z]|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])))\\.?)(:\\d*)?)(\\/((([a-zA-Z]|\\d|-|\\.|_|~|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])|(%[\\da-fA-F]{2})|[!\\$&''\\(\\)\\*\\+,;=]|:|#)+(\\/(([a-zA-Z]|\\d|-|\\.|_|~|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])|(%[\\da-fA-F]{2})|[!\\$&''\\(\\)\\*\\+,;=]|:|#)*)*)?)?(\\?((([a-zA-Z]|\\d|-|\\.|_|~|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])|(%[\\da-fA-F]{2})|[!\\$&''\\(\\)\\*\\+,;=]|:|#)|[\\uE000-\\uF8FF]|\\/|\\?)*)?(\\#((([a-zA-Z]|\\d|-|\\.|_|~|[\\u00A0-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFEF])|(%[\\da-fA-F]{2})|[!\\$&''\\(\\)\\*\\+,;=]|:|#)|\\/|\\?)*)?$', 'i');
// ...or...
var jsRegex = /^((https?|ftp):\/\/)?(((([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|([a-zA-Z][\-a-zA-Z0-9]*)|((([a-zA-Z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-zA-Z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-zA-Z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-zA-Z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-zA-Z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-zA-Z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)?(\?((([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$/i;
Note: The i modifier is assuming this needs case-insensitive matching.
(If the above isn't sufficient, please be more specific about what isn't working.)
Is there any tool or any easy way to convert this regex to work in JS?
The only generic tool I'm aware of that supposedly converts between regex flavors is RegexBuddy - but it is paid for software (€29.95) - although if for any reason it didn't work you could get a refund.
var YourRegEx = #"^((https?|ftp):\/\/)?(((([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|([a-zA-Z][\-a-zA-Z0-9]*)|((([a-zA-Z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-zA-Z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-zA-Z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-zA-Z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-zA-Z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-zA-Z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)?(\?((([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-zA-Z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-fA-F]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$";
var Ismatch = Regex.Match(input, YourRegEx , RegexOptions.IgnoreCase);
if (Ismatch .Success)
{
// does match
}
You can Try this as will put your regex in #""
I'm trying to use this great RegEx presented here for grabbing a video id from any youtube type url:
parse youtube video id using preg_match
// getting our youtube url from an input field.
var yt_url = $('#yt_url').val();
var regexp = new RegExp('%(?:youtube(?:-nocookie)?\\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\\.be/)([^"&?/ ]{11})%','i');
var videoId = yt_url.match( regexp ) ;
console.log('vid: '+videoId);
My console is always giving me a null videoId though. Am I incorrectly escaping something in my regexp var? I added the a second backslash to escape the single backslashes already.
Scratching my head?
% are delimiters for the PHP you got the link from, Javascript does not expect delimiters when using new RegExp(). Also, it looks like \\. should probably be replaced with \. Try:
var regexp = new RegExp('(?:youtube(?:-nocookie)?\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})','i');
Also, you can create a regular expression literally by using Javascript's /.../ delimiters, but then you'll need to escape all of your /s:
var regexp = /(?:youtube(?:-nocookie)?\.com\/(?:[^/]+\/.+\/|(?:v|e(?:mbed)?)\/|.*[?&]v=)|youtu\\.be\/)([^"&?\/ ]{11})/i;
Documentation
Update:
A quick update to address the comment on efficiency for literal expressions (/ab+c/) vs. constructors (new RegExp("ab+c")). The documentation says:
Regular expression literals provide compilation of the regular expression when the script is loaded. When the regular expression will remain constant, use this for better performance.
And:
Using the constructor function provides runtime compilation of the regular expression. Use the constructor function when you know the regular expression pattern will be changing, or you don't know the pattern and are getting it from another source, such as user input.
Since your expression will always be static, I would say creating it literally (the second example) would be slightly faster since it is compiled when loaded (however, don't confuse this into thinking it won't be creating a RegExp object). This small difference is confirmed with a quick benchmark test.
Does anyone know how to find regular expression string from javascript code?
e.g.
var pattern = /some regular expression/;
Is it possible to to with regular expression :) ?
If I got your question right, and you need a regular expression which would find all the regular expressions in a JavaScript program, then I don't think it is possible. A regular expression in JavaScript does not have to use the // syntax, it can be defined as a string. Even a full-blown JavaScript parser would not be smart enough to detect a regular expression here, for instance:
var re = "abcde";
var regexClass = function() { return RegExp; }
var regex = new regexClass()(re);
So I would give up this idea unless you want to cover only a few very basic cases.
You want a regex to match a regex? Crazy. This might cover the simplest cases.
new RegExp("\/.+\/")
However, I peeked into the Javascript Textmate bundle and is has 2 regex for finding a regex start and end.
begin = '(?<=[=(:]|^|return)\s*(/)(?![/*+{}?])'
end = '(/)[igm]*';
Which you could probably use as inspiration for toward your goal.
Thanks for answers I have found also that it is nearly impossible task to do, but here is my regex which parses source code just fine:
this.mainPattern = new RegExp(//single line comment
"(?://.*$)|"+
//multiline comment
"(/\\*.*?($|\\*/))"+
//single or double quote strings
"|(?:(?:\"[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*\")|(?:'[^'\\\\]*(?:\\\\.[^'\\\\]*)*'))"+
//regular expression literal in javascript code
"|(?:(?:[/].+[/])[img]?[\\s]?(?=[;]|[,]|[)]))"+
//brackets
"|([{]|[(]|[\[])|([}]|[)]|[\\]])", 'g');
I am in need of a regular expression that can remove the extension of a filename, returning only the name of the file.
Here are some examples of inputs and outputs:
myfile.png -> myfile
myfile.png.jpg -> myfile.png
I can obviously do this manually (ie removing everything from the last dot) but I'm sure that there is a regular expression that can do this by itself.
Just for the record, I am doing this in JavaScript
Just for completeness: How could this be achieved without Regular Expressions?
var input = 'myfile.png';
var output = input.substr(0, input.lastIndexOf('.')) || input;
The || input takes care of the case, where lastIndexOf() provides a -1. You see, it's still a one-liner.
/(.*)\.[^.]+$/
Result will be in that first capture group. However, it's probably more efficient to just find the position of the rightmost period and then take everything before it, without using regex.
The regular expression to match the pattern is:
/\.[^.]*$/
It finds a period character (\.), followed by 0 or more characters that are not periods ([^.]*), followed by the end of the string ($).
console.log(
"aaa.bbb.ccc".replace(/\.[^.]*$/,'')
)
/^(.+)(\.[^ .]+)?$/
Test cases where this works and others fail:
".htaccess" (leading period)
"file" (no file extension)
"send to mrs." (no extension, but ends in abbr.)
"version 1.2 of project" (no extension, yet still contains a period)
The common thread above is, of course, "malformed" file extensions. But you always have to think about those corner cases. :P
Test cases where this fails:
"version 1.2" (no file extension, but "appears" to have one)
"name.tar.gz" (if you view this as a "compound extension" and wanted it split into "name" and ".tar.gz")
How to handle these is problematic and best decided on a project-specific basis.
/^(.+)(\.[^ .]+)?$/
Above pattern is wrong - it will always include the extension too. It's because of how the javascript regex engine works. The (\.[^ .]+) token is optional so the engine will successfully match the entire string with (.+)
http://cl.ly/image/3G1I3h3M2Q0M
Here's my tested regexp solution.
The pattern will match filenameNoExt with/without extension in the path, respecting both slash and backslash separators
var path = "c:\some.path/subfolder/file.ext"
var m = path.match(/([^:\\/]*?)(?:\.([^ :\\/.]*))?$/)
var fileName = (m === null)? "" : m[0]
var fileExt = (m === null)? "" : m[1]
dissection of the above pattern:
([^:\\/]*?) // match any character, except slashes and colon, 0-or-more times,
// make the token non-greedy so that the regex engine
// will try to match the next token (the file extension)
// capture the file name token to subpattern \1
(?:\. // match the '.' but don't capture it
([^ :\\/.]*) // match file extension
// ensure that the last element of the path is matched by prohibiting slashes
// capture the file extension token to subpattern \2
)?$ // the whole file extension is optional
http://cl.ly/image/3t3N413g3K09
http://www.gethifi.com/tools/regex
This will cover all cases that was mentioned by #RogerPate but including full paths too
another no-regex way of doing it (the "oposite" of #Rahul's version, not using pop() to remove)
It doesn't require to refer to the variable twice, so it's easier to inline
filename.split('.').slice(0,-1).join()
This will do it as well :)
'myfile.png.jpg'.split('.').reverse().slice(1).reverse().join('.');
I'd stick to the regexp though... =P
return filename.split('.').pop();
it will make your wish come true. But not regular expression way.
In javascript you can call the Replace() method that will replace based on a regular expression.
This regular expression will match everything from the begining of the line to the end and remove anything after the last period including the period.
/^(.*)\..*$/
The how of implementing the replace can be found in this Stackoverflow question.
Javascript regex question