Split string in JavaScript using a regular expression - javascript

I'm trying to write a regex for use in javascript.
var script = "function onclick() {loadArea('areaog_og_group_og_consumedservice', '\x26roleOrd\x3d1');}";
var match = new RegExp("'[^']*(\\.[^']*)*'").exec(script);
I would like split to contain two elements:
match[0] == "'areaog_og_group_og_consumedservice'";
match[1] == "'\x26roleOrd\x3d1'";
This regex matches correctly when testing it at gskinner.com/RegExr/ but it does not work in my Javascript. This issue can be replicated by testing ir here http://www.regextester.com/.
I need the solution to work with Internet Explorer 6 and above.
Can any regex guru's help?

Judging by your regex, it looks like you're trying to match a single-quoted string that may contain escaped quotes. The correct form of that regex is:
'[^'\\]*(?:\\.[^'\\]*)*'
(If you don't need to allow for escaped quotes, /'[^']*'/ is all you need.) You also have to set the g flag if you want to get both strings. Here's the regex in its regex-literal form:
/'[^'\\]*(?:\\.[^'\\]*)*'/g
If you use the RegExp constructor instead of a regex literal, you have to double-escape the backslashes: once for the string literal and once for the regex. You also have to pass the flags (g, i, m) as a separate parameter:
var rgx = new RegExp("'[^'\\\\]*(?:\\\\.[^'\\\\]*)*'", "g");
while (result = rgx.exec(script))
print(result[0]);

The regex you're looking for is .*?('[^']*')\s*,\s*('[^']*'). The catch here is that, as usual, match[0] is the entire matched text (this is very normal) so it's not particularly useful to you. match[1] and match[2] are the two matches you're looking for.
var script = "function onclick() {loadArea('areaog_og_group_og_consumedservice', '\x26roleOrd\x3d1');}";
var parameters = /.*?('[^']*')\s*,\s*('[^']*')/.exec(script);
alert("you've done: loadArea("+parameters[1]+", "+parameters[2]+");");
The only issue I have with this is that it's somewhat inflexible. You might want to spend a little time to match function calls with 2 or 3 parameters?
EDIT
In response to you're request, here is the regex to match 1,2,3,...,n parameters. If you notice, I used a non-capturing group (the (?: ) part) to find many instances of the comma followed by the second parameter.
/.*?('[^']*')(?:\s*,\s*('[^']*'))*/

Maybe this:
'([^']*)'\s*,\s*'([^']*)'

Related

Javascript: Difference in Regex String and Regex [duplicate]

This question already has answers here:
Differences between Javascript regexp literal and constructor
(2 answers)
Closed 7 years ago.
I have to put a given variable into a regular expression. When I do it with hard coded data it works. Here is my code for that
/^(?=.*[a-z])+(?=.*[A-Z])+(?=.*[0-9##$-/:-?{-~!"^_`\[\]])+((?!ccarn).)*$/
This should ( and does ) look for a word (password in this case) that is case sensitive, has at least one capitol and one lowercase letter, and one number or symbol. It cannot, however, contain the word "ccarn" in it. Again when I put this in as my regex all works out. When I try to turn it into a string that gets passed in, it doesn't work. Here is my code for that
var regex = new RegExp('/^(?=.*[a-z])+(?=.*[A-Z])+(?=.*[0-9##$-/:-?{-~!"^_`\[\]])+((?!' + $scope.username + ').)*$/');
I feel like I may just be missing something in translation/transition, but can't seem to get it right. TIA
When you use the new RegExp() constructor to construct a regex from a string, you shouldn't include the leading and trailing / within the string. The /.../ form is only to be used when specifying a regex literal, which isn't what you're doing here.
When you do, say, var r = new RegExp('/foo/'), the regex you're actually getting is equivalent to doing var r = /\/foo\//, which clearly isn't what you want. So your constructor should actually look like this:
var regex = new RegExp('^(?=.*[a-z])+(?=.*[A-Z])+(?=.*[0-9##$-/:-?{-~!"^_`\[\]])+((?!' + $scope.username + ').)*$');
// ↑↑ ↑↑
// no "/" at the locations pointed to above
You probably also need to double your backslashes (since backslashes are escape characters in strings, but not in regex literals). So, [0-9##$-/:-?{-~!"^_`\[\]] needs to become [0-9##$-/:-?{-~!"^_`\\[\\]].
If you look closely the '/' character gets delimited when you give it inside the quotes so essentially the
var regex = new RegExp('/^(?=.*[a-z])+(?=.*[A-Z])+(?=.*[0-9##$-/:-?{-~!"^_`\[\]])+((?!' + $scope.username + ').)*$/');
The regular expression would be like this
/\/^(?=.*[a-z])+(?=.*[A-Z])+(?=.*[0-9##$-\/:-?{-~!"^_`[]])+((?!ccarn).)*$\//
The right way to go is to remove the '/' character from the RegEx and it should work
var regex = new RegExp('/^(?=.*[a-z])+(?=.*[A-Z])+(?=.*[0-9##$-/:-?{-~!"^_`\[\]])+((?!' + $scope.username + ').)*$/');
The output for the above would be
/^(?=.*[a-z])+(?=.*[A-Z])+(?=.*[0-9##$-\/:-?{-~!"^_`[]])+((?!ccarn).)*$/
which is exactly what you need ?
Hope it helps
Please before doing anything else, read a regex tutorial!
Mistakes:
A lookahead is a zero width assertion ( in other words, it's only a test and doesn't match anything ), putting a quantifier for a zero width assertion doesn't make any sense: (?=.*[a-z])+ (it is like repeating something empty, zero or more times. Note that the regex engine will protest if you write something like this.)
When you use the oop syntax to define a pattern (ie:var pattern = new RegExp("...), you don't need to add delimiters. You need to put double backslashes instead simple backslashes.

Regex trying to match characters before and after symbol

I'm trying to match characters before and after a symbol, in a string.
string: budgets-closed
To match the characters before the sign -, I do: ^[a-z]+
And to match the other characters, I try: \-(\w+) but, the problem is that my result is: -closed instead of closed.
Any ideas, how to fix it?
Update
This is the piece of code, where I was trying to apply the regex http://jsfiddle.net/trDFh/1/
I repeat: It's not that I don't want to use split; it's just I was really curious, and wanted to see, how can it be done the regex way. Hacking into things spirit
Update2
Well, using substring is a solution as well: http://jsfiddle.net/trDFh/2/ and is the one I chosed to use, since the if in question, is actually an else if in a more complex if syntax, and the chosen solutions seems to be the most fitted for now.
Use exec():
var result=/([^-]+)-([^-]+)/.exec(string);
result is an array, with result[1] being the first captured string and result[2] being the second captured string.
Live demo: http://jsfiddle.net/Pqntk/
I think you'll have to match that. You can use grouping to get what you need, though.
var str = 'budgets-closed';
var matches = str.match( /([a-z]+)-([a-z]+)/ );
var before = matches[1];
var after = matches[2];
For that specific string, you could also use
var str = 'budgets-closed';
var before = str.match( /^\b[a-z]+/ )[0];
var after = str.match( /\b[a-z]+$/ )[0];
I'm sure there are better ways, but the above methods do work.
If the symbol is specifically -, then this should work:
\b([^-]+)-([^-]+)\b
You match a boundry, any "not -" characters, a - and then more "not -" characters until the next word boundry.
Also, there is no need to escape a hyphen, it only holds special properties when between two other characters inside a character class.
edit: And here is a jsfiddle that demonstrates it does work.

javascript regex to require at least one special character

I've seen plenty of regex examples that will not allow any special characters. I need one that requires at least one special character.
I'm looking at a C# regex
var regexItem = new Regex("^[a-zA-Z0-9 ]*$");
Can this be converted to use with javascript? Do I need to escape any of the characters?
Based an example I have built this so far:
var regex = "^[a-zA-Z0-9 ]*$";
//Must have one special character
if (regex.exec(resetPassword)) {
isValid = false;
$('#vsResetPassword').append('Password must contain at least 1 special character.');
}
Can someone please identify my error, or guide me down a more efficient path? The error I'm currently getting is that regex has no 'exec' method
Your problem is that "^[a-zA-Z0-9 ]*$" is a string, and you need a regex:
var regex = /^[a-zA-Z0-9 ]*$/; // one way
var regex = new RegExp("^[a-zA-Z0-9 ]*$"); // another way
[more information]
Other than that, your code looks fine.
In javascript, regexs are formatted like this:
/^[a-zA-Z0-9 ]*$/
Note that there are no quotation marks and instead you use forward slashes at the beginning and end.
In javascript, you can create a regular expression object two ways.
1) You can use the constructor method with the RegExp object (note the different spelling than what you were using):
var regexItem = new RegExp("^[a-zA-Z0-9 ]*$");
2) You can use the literal syntax built into the language:
var regexItem = /^[a-zA-Z0-9 ]*$/;
The advantage of the second is that you only have to escape a forward slash, you don't have to worry about quotes. The advantage of the first is that you can programmatically construct a string from various parts and then pass it to the RegExp constructor.
Further, the optional flags for the regular expression are passed like this in the two forms:
var regexItem = new RegExp("^[A-Z0-9 ]*$", "i");
var regexItem = /^[A-Z0-9 ]*$/i;
In javascript, it seems to be a more common convention to the user /regex/ method that is built into the parser unless you are dynamically constructing a string or the flags.

JavaScript RegEx Match Failing

I am having issues matching a string using regex in javascript. I am trying to get everything up to the word "at". I am using the following and while it doesn't return any errors, it also doesn't do anything either.
var str = "Team A at Team B";
var matches = str.match(/(.*?)(?=at|$)/);
I tried multiple regex patterns before coming across this SO post, Regex to capture everything before first optional string, but it doesn't to return what I want.
Remove the ? at your first capturing group, and |$ from your second, and add ^ to mark beginning of string:
str.match(/^(.*)(?=at)/)
Alternatively (I personally find below easier to read, but your call):
str.substr(0, str.search(/\bat\b/))

JavaScript Regex to match a URL in a field of text

How can I setup my regex to test to see if a URL is contained in a block of text in javascript. I cant quite figure out the pattern to use to accomplish this
var urlpattern = new RegExp( "(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,#?^=%&:/~\+#]*[\w\-\#?^=%&/~\+#])?"
var txtfield = $('#msg').val() /*this is a textarea*/
if ( urlpattern.test(txtfield) ){
//do something about it
}
EDIT:
So the Pattern I have now works in regex testers for what I need it to do but chrome throws an error
"Invalid regular expression: /(http|ftp|https)://[w-_]+(.[w-_]+)+([w-.,#?^=%&:/~+#]*[w-#?^=%&/~+#])?/: Range out of order in character class"
for the following code:
var urlexp = new RegExp( '(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,#?^=%&:/~\+#]*[\w\-\#?^=%&/~\+#])?' );
Though escaping the dash characters (which can have a special meaning as character range specifiers when inside a character class) should work, one other method for taking away their special meaning is putting them at the beginning or the end of the class definition.
In addition, \+ and \# in a character class are indeed interpreted as + and # respectively by the JavaScript engine; however, the escapes are not necessary and may confuse someone trying to interpret the regex visually.
I would recommend the following regex for your purposes:
(http|ftp|https)://[\w-]+(\.[\w-]+)+([\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?
this can be specified in JavaScript either by passing it into the RegExp constructor (like you did in your example):
var urlPattern = new RegExp("(http|ftp|https)://[\w-]+(\.[\w-]+)+([\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?")
or by directly specifying a regex literal, using the // quoting method:
var urlPattern = /(http|ftp|https):\/\/[\w-]+(\.[\w-]+)+([\w.,#?^=%&:\/~+#-]*[\w#?^=%&\/~+#-])?/
The RegExp constructor is necessary if you accept a regex as a string (from user input or an AJAX call, for instance), and might be more readable (as it is in this case). I am fairly certain that the // quoting method is more efficient, and is at certain times more readable. Both work.
I tested your original and this modification using Chrome both on <JSFiddle> and on <RegexLib.com>, using the Client-Side regex engine (browser) and specifically selecting JavaScript. While the first one fails with the error you stated, my suggested modification succeeds. If I remove the h from the http in the source, it fails to match, as it should!
Edit
As noted by #noa in the comments, the expression above will not match local network (non-internet) servers or any other servers accessed with a single word (e.g. http://localhost/... or https://sharepoint-test-server/...). If matching this type of url is desired (which it may or may not be), the following might be more appropriate:
(http|ftp|https)://[\w-]+(\.[\w-]+)*([\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?
#------changed----here-------------^
<End Edit>
Finally, an excellent resource that taught me 90% of what I know about regex is Regular-Expressions.info - I highly recommend it if you want to learn regex (both what it can do and what it can't)!
Complete Multi URL Pattern.
UPDATED: Nov. 2020, April & June 2021 (Thanks commenters)
Matches all URI or URL in a string!
Also extracts the protocol, domain, path, query and hash. ([a-z0-9-]+\:\/+)([^\/\s]+)([a-z0-9\-#\^=%&;\/~\+]*)[\?]?([^ \#\r\n]*)#?([^ \#\r\n]*)
https://regex101.com/r/jO8bC4/56
Example JS code with output - every URL is turned into a 5-part array of its 'parts' (protocol, host, path, query, and hash)
var re = /([a-z0-9-]+\:\/+)([^\/\s]+)([a-z0-9\-#\^=%&;\/~\+]*)[\?]?([^ \#\r\n]*)#?([^ \#\r\n]*)/mig;
var str = 'Bob: Hey there, have you checked https://www.facebook.com ?\n(ignore) https://github.com/justsml?tab=activity#top (ignore this too)';
var m;
while ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
console.log(m);
}
Will give you the following:
["https://www.facebook.com",
"https://",
"www.facebook.com",
"",
"",
""
]
["https://github.com/justsml?tab=activity#top",
"https://",
"github.com",
"/justsml",
"tab=activity",
"top"
]
You have to escape the backslash when you are using new RegExp.
Also you can put the dash - at the end of character class to avoid escaping it.
& inside a character class means & or a or m or p or ; , you just need to put & and ; , a, m and p are already match by \w.
So, your regex becomes:
var urlexp = new RegExp( '(http|ftp|https)://[\\w-]+(\\.[\\w-]+)+([\\w-.,#?^=%&:/~+#-]*[\\w#?^=%&;/~+#-])?' );
try (http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,#?^=%&:/~\+#]*[\w\-\#?^=%&/~\+#])?
I've cleaned up your regex:
var urlexp = new RegExp('(http|ftp|https)://[a-z0-9\-_]+(\.[a-z0-9\-_]+)+([a-z0-9\-\.,#\?^=%&;:/~\+#]*[a-z0-9\-#\?^=%&;/~\+#])?', 'i');
Tested and works just fine ;)
Try this general regex for many URL format
/(([A-Za-z]{3,9})://)?([-;:&=\+\$,\w]+#{1})?(([-A-Za-z0-9]+\.)+[A-Za-z]{2,3})(:\d+)?((/[-\+~%/\.\w]+)?/?([&?][-\+=&;%#\.\w]+)?(#[\w]+)?)?/g
The trouble is that the "-" in the character class (the brackets) is being parsed as a range: [a-z] means "any character between a and z." As Vini-T suggested, you need to escape the "-" characters in the character classes, using a backslash.
try this worked for me
/^((ftp|http[s]?):\/\/)?(www\.)([a-z0-9]+)\.[a-z]{2,5}(\.[a-z]{2})?$/
that is so simple and understandable

Categories