I have a RegExp, doing a string replace, with global set. I only need one replace, but I'm using global because there's a second set of pattern matching (a mathematical equation that determines acceptable indices for the start of the replace) that I can't readily express as part of a regex.
var myString = //function-created string
myString = myString.replace(myRegex, function(){
if (/* this index is okay */){
//!! want to STOP searching now !!//
return //my return string
} else {
return arguments[0];
//return the string we matched (no change)
//continue on to the next match
}
}, "g");
If even possible, how do I break out of the string global search?
Thanks
Possible Solution
A solution (that doesn't work in my scenario for performance reasons, since I have very large strings with thousands of possible matches to very complex RegExp running hundreds or thousands of times):
var matched = false;
var myString = //function-created string
myString = myString.replace(myRegex, function(){
if (!matched && /* this index is okay */){
matched = true;
//!! want to STOP searching now !!//
return //my return string
} else {
return arguments[0];
//return the string we matched (no change)
//continue on to the next match
}
}, "g");
Use RegExp.exec() instead. Since you only do replacement once, I make use of that fact to simplify the replacement logic.
var myString = "some string";
// NOTE: The g flag is important!
var myRegex = /some_regex/g;
// Default value when no match is found
var result = myString;
var arr = null;
while ((arr = myRegex.exec(myString)) != null) {
// arr.index gives the starting index of the match
if (/* index is OK */) {
// Assign new value to result
result = myString.substring(0, arr.index) +
/* replacement */ +
myString.substring(myRegex.lastIndex);
break;
}
// Increment lastIndex of myRegex if the regex matches an empty string
// This is important to prevent infinite loop
if (arr[0].length == 0) {
myRegex.lastIndex++;
}
}
This code exhibits the same behavior as String.match(), since it also increments the index by 1 if the last match is empty to prevent infinite loop.
You can put try-catch and use undeclared variable to exit the replace function
var i = 0;
try{
"aaaaa".replace ( /./g, function( a, b ){
//Exit the loop on the 3-rd iteration
if ( i === 3 ){
stop; //undeclared variable
}
//Increment i
i++
})
}
catch( err ){
}
alert ( "i = " + i ); //Shows 3
I question your logic about performance. I think some points made in the comments are valid. But, what do I know... ;)
However, this is one way of doing what you want. Again, I think this, performance wise, isn't the best...:
var myString = "This is the original string. Let's see if the original will change...";
var myRegex = new RegExp('original', 'g');
var matched=false;
document.write(myString+'<br>');
myString = myString.replace(myRegex, function (match) {
if ( !matched ) {
matched = true;
return 'replaced';
} else {
return match;
}
});
document.write(myString);
It's pretty much like your "Possible Solution". And it doesn't "abort" after the replace (hence my performance reservation). But it does what you asked for. It replaces the first instance, sets a flag and after that just returns the matched string.
See it work here.
Regards.
Related
I'm trying to manipulate a string that has tested as a positive match against my regex statement.
My regex statement is /\[table=\d](.*?)\[\/table] / gmi and an example of a positive match would be [table=1]Cell 1[c]Cell 2[/table]. I'm searching for matches within a certain div, which I'll call .foo in the code below.
However, once the search comes back saying it has found a match, I want to have the section that was identified as a match returned back to me so that I can start manipulating a specific section of it, namely count the number of times [c] appears and reference the number in [table=1].
(function(regexCheck) {
var regex = /\[table=\d](.*?)\[\/table] / gmi;
$('.foo').each(function() {
var html = $(this).html();
var change = false;
while (regex[0].test(html)) {
change = true;
//Somehow return string?
}
});
})(jQuery);
I'm quite new to javascript and especially new to RegEx, so I apologise if this code is crude.
Thanks for all of your help in advance.
Use exec instead of test and keep the resulting match object:
var match;
while ((match = regex[0].exec(html)) != null) {
change = true;
// use `match[0]` for the full match, or `match[1]` and onward for capture groups
}
Simple example (since your snippet isn't runnable, I've just created a simple one instead):
var str = "test 1 test 2 test 3";
var regex = /test (\d)/g;
var match;
while ((match = regex.exec(str)) !== null) {
console.log("match = " + JSON.stringify(match));
}
I have created a JS fiddle https://jsfiddle.net/95r110s9/#&togetherjs=Emdw6ORNpc
HTML
<input id="landlordstreetaddress2" class="landlordinputs" onfocusout="validateinputentries()" />
JS
validateinputentries(){
landlordstreetaddress2 = document.getElementById('landlordstreetaddress2').value;
goodcharacters = "/^[a-zA-Z0-9#.,;:'\s]+$/gi";
for (var i = 0; i < landlordstreetaddress2.length; i++){
if (goodcharacters.indexOf(landlordstreetaddress2.charAt(i)) != -1){
console.log('Character is valid');
}
}
}
Its pulling the value from an input and running an indexOf regex expression with A-Z a-z and 0-9 with a few additional characters as well.
The problem is that it works with the entry of BCDEFG...etc and 12345...etc, but when I type "A" or "Z" or "0" or "1", it returns incorrectly.
I need it to return the same with 0123456789, ABCDEF...XYZ and abcdef...xyz
I should point out that the below does work as intended:
var badcharacters = "*|,\":<>[]`\';#?=+/\\";
badcharacter = false;
//firstname
for (var i = 0; i < landlordfirstname.value.length; i++){
if (badcharacters.indexOf(landlordfirstname.value.charAt(i)) != -1){
badcharacter = true;
break;
}
if(landlordfirstname.value.charAt(0) == " "){
badcharacter = true;
break;
}
}
String.prototype.indexOf()
The indexOf() method returns the index within the calling String object of the first occurrence of the specified value, starting the search at fromIndex. Returns -1 if the value is not found.
So, you're trying to search this value "/^[a-zA-Z0-9#.,;:'\s]+$/gi" which "never" will be found in the entered string.
You actually want to test that regexp against the entered value.
/^[a-zA-Z0-9#.,;:'\s]+$/gi.test(landlordstreetaddress2)
function validateinputentries() {
var landlordstreetaddress2 = document.getElementById('landlordstreetaddress2').value;
if (/^[a-zA-Z0-9#.,;:'\s]+$/gi.test(landlordstreetaddress2)) {
console.log('Characters are valid');
} else {
console.log('Characters are invalid');
}
}
<input id="landlordstreetaddress2" class="landlordinputs" onfocusout="validateinputentries()" />
You're trying to combine two different methods of testing a string -- one way is with a regex; the other way is by checking each character against a list of allowed characters. What you've wound up with is checking each character against a list of what would have been a regex, if you hadn't declared it as a string.
Those methods conflict with each other; you need to pick one or the other.
Check each character:
This is closest to what you were attempting. You can't use character ranges here (like a-zA-Z) as you would in a regex; you have to spell out each allowed character individually:
var validateinputentries = function() {
var address = document.getElementById('landlordstreetaddress2').value;
var goodcharacters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789#.,;:' ";
var badcharactersfound = false;
for (var i = 0; i < address.length; i++) {
if (goodcharacters.indexOf(address.charAt(i)) == -1) {
badcharactersfound = true;
console.log("not allowed: ", address.charAt(i));
}
}
if (badcharactersfound) {
// Show validation error here
}
}
<input id="landlordstreetaddress2" class="landlordinputs" onfocusout="validateinputentries()" />
Regular Expressions
The regex version is much simpler, because the regular expression is doing most of the work. You don't need to step through the string, just test the whole string against the regex and see what comes out. In this case you're looking to see if the input contains any characters that aren't allowed, so you want to use the character exception rule: [^abc] will match any character that is not a, b, or c. You don't want to anchor the match to the beginning or the end of the string, as you were doing with the initial ^ and the trailing $; and you can leave out the + because you don't care if there are sequential bad characters, you just care if they exist at all.
var validateinputentries = function() {
var address = document.getElementById('landlordstreetaddress2').value;
var regex = new RegExp("[^a-zA-Z0-9#.,;:'\\s]","g")
var badcharactersfound = address.match(regex);
// or the above two lines could also have been written like this:
// var bad = address.match(/[^a-zA-Z0-9#.,;:'\s]/g)
// In either case the "g" operator could be omitted; then it would only return the first bad character.
if (badcharactersfound) {
console.log("Not allowed: ", badcharactersfound);
}
}
<input id="landlordstreetaddress2" class="landlordinputs" onfocusout="validateinputentries()" />
I have strings, and i want to find in them 2 words: 'start' and 'end'.
'start' and 'end' always come together (maybe i will have another characters between them, but if i have 'start', i will have 'end' too).
I try to do with regEx source that find the first 'start' and than his own 'end', and it will return the correct substring.
examples of strings: [i wrote in this examples index for every couple of 'start' and 'end' just for clarity (in the real strings i will not have this indexes)- the answer always between index (1)]
something start something_needed end something // print 'something_needed'
start(1) something start(2) something end(2) something end(1) start something end // print 'something start(2) something end(2) something'
start(1) something start(2) start(3) something end(3) something start(4) end(4) something end(2) something end(1) something start(5) something end(5) // print 'something start**(2) start(3) something end(3) something start(4) end(4) something end(2) something'
This is my solution in Javascript, but i prefer the answer in regEx only.
i find all the start, and after that all the end, and than- for every start: count++, for every end: count--. when count == 0, it the position of the correct end.
function getStartEnd(str) {
str = " "+str+" ";
var start = matchPosArr(str, /[\d\s\r\n,\(\)\[\]\{\}]+START+(?=[\d\s\r\n,\(\)\[\]\{\}])/gi);
var end = matchPosArr(str, /[\d\s\r\n,\(\)\[\]\{\}]+END+(?=[\d\s\r\n,\(\)\[\]\{\}])/gi);
var count = 0; // counter
var si = 0; // index of start array
var ei = 0; // index of end array
var isStart = false;
while (true) {
if (ei >= end.length) {
alert('error');
break;
}
else if (si >= start.length) {
ei++;
count--;
if (count == 0) {
ei--;
}
}
else if (start[si] > end[ei]) {
ei++;
count--;
}
else if (start[si] < end[ei]) {
si++;
count++;
}
if (count == 0 && isStart==true) {
break;
}
isStart = true;
}
return str.substring(start[0]+("start ".length),end[ei]);
}
function matchPosArr(str, regEx) {
var pos = [];
while ((match = regEx.exec(str)) != null) {
pos.push(match.index);
}
return pos;
}
alert( getSelectFrom(str) );
Here is a possible solution from Matching Nested Constructs in JavaScript, Part 2.
Example usage:
matchRecursiveRegExp("START text START text END text more END text", "START", "END");
// (c) 2007 Steven Levithan <stevenlevithan.com>
// MIT License
/*** matchRecursiveRegExp
Accepts a string to search, a left and right format delimiter
as regex patterns, and optional regex flags. Returns an array
of matches, allowing nested instances of left/right delimiters.
Use the "g" flag to return all matches, otherwise only the
first is returned. Be careful to ensure that the left and
right format delimiters produce mutually exclusive matches.
Backreferences are not supported within the right delimiter
due to how it is internally combined with the left delimiter.
When matching strings whose format delimiters are unbalanced
to the left or right, the output is intentionally as a
conventional regex library with recursion support would
produce, e.g. "<<x>" and "<x>>" both produce ["x"] when using
"<" and ">" as the delimiters (both strings contain a single,
balanced instance of "<x>").
examples:
matchRecursiveRegExp("test", "\\(", "\\)")
returns: []
matchRecursiveRegExp("<t<<e>><s>>t<>", "<", ">", "g")
returns: ["t<<e>><s>", ""]
matchRecursiveRegExp("<div id=\"x\">test</div>", "<div\\b[^>]*>", "</div>", "gi")
returns: ["test"]
*/
function matchRecursiveRegExp (str, left, right, flags) {
var f = flags || "",
g = f.indexOf("g") > -1,
x = new RegExp(left + "|" + right, "g" + f),
l = new RegExp(left, f.replace(/g/g, "")),
a = [],
t, s, m;
do {
t = 0;
while (m = x.exec(str)) {
if (l.test(m[0])) {
if (!t++) s = x.lastIndex;
} else if (t) {
if (!--t) {
a.push(str.slice(s, m.index));
if (!g) return a;
}
}
}
} while (t && (x.lastIndex = s));
return a;
}
document.write(matchRecursiveRegExp("something start something_needed end something", "start", "end") + "<br/>");
document.write(matchRecursiveRegExp("start something start something end something end start something end", "start", "end")+ "<br/>");
document.write(matchRecursiveRegExp("start something start start something end something start end something end something end something start something end", "start", "end")+ "<br/>");
what you are looking for is to find 'start' count the amount of times another 'start' is found, and then ignore an equal amount of 'end's. This is a thing that cannot be done with regex.
its impossible to compare the amount of times 2 strings match with pure regex.
instead, here's several semi-regex solution for this problem:
var string = "start(1) something start(2) start(3) something end(3) something start(4) end(4) something end(2) something end(1) something start(5) something end(5)";
var stop;
do {
stop = true;
string = string.replace(/start((?:[^s]|s(?!tart))*?)end/, function($0, $1) {
stop = false;
var result = $1;
//do stuff with result here..
console.log(result);
return ""; //replaces the match with empty so it can continue processing
});
} while (!stop);
whats good about this method is that is simple, and you can have an infinite number of nested statements.
I'm having a hard time understanding what you exactly want, but if I understand correctly: you cannot do this with pure regex in javascript because lookbehind (positive (?<=...) and negative (?<!...)) is not supported, and thus you would not be able to match the 'start(n)' before the match result.
but instead you can use subgroups (subgroups aren't fully supported in javascript so you'll need to use replace):
var string = "something start(1) something_needed end(1) something";
var regex = /start\((\d+)\)(.*)end\(\1\)/;
string.replace(regex, function($0, $1, $2) {
var result = $2;
console.log($2)
//do stuff with $2 here
});
$0 is the original match (start\((\d+)\)(.*)end\(\1\))
$1 and $2 are the groups that are outputted by the regex.
$1 refers to (\d+). It's already used to 'store' the number behind start (1 in this case). But here's where the magic happens: it gets loaded again and matched against with \1 inside the regex.
$2 is where the info you need is stored. it refers to (.*)
Can anyone tell me why does this not work for integers but works for characters? I really hate reg expressions since they are cryptic but will if I have too. Also I want to include the "-()" as well in the valid characters.
String.prototype.Contains = function (str) {
return this.indexOf(str) != -1;
};
var validChars = '0123456789';
var str = $("#textbox1").val().toString();
if (str.Contains(validChars)) {
alert("found");
} else {
alert("not found");
}
Review
String.prototype.Contains = function (str) {
return this.indexOf(str) != -1;
};
This String "method" returns true if str is contained within itself, e.g. 'hello world'.indexOf('world') != -1would returntrue`.
var validChars = '0123456789';
var str = $("#textbox1").val().toString();
The value of $('#textbox1').val() is already a string, so the .toString() isn't necessary here.
if (str.Contains(validChars)) {
alert("found");
} else {
alert("not found");
}
This is where it goes wrong; effectively, this executes '1234'.indexOf('0123456789') != -1; it will almost always return false unless you have a huge number like 10123456789.
What you could have done is test each character in str whether they're contained inside '0123456789', e.g. '0123456789'.indexOf(c) != -1 where c is a character in str. It can be done a lot easier though.
Solution
I know you don't like regular expressions, but they're pretty useful in these cases:
if ($("#textbox1").val().match(/^[0-9()]+$/)) {
alert("valid");
} else {
alert("not valid");
}
Explanation
[0-9()] is a character class, comprising the range 0-9 which is short for 0123456789 and the parentheses ().
[0-9()]+ matches at least one character that matches the above character class.
^[0-9()]+$ matches strings for which ALL characters match the character class; ^ and $ match the beginning and end of the string, respectively.
In the end, the whole expression is padded on both sides with /, which is the regular expression delimiter. It's short for new RegExp('^[0-9()]+$').
Assuming you are looking for a function to validate your input, considering a validChars parameter:
String.prototype.validate = function (validChars) {
var mychar;
for(var i=0; i < this.length; i++) {
if(validChars.indexOf(this[i]) == -1) { // Loop through all characters of your string.
return false; // Return false if the current character is not found in 'validChars' string.
}
}
return true;
};
var validChars = '0123456789';
var str = $("#textbox1").val().toString();
if (str.validate(validChars)) {
alert("Only valid characters were found! String validates!");
} else {
alert("Invalid Char found! String doesn't validate.");
}
However, This is quite a load of code for a string validation. I'd recommend looking into regexes, instead. (Jack's got a nice answer up here)
You are passing the entire list of validChars to indexOf(). You need to loop through the characters and check them one-by-one.
Demo
String.prototype.Contains = function (str) {
var mychar;
for(var i=0; i<str.length; i++)
{
mychar = this.substr(i, 1);
if(str.indexOf(mychar) == -1)
{
return false;
}
}
return this.length > 0;
};
To use this on integers, you can convert the integer to a string with String(), like this:
var myint = 33; // define integer
var strTest = String(myint); // convert to string
console.log(strTest.Contains("0123456789")); // validate against chars
I'm only guessing, but it looks like you are trying to check a phone number. One of the simple ways to change your function is to check string value with RegExp.
String.prototype.Contains = function(str) {
var reg = new RegExp("^[" + str +"]+$");
return reg.test(this);
};
But it does not check the sequence of symbols in string.
Checking phone number is more complicated, so RegExp is a good way to do this (even if you do not like it). It can look like:
String.prototype.ContainsPhone = function() {
var reg = new RegExp("^\\([0-9]{3}\\)[0-9]{3}-[0-9]{2}-[0-9]{2}$");
return reg.test(this);
};
This variant will check phones like "(123)456-78-90". It not only checks for a list of characters, but also checks their sequence in string.
Thank you all for your answers! Looks like I'll use regular expressions. I've tried all those solutions but really wanted to be able to pass in a string of validChars but instead I'll pass in a regex..
This works for words, letters, but not integers. I wanted to know why it doesn't work for integers. I wanted to be able to mimic the FilteredTextBoxExtender from the ajax control toolkit in MVC by using a custom Attribute on a textBox
Consider the following:
var params = location.search.match(/=([\w\d-]+)&?/g);
console.log(params);
The output is:
["=7&", "=31500&", "=1"]
I don't wont any signs there, digits or words only, so I've set parentheses, but it doesn't work. So how do I do it?
Are you getting the querystring parameter? I think this is what you want (although it doesn't use regular expression).
<script type="text/javascript">
<!--
function querySt(ji) {
hu = window.location.search.substring(1);
gy = hu.split("&");
for (i=0;i<gy.length;i++) {
ft = gy[i].split("=");
if (ft[0] == ji) {
return ft[1];
}
}
}
var koko = querySt("koko");
document.write(koko);
document.write("<br>");
document.write(hu);
-->
</script>
Reference: http://ilovethecode.com/Javascript/Javascript-Tutorials-How_To-Easy/Get_Query_String_Using_Javascript.shtml
There's a nice javascript function called gup() which makes this sort of thing simple. Here's the function:
function gup( name )
{
name = name.replace(/[\[]/,"\\\[").replace(/[\]]/,"\\\]");
var regexS = "[\\?&]"+name+"=([^&#]*)";
var regex = new RegExp( regexS );
var results = regex.exec( window.location.href );
if( results == null )
return "";
else
return results[1];
}
and sample usage:
var myVar = gup('myVar');
So, if your querystring looks like this: ?myVar=asdf
myVar will return 'asdf'.
The .match method returns the whole matched string, not any groupings you have defined with parenthesis.
If you want to return just a grouping in a regular expression, you'll have to use the .exec method multiple times, and extract the matched group from the resulting array:
var search = location.search,
param = /=([\w\d-]+)&?/g,
params = [],
match;
while ((match = param.exec(search)) != null) {
params.push(match[1]);
}
console.log(params);
This works because the g flag is used on the regular expression. Every time you call .exec on the param regular expression, it's lastIndex attribute is set to the next matching substring and that in turn makes sure that the next call to .exec starts searching at the next match. The resulting array contains the whole matched string at index 0, then every matched group at subsequent positions. Your group is thus returned as index 1 of the array.