I have a regex problem with validation for a region code.
My region code could be only one digit but it also could be a digits separated by '-'
for Example my region code could be one of the following:
6
6-66
77-7
As you can see I must have at least one digit or digits separated by '-' and if they are separated there should be a digits after the '-' sign (does not matter how many). So 6- must not be validated as legal region code. I try 2 hours to solve this, but I couldn't, so please help me! Thank you!
/\d+(-\d+)?$/
This will match 6, 6-66,77-7, but not6-`
If what you are looking for is the whole string:
/^\d+(?:-\d+)?$/
or something like that:
if (parseInt(yourstring.split(/-/)[0])>=eval(yourstring)) alert('true');
else alert('false');
But it is more complicated :) and less efficient! And if the condition is false you code will crash!
var data = ['6', '6-66', '77-7', '6-'];
var len = data.length;
for(var i=0; i<len; ++i) {
var current = data[i];
var result = data[i].match(/^(\d+|\d+[-]\d+)$/);
if(result != null) {
console.log(current);
}
}
--output:--
6
6-66
77-7
For a quick answer you can try following:
/^([0-9])|([0-9]\-[0-9][0-9])|([0-9][0-9]\-[0-9])$/
or in case your engine support perl-styled character classes:
/^(\d)|(\d\-\d\d)|(\d\d\-\d)$/
here what it does:
between / and / resides as string defining a regular expression
\d stands for one digit it coudl also be writen as [0-9]
() defines a sub-expression, so (\d) matches your first one-digit, (\d-\d\d) second three digits style, and last (\d\d-\d) third variant of three-digit region code
| goes as "OR" like (A)|(B)|(C), so by combining previous three we will get:
/(\d)|(\d-\d\d)|(\d\d-\d)/
Finally ^ means start of string, and $ - end of string.
also there is so called BRE mode (in which you have to add "\" symbol before each parentheses), but I think it is not the case. However if you would have some free time, please consider any quick tutorial like this one.
Related
I am trying to understand some code where a number is converted to a currency format. Thus, if you have 16.9 it converts to $16.90. The problem with the code is if you have an amount over $1,000, it just returns $1, an amount over $2,000 returns $2, etc. Amounts in the hundreds show up fine.
Here is the function:
var _formatCurrency = function(amount) {
return "$" + parseFloat(amount).toFixed(2).replace(/(\d)(?=(\d{3})+\.)/g, '$1,')
};
(The reason the semicolon is after the bracket is because this function is in itself a statement in another function. That function is not relevant to this discussion.)
I found out that the person who originally put the code in there found it somewhere but didn't fully understand it and didn't test this particular scenario. I myself have not dealt much with regular expressions. I am not only trying to fix it, but to understand how it is working as it is now.
Here's what I've found out. The code between the backslash after the open parenthesis and the backslash before the g is the pattern. The g means global search. The \d means digit, and the (?=\d{3})+\. appears to mean find 3 digits plus a decimal point. I'm not sure I have that right, though, because if that was correct shouldn't it ignore numbers like 5.4? That works fine. Also, I'm not sure what the '$1,' is for. It looks to me like it is supposed to be placed where the digits are, but wouldn't that change all the numbers to $1? Also, why is there a comma after the 1?
Regarding your comment
I was hoping to just edit the regex so it would work properly.
The regex you are currently using is obviously not working for you so I think you should consider alternatives even if they are not too similar, and
Trying to keep the code change as small as possible
Understandable but sometimes it is better to use a code that is a little bit bigger and MORE READABLE than to go with compact and hieroglyphical.
Back to business:
I'm assuming you are getting a string as an argument and this string is composed only of digits and may or may not have a dot before the last 1 or 2 digts. Something like
//input //intended output
1 $1.00
20 $20.00
34.2 $34.20
23.1 $23.10
62516.16 $62,516.16
15.26 $15.26
4654656 $4,654,656.00
0.3 $0.30
I will let you do a pre-check of (assumed) non-valids like 1. | 2.2. | .6 | 4.8.1 | 4.856 | etc.
Proposed solution:
var _formatCurrency = function(amount) {
amount = "$" + amount.replace(/(\d)(?=(\d{3})+(\.(\d){0,2})*$)/g, '$1,');
if(amount.indexOf('.') === -1)
return amount + '.00';
var decimals = amount.split('.')[1];
return decimals.length < 2 ? amount + '0' : amount;
};
Regex break down:
(\d): Matches one digit. Parentheses group things for referencing when needed.
(?=(\d{3})+(\.(\d){0,2})*$). Now this guy. From end to beginning:
$: Matches the end of the string. This is what allows you to match from the end instead of the beginning which is very handy for adding the commas.
(\.(\d){0,2})*: This part processes the dot and decimals. The \. matches the dot. (\d){0,2} matches 0, 1 or 2 digits (the decimals). The * implies that this whole group can be empty.
?=(\d{3})+: \d{3} matches 3 digits exactly. + means at least one occurrence. Finally ?= matches a group after the main expression without including it in the result. In this case it takes three digits at a time (from the end remember?) and leaves them out of the result for when replacing.
g: Match and replace globally, the whole string.
Replacing with $1,: This is how captured groups are referenced for replacing, in this case the wanted group is number 1. Since the pattern will match every digit in the position 3n+1 (starting from the end or the dot) and catch it in the group number 1 ((\d)), then replacing that catch with $1, will effectively add a comma after each capture.
Try it and please feedback.
Also if you haven't already you should (and SO has not provided me with a format to stress this enough) really really look into this site as suggested by Taplar
The pattern is invalid, and your understanding of the function is incorrect. This function formats a number in a standard US currency, and here is how it works:
The parseFloat() function converts a string value to a decimal number.
The toFixed(2) function rounds the decimal number to 2 digits after the decimal point.
The replace() function is used here to add the thousands spearators (i.e. a comma after every 3 digits). The pattern is incorrect, so here is a suggested fix /(\d)(?=(\d{3})+\.)/g and this is how it works:
The (\d) captures a digit.
The (?=(\d{3})+\.) is called a look-ahead and it ensures that the captured digit above has one set of 3 digits (\d{3}) or more + followed by the decimal point \. after it followed by a decimal point.
The g flag/modifier is to apply the pattern globally, that is on the entire amount.
The replacement $1, replaces the pattern with the first captured group $1, which is in our case the digit (\d) (so technically replacing the digit with itself to make sure we don't lose the digit in the replacement) followed by a comma ,. So like I said, this is just to add the thousands separator.
Here are some tests with the suggested fix. Note that it works fine with numbers and strings:
var _formatCurrency = function(amount) {
return "$" + parseFloat(amount).toFixed(2).replace(/(\d)(?=(\d{3})+\.)/g, '$1,');
};
console.log(_formatCurrency('1'));
console.log(_formatCurrency('100'));
console.log(_formatCurrency('1000'));
console.log(_formatCurrency('1000000.559'));
console.log(_formatCurrency('10000000000.559'));
console.log(_formatCurrency(1));
console.log(_formatCurrency(100));
console.log(_formatCurrency(1000));
console.log(_formatCurrency(1000000.559));
console.log(_formatCurrency(10000000000.559));
Okay, I want to apologize to everyone who answered. I did some further tracing and found out the JSON call which was bringing in the amount did in fact have a comma in it, so it is just parsing that first digit. I was looking in the wrong place in the code when I thought there was no comma in there already. I do appreciate everyone's input and hope you won't think too bad of me for not catching that before this whole exercise. If nothing else, at least I now know how that regex operates so I can make use of it in the future. Now I just have to go about removing that comma.
Have a great day!
Assuming that you are working with USD only, then this should work for you as an alternative to Regular Expressions. I have also included a few tests to verify that it is working properly.
var test1 = '16.9';
var test2 = '2000.5';
var test3 = '300000.23';
var test4 = '3000000.23';
function stringToUSD(inputString) {
const splitValues = inputString.split('.');
const wholeNumber = splitValues[0].split('')
.map(val => parseInt(val))
.reverse()
.map((val, idx, arr) => idx !== 0 && (idx + 1) % 3 === 0 && arr[idx + 1] !== undefined ? `,${val}` : val)
.reverse()
.join('');
return parseFloat(`${wholeNumber}.${splitValues[1]}`).toFixed(2);
}
console.log(stringToUSD(test1));
console.log(stringToUSD(test2));
console.log(stringToUSD(test3));
console.log(stringToUSD(test4));
I'm using RPG Maker MV which is a game creator that uses JavaScript to create plugins. I have a plugin in JavaScript already, however I'm trying to edit a part of the plugin so that it basically checks if a certain string exists in a character in the game and if it does, then sets specific variables to numbers within that string.
for (var i = 0; i < page.list.length; i++) {
if (page.list[i].code == 108 && page.list[i].parameters[0].contains("<post:" + (n) + "," + (n) + ">")) {
var post = page.list[i].parameters[0];
var array = post.split(',');
this._origMovement.x = Number(array[1]);
this._origMovement.y = Number(array[1]);
break;
};
};
So I know the first 2 lines work and contains works when I only put a specific string. However I can't figure out how to check for 2 numbers that are separated by a comma and wrapped in '<>' tags, without knowing what the numbers would be.
Then it needs to extract those numbers and assign one to this._origMovement.x and the other to this._origMovement.y.
Any help would be greatly appreciated.
This is one of those rare cases where I'd use a regular expression. If you haven't come across regular expressions before I suggest reading an introduction to them, such as this one: https://regexone.com/
In your case, you probable want something like this:
var myRegex = /<post:(\d+),(\d+)>/;
var matches = myParameter.match(myRegex);
this._origMovement.x = matches[1]; //the first number
this._origMovement.y = matches[2]; //the second number
The myRegex variable is a regular expression that looks for the pattern you describe, and has 2 capture groups which look for a string of one or more digits (\d+ means "one or more digits"). The result of the .match() call gives you an array containing the entire match and the results of the capture groups.
If you want to allow for decimal numbers, you'll need to use a different capture group that allows for a decimal point, such as ([\d\.]+), which means "a sequence of one or more digits and decimal points", or more sophisticated, (\d+\.?\d*), which is "a sequence of one or more digits, following by an optional decimal point, followed by zero or more digits).
There are lots of good tutorials around to help you write good regular expressions, and sites that will help you live-test your expressions to make sure they work correctly. They're a powerful tool, but be careful not to over-use them!
Got it to work. For anyone who may ever be interested, the code is below.
for (var i = 0; i < page.list.length; i++) {
if (page.list[i].code == 108 && page.list[i].parameters[0].contains("<post:")) {
var myRegex = /<post:(\d+),(\d+)>/;
var matches = page.list[i].parameters[0].match(myRegex);
this._origMovement.x = matches[1]; //the first number
this._origMovement.y = matches[2]; //the second number
break;
}
};
This regular expression looks for words with 3 or less characters so that a non-breaking space can be placed in before them.
smallwords = /(\s|^)(([a-zA-Z-_(]{1,2}('|’)*[a-zA-Z-_,;]{0,1}?\s)+)/gi, // words with 3 or less characters
Is there a way, to make the expression only apply itself to 2 words in a row?
Example
Currently, the string:
Singapore, the USA and Vietnam.
will be turned into:
Singapore, the USA and Vietnam.
if the expression only applied to 2 words in a row it would show
Singapore, the USA and Vietnam.
here's the full script:
ragadjust = function (s, method) {
if (document.querySelectorAll) {
var eles = document.querySelectorAll(s),
elescount = eles.length,
smallwords = /(\s|^)(([a-zA-Z-_(]{1,2}('|’)*[a-zA-Z-_,;]{0,1}?\s)+)/gi, // words with 3 or less characters
while (elescount-- > 0) {
var ele = eles[elescount],
elehtml = ele.innerHTML;
if (method == 'small-words' || method == 'all')
// replace small words
elehtml = elehtml.replace(smallwords, function(contents, p1, p2) {
return p1 + p2.replace(/\s/g, ' ');
});
ele.innerHTML = elehtml;
}
}
};
This is from RagAdjust
I know that this is not what you are asking for, but I figured a code review wouldn't hurt:
I think the word boundary \b is better, in this case, than \s|^.
You have the A-Z and a-z characters in your match, yet you are use the i case insensitive operator.
{0,1}? is redundant - either use the ? to make it optional, or use {0,1} to make it match zero or one times.
If your are going to have a dash in your character set put it at the end so that you don't have an ambiguous regex, for example this [a-z_-] is much better than [a-z-_].
If you don't need to capture a value, use the non-capturing parenthesis (?:).
So, here's your cleaned up regex:
/\b((?:[a-z_(-]{1,2}(?:'|’)*[a-z_,;-]?\s)+)/gi
I'm pretty sure the '|’ bit is some sort of typo when you pasted this in from your editor. Not sure what it is supposed to be.
This doesn't quite solve the issue the way you suggested but it does reduce the number of non breaking spaces that end up in the string. But it might give you some insight. Because you have the trailing g on both regex replacements, you're doing global replace. If you instead loop it with some max number of fixes, things work out a little differently.
Try changing the max number of replacements. I think the other thing that happens here (in my modified code) is that after you make one replacement, the spaces and small words are gone because you jammed in a nbsp which may or may not solve the issue you're trying to get around.
Here's my replacement function (simplified from your original). The basic mod is to remove the g from the regex's and add the loop. You should check out the codepen to see the full deal
var new_ragadjust = function (contents) {
MAX_NUMBER_OF_REPLACEMENTS = 5;
smallwords = /(\s|^)(([a-zA-Z-_(]{1,2}('|’)*[a-zA-Z-_,;]{0,1}?\s)+)/i; // words with 3 or less characters
var ii = 0;
var c = contents;
for (;ii < MAX_NUMBER_OF_REPLACEMENTS; ++ii) {
c = c.replace(smallwords, function(contents, p1, p2) {
return p1 + p2.replace(/\s/, ' ');
});
}
return c;
};
Codepen
http://cdpn.io/DKLtc
Also, to see the difference, you need to inspect elements to actually see where the nbsps end up (as you probably already knew).
Regular expressions are simply evil in my mind and no matter how many times I read any documentation I just cannot seem to grasp even the simplest of expressions!
I am trying to write what must be a very simple expression to query a variable in javascript but I just cannot get it to work properly.
I am trying to validate the following:-
The string must be 9 characters long, starting with SO- (case insensitive eg So-, so-, sO- and SO-) followed by 6 numbers.
So the following should all match
SO-123456,
So-123456,
sO-456789,
so-789123
but the following should fail
SO-12d456,
SO-1234567
etc etc
I have only managed to get this far so far
var _reg = /(SO-)\d{6}/i;
var _tests = new Array();
_tests[0] = "So-123456";
_tests[1] = "SO-123456";
_tests[2] = "sO-456789";
_tests[3] = "so-789123";
_tests[4] = "QR-123456";
_tests[5] = "SO-1234567";
_tests[6] = "SO-45k789";
for(var i = 0; i < _tests.length; i++){
var _matches = _tests[i].match(_reg);
if(_matches && _matches.length > 0)
$('#matches').append(i+'. '+_matches[0] + '<br/>');
}
Please see http://jsfiddle.net/TzHKd/ for above example
Test number 5 is matching although it should fail as there are 7 numbers and not 6.
Any assistance would be greatly appreciated.
Cheers
use this regexp instead
/^(so-)\d{6}$/i;
without ^ (string starting with) or $ (string ending with) you're looking for a generic substring match (that's the reason why when you have 7 digits your regexp return true).
By using the anchors ^ and $ (matching beginining of line and end of line respectively), you can make the regex match the whole line. Otherwise, the match with return true as soon as the characters in the regex are matched.
So, you will apply it like this:
var _reg = /^(so-)\d{6}$/i;
I've spent a few hours on this and I can't seem to figure this one out.
In the code below, I'm trying to understand exactly what and how the regular expressions in the url.match are working.
As the code is below, it doesn't work. However if I remove (?:&toggle=|&ie=utf-8|&FORM=|&aq=|&x=|&gwp) it seems to give me the output that I want.
However, I don't want to remove this without understanding what it is doing.
I found a pretty useful resource, but after a few hours I still can't precisely determine what these expressions are doing:
https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Regular_Expressions#Using_Parenthesized_Substring_Matches
Could someone break this down for me and explain how exactly it is parsing the strings. The expressions themselves and the placement of the parentheses is not really clear to me and frankly very confusing.
Any help is appreciated.
(function($) {
$(document).ready(function() {
function parse_keywords(url){
var matches = url.match(/.*(?:\?p=|\?q=|&q=|\?s=)([a-zA-Z0-9 +]*)(?:&toggle=|&ie=utf-8|&FORM=|&aq=|&x=|&gwp)/);
return matches ? matches[1].split('+') : [];
}
myRefUrl = "http://www.google.com/url?sa=f&rct=j&url=https://www.mydomain.com/&q=my+keyword+from+google&ei=fUpnUaage8niAKeiICgCA&usg=AFQjCNFAlKg_w5pZzrhwopwgD12c_8z_23Q";
myk1 = (parse_keywords(myRefUrl));
kw="";
for (i=0;i<myk1.length;i++) {
if (i == (myk1.length - 1)) {
kw = kw + myk1[i];
}
else {
kw = kw + myk1[i] + '%20';
}
}
console.log (kw);
if (kw != null && kw != "" && kw != " " && kw != "%20") {
orighref = $('a#applynlink').attr('href');
$('a#applynlink').attr('href', orighref + '&scbi=' + kw);
}
});
})(jQuery);
Let's break this regex down.
/
Begin regex.
.*
Match zero or more anything - basically, we're willing to match this regex at any point into the string.
(?:\?p=
|\?q=
|&q=
|\?s=)
In this, the ?: means 'do not capture anything inside of this group'. See http://www.regular-expressions.info/refadv.html
The \? means take ? literally, which is normally a character meaning 'match 0 or 1 copies of the previous token' but we want to match an actual ?.
Other than that, it's just looking for a multitude of different options to select (| means 'the regex is valid if I match either what's before me or after me.)
([a-zA-Z0-9 +]*)
Now we match zero or more of any of the following characters in any arrangement: a-ZA-Z0-9 + And since it is inside a () with no ?: we DO capture it.
(?:&toggle=
|&ie=utf-8
|&FORM=
|&aq=
|&x=
|&gwp)
We see another ?: so this is another non-capturing group.
Other than that, it is just full of literal characters separated by |s, so it is not doing any fancy logic.
/
End regex.
In summary, this regex looks through the string for any instance of the first non capturing group, captures everything inside of it, then looks for any instance of the second non capturing group to 'cap' it off and returns everything that was between those two non capturing groups. (Think of it as a 'sandwich', we look for the header and footer and capture everything in between that we're interested in)
After the regex runs, we do this:
return matches ? matches[1].split('+') : [];
Which grabs the captured group and splits it on + into an array of strings.
For situations like this, it's really helpful to visualize it with www.debuggex.com (which I built). It immediately shows you the structure of your regex and allows you to walk through step-by-step.
In this case, the reason it works when you remove the last part of your regex is because none of the strings &toggle=, &ie=utf-8, etc are in your sample url. To see this, drag the grey slider above the test string on debuggex and you'll see that it never makes it past the & in that last group.