Regular expression to strip thousand separator from numeral string?

Regular expression to strip thousand separator from numeral string? - javascript

I have strings which contains thousand separators, however no string-to-number function wants to consume it correctly (using JavaScript). I'm thinking about "preparing" the string by stripping all thousand separators, leaving anything else untoched and letting Number/parseInt/parseFloat functions (I'm satisfied with their behavious otherwise) to decide the rest. But it seems what i have no idea which RegExp can do that!
Better ideas are welcome too!
UPDATE:
Sorry, answers enlightened me how badly formulated question it is. What i'm triyng to achieve is: 1) to strip thousand separators only if any, but 2) to not disturb original string much so i will get NaNs in the cases of invalid numerals.
MORE UPDATE:
JavaScript is limited to English locale for parsing, so lets assume thousand separator is ',' for simplicity (naturally, it never matches decimal separator in any locale, so changing to any other locale should not pose a problem)
Now, on parsing functions:
parseFloat('1023.95BARGAIN BYTES!') // parseXXX functions just "gives up" on invalid chars and returns 1023.95
Number('1023.95BARGAIN BYTES!') // while Number constructor behaves "strictly" and will return NaN
Sometimes I use rhw loose one, sometimes strict. I want to figure out the best approach for preparing string for both functions.
On validity of numerals:
'1,023.99' is perfectly well-formed English number, and stripping all commas will lead to correct result.
'1,0,2,3.99' is broken, however generic comma stripping will give '1023.99' which is unlikely to be a correct result.

welp, I'll venture to throw my suggestion into the pot:
Note: Revised
stringWithNumbers = stringwithNumbers.replace(/(\d+),(?=\d{3}(\D|$))/g, "$1");
should turn
1,234,567.12
1,023.99
1,0,2,3.99
the dang thing costs $1,205!!
95,5,0,432
12345,0000
1,2345
into:
1234567.12
1023.99
1,0,2,3.99
the dang thing costs $1205!!
95,5,0432
12345,0000
1,2345
I hope that's useful!
EDIT:
There is an additional alteration that may be necessary, but is not without side effects:
(\b\d{1,3}),(?=\d{3}(\D|$))
This changes the "one or more" quantifier (+) for the first set of digits into a "one to three" quantifier ({1,3}) and adds a "word-boundary" assertion before it. It will prevent replacements like 1234,123 ==> 1234123. However, it will also prevent a replacement that might be desired (if it is preceded by a letter or underscore), such as A123,789 or _1,555 (which will remain unchanged).

A simple num.replace(/,/g, '') should be sufficient I think.

Depends on what your thousand separator is
myString = myString.replace(/[ ,]/g, "");
would remove spaces and commas.

This should work for you
var decimalCharacter = ".",
regex = new RegExp("[\\d" + decimalCharacter + "]+", "g"),
num = "10,0000,000,000.999";
+num.match(regex).join("");

To confirm that a numeral-string is well-formed, use:
/^(\d*|\d{1,3}(,\d{3})+)($|[^\d])/.test(numeral_string)
which will return true if the numeral-string is either (1) just a sequence of zero or more digits, or (2) a sequence of digits with a comma before each set of three digits, or (3) either of the above followed by a non-digit character and who knows what else. (Case #3 is for floats, as well as your "BARGAIN BYTES!" examples.)
Once you've confirmed that, use:
numeral_string.replace(/,/g, '')
which will return a copy of the numeral-string with all commas excised.

You can use s.replaceAll("(\\W)(?=\\d{3})","");
This regex gets all alpha-numeric character with 3 characters after it.
Strings like 4.444.444.444,00 € will be 4444444444,00 €

I have used the following in a commercial setting, and it has worked often:
numberStr = numberStr.replace(/[. ,](\d\d\d\D|\d\d\d$)/g,'$1');
In the above example, thousands can be marked with a decimal, a comma, or a space.
In some cases ( like a price of 1000,5 Euros) the above doesn't work. If you need something more robust, this should work 100% of the time:
//convert a comma or space used as the cent placeholder to a decimal
$priceStr = $priceStr.replace(/[, ](\d\d$)/,'.$1');
$priceStr = $priceStr.replace(/[, ](\d$)/,'.$1');
//capture cents
var $hasCentsRegex = /[.]\d\d?$/;
if($hasCentsRegex.test($priceStr)) {
var $matchArray = $priceStr.match(/(.*)([.]\d\d?$)/);
var $priceBeforeCents = $matchArray[1];
var $cents = $matchArray[2];
} else{
var $priceBeforeCents = $priceStr;
var $cents = "";
}
//remove decimals, commas and whitespace from the pre-cent portion
$priceBeforeCents = $priceBeforeCents.replace(/[.\s,]/g,'');
//re-create the price by adding back the cents
$priceStr = $priceBeforeCents + $cents;

Related

Applying currency format using replace and a regular expression

I am trying to understand some code where a number is converted to a currency format. Thus, if you have 16.9 it converts to $16.90. The problem with the code is if you have an amount over $1,000, it just returns $1, an amount over $2,000 returns $2, etc. Amounts in the hundreds show up fine.
Here is the function:
var _formatCurrency = function(amount) {
return "$" + parseFloat(amount).toFixed(2).replace(/(\d)(?=(\d{3})+\.)/g, '$1,')
};
(The reason the semicolon is after the bracket is because this function is in itself a statement in another function. That function is not relevant to this discussion.)
I found out that the person who originally put the code in there found it somewhere but didn't fully understand it and didn't test this particular scenario. I myself have not dealt much with regular expressions. I am not only trying to fix it, but to understand how it is working as it is now.
Here's what I've found out. The code between the backslash after the open parenthesis and the backslash before the g is the pattern. The g means global search. The \d means digit, and the (?=\d{3})+\. appears to mean find 3 digits plus a decimal point. I'm not sure I have that right, though, because if that was correct shouldn't it ignore numbers like 5.4? That works fine. Also, I'm not sure what the '$1,' is for. It looks to me like it is supposed to be placed where the digits are, but wouldn't that change all the numbers to $1? Also, why is there a comma after the 1?

Regarding your comment
I was hoping to just edit the regex so it would work properly.
The regex you are currently using is obviously not working for you so I think you should consider alternatives even if they are not too similar, and
Trying to keep the code change as small as possible
Understandable but sometimes it is better to use a code that is a little bit bigger and MORE READABLE than to go with compact and hieroglyphical.
Back to business:
I'm assuming you are getting a string as an argument and this string is composed only of digits and may or may not have a dot before the last 1 or 2 digts. Something like
//input //intended output
1 $1.00
20 $20.00
34.2 $34.20
23.1 $23.10
62516.16 $62,516.16
15.26 $15.26
4654656 $4,654,656.00
0.3 $0.30
I will let you do a pre-check of (assumed) non-valids like 1. | 2.2. | .6 | 4.8.1 | 4.856 | etc.
Proposed solution:
var _formatCurrency = function(amount) {
amount = "$" + amount.replace(/(\d)(?=(\d{3})+(\.(\d){0,2})*$)/g, '$1,');
if(amount.indexOf('.') === -1)
return amount + '.00';
var decimals = amount.split('.')[1];
return decimals.length < 2 ? amount + '0' : amount;
};
Regex break down:
(\d): Matches one digit. Parentheses group things for referencing when needed.
(?=(\d{3})+(\.(\d){0,2})*$). Now this guy. From end to beginning:
$: Matches the end of the string. This is what allows you to match from the end instead of the beginning which is very handy for adding the commas.
(\.(\d){0,2})*: This part processes the dot and decimals. The \. matches the dot. (\d){0,2} matches 0, 1 or 2 digits (the decimals). The * implies that this whole group can be empty.
?=(\d{3})+: \d{3} matches 3 digits exactly. + means at least one occurrence. Finally ?= matches a group after the main expression without including it in the result. In this case it takes three digits at a time (from the end remember?) and leaves them out of the result for when replacing.
g: Match and replace globally, the whole string.
Replacing with $1,: This is how captured groups are referenced for replacing, in this case the wanted group is number 1. Since the pattern will match every digit in the position 3n+1 (starting from the end or the dot) and catch it in the group number 1 ((\d)), then replacing that catch with $1, will effectively add a comma after each capture.
Try it and please feedback.
Also if you haven't already you should (and SO has not provided me with a format to stress this enough) really really look into this site as suggested by Taplar

The pattern is invalid, and your understanding of the function is incorrect. This function formats a number in a standard US currency, and here is how it works:
The parseFloat() function converts a string value to a decimal number.
The toFixed(2) function rounds the decimal number to 2 digits after the decimal point.
The replace() function is used here to add the thousands spearators (i.e. a comma after every 3 digits). The pattern is incorrect, so here is a suggested fix /(\d)(?=(\d{3})+\.)/g and this is how it works:
The (\d) captures a digit.
The (?=(\d{3})+\.) is called a look-ahead and it ensures that the captured digit above has one set of 3 digits (\d{3}) or more + followed by the decimal point \. after it followed by a decimal point.
The g flag/modifier is to apply the pattern globally, that is on the entire amount.
The replacement $1, replaces the pattern with the first captured group $1, which is in our case the digit (\d) (so technically replacing the digit with itself to make sure we don't lose the digit in the replacement) followed by a comma ,. So like I said, this is just to add the thousands separator.
Here are some tests with the suggested fix. Note that it works fine with numbers and strings:
var _formatCurrency = function(amount) {
return "$" + parseFloat(amount).toFixed(2).replace(/(\d)(?=(\d{3})+\.)/g, '$1,');
};
console.log(_formatCurrency('1'));
console.log(_formatCurrency('100'));
console.log(_formatCurrency('1000'));
console.log(_formatCurrency('1000000.559'));
console.log(_formatCurrency('10000000000.559'));
console.log(_formatCurrency(1));
console.log(_formatCurrency(100));
console.log(_formatCurrency(1000));
console.log(_formatCurrency(1000000.559));
console.log(_formatCurrency(10000000000.559));

Okay, I want to apologize to everyone who answered. I did some further tracing and found out the JSON call which was bringing in the amount did in fact have a comma in it, so it is just parsing that first digit. I was looking in the wrong place in the code when I thought there was no comma in there already. I do appreciate everyone's input and hope you won't think too bad of me for not catching that before this whole exercise. If nothing else, at least I now know how that regex operates so I can make use of it in the future. Now I just have to go about removing that comma.
Have a great day!

Assuming that you are working with USD only, then this should work for you as an alternative to Regular Expressions. I have also included a few tests to verify that it is working properly.
var test1 = '16.9';
var test2 = '2000.5';
var test3 = '300000.23';
var test4 = '3000000.23';
function stringToUSD(inputString) {
const splitValues = inputString.split('.');
const wholeNumber = splitValues[0].split('')
.map(val => parseInt(val))
.reverse()
.map((val, idx, arr) => idx !== 0 && (idx + 1) % 3 === 0 && arr[idx + 1] !== undefined ? `,${val}` : val)
.reverse()
.join('');
return parseFloat(`${wholeNumber}.${splitValues[1]}`).toFixed(2);
}
console.log(stringToUSD(test1));
console.log(stringToUSD(test2));
console.log(stringToUSD(test3));
console.log(stringToUSD(test4));

Need a RegExp to filter out all but one decimal point

I'm using the following code to negate the characters in the regexp. By checking the inverse, I can determine if the value entered is correctly formatted. Essentially, any digit can be allowed but only one decimal point (placed anywhere in the string.) The way I have it now, it catches all numerals, but allows for multiple decimal points (creating invalid floats.) How can I adjust this to catch more than one decimal points (since I only want to allow for one)?
var regex = new RegExp(/[^0-9\.]/g);
var containsNonNumeric = this.value.match(regex);
if(containsNonNumeric){
this.value = this.value.replace(regex,'');
return false;
}
Here is what I'm expecting to happen:
First, valid input would be any number of numerals with the possibility of only one decimal point. The current behavior: The user enters characters one by one, if they are valid characters they will show up. If the character is invalid (e.g. the letter A) the field will replace that character with ''(essentially behaving like a backspace immediately after filling the character in. What I need is the same behavior for the addition of one too many decimal points.

As I understand your question the code below might be what you are looking for:
var validatedStr=str.replace(/[^0-9.]|\.(?=.*\.)/g, "");
It replaces all characters other then numbers and dot (.), then it replaces all dots followed by any number of 0-9 characters followed by dot.
EDIT based on first comment - the solution above erases all dots but the last, the author wants to erase all but the first one:
Since JS does not support "look behind", the solution might be to reverse string before regex, then reverse it again or to use this regex:
var counter=0;
var validatedStr=str.replace(/[^0-9.]|\./g, function($0){
if( $0 == "." && !(counter++) ) // dot found and counter is not incremented
return "."; // that means we met first dot and we want to keep it
return ""; // if we find anything else, let's erase it
});
JFTR: counter++ only executes if the first part of condition is true, so it works even for strings beginning with letters

Building upon the original regex from #Jan Legner with a pair of string reversals to work around the look behind behavior. Succeeds at keeping the first decimal point.
Modified with an attempt to cover negatives as well. Can't handle negative signs that are out of place and special cases that should logically return zero.
let keep_first_decimal = function(s) {
return s.toString().split('').reverse().join('').replace(/[^-?0-9.]|\.(?=.*\.)/g, '').split('').reverse().join('') * 1;
};
//filters as expected
console.log(keep_first_decimal("123.45.67"));
console.log(keep_first_decimal(123));
console.log(keep_first_decimal(123.45));
console.log(keep_first_decimal("123"));
console.log(keep_first_decimal("123.45"));
console.log(keep_first_decimal("a1b2c3d.e4f5g"));
console.log(keep_first_decimal("0.123"));
console.log(keep_first_decimal(".123"));
console.log(keep_first_decimal("0.123.45"));
console.log(keep_first_decimal("123."));
console.log(keep_first_decimal("123.0"));
console.log(keep_first_decimal("-123"));
console.log(keep_first_decimal("-123.45.67"));
console.log(keep_first_decimal("a-b123.45.67"));
console.log(keep_first_decimal("-ab123"));
console.log(keep_first_decimal(""));
//NaN, should return zero?
console.log(keep_first_decimal("."));
console.log(keep_first_decimal("-"));
//NaN, can't handle minus sign after first character
console.log(keep_first_decimal("-123.-45.67"));
console.log(keep_first_decimal("123.-45.67"));
console.log(keep_first_decimal("--123"));
console.log(keep_first_decimal("-a-b123"));

How to check if a string contains a number in JavaScript?

I don't get how hard it is to discern a string containing a number from other strings in JavaScript.
Number('') evaluates to 0, while '' is definitely not a number for humans.
parseFloat enforces numbers, but allow them to be tailed by abitrary text.
isNaN evaluates to false for whitespace strings.
So what is the programatically function for checking if a string is a number according to a simple and sane definition what a number is?

By using below function we can test whether a javascript string contains a number or not. In above function inplace of t, we need to pass our javascript string as a parameter, then the function will return either true or false
function hasNumbers(t)
{
var regex = /\d/g;
return regex.test(t);
}

If you want something a little more complex regarding format, you could use regex, something like this:
var pattern = /^(0|[1-9][0-9]{0,2}(?:(,[0-9]{3})*|[0-9]*))(\.[0-9]+){0,1}$/;
Demo
I created this regex while answering a different question awhile back (see here). This will check that it is a number with atleast one character, cannot start with 0 unless it is 0 (or 0.[othernumbers]). Cannot have decimal unless there are digits after the decimal, may or may not have commas.. but if it does it makes sure they are 3 digits apart, etc. Could also add a -? at the beginning if you want to allow negative numbers... something like:
/^(-)?(0|[1-9][0-9]{0,2}(?:(,[0-9]{3})*|[0-9]*))(\.[0-9]+){0,1}$/;

There's this simple solution :
var ok = parseFloat(s)==s;
If you need to consider "2 " as not a number, then you might use this one :
var ok = !!(+s==s && s.length && s.trim()==s);

You can always do:
function isNumber(n)
{
if (n.trim().length === 0)
return false;
return !isNaN(n);
}

Let's try
""+(+n)===n
which enforces a very rigid canonical way of the number.
However, such number strings can be created by var n=''+some_number by JS reliable.
So this solution would reject '.01', and reject all simple numbers that JS would stringify with exponent, also reject all exponential representations that JS would display with mantissa only. But as long we stay in integer and low float number ranges, it should work with otherwise supplied numbers to.

No need to panic just use this snippet if name String Contains only numbers or text.
try below.
var pattern = /^([^0-9]*)$/;
if(!YourNiceVariable.value.match(pattern)) {//it happen while Name Contains only Charectors.}
if(YourNiceVariable.value.match(pattern)) {//it happen while Name Contains only Numbers.}

This might be insane depending on the length of your string, but you could split it into an array of individual characters and then test each character with isNaN to determine if it's a number or not.

A very short, wrong but correctable answer was just deleted. I just could comment it, besides it was very cool! So here the corrected term again:
n!=='' && +n==n'
seems good. The first term eliminates the empty string case, the second one enforces the string interpretataion of a number created by numeric interpretation of the string to match the string. As the string is not empty, any tolerated character like whitespaces are removed, so we check if they were present.

Regex x(?!n) with optional characters in the x selector

So I have a javascript program that solves for 1 variable. I'm coming to a roadblock when selecting numbers that DON'T have a variable associated with them.
Here is my current regex expression:
(\+|-)?([0-9]+)(\.[0-9]+)?(?![a-z])
takes input like 15000.53=1254b+21
and returns [15000.53, 125, +21], when it should return [15000.53, +21] (yes, the + is supposed to be there)
I know why it is happening. The number of digits is optional so the function can handle large numbers and floats, but they are optional, so it is hard to make sure the entire number is selected. The result of this is selecting all the digits of the number EXCEPT the one directly next to the variable.
Anyone know of a way for the number of digits to stay optional, yet still make sure a variable doesn't follow the number? Thanks!
var reg = (\+|-)?([0-9]+)(\.[0-9]+)?(?![a-z]);
var numbers = [];
var equation = '15000.53=1254b+21';
while (aloneInt = reg.exec(side[0])) {
numbers.push(aloneInt[0]);
}

Try the following expression:
(?![+-]?[0-9.]+[a-z])(\+|-)?([0-9]+)(\.[0-9]+)?
The added negative lookahead (?![+-]?[0-9.]+[a-z]) makes sure there isn't one or more optionally signed floating point numbers that are followed by a letter from the alphabet.
In other words, it makes sure there isn't a number followed by a variable name, then it matches the number.
Regex101 Demo

Filter out second set of digits with regex

In my HTML markup, there will be a series of elements with the following naming scheme:
name="[].timeEntries[].Time"
Between both sets of brackets, there will be numbers with at least one possibly two digits. I need to filter out the second set of digits.
Disclaimer: This is my first time getting to know regex.
This is my pattern so far:
var re = /\[\d{1,2}\].timeEntries\[(\d{1,2})\]\.Time/;
I am not sure if I should use the * or + character to indicate two possible digits.
Is replace() the right method for this?
Do I need to escape the period '.' ?
Any other tips you can offer are appreciated.
For example, if I come across an element with
name="[10].timeEntries[9].Time"
I would like to put just the 9 into a variable.

I am not sure if I should use the * or + character to indicate two possible digits.
Neither, use {1,2}
\[\d{1,2}\]\.timeEntries\[(\d{1,2})\]\.Time
Example
This indicates explicitly 1 or 2 digits.
Also, yes, you should escape the .'s
You can use it like this:
var re = /\[\d{1,2}\]\.timeEntries\[(\d{1,2})\]\.Time/;
var myNumber = "[0].timeEntries[47].Time".match(re)[1];
Now myNumber will contain 47.
One final word of warning, myNumber contains the string "47". If your intention is to use it as a number you'll need to either use parseInt or use +:
var myNumber = +"[0].timeEntries[47].Time".match(re)[1];

You're pretty close.
There are a lot of ways you could do this - especially depending on how solid the format of that text will be.
You could use replace:
var re = /\[\d+\]\.timeEntries\[([\d]+)\]\.Time/;
var digits = element_name.replace(re, '$1');
If you know it will always be the second set of digits, you could use match
You could also use indexOf and/or split and some other string functions... In some cases that can be faster (but I think in your case, the regex is fine and probably easier to follow)

We Keep Coding

JavaScript is the programming language of the Web.

Regular expression to strip thousand separator from numeral string? - javascript

A simple num.replace(/,/g, '') should be sufficient I think.

Depends on what your thousand separator is myString = myString.replace(/[ ,]/g, ""); would remove spaces and commas.

This should work for you var decimalCharacter = ".", regex = new RegExp("[\\d" + decimalCharacter + "]+", "g"), num = "10,0000,000,000.999"; +num.match(regex).join("");

You can use s.replaceAll("(\\W)(?=\\d{3})",""); This regex gets all alpha-numeric character with 3 characters after it. Strings like 4.444.444.444,00 € will be 4444444444,00 €

Related

Applying currency format using replace and a regular expression

Need a RegExp to filter out all but one decimal point

How to check if a string contains a number in JavaScript?

Regex x(?!n) with optional characters in the x selector

Filter out second set of digits with regex

Categories

Resources