Regex - Delete up to second whitespace

Regex - Delete up to second whitespace - javascript

I think what i'm trying to do is possible... but I haven't found an answer that achieves that yet.
I would like to format a string which will be similar to this in javascript:
+44 (0) 234-567-8901
The proceeding international ( +44 ) and local ( (0) ) calling code are subject to change, so the string could be longer or shorter. However the format is always the same.
I would like to take this string and remove all characters including whitespaces up to the second whitespace and remove dashes.
e.g:
+44 (0) 234-567-8901 becomes 2345678901
Can anyone help?
Thanks

Going by your own requirements (remove everything up to and including the second whitespace, and remove dashes from the remaining);
Use this pattern ^([\S]+\s){2}|- and replace with nothing. You should be left with a number similar to what you asked for (+44 (0) 234-567-8901 becomes 2345678901).

If you can rely on the ending format of the string to be a consistent number of characters, you can grab the last 12 characters, and remove the dashes.
var result = str.slice(-12).replace(/-/g, "");
If you really just want a regex, you can do this...
var result = str.match(/(\d{3})-(\d{3})-(\d{4})$/).slice(1).join("");
Again, this relies on consistency of the end of the string. If there may be variation, you'll need to adjust to compensate.
To get everything after the last whitespace, change the slice() in the first example to this...
.slice(str.lastIndexOf(' ') + 1)
To get everything after the second whitespace, change the slice() to this...
.slice(str.indexOf(' ', str.indexOf(' ') + 1) + 1)

Related

Applying currency format using replace and a regular expression

I am trying to understand some code where a number is converted to a currency format. Thus, if you have 16.9 it converts to $16.90. The problem with the code is if you have an amount over $1,000, it just returns $1, an amount over $2,000 returns $2, etc. Amounts in the hundreds show up fine.
Here is the function:
var _formatCurrency = function(amount) {
return "$" + parseFloat(amount).toFixed(2).replace(/(\d)(?=(\d{3})+\.)/g, '$1,')
};
(The reason the semicolon is after the bracket is because this function is in itself a statement in another function. That function is not relevant to this discussion.)
I found out that the person who originally put the code in there found it somewhere but didn't fully understand it and didn't test this particular scenario. I myself have not dealt much with regular expressions. I am not only trying to fix it, but to understand how it is working as it is now.
Here's what I've found out. The code between the backslash after the open parenthesis and the backslash before the g is the pattern. The g means global search. The \d means digit, and the (?=\d{3})+\. appears to mean find 3 digits plus a decimal point. I'm not sure I have that right, though, because if that was correct shouldn't it ignore numbers like 5.4? That works fine. Also, I'm not sure what the '$1,' is for. It looks to me like it is supposed to be placed where the digits are, but wouldn't that change all the numbers to $1? Also, why is there a comma after the 1?

Regarding your comment
I was hoping to just edit the regex so it would work properly.
The regex you are currently using is obviously not working for you so I think you should consider alternatives even if they are not too similar, and
Trying to keep the code change as small as possible
Understandable but sometimes it is better to use a code that is a little bit bigger and MORE READABLE than to go with compact and hieroglyphical.
Back to business:
I'm assuming you are getting a string as an argument and this string is composed only of digits and may or may not have a dot before the last 1 or 2 digts. Something like
//input //intended output
1 $1.00
20 $20.00
34.2 $34.20
23.1 $23.10
62516.16 $62,516.16
15.26 $15.26
4654656 $4,654,656.00
0.3 $0.30
I will let you do a pre-check of (assumed) non-valids like 1. | 2.2. | .6 | 4.8.1 | 4.856 | etc.
Proposed solution:
var _formatCurrency = function(amount) {
amount = "$" + amount.replace(/(\d)(?=(\d{3})+(\.(\d){0,2})*$)/g, '$1,');
if(amount.indexOf('.') === -1)
return amount + '.00';
var decimals = amount.split('.')[1];
return decimals.length < 2 ? amount + '0' : amount;
};
Regex break down:
(\d): Matches one digit. Parentheses group things for referencing when needed.
(?=(\d{3})+(\.(\d){0,2})*$). Now this guy. From end to beginning:
$: Matches the end of the string. This is what allows you to match from the end instead of the beginning which is very handy for adding the commas.
(\.(\d){0,2})*: This part processes the dot and decimals. The \. matches the dot. (\d){0,2} matches 0, 1 or 2 digits (the decimals). The * implies that this whole group can be empty.
?=(\d{3})+: \d{3} matches 3 digits exactly. + means at least one occurrence. Finally ?= matches a group after the main expression without including it in the result. In this case it takes three digits at a time (from the end remember?) and leaves them out of the result for when replacing.
g: Match and replace globally, the whole string.
Replacing with $1,: This is how captured groups are referenced for replacing, in this case the wanted group is number 1. Since the pattern will match every digit in the position 3n+1 (starting from the end or the dot) and catch it in the group number 1 ((\d)), then replacing that catch with $1, will effectively add a comma after each capture.
Try it and please feedback.
Also if you haven't already you should (and SO has not provided me with a format to stress this enough) really really look into this site as suggested by Taplar

The pattern is invalid, and your understanding of the function is incorrect. This function formats a number in a standard US currency, and here is how it works:
The parseFloat() function converts a string value to a decimal number.
The toFixed(2) function rounds the decimal number to 2 digits after the decimal point.
The replace() function is used here to add the thousands spearators (i.e. a comma after every 3 digits). The pattern is incorrect, so here is a suggested fix /(\d)(?=(\d{3})+\.)/g and this is how it works:
The (\d) captures a digit.
The (?=(\d{3})+\.) is called a look-ahead and it ensures that the captured digit above has one set of 3 digits (\d{3}) or more + followed by the decimal point \. after it followed by a decimal point.
The g flag/modifier is to apply the pattern globally, that is on the entire amount.
The replacement $1, replaces the pattern with the first captured group $1, which is in our case the digit (\d) (so technically replacing the digit with itself to make sure we don't lose the digit in the replacement) followed by a comma ,. So like I said, this is just to add the thousands separator.
Here are some tests with the suggested fix. Note that it works fine with numbers and strings:
var _formatCurrency = function(amount) {
return "$" + parseFloat(amount).toFixed(2).replace(/(\d)(?=(\d{3})+\.)/g, '$1,');
};
console.log(_formatCurrency('1'));
console.log(_formatCurrency('100'));
console.log(_formatCurrency('1000'));
console.log(_formatCurrency('1000000.559'));
console.log(_formatCurrency('10000000000.559'));
console.log(_formatCurrency(1));
console.log(_formatCurrency(100));
console.log(_formatCurrency(1000));
console.log(_formatCurrency(1000000.559));
console.log(_formatCurrency(10000000000.559));

Okay, I want to apologize to everyone who answered. I did some further tracing and found out the JSON call which was bringing in the amount did in fact have a comma in it, so it is just parsing that first digit. I was looking in the wrong place in the code when I thought there was no comma in there already. I do appreciate everyone's input and hope you won't think too bad of me for not catching that before this whole exercise. If nothing else, at least I now know how that regex operates so I can make use of it in the future. Now I just have to go about removing that comma.
Have a great day!

Assuming that you are working with USD only, then this should work for you as an alternative to Regular Expressions. I have also included a few tests to verify that it is working properly.
var test1 = '16.9';
var test2 = '2000.5';
var test3 = '300000.23';
var test4 = '3000000.23';
function stringToUSD(inputString) {
const splitValues = inputString.split('.');
const wholeNumber = splitValues[0].split('')
.map(val => parseInt(val))
.reverse()
.map((val, idx, arr) => idx !== 0 && (idx + 1) % 3 === 0 && arr[idx + 1] !== undefined ? `,${val}` : val)
.reverse()
.join('');
return parseFloat(`${wholeNumber}.${splitValues[1]}`).toFixed(2);
}
console.log(stringToUSD(test1));
console.log(stringToUSD(test2));
console.log(stringToUSD(test3));
console.log(stringToUSD(test4));

Add a space to UK Postcode in correct place Javascript

I am trying to write a basic function that will allow me to add a space to UK postcodes where the spaces have been removed.
UK postcodes always have a space before the final digit of the postcode string.
Some examples with no spacing and with correct spacing:
CB30QB => CB3 0QB
N12NL => N1 2NL
OX144FB => OX14 4FB
To find the final digit in the string I am regex /\d(?=\D*$)/g and the Javascript I have in place currently is as follows:
// Set the Postcode
var postCode = "OX144FB";
// Find the index position of the final digit in the string (in this case '4')
var postcodeIndex = postCode.indexOf(postCode.match(/\d(?=\D*$)/g));
// Slice the final postcode at the index point, add a space and join back together.
var finalPostcode = [postCode.slice(0, postcodeIndex), ' ', postCode.slice(postcodeIndex)].join('');
return finalPostcode;
I am getting the following results when I change the set postcost:
CB30QB becomes CB3 0QB - Correct
N12NL becomes N1 2NL - Correct
CB249LQ becomes CB24 9LQ - Correct
OX144FB becomes OX1 44FB - Incorrect
OX145FB becomes OX14 5FB - Correct
It seems that the issue might be to do with having two digits of the same value as most other combinations seem to work.
Does anyone know how I can fix this?

I should use string.replace
string.replace(/^(.*)(\d)/, "$1 $2");
DEMO

You can use replace() with regex, you need to place space before 3 letters from the end
document.write('CB30QB'.replace(/^(.*)(.{3})$/,'$1 $2')+'<br>');
document.write('N12NL'.replace(/^(.*)(.{3})$/,'$1 $2')+'<br>');
document.write('CB249LQ'.replace(/^(.*)(.{3})$/,'$1 $2')+'<br>');
document.write('OX144FB'.replace(/^(.*)(.{3})$/,'$1 $2'));

As everyone else is answering, .replace() is easier. However, let me point what's wrong in the code.
The problem is you're using postCode.indexOf() to find the first occurence of what has been matched. In this case:
Text: OX144FB
Match: ^ match is correct: "4"
Text: OX144FB
IndexOf: ^ first occurence of "4"
To fix it, use the .index of the match object:
// Find the index position of the final digit in the string (in this case '4')
var postcodeIndex = postCode.match(/\d(?=\D*$)/g).index;

var postCode = "OX144FB";
return postCode.replace(/^(.*)(\d)(.*)/, "$1 $2$3");

Using the String.prototype.replace method is obviously the easiest way:
return postCode.replace(/(?=\d\D*$)/, ' ');
or using the greediness:
return postCode.replace(/^(.*)(?=\d)/, '$1 ');
Your previous code doesn't work because you are searching with indexOf the substring matched with the String.prototype.match() method (that is the last digit before the end). But if this digit is several times in the string, indexOf will return the position of the first occurrence.
As an aside, when you want to find the position of a match in a string, use the String.prototype.search() method that returns this position.

This is an old problem, but whilst Avinash Raj's solution works, it only works if all your postcodes are without spaces. If you have a mix, and you want to regularize them to having a single space, you can use this regex:
string.replace(/(\S*)\s*(\d)/, "$1 $2");
DEMO - it even works with more than one space!

Javascript regex match returning a string with comma at the end

Just as the title says...i'm trying to parse a string for example
2x + 3y
and i'm trying to get only the coefficients (i.e. 2 and 3)
I first tokenized it with space character as delimiter giving me "2x" "+" "3y"
then i parsed it again to this statement to get only the coefficients
var number = eqTokens[i].match(/(\-)?\d+/);
I tried printing the output but it gave me "2,"
why is it printing like this and how do i fix it? i tried using:
number = number.replace(/[,]/, "");
but this just gives me an error that number.replace is not a function

What's wrong with this?
> "2x + 3y".match(/-?\d+(?=[A-Za-z]+)/g)
[ '2', '3' ]
The above regex would match the numbers only if it's followed by one or more alphabets.

Match is going to return an array of every match. Since you put the optional negative in a parentheses, it's another capture group. That capture group has one term and it's optional, so it'll return an empty match in addition to your actual match.
Input 2x -> Your output: [2,undefined] which prints out as "2,"
Input -2x -> Your output: [2,-]
Remove the parentheses around the negative.
This is just for the sake of explaining why your case is breaking but personally I'd use Avinash's answer.

Regular expression to strip thousand separator from numeral string?

I have strings which contains thousand separators, however no string-to-number function wants to consume it correctly (using JavaScript). I'm thinking about "preparing" the string by stripping all thousand separators, leaving anything else untoched and letting Number/parseInt/parseFloat functions (I'm satisfied with their behavious otherwise) to decide the rest. But it seems what i have no idea which RegExp can do that!
Better ideas are welcome too!
UPDATE:
Sorry, answers enlightened me how badly formulated question it is. What i'm triyng to achieve is: 1) to strip thousand separators only if any, but 2) to not disturb original string much so i will get NaNs in the cases of invalid numerals.
MORE UPDATE:
JavaScript is limited to English locale for parsing, so lets assume thousand separator is ',' for simplicity (naturally, it never matches decimal separator in any locale, so changing to any other locale should not pose a problem)
Now, on parsing functions:
parseFloat('1023.95BARGAIN BYTES!') // parseXXX functions just "gives up" on invalid chars and returns 1023.95
Number('1023.95BARGAIN BYTES!') // while Number constructor behaves "strictly" and will return NaN
Sometimes I use rhw loose one, sometimes strict. I want to figure out the best approach for preparing string for both functions.
On validity of numerals:
'1,023.99' is perfectly well-formed English number, and stripping all commas will lead to correct result.
'1,0,2,3.99' is broken, however generic comma stripping will give '1023.99' which is unlikely to be a correct result.

welp, I'll venture to throw my suggestion into the pot:
Note: Revised
stringWithNumbers = stringwithNumbers.replace(/(\d+),(?=\d{3}(\D|$))/g, "$1");
should turn
1,234,567.12
1,023.99
1,0,2,3.99
the dang thing costs $1,205!!
95,5,0,432
12345,0000
1,2345
into:
1234567.12
1023.99
1,0,2,3.99
the dang thing costs $1205!!
95,5,0432
12345,0000
1,2345
I hope that's useful!
EDIT:
There is an additional alteration that may be necessary, but is not without side effects:
(\b\d{1,3}),(?=\d{3}(\D|$))
This changes the "one or more" quantifier (+) for the first set of digits into a "one to three" quantifier ({1,3}) and adds a "word-boundary" assertion before it. It will prevent replacements like 1234,123 ==> 1234123. However, it will also prevent a replacement that might be desired (if it is preceded by a letter or underscore), such as A123,789 or _1,555 (which will remain unchanged).

A simple num.replace(/,/g, '') should be sufficient I think.

Depends on what your thousand separator is
myString = myString.replace(/[ ,]/g, "");
would remove spaces and commas.

This should work for you
var decimalCharacter = ".",
regex = new RegExp("[\\d" + decimalCharacter + "]+", "g"),
num = "10,0000,000,000.999";
+num.match(regex).join("");

To confirm that a numeral-string is well-formed, use:
/^(\d*|\d{1,3}(,\d{3})+)($|[^\d])/.test(numeral_string)
which will return true if the numeral-string is either (1) just a sequence of zero or more digits, or (2) a sequence of digits with a comma before each set of three digits, or (3) either of the above followed by a non-digit character and who knows what else. (Case #3 is for floats, as well as your "BARGAIN BYTES!" examples.)
Once you've confirmed that, use:
numeral_string.replace(/,/g, '')
which will return a copy of the numeral-string with all commas excised.

You can use s.replaceAll("(\\W)(?=\\d{3})","");
This regex gets all alpha-numeric character with 3 characters after it.
Strings like 4.444.444.444,00 € will be 4444444444,00 €

I have used the following in a commercial setting, and it has worked often:
numberStr = numberStr.replace(/[. ,](\d\d\d\D|\d\d\d$)/g,'$1');
In the above example, thousands can be marked with a decimal, a comma, or a space.
In some cases ( like a price of 1000,5 Euros) the above doesn't work. If you need something more robust, this should work 100% of the time:
//convert a comma or space used as the cent placeholder to a decimal
$priceStr = $priceStr.replace(/[, ](\d\d$)/,'.$1');
$priceStr = $priceStr.replace(/[, ](\d$)/,'.$1');
//capture cents
var $hasCentsRegex = /[.]\d\d?$/;
if($hasCentsRegex.test($priceStr)) {
var $matchArray = $priceStr.match(/(.*)([.]\d\d?$)/);
var $priceBeforeCents = $matchArray[1];
var $cents = $matchArray[2];
} else{
var $priceBeforeCents = $priceStr;
var $cents = "";
}
//remove decimals, commas and whitespace from the pre-cent portion
$priceBeforeCents = $priceBeforeCents.replace(/[.\s,]/g,'');
//re-create the price by adding back the cents
$priceStr = $priceBeforeCents + $cents;

Javascript Regex: replacing the last dot for a comma

I have the following code:
var x = "100.007"
x = String(parseFloat(x).toFixed(2));
return x
=> 100.01
This works awesomely just how I want it to work. I just want a tiny addition, which is something like:
var x = "100,007"
x.replace(",", ".")
x.replace
x = String(parseFloat(x).toFixed(2));
x.replace(".", ",")
return x
=> 100,01
However, this code will replace the first occurrence of the ",", where I want to catch the last one. Any help would be appreciated.

You can do it with a regular expression:
x = x.replace(/,([^,]*)$/, ".$1");
That regular expression matches a comma followed by any amount of text not including a comma. The replacement string is just a period followed by whatever it was that came after the original last comma. Other commas preceding it in the string won't be affected.
Now, if you're really converting numbers formatted in "European style" (for lack of a better term), you're also going to need to worry about the "." characters in places where a "U.S. style" number would have commas. I think you would probably just want to get rid of them:
x = x.replace(/\./g, '');
When you use the ".replace()" function on a string, you should understand that it returns the modified string. It does not modify the original string, however, so a statement like:
x.replace(/something/, "something else");
has no effect on the value of "x".

You can use a regexp. You want to replace the last ',', so the basic idea is to replace the ',' for which there's no ',' after.
x.replace(/,([^,]*)$/, ".$1");
Will return what you want :-).

You could do it using the lastIndexOf() function to find the last occurrence of the , and replace it.
The alternative is to use a regular expression with the end of line marker:
myOldString.replace(/,([^,]*)$/, ".$1");

You can use lastIndexOf to find the last occurence of ,. Then you can use slice to put the part before and after the , together with a . inbetween.

You don't need to worry about whether or not it's the last ".", because there is only one. JavaScript doesn't store numbers internally with comma or dot-delimited sets.

We Keep Coding

JavaScript is the programming language of the Web.

Regex - Delete up to second whitespace - javascript

Going by your own requirements (remove everything up to and including the second whitespace, and remove dashes from the remaining); Use this pattern ^([\S]+\s){2}|- and replace with nothing. You should be left with a number similar to what you asked for (+44 (0) 234-567-8901 becomes 2345678901).

Related

Applying currency format using replace and a regular expression

Add a space to UK Postcode in correct place Javascript

Javascript regex match returning a string with comma at the end

Regular expression to strip thousand separator from numeral string?

Javascript Regex: replacing the last dot for a comma

Categories

Resources