How do I use regex to pick the following price values?

How do I use regex to pick the following price values? - javascript

I have a scenario where I am trying to pick the price values in Rs from strings in Javascript as follows
The price was Rs.1000
The price was Rs 1000
The price was Rs.1000 - 5000
The price was Rs.1000 - Rs.5000
The price was Rs.50,000
The price was Rs 1,25,000 - Rs 2,45,000
Now obviously, given the input with so much variety in it, its not a good idea to make a single very long cumbersome regex expression.
Currently I have divided this task into 4 parts
Part 1
// Extracts all Rs.1000 or Rs 1000
var regex = new RegExp(/\brs\W*?(\d{1,7})\b(?![,\d])/i)
Part 2
//Extracts all Rs.1000 - 2000 or Rs 1000 - Rs 2000 and any combinations of this
regex = new RegExp(/\brs\W*?(\d{1,7})\b(?![,\d])\s*?(?:-|to)\s*?(?:\brs\b\W*?)?(\d{1,7})\b(?![,\d])/i)
I need to capture the currency values like 1000 and 2000 to store and process it.
A few questions right off the bat, my array in JS has around 3000 items. I am stuck on Part 3 and 4 that involves commas. Is this the right way to go about it.
How do I get the values in 1 stroke where commas are present
This Regex seems to capture both normal numbers and numbers with commas, and since I just want numeric values rather than have anything to do with where the commas are placed,
\brs\W*?\d.,?.\d\b
I am trying to work one step forward on this expression to include 1000 - 2000 types as well. Any ideas?

You can use a regex for this task - you have a regular pattern used to find repeated patterns in a plain text, just create the pattern dynamically. There are 2 main blocks, one that will match the prices glued to other words (so that we could skip that text) and the other will capture the prices only in valid contexts.
The whole regex looks ugly and long:
/\Brs\W*(?:\d{1,7}(?:,\d+)*)\b(?:\s*(?:-|to)\s*(?:\brs\b\W*?)?(?:\d{1,7}(?:,\d+)*)\b)?|\brs\W*(\d{1,7}(?:,\d+)*)\b(?:\s*(?:-|to)\s*(?:\brs\b\W*?)?(\d{1,7}(?:,\d+)*)\b)?/gi
However, it is clear it consists of simple and easily editable building blocks:
(\\d{1,7}(?:,\\d+)*)\\b - the number part
rs\\W*${num}(?:\\s*(?:-|to)\\s*(?:\\brs\\b\\W*?)?${num})? - the price part
NOTE that the capturing groups are made non-capturing with .replace(/\((?!\?:)/g, '(?:') further in the RegExp constructor.
See the JS demo:
const num = "(\\d{1,7}(?:,\\d+)*)\\b";
const block = `rs\\W*${num}(?:\\s*(?:-|to)\\s*(?:\\brs\\b\\W*?)?${num})?`;
const regex = RegExp(`\\B${block.replace(/\((?!\?:)/g, '(?:')}|\\b${block}`, 'ig');
const str = `The price was Rs.1000
The price was Rs 1000
The price was Rs.1000 - 5000
The price was Rs.1000 - Rs.5000
The price was Rs.50,000
The price was Rs 1,25,000 - Rs 2,45,000
The price was dummytestRs 1,2665,000 - Rs 2,45,000`;
let m;
let result = [];
while ((m = regex.exec(str)) !== null) {
if (m[2]) {
result.push([m[1].replace(/,/g, ''), m[2]]);
} else if (m[1]) {
result.push([m[1].replace(/,/g, ''), ""]);
}
}
document.body.innerHTML = "<pre>" + JSON.stringify(result, 0, 4) + "</pre>";

Related

Regex validation for time tracking

I am trying to validate a string the way it is done in Jira in Javascript. I'm trying to replicate how it is validated in Jira. I am guessing I could do this with Regex but I am not sure how.
A user can type a string in the format of "1d 6h 30m" which would mean 1 day, 6 hours, 30 minutes. I do not need the weeks for my use case. I want to show an error if the user uses an invalid character (anything except 'd','h','m', or ' '). Also the string must separate the time durations by spaces and ideally I would like to force the user to enter the time durations in descending order meaning '6h 1d' would be invalid because the days should come first. Also the user does not have to enter all information so '30m' would be valid.
This is my code for getting the days, hours and minutes which seems to work. I just need help with the validation part.
let time = '12h 21d 30m'; //example
let times = time.split(' ');
let days = 0;
let hours = 0;
let min = 0;
for(let i = 0; i < times.length; i++) {
if (times[i].includes('d')){
days = times[i].split('d')[0];
}
if (times[i].includes('h')){
hours = times[i].split('h')[0];
}
if (times[i].includes('m')){
min = times[i].split('m')[0];
}
}
console.log(days);
console.log(hours);
console.log(min);

const INPUT = "12h 21d 30s";
checkTimespanFormat(INPUT);
if (checkTimespanKeysOrder(INPUT, true))
console.log(`${INPUT} keys order is valid`);
else console.log(`${INPUT} keys order is NOT valid`);
//******************************************************************
/**
* Ensures that time keys are:
* - Preceeded by one or two digits
* - Separated by one or many spaces
*/
function checkTimespanFormat(timespanStr, maltiSpacesSeparation = false) {
// timespan items must be separated by 1 space
if (maltiSpacesSeparation) timespanStr = timespanStr.toLowerCase().split(" ");
// timespan items must be separated by one or many space
else timespanStr = timespanStr.toLowerCase().split(/ +/);
// timespan items must be formatted correctly
timespanStr.forEach((item) => {
if (!/^\d{1,2}[dhms]$/.test(item)) console.log("invalid", item);
else console.log("valid", item);
});
}
/**
* Validates:
* - Keys order
* - Duplicate keys
*/
function checkTimespanKeysOrder(timespanStr) {
const ORDER = ["d", "h", "m", "s"];
let timeKeysOrder = timespanStr
.replace(/[^dhms]/g, "") // Removing non time keys characters
.split("") // toArray
.map((char) => {
return ORDER.indexOf(char); // Getting the order of time keys
});
for (i = 0; i < timeKeysOrder.length - 1; i++)
if (timeKeysOrder.at(i) >= timeKeysOrder.at(i + 1)) return false;
return true;
}

Based on your comment, I have added a validation regex to be run first before running the match regex.
For validation, you want
/^(\d+[d]\s+)?(\d+[h]\s+)?(\d+[m]\s+)?(\d+[s]\s+|$)?/
For extracting values, you want
/([\d]+[dhms]\s+|$)/g
You can then use String.match with this regular expression, iterating through all the matches add adding time based on the time letter at the end

Here's my take at the problem:
match minutes ([1-5]?[\d])m, with eligible values in range [0,59]
match hours ([1]?[\d]|2[0-3])h , with eligible values in range [0,23]
match days ([1-9]|[1-9][\d])d, with eligible values in range [1,99]
Then we can encapsulate regex for days in hours and regex for hours in minutes to make sure that we have formats like {dd}d {hh}h {mm}m, {hh}h {mm}m, {mm}m:
"(((([1-9]|[1-9][\d])d )?([1]?[\d]|2[0-3])h )*([1-5]?[\d])m)"
Corner cases include inputs with zeros like 00m, 00d 00m, 01h 00d 00m. In order to reject the former two and accept the last one, we can negate the values 00m and 00d when the preceeding patterns are not found.
Final regex to use:
"(?!0m)(((?!0h)(([1-9]|[1-9][\d])d )?([1]?[\d]|2[0-3])h )*([1-5]?[\d])m)"
Check for:
days in Group 4
hours in Group 5
minutes in Group 6
Tested at https://regex101.com.
Does it solve your problem?

Generated hashid size

Today I am generating the hashid as follows:
const Hashids = require('hashids');
const ALPHABET = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
let number = 1419856
let hash = new Hashids('Salto do Pedro', 6, ALPHABET).encode(number)
console.log("Hash:", hash, ". Number:", number, ". Size:", hash.length)
So what is printed on the console is:
[Running] node "c:\Users\pedro\Desktop\teste\testPedro.js"
Hash: YMMMMM . Number: 1419856 . Size: 6
[Done] exited with code=0 in 0.258 seconds
However, if I change the variable 'number' to number 1419857 the result is:
[Running] node "c:\Users\pedro\Desktop\teste\testPedro.js"
Hash: DRVVVVV . Number: 1419857 . Size: 7
[Done] exited with code=0 in 0.245 seconds
My doubt is:
The alphabet I am going through has 26 characters and I defined that the minimum size of the hashid would be 6 characters, the maximum hashid that I would have available with 6 characters would not be 308.915.776 (26 * 26 * 26 * 26 * 26 * 26 )?
Why in the number 1.419.857 he already increased one more character in my hashid?

Good question. This might look intimidating but I would try to make it as simple as possible by understanding the math behind the code.
The Hashids constructor takes parameter - (salt, minLength, alphabet, seps)
Salt - String value which makes your ids unique in your project
MinLength - Number value which is the minimum length of id string you need
alphabet - String value (Input string)
seps - String value to take care of curse words
With this set, it tries to create a buffer array with Salt and random characters(taken based on salt,sep,alphabet passed) and shuffles the positions of each character.
Now, the below is the code which encodes the value based on the above character array
id = []
do {
id.unshift(alphabetChars[input % alphabetChars.length])
input = Math.floor(input / alphabetChars.length)
} while (input > 0)
Let's first take example 1 -
this.salt = 'Salto do Pedro'
this.minLength = 6
input = 1419856
// alphabetChars is the array which generates based on salt,alphabet and seps. (complex operations invol
alphabetChars = ["A","X","R","N","W","G","Q","O","L","D","V","Y","K","J","E","Z","M"]
The final array is then joined by string and a lottery character (another some math operation calc) is appended at the start. This is returned as encoded string.
Let's first take example 2 -
this.salt = 'Salto do Pedro'
this.minLength = 6
input = 1419857
// alphabetChars is the array which generates based on salt,alphabet and seps. (complex operations invol
alphabetChars = ["V","R","Y","W","J","G","M","L","Z","K","O","D","E","A","Q","N","X"]
Now this is the reason why you get +1 character extra if the number changes (because it ran a loop extra). It's the min length of the alphabet array which is monitored rather than max length, so you cannot be sure that you would be getting the same length by +1 characters always.
Hope it helps. If you want to dig in more - Here's the code for the library - https://github.com/niieani/hashids.js/blob/master/dist/hashids.js#L197

How to find time stamp in the string and then convert to number of hours/minutes?

In the column where the hours/minutes are stored for some of the business facilities time stamp(s) are presented in this format 0000-0000. The first two digits represent hours and the other two minutes. Here is example of some business hours:
0700-2300 M-F 0700-1700 S&S
0600-2200
0600-2300 Mon-Fri 0700-2200 Sat&Sun
Local 1 0000-2359 Local 2 0630-2230
0700-2100
0600-2345
The original solution that I had was to convert the values in JavaScript and that it was pretty simple. The problem I have is when there is more than one set of time hours/minutes in the string. In the example above that is the case where hours/minutes are different during the weekend or for the different locations. The JS code example is here:
var time = calcDifference("0600-2345");
function calcDifference(time) {
var arr = time.split('-').map(function(str) {
var hours = parseInt(str.substr(0, 2), 10),
minutes = parseInt(str.substr(2, 4), 10);
var result = (hours * 60 + minutes) / 60;
return result.toFixed(2);
});
return arr[1] - arr[0];
}
console.log(time);
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
The code above works just fine if I pass the argument with one time stamp. I'm wondering how to handle situation where I have two time stamps? What would be a good solution to search if string has more than one hours/minutes values and then convert each of them and return to the user.

Assuming the HHMM-HHMM format is consistent in the input, and you don't care about discarding the remaining information in the string, regex is probably the simplest approach (and much safer than your current approach of splitting on hyphens, which might easily occur in other parts of the string you don't care about.)
Note that you won't be able to distinguish between "weekend" and "weekday" times, because that information isn't in a consistent format in your input. (This looks like human input, which pretty much guarantees that your HHMM-HHMM format also won't be strictly consistent; consider allowing for optional whitespace around the hyphen for example, and logging strings which show no match so you can check them manually.)
var testinputs = [
"0700-2300 M-F 0700-1700 S&S",
"0600-2200",
"0600-2300 Mon-Fri 0700-2200 Sat&Sun",
"Local 1 0000-2359 Local 2 0630-2230",
"0700-2100",
"0600-2345"
]
var reg = /(\d\d)(\d\d)\-(\d\d)(\d\d)/g; // \d means any digit 0-9; \- matches a literal "-", parens capture the group for easier access later
for (input of testinputs) {
console.log("Input: ", input)
var matches;
while ((matches = reg.exec(input)) !== null) { // loop through every matching instance in the string
// matches[0] is the full HHMM-HHMM string; the remainder is
// the HH and MM for each parenthetical in the regexp:
console.log(matches)
}
}

There are plenty of ways to do this ( based on your point of view ), but this is my favourite one. you can manipulate the text then pass numbers individually.
var date = '0700-2300 M-F 0700-1700 S&S'.match( /[-\d]+/gi ).filter( e => ~e.search( /\d+/gi ) )
now you have an array of multiple timestamps saved on your database and you pass them individually to your function.
date.forEach( each => calcDifference( each ) );

You can use a regex like /\d{4}\-\d{4}/g to extract all of the digits from the string and map them to their time differences or replace text in the original.
const calcDifference = range => {
const time = range.split`-`
.map(e => (+e.substr(0, 2) * 60 + (+e.substr(2))) / 60)
return time[1] - time[0];
};
const diffs = `0700-2300 M-F 0700-1700 S&S
0600-2200
0600-2300 Mon-Fri 0700-2200 Sat&Sun
Local 1 0000-2359 Local 2 0630-2230
0700-2100
0600-2345`.replace(/\d{4}\-\d{4}/g, calcDifference);
console.log(diffs);

How to create a unique value each time when ever I run the java-script code?

I am using Math.random to create a unique value.
However , it looks like after some days , if i run the same script it produces the same value that created earlier.
Is there any way to create unique value every time when ever i run the script.
Below is my code for the random method.
var RandomNo = function (Min,Max){
return Math.floor(Math.random() * (Max - Min + 1)) + Min;
}
module.exports = RandomNo;

The best way to achieve a unique value is to use Date() as milliseconds. This increasing time representation will never repeat.
Do it this way:
var RamdomNo = new Date().getTime();
Done.
Edit
If you are bound to length restrictions, the solution above won't help you as repetition is predictable using an increasing number the shorter it gets.
Then I'd suggest the following approach:
// turn Integer into String. String length = 36
function dec2string (dec) {
return ('0' + dec.toString(36)).substr(-2);
}
// generate a 20 * 2 characters long random string
function generateId () {
var arr = new Uint8Array(20);
window.crypto.getRandomValues(arr);
// return 5 characters of this string starting from position 8.
// here one can increase the quality of randomness by varying
// the position (currently static 8) by another random number <= 35
return Array.from(arr, this.dec2string).join('').substr(8,5);
}
// Test
console.log(generateId());
This pair of methods generates a 40 characters long random string consisting of letters and digits. Then you pick a sequence of 5 consecutive characters off it.

trim trailing zeros not quite working

I have a number of prices on a page that arein 4 decimal places. Some products are priced in 4 decimal places, but some are only in 2.
At the moment our website is set to display in 4 decimal places for every product and I have to use javascript to trim down the prices on those products that aren't 2 decimal places.
So I have prices like so...
£0.1234
£1.1000
£10.9900
£100.0000
I have the following javascript which works fine for prices that have a number greater than 1 after the decimal point, but it fails on prices where there are just 0's after the decimal point...
$.each($("#mydiv"),function(){
var price = $(this).text().replace("£","");
var number = parseFloat(price);
var integerPart = number.toString().split(".")[0] == 0 ? 0: number.toString().split(".")[0].length;
var decimalPart = number.toString().split(".")[1].length;
if(decimalPart > 2){
$(this).text("£" + number.toPrecision(integerPart + decimalPart));
}else{
$(this).text("£" + number.toPrecision(integerPart + 2));
}
});
The ones it fails on are prices like £100.0000 - I would like the prices to appear as follows - no rounding...
£0.1234
£1.10
£10.99
£100.00

Just use a regexp to remove any trailing zeroes if the preceeding characters are the decimal period followed by another two digits:
$('.myClass').text(function(_, t) {
return t.replace(/(\.\d\d)00$/, '$1');
});
NB: you can't use duplicate element ID's, so your $.each call should be moot. If there really are multiple fields that this needs doing to, mark them with a class, not an ID. The .text call in the code above will automatically cope with multiple elements.
EDIT if you really can't upgrade your jQuery:
$('.myClass').each(function() {
var $this = $(this);
var text = $this.text();
text = text.replace(/(\.\d\d)00$/, '$1');
$this.text(text);
});

Additional to Alnitak's answer, I would highly recommend to store the prices of your store's items as integers (or maybe in your case longs), so you don't have to worry about imprecisions of double or float.
Because you have 4 decimal places the integers (or longs) can not represent pence, but 1/100 pence.
Example:
Price on website | price in database
-------------------------------------------
£0.1234 | 1234
£1.10 | 11000
£10.99 | 109900
£100.00 | 1000000

We Keep Coding

JavaScript is the programming language of the Web.

How do I use regex to pick the following price values? - javascript

Related

Regex validation for time tracking

Generated hashid size

How to find time stamp in the string and then convert to number of hours/minutes?

How to create a unique value each time when ever I run the java-script code?

trim trailing zeros not quite working

Categories

Resources