I use match to split a mathematics expression into separated strings and save them in an array.
var STRING = ST.match(/\d*\.\d+|\d+|[()/*+-]/g);
but this method separate everything including negative numbers which are inside parentheses.
For example (-2+4) does not give me -2, instead it saves - in one index of STRING array and 2 in the next index.
Is there anyway use match and save negative numbers which are in the parentheses?
This is what I want:
(-2+4):
STRING[0] give me (
STRING[1] give me -2
STRING[2] give me +
STRING[3] give me 4
STRING[4] give me )
and if there is no negative number work as normal:
(2+4):
STRING[0] give me (
STRING[1] give me 2
STRING[2] give me +
STRING[3] give me 4
STRING[4] give me )
I don't think it's possible to parse complex cases like "(-2+4*-(3.5--8))" with just a regex especially given we don't have negative look behind in javascript.
A solution would be to postprocess your match array by merging signs when they're between a separator and an unsigned expression.
In my opinion a regex is useful here, but only for the primary tokenization. Most of the work will be ahead of you as you'll build the binary expression tree (or any other formal representation you choose).
Unfortunately, if what you're trying to do is parsing a mathematical expression, regexps can not be used.
RegExps can be used in languages that are describable by Regular Grammars and arithmetical expressions can not, they are described by a Context Free Grammar (CFG). If you want to parse, and perhaps interpret the result, you'll certainly need some stacked state machine.
You can look at something like this well known algorithm.
Hope this helps.
You can add an optional sign to the numbers, that would work with your example:
var STRING = ST.match(/-?\d*\.\d+|-?\d+|[()/*+-]/g);
However, that will also turn a minus operator into a sign. The expression (4-2) would give you { "(", "4", "-2", ")" }.
Also, it will happily "parse" an expression like +---((((*** without complaining. If you want a result that makes sense, you should parse it for real, not just split it with a regular expression.
I think you have some mistake in your RegExp try this, it works for me:
var STRING = ST.match(/(\d*)(\.)(\d+)|(\d+)|[()\/*+-]/g);
Related
Suppose we have a string with some (astral) Unicode characters:
const s = 'Hi π Unicode!'
The [] operator and .charAt() method don't work for getting the 4th character, which should be "π":
> s[3]
'οΏ½'
> s.charAt(3)
'οΏ½'
The .codePointAt() does get the correct value for the 4th character, but unfortunately it's a number and has to be converted back to a string using String.fromCodePoint():
> String.fromCodePoint(s.codePointAt(3))
'π'
Similarly, converting the string into an array using splats yields valid Unicode characters, so that's another way of getting the 4th one:
> [...s][3]
'π'
But i can't believe that going from string to number back to string, or having to split the string into an array are the only ways of doing this seemingly trivial thing. Isn't there a simple method for doing this?
> s.simpleMethod(3)
'π'
Note: i know that the definition of "character" is somewhat fuzzy, but for the purpose of this question a character is simply the symbol that corresponds to a Unicode codepoint (no combining characters, no grapheme clusters, etc).
Update: the String.fromCodePoint(str.codePointAt(n)) method is not really viable, since the nth position there doesn't take previous astral symbols into account: String.fromCodePoint('ππ'.codePointAt(1)) // => 'οΏ½'
(I feel kinda dumb asking this; like i'm probably missing something obvious. But previous answers to this questions don't work on strings with Unicode simbols on astral planes.)
The string iterator is the only thing that iterates through code points rather than UCS-2/UTF-16 code units. So:
const string = 'Hi π Unicode!';
for (const symbol of string) {
console.log(symbol);
}
So to get a specific code point based on its index from a string:
const string = 'Hi π Unicode!';
// Note: The spread operator uses the string iterator under the hood.
const symbols = [...string];
symbols[3]; // 'π'
Still, this would break with grapheme clusters, or emoji sequences such as π¨βπ©βπ§βπ¦ (π¨ + U+200D ZERO WIDTH JOINER + π© + U+200D ZERO WIDTH JOINER + π§ + U+200D ZERO WIDTH JOINER + π¦). Text segmentation helps with that.
Do you actually need to get the 4th code point in the string, though? Whatβs your use case?
You can use the new u flag to regexp if it's available to you.
const chars = 'Hi π Unicode!'.match(/./ug);
console.log(chars);
The accepted answer to this question is out of date.
There is now a member of the String object called .at()/1 which does exactly what you're hoping for. If you have shims, shams, a transcompiler like TypeScript or Babel, etc, just set whatever your local configuration is, and you should be good to go.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/at
Amusingly, the spec for this feature, as well as the most common implementation shim (the one that I use,) is written by the person who authored the now out-of date accepted answer here. So even when he's out of date, he's still up to date.
If shimming or transcompiling isn't appropriate for you, there's a library called jsesc that can handle it for you through simple escaping. I'll give you three guesses who wrote the library. First two don't count.
https://www.npmjs.com/package/jsesc
I tummbled into this RegEx and I googled it. A lot. But unfortunately didn't quite understand how RegEx works...
So to make this quick since only a tiny winny part of my work requires it so I will be needing you guys. again :))
So here it goes...
All I want is to retrieve a specific string with a format of 0000x0000. For example:
Input:NameName975x945NameName
Output:
975x945
Must also consider string like this:
NameNameName9751x9451NameNameName
(the integer and string are longer...)
Use regex in String.prototype.match() to get specific part of string.
str.match(/\d+x\d+/)[0]
var str = "NameName975x945NameName";
var match = str.match(/\d+x\d+/)[0];
console.log(match)
We need a bit more detail, but I'll go in order:
Assuming there can be any number of digits before and after the x, and these can be of different lengths:
[\d]+x[\d]+
Assuming the number of digits before the x needs to be equal to the number of digits after the x (as in your example) and this number is finite (and small enough so that your regex isn't obscenely long):
[\d]{1}x[\d]{1}|[\d]{2}x[\d]{2}|[\d]{3}x[\d]{3} (and so on)
Check out this related answer for more details on handling this as the length of the number gets longer.
Then you can use String.prototype.match() with your regex to grab the matches within your string.
I'm not particularly strong with Regular Expressions. Basically, I have the following string:
Showing 1-20 of 748 results.
I want to extract the "748", convert it to a number, and use it for comparisons. As expected, "Showing", "of", and "results" are not expected to change, but the numbers could. I have a couple of solutions in mind. The first is using lookbehinds, but I do not believe JS supports them. The second is doing a more blunt approach, maybe finding all the numbers in the string using match() and taking the element at the third index in the returned array (which should be "748").
Any thoughts on the best way to do this?
I would use the regex:
Showing \d+-\d+ of (\d+) results\.
where \d+ in each case means to match 1 or more digits. The parentheses around the number you wanted to find is called a capture group.
So if the search string was in str, the resulting JavaScript might look like:
var resultsRe = /Showing \d+-\d+ of (\d+) results\./;
var numResults = resultsRe.exec(str);
console.log("There are " + numResults + " results.");
For a simple approach you could do the following:
(\d+)\sresults
All it does is capture the integer directly before the word results.
I'm a novice programmer making a simple calculator in JavaScript for a school project, and instead of using eval() to evaluate a string, I made my own function calculate(exp).
Essentially, my program uses order of operations (PEMDAS, or Parenthesis, Exponents, Multiplication/Division, Addition/Subtraction) to evaluate a string expression. One of my regex patterns is like so ("mdi" for multiplication/division):
mdi = /(-?\d+(\.\d+)?)([\*\/])(-?\d+(\.\d+)?)/g; // line 36 on JSFiddle
What this does is:
-?\d+ finds an integer number
(\.\d+)? matches the decimal if there is one
[\*\/] matches the operator used (* or / for multiplication or division)
/g matches every occurence in the string expression.
I loop through this regular expression's matches with the following code:
while((res = mdi.exec(exp)) !== null) { // line 69 on JSFiddle
exp = exp.replace(mdi,
function(match,$1,$3,$4,$5) {
if($4 == "*")
return parseFloat($1) * parseFloat($5);
else
return parseFloat($1) / parseFloat($5);
});
exp = exp.replace(doN,""); // this gets rid of double negatives
}
However, this does not work all the time. It only works with numbers with an absolute value less than 10. I cannot do any operations on numbers like 24 and -5232000321, even though the regex should match it with the + quantifier. It works with small numbers, but crashes and uses up most of my CPU when the numbers are larger than 10.
For example, when the expression 5*.5 is inputted, 2.5 is outputted, but when you input 75*.5 and press enter, the program stops.
I'm not really sure what's happening here, because I can't locate the source of the error for some reason - nothing is showing up even though I have console.log() all over my code for debugging, but I think it is something wrong with this regex. What is happening?
The full code (so far) is here at JSFiddle.net, but please be aware that it may crash. If you have any other suggestions, please tell me as well.
Thanks for any help.
The problem is
bzp = /^.\d/;
while((res = bzp.exec(result)) !== null) {
result = result.replace(bzp,
function($match) {
console.log($match + " -> 0 + " + $match);
return "0" + $match;
});
}
It keeps prepending zeros with no limit.
Removing that code it works well.
I have also cleaned your code, declared variables, and made it more maintainable: Demo
If you have any other suggestions, please tell me as well.
As pointed out in the comments, parsing your input by iteratively applying regular expressions is very ad-hoc. A better approach would be to actually construct a grammar for your input language and parse based on that. Here's an example grammar that basically matches your input language:
expr ::= term ( additiveOperator term )*
term ::= factor ( multiplicativeOperator factor )*
expr ::= number | '(' expr ')'
additiveOperator ::= '+' | '-'
multiplicativeOperator ::= '*' | '/'
The syntax here is pretty similar to regular expressions, where parenthesese denote groups, * denotes zero-or-more repetitions, and | denotes alternatives. The symbols enclosed in single quotes are literals, whereas everything else is symbolic. Note that this grammar doesn't handle unary operators (based on your post it sounds like you assume a single negative sign for negative numbers, which can be parsed by the number parser).
There are several parser-generator libraries for JavaScript, but I prefer combinator-style parsers where the parser is built functionally at runtime rather than having to run a separate tool to generate the code for your parer. Parsimmon is a nice combinator parser for JavaScript, and the API is pretty easy to wrap your head around.
A parser usually returns some sort of a tree data structure corresponding to the parsed syntax (i.e. an abstract syntax tree). You then traverse this data structure in order to calculate the value of the arithmetic expression.
I created a fiddle demonstrating parsing and evaluating of arithmetic expressions. I didn't integrate any of this into your existing calculator interface, but if you can understand how to use the parser
Mathematical expression are not parsed and calculated with regular expressions because of the number of permutations and combinations available. The faster way so far, is POST FIX notation because other notations are not as fast as this one. As they mention on Wikipedia:
In comparison testing of reverse Polish notation with algebraic
notation, reverse Polish has been found to lead to faster
calculations, for two reasons. Because reverse Polish calculators do
not need expressions to be parenthesized, fewer operations need to be
entered to perform typical calculations. Additionally, users of
reverse Polish calculators made fewer mistakes than for other types of
calculator. Later research clarified that the increased speed
from reverse Polish notation may be attributed to the smaller number
of keystrokes needed to enter this notation, rather than to a smaller
cognitive load on its users. However, anecdotal evidence suggests
that reverse Polish notation is more difficult for users to learn than
algebraic notation.
Full article: Reverse Polish Notation
And also here you can see other notations that are still far more better than regex.
Calculator Input Methods
I would therefore suggest you change your algorithm to a more efficient one, personally I would prefer POST FIX.
I have searched StackOverflow and I can't find an answer as to how to check for regex of numeric inputs for a calculator app that will check for the following format with every keyup (jquery key up):
Any integer like: 34534
When a dot follows the integer when the user is about to enter a decimal number like this: 34534. Note that a dot can only be entered once.
Any float: 34534.093485
I don't plan to use commas to separate the thousands...but I would welcome if anyone can also provide a regex for that.
Is it possible to check the above conditions with just one regex? Thanks in advance.
Is a lone . a successful match or not? If it is then use:
\d+(\.\d*)?|\.\d*
If not then use:
\d+(\.\d*)?|\.\d+
Rather than incorporating commas into the regexes, I recommend stripping them out first: str = str.replace(/,/g, ''). Then check against the regex.
That wouldn't verify that digits are properly grouped into groups of three, but I don't see much value in such a check. If a user types 1,024 and then decides to add a digit (1,0246), you probably shouldn't force them to move the comma.
Let's write our your specifications, and develop from that.
Any integer: \d+
A comma, optionally followed by an integer: \.\d*
Combine the two and make the latter optional, and you get:
\d+\.?\d*
As for handling commas, I'd rather not go into it, as it gets very ugly very fast. You should simply strip all commas from input if you still care about them.
you can use in this way:
[/\d+./]
I think this can be used for any of your queries.
Whether it's 12445 or 1244. or 12445.43
I'm going to throw in a potentially downvoted answer here - this is a better solution:
function valid_float (num) {
var num = (num + '').replace(/,/g, ''), // don't care about commas, this turns `num` into a String
float_num = parseFloat(num);
return float_num == num || float_num + '.' == num; // allow for the decimal point, deliberately using == to ignore type as `num` is a String now
}
Any regex that does your job correctly will come with a big asterisk after it saying "probably", and if it's not spot on, it'll be an absolute pig to debug.
Sure, this answer isn't giving you the most awesomely cool one-liner that's going to make you go "Cool!", but in 6 months time when you realise it's going wrong somewhere, or you want to change it to do something slightly different, it's going to be a hell of a lot easier to see where, and to fix.
I'm using ^(\d)+(.(\d)+)+$ to capture each integer and to have an unlimited length, so long as the string begins and ends with integers and has dots between each integer group. I'm capturing the integer groups so that I can compare them.