Why does this regular expression won't work properly? - javascript

I'm in the need to check wether some input is strictly as this one:
PEOPLE-123456 or PERSON-12345376 (it can be any combination of numbers)
The number of numbers following the - doesn't matter. It can be from 0 to N numbers.
I've come up with the following expression:
/(PEOPLE-)|(PERSON-)?=^[0-9]+$/
The problem is, this will work even if the characters after the -are not numbers.
PEOPLE-123131 yields true
PERSON-123242 yields true
PERSON-23123.341 yields true
PEOPLE-.2341231 yields false
What am I doing wrong with it? I don't see any problems with the expression itself, maybe I am to noob to see it.

Try this:
^(PERSON|PEOPLE)-[0-9]{1,}$
This ensures the beginnings starts with exactly wither PERSON or PEOPLE, followed by - and ends with at least one number.

You need to put the grouping parentheses around both alternatives:
/^(PEOPLE|PERSON)-\d+$/
And you shouldn't mark it optional with ?. I have no idea why you put = and ^ after that part.
And if you want to allow decimal points in the number, use [0-9.] instead of \d.

This should work if numbers are optional. Otherwise at least 1 number is required replace * with +.
/^(PEOPLE|PERSON)-\d*$/

Related

charCodeAt is not behaving as expected

How can this be possible:
var string1 = "🌀", string2 = "🌀🌂";
//comparing the charCode
console.log(string1.charCodeAt(0) === string2.charCodeAt(0)); //true
//comparing the character
console.log(string1 === string2.substring(0,1)); //false
//This is giving me a headache.
http://jsfiddle.net/DerekL/B9Xdk/
If their char codes are the same in both strings, by comparing the character itself should return true. It is true when I put in a and ab. But when I put in these strings, it simply breaks.
Some said that it might be the encoding that is causing the problem. But since it works perfectly fine when there's only one character in the string literal, I assume encoding has nothing to do with it.
(This question addresses the core problem in my previous questions. Don't worry I deleted them already.)
In JavaScript, strings are treated by characters instead of bytes, but only if they can be expressed in 16-bit code points.
A majority of the characters will cause no issues, but in this case they don't "fit" and so they occupy 2 characters as far as JavaScript is concerned.
In this case you need to do:
string2.substring(0, 2) // "🌀"
For more information on Unicode quirkiness, see UTF-8 Everywhere.
Substring parameters are the index where he starts, and the end, where as if you change it to substr, the parameters are index where to start and how many characters.
You can use the method to compare 2 strings:
string1.localeCompare(string2);

Can't get number based javascript regex pattern to work

I know it should be simple, and yes i've tested on online regex sites, but i just can't get this to work.
Input string: "w_(number from 1-99),h_(number from 1-99)", e.g: "w_34,h_34"
Expected Output: number replaced, e.g "w_50,h_50"
Test:
'w_34,h_34'.replace('w_[1-9][0-9],h_[1-9][0-9]', 'w_50,h_50')
But it just returns the original string. (w_34,h_34)
You need to use a regular expression to take advantage of the regexp syntax
'w_34,h_34'.replace(/w_[1-9][0-9]?,h_[1-9][0-9]?/, 'w_50,h_50')
This will solve to only 2 digits of numbers. An alternative would be to use the * operator.
'w_34,h_34'.replace(/w_[1-9][0-9]*,h_[1-9][0-9]*/, 'w_50,h_50')
Which would allow n-length numbers to match.

negative number in parentheses using javascript

I use match to split a mathematics expression into separated strings and save them in an array.
var STRING = ST.match(/\d*\.\d+|\d+|[()/*+-]/g);
but this method separate everything including negative numbers which are inside parentheses.
For example (-2+4) does not give me -2, instead it saves - in one index of STRING array and 2 in the next index.
Is there anyway use match and save negative numbers which are in the parentheses?
This is what I want:
(-2+4):
STRING[0] give me (
STRING[1] give me -2
STRING[2] give me +
STRING[3] give me 4
STRING[4] give me )
and if there is no negative number work as normal:
(2+4):
STRING[0] give me (
STRING[1] give me 2
STRING[2] give me +
STRING[3] give me 4
STRING[4] give me )
I don't think it's possible to parse complex cases like "(-2+4*-(3.5--8))" with just a regex especially given we don't have negative look behind in javascript.
A solution would be to postprocess your match array by merging signs when they're between a separator and an unsigned expression.
In my opinion a regex is useful here, but only for the primary tokenization. Most of the work will be ahead of you as you'll build the binary expression tree (or any other formal representation you choose).
Unfortunately, if what you're trying to do is parsing a mathematical expression, regexps can not be used.
RegExps can be used in languages that are describable by Regular Grammars and arithmetical expressions can not, they are described by a Context Free Grammar (CFG). If you want to parse, and perhaps interpret the result, you'll certainly need some stacked state machine.
You can look at something like this well known algorithm.
Hope this helps.
You can add an optional sign to the numbers, that would work with your example:
var STRING = ST.match(/-?\d*\.\d+|-?\d+|[()/*+-]/g);
However, that will also turn a minus operator into a sign. The expression (4-2) would give you { "(", "4", "-2", ")" }.
Also, it will happily "parse" an expression like +---((((*** without complaining. If you want a result that makes sense, you should parse it for real, not just split it with a regular expression.
I think you have some mistake in your RegExp try this, it works for me:
var STRING = ST.match(/(\d*)(\.)(\d+)|(\d+)|[()\/*+-]/g);

Regex for integer, integer + dot, and decimals

I have searched StackOverflow and I can't find an answer as to how to check for regex of numeric inputs for a calculator app that will check for the following format with every keyup (jquery key up):
Any integer like: 34534
When a dot follows the integer when the user is about to enter a decimal number like this: 34534. Note that a dot can only be entered once.
Any float: 34534.093485
I don't plan to use commas to separate the thousands...but I would welcome if anyone can also provide a regex for that.
Is it possible to check the above conditions with just one regex? Thanks in advance.
Is a lone . a successful match or not? If it is then use:
\d+(\.\d*)?|\.\d*
If not then use:
\d+(\.\d*)?|\.\d+
Rather than incorporating commas into the regexes, I recommend stripping them out first: str = str.replace(/,/g, ''). Then check against the regex.
That wouldn't verify that digits are properly grouped into groups of three, but I don't see much value in such a check. If a user types 1,024 and then decides to add a digit (1,0246), you probably shouldn't force them to move the comma.
Let's write our your specifications, and develop from that.
Any integer: \d+
A comma, optionally followed by an integer: \.\d*
Combine the two and make the latter optional, and you get:
\d+\.?\d*
As for handling commas, I'd rather not go into it, as it gets very ugly very fast. You should simply strip all commas from input if you still care about them.
you can use in this way:
[/\d+./]
I think this can be used for any of your queries.
Whether it's 12445 or 1244. or 12445.43
I'm going to throw in a potentially downvoted answer here - this is a better solution:
function valid_float (num) {
var num = (num + '').replace(/,/g, ''), // don't care about commas, this turns `num` into a String
float_num = parseFloat(num);
return float_num == num || float_num + '.' == num; // allow for the decimal point, deliberately using == to ignore type as `num` is a String now
}
Any regex that does your job correctly will come with a big asterisk after it saying "probably", and if it's not spot on, it'll be an absolute pig to debug.
Sure, this answer isn't giving you the most awesomely cool one-liner that's going to make you go "Cool!", but in 6 months time when you realise it's going wrong somewhere, or you want to change it to do something slightly different, it's going to be a hell of a lot easier to see where, and to fix.
I'm using ^(\d)+(.(\d)+)+$ to capture each integer and to have an unlimited length, so long as the string begins and ends with integers and has dots between each integer group. I'm capturing the integer groups so that I can compare them.

Negative lookahead Regular Expression

I want to match all strings ending in ".htm" unless it ends in "foo.htm". I'm generally decent with regular expressions, but negative lookaheads have me stumped. Why doesn't this work?
/(?!foo)\.htm$/i.test("/foo.htm"); // returns true. I want false.
What should I be using instead? I think I need a "negative lookbehind" expression (if JavaScript supported such a thing, which I know it doesn't).
The problem is pretty simple really. This will do it:
/^(?!.*foo\.htm$).*\.htm$/i.test("/foo.htm"); // returns false
What you are describing (your intention) is a negative look-behind, and Javascript has no support for look-behinds.
Look-aheads look forward from the character at which they are placed — and you've placed it before the .. So, what you've got is actually saying "anything ending in .htm as long as the first three characters starting at that position (.ht) are not foo" which is always true.
Usually, the substitute for negative look-behinds is to match more than you need, and extract only the part you actually do need. This is hacky, and depending on your precise situation you can probably come up with something else, but something like this:
// Checks that the last 3 characters before the dot are not foo:
/(?!foo).{3}\.htm$/i.test("/foo.htm"); // returns false
As mentioned JavaScript does not support negative look-behind assertions.
But you could use a workaroud:
/(foo)?\.htm$/i.test("/foo.htm") && RegExp.$1 != "foo";
This will match everything that ends with .htm but it will store "foo" into RegExp.$1 if it matches foo.htm, so you can handle it separately.
Like Renesis mentioned, "lookbehind" is not supported in JavaScript, so maybe just use two regexps in combination:
!/foo\.htm$/i.test(teststring) && /\.htm$/i.test(teststring)
Probably this answer has arrived just a little bit later than necessary but I'll leave it here just in case someone will run into the same issue now (7 years, 6 months after this question was asked).
Now lookbehinds are included in ECMA2018 standard & supported at least in last version of Chrome. However, you might solve the puzzle with or without them.
A solution with negative lookahead:
let testString = `html.htm app.htm foo.tm foo.htm bar.js 1to3.htm _.js _.htm`;
testString.match(/\b(?!foo)[\w-.]+\.htm\b/gi);
> (4) ["html.htm", "app.htm", "1to3.htm", "_.htm"]
A solution with negative lookbehind:
testString.match(/\b[\w-.]+(?<!foo)\.htm\b/gi);
> (4) ["html.htm", "app.htm", "1to3.htm", "_.htm"]
A solution with (technically) positive lookahead:
testString.match(/\b(?=[^f])[\w-.]+\.htm\b/gi);
> (4) ["html.htm", "app.htm", "1to3.htm", "_.htm"]
etc.
All these RegExps tell JS engine the same thing in different ways, the message that they pass to JS engine is something like the following.
Please, find in this string all sequences of characters that are:
Separated from other text (like words);
Consist of one or more letter(s) of english alphabet, underscore(s),
hyphen(s), dot(s) or digit(s);
End with ".htm";
Apart from that, the part of sequence before ".htm" could be anything
but "foo".
String.prototype.endsWith (ES6)
console.log( /* !(not)endsWith */
!"foo.html".endsWith("foo.htm"), // true
!"barfoo.htm".endsWith("foo.htm"), // false (here you go)
!"foo.htm".endsWith("foo.htm"), // false (here you go)
!"test.html".endsWith("foo.htm"), // true
!"test.htm".endsWith("foo.htm") // true
);
You could emulate the negative lookbehind with something like
/(.|..|.*[^f]..|.*f[^o].|.*fo[^o])\.htm$/, but a programmatic approach would be better.

Categories