JavaScript regex - null argument makes the regex match - javascript

I'm writing a regex to be used with JavaScript. When testing I came across some strange behavior and boiled it down to the following:
/^[a-z]/.test("abc"); // <-- returns true as expected
/^[a-z]/.test(null); // <-- returns true, but why?
I was assuming that the last case was going to return false since it does not meet the regex (the value is null and thus, do no start with a character in the range). So, can anyone explain me why this is not the case?
If I do the same test in C#:
var regex = new Regex("^[a-z]");
var res = regex.IsMatch(null); // <-- ArgumentNullException
... I get an ArgumentNullException which makes sense. So, I guess when testing a regex in JavaScript, you have to manually do a null check?
I have tried searching for an explanation, but without any luck.

That's because test converts its argument : null is converted to the "null" string.
You can check that in the console :
/^null$/.test(null)
returns true.
The call to ToString(argument) is specified in the ECMAScript specification (see also ToString).

Here null is getting typecasted to String form which is "null".
And "null" matches your provided regex which is why it is evaluating to true
In Javascript, everything(or mostly) is an Object which has ToString method which will be automatically called upon internally in case there is a need for a typecast.

Related

query-string pick method returns the first match from a filter and corrupted URL

I am trying to clean a URL and only keep some parameters that I need in order to parse them
so I was using pick method providing it the url, and the filter which is a regex test method
here I am testing to check if the key in the query parameter matches the regular expression
const groupRegex = new RegExp('^(GRP_)[a-zA-Z0-9/-]','g');
export const parseGroups= (url:string)=>{
let pickedURL = qs.pick(url,(key,value)=>groupRegex.test(key));
console.log(pickedURL);
}
var url=`http://localhost:3000/tat?GRP_Bob[]=SW&GRP_sa[]=QW&GRP_sa[]=AA&projects[]=MP,PM&releases[]=2021.4,2022.1`
parseGroups(url)
for example http://localhost:3000/tat?GRP_Bob[]=SW&GRP_sa[]=QW&GRP_sa[]=AA&projects[]=MP,PM&releases[]=2021.4,2022.1
it should return http://localhost:3000/tat?GRP_Bob=SW&GRP_sa=QW&GRP_sa=AA
yet it only tests for the first request parameter only and logs
http://localhost:3000/tat?GRP_Bob%5B%5D=SW
I am trying to clean the url from any other parameters that doesn't match my regular expression
so I can parse the URL and extract the object
so it can be like this for example
{
GRP_Bob:["SW"],
GRP_sa:["QW","AA"]
}
Instead of having other parameters parsed also which are not necessary. I know I can just parse the url normally, and then loop on the returned query object, and remove any key that doesn't match the regex, but is there anything wrong I am doing in the above snippet?
UPDATE:
I changed the filter function to be (key,value)=>key.startsWith('GRP_'))
export const parseGroups= (url:string)=>{
let pickedURL = qs.pick(url,(key,value)=>key.startsWith('GRP_'));
console.log(pickedURL);
let parsedURL = qs.parseUrl(pickedURL)
console.log(parsedURL.query)
}
var url=`http://localhost:3000/tat?GRP_Bob[]=SW&GRP_sa[]=QW&GRP_sa[]=AA&projects[]=MP,PM&releases[]=2021.4,2022.1`
parseGroups(url)
and the pickedURL logged this http://localhost:3000/tat?GRP_Bob%5B%5D=SW&GRP_sa%5B%5D=QW&GRP_sa%5B%5D=AA which is likely to be correct.
it came out like that
GRP_Bob[]: "SW"
GRP_sa[]: (2) ['QW', 'AA']
So I am confused actually what's going on with the regular expression approach, and why the keys in the second approach have [] in it?
Ah yeh! This one is a rare gotcha and totally unexpected every time I see it. RegExp actually has state. See Why does JavaScript's RegExp maintain state between calls? and Why does Javascript's regex.exec() not always return the same value?.
In your case, you don't need the g flag, so removing that should also fix your problem since that makes the regex stateless.

Why is there an isNaN() function in JavaScript but no isUndefined()?

Why is there an isNaN() function in JavaScript whilst isUndefined() must be written as:
typeof(...) != "undefined"
Is there a point I don't see?
In my opinion its really ugly to write this instead of just isUndefined(testValue).
There is simply no need for an isUndefined() function. The reason behind this is explained in the ECMAScript specification:
(Note that the NaN value is produced by the program expression NaN.) In some implementations, external code might be able to detect a difference between various Not-a-Number values, but such behaviour is implementation-dependent; to ECMAScript code, all NaN values are indistinguishable from each other.
The isNaN() function acts as a way to detect whether something is NaN because equality operators do not work (as you'd expect, see below) on it. One NaN value is not equal to another NaN value:
NaN === NaN; // false
undefined on the other hand is different, and undefined values are distinguishable:
undefined === undefined; // true
If you're curious as to how the isNaN() function works, the ECMAScript specification also explains this for us too:
Let num be ToNumber(number).
ReturnIfAbrupt(num).
If num is NaN, return true.
Otherwise, return false.
A reliable way for ECMAScript code to test if a value X is a NaN is an expression of the form X !== X. The result will be true if and only if X is a NaN.
NaN !== NaN; // true
100 !== 100; // false
var foo = NaN;
foo !== foo; // true
The use case var === undefined works almost everywhere, except for the cases covered by this answer, where either undefined is assigned a value, or var is undefined.
The reason such a function cannot exist, is clear from the latter case. If var is undefined, then calling the supposed function isUndefined(var) will result in a ReferenceError. However introducting a new keyword, e.g. isundefined var could address this issue.
But despite being valid, both of the above cases are poor uses of javascript. This is the reason I believe such a keyword does not exist.
isUndefined could be written as
testValue === undefined
like for every other value.
This does not work with NaN however, as NaN !== NaN. Without the ability to use a comparison, there was need for an isNaN function to detect NaN values.
This isn't a direct answer to the question as others have already answered, it's more to highlight libraries that contain an isUndefined() function for anybody looking for quick solution and they're in a position to use them.
Underscore and Lo-dash both contain an isUndefined() function, the latter, because it's built upon Underscore.
http://underscorejs.org/#isUndefined
https://lodash.com/docs#isUndefined

Why does saving a JavaScript regular expression to a variable produce different results when invoking the .test() method?

Can someone explain why saving this regex to a variable produces alternating true and false values, but using the literal produces true every time? Am I missing something obvious here? I'm certainly not a regex expert, but this seems like odd behavior to me.
var exp = /[\^~\\><\|"]/g;
exp.test('<abc'); // true
exp.test('<abc'); // false
exp.test('<abc'); // true
exp.test('<abc'); // false
/[\^~\\><\|"]/g.test('<abc'); // true
/[\^~\\><\|"]/g.test('<abc'); // true
/[\^~\\><\|"]/g.test('<abc'); // true
/[\^~\\><\|"]/g.test('<abc'); // true
CodePen Demo (check the console)
This is because regular expression objects save their state, so when you call test again on the same object it tries to find the next match and fails.
From the docs:
test called multiple times on the same global regular expression instance will advance past the previous match.
In the last examples, you are creating a new regular expression each time, so it matches every time.

Is it incorrect to use eval() within this function? Can I accomplish the same functionality without it somehow?

I'm trying to write a function I can use to test all for falsy values, keeping it concise since it will be run quite often serverside.
function is_falsy(val){
val = eval(String(val).toLowerCase());
return !!val;
}
I wonder if there's any way it could be done shorter, or what the possible negative implications of using eval() might be. JSBIN tells me it is "evil".
JSBIN
Assuming that val is a string that represents a JavaScript literal then we can take advantage of the fact that the only false-y values in JavaScript are:
0 (+ or -)
NaN
the empty string ('') or ("")
null
undefined
false
Thus, ignoring edge-cases (like 0.0) we could write it like so (a lower case can be performed as in the original code):
function is_falsey_literal (lit) {
if (['""', "''", "null", "undefined", "false", "0", "NaN"].indexOf(lit) >= 0) {
return true;
}
// Ideally there are more checks on numeric literals such as `-0` or `0.0`.
return false;
}
If needing to check a full expression then eval may "work" and is likely more practical when compared to writing a full JavaScript-in-JavaScript parser. For instance, in the above, the input string of (void 0) will be "true" although it evaluates to undefined which is definitely not a truth-y value.
Of course, perhaps the original data can be written/consumed such that there is no need for such a construct at all ..
There should never be any need to treat a string containing false or undefined as falsy. Doing so is inviting false positives (or false negatives) on possibly completely unrelated data.
Imagine what else would be treated as "falsy":
!true
!1
!!true
it's begging for mysterious bugs in your application further down the line.
The program flow should make sure that an undefined value actually arrives at your testing script as a literal undefined, not a string "undefined".
If you only want to test is falsy, then the below is enough.
function is_falsy(val){
return !val;
}
If you want to test whether a string is falsy value like 'false', then
function is_falsy(val){
try {
return !JSON.parse(String(val).toLowerCase());
} catch(e) {
return false;
}
}

Casting to string in JavaScript

I found three ways to cast a variable to String in JavaScript.
I searched for those three options in the jQuery source code, and they are all in use.
I would like to know if there are any differences between them:
value.toString()
String(value)
value + ""
DEMO
They all produce the same output, but does one of them better than the others?
I would say the + "" has an advantage that it saves some characters, but that's not that big advantage, anything else?
They do behave differently when the value is null.
null.toString() throws an error - Cannot call method 'toString' of null
String(null) returns - "null"
null + "" also returns - "null"
Very similar behaviour happens if value is undefined (see jbabey's answer).
Other than that, there is a negligible performance difference, which, unless you're using them in huge loops, isn't worth worrying about.
There are differences, but they are probably not relevant to your question. For example, the toString prototype does not exist on undefined variables, but you can cast undefined to a string using the other two methods:
​var foo;
​var myString1 = String(foo); // "undefined" as a string
var myString2 = foo + ''; // "undefined" as a string
var myString3 = foo.toString(); // throws an exception
http://jsfiddle.net/f8YwA/
They behave the same but toString also provides a way to convert a number binary, octal, or hexadecimal strings:
Example:
var a = (50274).toString(16) // "c462"
var b = (76).toString(8) // "114"
var c = (7623).toString(36) // "5vr"
var d = (100).toString(2) // "1100100"
In addition to all the above, one should note that, for a defined value v:
String(v) calls v.toString()
'' + v calls v.valueOf() prior to any other type cast
So we could do something like:
var mixin = {
valueOf: function () { return false },
toString: function () { return 'true' }
};
mixin === false; // false
mixin == false; // true
'' + mixin; // "false"
String(mixin) // "true"
Tested in FF 34.0 and Node 0.10
According to this JSPerf test, they differ in speed. But unless you're going to use them in huge amounts, any of them should perform fine.
For completeness: As asawyer already mentioned, you can also use the .toString() method.
if you are ok with null, undefined, NaN, 0, and false all casting to '' then (s ? s+'' : '') is faster.
see http://jsperf.com/cast-to-string/8
note - there are significant differences across browsers at this time.
Real world example: I've got a log function that can be called with an arbitrary number of parameters: log("foo is {} and bar is {}", param1, param2). If a DEBUG flag is set to true, the brackets get replaced by the given parameters and the string is passed to console.log(msg). Parameters can and will be Strings, Numbers and whatever may be returned by JSON / AJAX calls, maybe even null.
arguments[i].toString() is not an option, because of possible null values (see Connell Watkins answer)
JSLint will complain about arguments[i] + "". This may or may not influence a decision on what to use. Some folks strictly adhere to JSLint.
In some browsers, concatenating empty strings is a little faster than using string function or string constructor (see JSPerf test in Sammys S. answer). In Opera 12 and Firefox 19, concatenating empty strings is rediculously faster (95% in Firefox 19) - or at least JSPerf says so.
On this page you can test the performance of each method yourself :)
http://jsperf.com/cast-to-string/2
here, on all machines and browsers, ' "" + str ' is the fastest one, (String)str is the slowest

Categories