I'm using the following code to compare returned IP address (Using node-restify which is similar to express):
var checkIP = function (config, req) {
var ip = req.connection.remoteAddress.split('.'),
curIP,
b,
block = [];
for (var i=0, z=config.ips.length-1; i<=z; i++) {
curIP = config.ips[i].split('.');
b = 0;
// Compare each block
while (b<=3) {
(curIP[b]===ip[b] || curIP[b]==='*') ? block[b] = true : block[b] = false;
b++;
}
// Check all blocks
if (block[0] && block[1] && block[2] && block[3]) {
return true;
}
}
return false;
};
config.ips contains an array which (as should be obvious from the code) can be specific or wildcarded IPs.
This works, but it seems like there is a more efficient way to do this. Just curious if anyone has any suggestions on a way to simplify this or make it more efficient. My request time nearly doubled when I introduced this and I'd like to squeeze out some load time if possible.
If my intuition is correct, you might be doing a bunch of extra work right now:
For each IP expression in your config.ips array, your code is parsing and comparing:
if (block[0] && block[1] && block[2] && block[3]) {
return true;
}
^^^ Note that you have already done work to get all 4 blocks in the iterations of calculating this expression 4 times per IP: (curIP[b]===ip[b] || curIP[b]==='*'), so the ANDing above is not preventing the overhead of the work that is already happening regardless.
I have 2 ideas for you:
Since IP addresses are strings anyways, the * notation lends itself to be suitable for a Regex to do the work, instead of your splitting and comparing? So maybe as a next step you could look into implementing a Regex to do the work, instead of .split() and compare, and test the performance of that?
Or maybe figure out how to avoid that overhead associated with comparing the parts all the time, and compare the wholes when you can? And then fall back into comparing the parts only when necessity requires it.
If you want to read some C code, here's how Apache does IP blacklisting behind the scenes. Look at the function named in_domain for some inspiration.
Good luck, hope this helps!
Related
This is from Leetcode problem: Concatenated Words.
Below is a working solution. I added what I thought to be an optimization (see code comment), but it actually slows down the code. If I remove the wrapping if statement, it runs faster.
To me, the optimization helps avoid having to:
call an expensive O(n) substring()
check inside wordsSet
making an unnecessary function call to checkConcatenation
Surely if (!badStartIndices.has(end + 1)) isn't more expensive than all the above, right? Maybe it has something to do with Javascript JIT compilation? V8? Thoughts?
Use the following test input:
// Notice how the second string ends with a 'b'!
const words = [
'a',
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab',
];
// Function in question.
var findAllConcatenatedWordsInADict = function (words) {
console.time();
// 1) put all words in a set
const wordsSet = new Set(words);
let badStartIndices;
// 2) iterate words, recursively check if it's a valid word
const concatenatedWords = [];
function checkConcatenation(word, startIdx = 0, matches = 0) {
if (badStartIndices.has(startIdx)) {
return false;
}
if (startIdx === word.length && matches >= 2) {
concatenatedWords.push(word);
return true;
}
for (let end = startIdx; end < word.length; end++) {
// I ADDED THE IF STATEMENT AS AN OPTIMIZATION. BUT CODE RUNS FASTER WITHOUT IT.
// NOTE: Code is correct with & without if statement.
if (!badStartIndices.has(end + 1)) {
const curWord = word.substring(startIdx, end + 1);
if (wordsSet.has(curWord)) {
if (checkConcatenation(word, end + 1, matches + 1)) {
return true;
}
}
}
}
badStartIndices.add(startIdx);
return false;
}
for (const word of words) {
// reset memo at beginning of each word
badStartIndices = new Set();
checkConcatenation(word);
}
console.timeEnd();
return concatenatedWords;
};
Turns out this depends entirely on the input data, not on JavaScript or V8. (And as of writing this, I don't know what data you used for benchmarking.)
With the example input from the Leetcode page you've linked, badStartIndices never does anything useful (both of the .has checks always return false); so it's fairly obvious that doing this fruitless check twice is a little slower than doing it just once. In that case, the "dynamic programming" mechanism of the solution never kicks in, so the effective behavior degenerates to brute force, which is good enough because the input data is well-behaved. (In fact, deleting badStartIndices entirely would be even faster for such a test case.)
If I construct "evil" input data that actually leads to exponential combinatorial blow-up, i.e. where the badStartIndices.has(...) checks actually have something to do, then adding the early check does have a (small) performance benefit. (And without either of the checks, the computation would take "forever" for such inputs.)
So, taking a step back, this is one more example to illustrate that benchmarking is difficult; in particular, in order to get useful results, care must be taken to select relevant/realistic input data.
If the tests are too simple, developers are likely to not build optimizations that would help (a little or a lot) in high-load situations.
If the tests are too demanding, developers are likely to waste time on overly complicated code that ends up being slower than it could be for its target use case.
And if the code must handle any input with maximum performance, then as the developer you have the extra challenge of avoiding overhead for simple inputs while still scaling well to tough inputs...
Basically I was playing around with an Steam bot for some time ago, and made it auto-reply when you said things in an array, I.E an 'hello-triggers' array, which would contain things like "hi", "hello" and such. I made so whenever it received an message, it would check for matches using indexOf() and everything worked fine, until I noticed it would notice 'hiasodkaso', or like, 'hidemyass' as an "hi" trigger.
So it would match anything that contained the word even if it was in the middle of a word.
How would I go about making indexOf only notice it if it's the exact word, and not something else in the same word?
I do not have the script that I use but I will make an example that is pretty much like it:
var hiTriggers = ['hi', 'hello', 'yo'];
// here goes the receiving message function and what not, then:
for(var i = 0; i < hiTriggers.length; i++) {
if(message.indexOf(hiTriggers[i]) >= 0) {
bot.sendMessage(SteamID, randomHelloMsg[Math stuff here blabla]); // randomHelloMsg is already defined
}
}
Regex wouldn't be used for this, right? As it is to be used for expressions or whatever. (my English isn't awesome, ikr)
Thanks in advance. If I wasn't clear enough on something, please let me know and I'll edit/formulate it in another way! :)
You can extend prototype:
String.prototype.regexIndexOf = function(regex, startpos) {
var indexOf = this.substring(startpos || 0).search(regex);
return (indexOf >= 0) ? (indexOf + (startpos || 0)) : indexOf;
}
and do:
var foo = "hia hi hello";
foo.regexIndexOf(/hi\b/);
Or if you don't want to extend the string object:
foo.substr(i).search(/hi\b/);
both examples where taken from the top answers of Is there a version of JavaScript's String.indexOf() that allows for regular expressions?
Regex wouldn't be used for this, right? As it is to be used for expressions or whatever. (my > English isn't awesome, ikr)
Actually, regex is for any old pattern matching. It's absolutely useful for this.
fmsf's answer should work for what you're trying to do, however, in general extending native objects prototypes is frowned upon afik. You can easily break libraries by doing so. I'd avoid it when possible. In this case you could use his regexIndexOf function by itself or in concert with something like:
//takes a word and searches for it using regexIndexOf
function regexIndexWord(word){
return regexIndexOf("/"+word+"\b/");
}
Which would let you search based on your array of words without having to add the special symbols to each one.
Today this question came up for the project I'm working on. The 'problem' is that I have some uncertainties in the data that is provided to me and on top of which I am building my application. This means it could be that some values sometimes are present but sometimes not. Because I want some consistency in my upper layers I wrote some 'sanitize' method that creates the consistency I want.
But...what is better?:
var myNewData = {};
myNewData['somevalue'] = (myOldData.somevalue) ? myOldData.somevalue : '';
or
var myNewData = {};
myNewData['somevalue'] = myOldData.somevalue || '';
And...why is it better? Is it performance? Readability?
Just curious.
EDIT:
To be clear. The 'somevalue' property does not necessarily had to be in the old data. Sometimes it is sometimes it is not.
EDIT2:
Of course, if I know that the value of my old data contains a non-character value (numeric, boolean, etc.) I will default to it's appropriate value (0, true, etc.).
You should aim at readability, and for this case || clearly wins.
The performance in Javascript is very hard to predict because it can vary wildly between different implementations and sometimes the result are apparently totally illogic (something that formally requires three lookups can be faster than something requiring one because may be the runtime engine has been specialized for that code path).
I ran a performance test for this instructions:
myNewData['somevalue'] = (myOldData.somevalue) ? myOldData.somevalue : ''
myNewData['somevalue'] = (myOldData.somevalue) || ''
And as bonus the old if:
if (myOldData.somevalue)
myNewData['somevalue'] = myOldData.somevalue
else
myNewData['somevalue'] = '';
Both for myOldData.somevalue empty or not. For a test like this:
for (i = 0; i < 10; ++i) {
for (j = 0; j < 100000000; ++j) {
result = empty || "";
}
}
Outer loop is to calculate an average (timing code is omitted). These are my results (lower index better performance):
Code | Empty | Not Empty
| IE9 CHROME | IE9 CHROME
------------------------------------------------------
?: | 1435.1 551.1 | 1636.1 706.1
|| | 1450.3 488 | 1623.7 706.4
if | 1436.2 491 | 1642.6 653.6
------------------------------------------------------
So I guess performances aren't the point here (anyway a better test should check what if the variable to test is something more complex).
Readability is something very opinion based. Personally I prefer || because it's clear enough and shorter but if you pick a C programmer maybe he won't like it while a C# programmer will understand it's like his ?? operator...
The problem with:
myNewData['somevalue'] = myOldData.somevalue || '';
is that, if myOldData.somevalue holds an acceptable falsy value, you'd still get the empty string.
So, with the first one you could at least make a strict check to have better control:
(myOldData.somevalue !== false) ? myOldData.somevalue : '';
Both will yield '' if myOldData.someValue is falsey (null, undefined, 0, '', etc), but the first one is very susceptible to copy/paste errors. Use the second form wherever possible and when coalescing all falsey answers to your default value is tolerable.
Note that if the base object -- myOldData -- could be null or undefined, it's a completely different ball game, and you'll need to do something like this:
myNewData.somevalue = ( myOldData && myOldData.someValue ) || '';
This assumes that myOldData is an object. If it's a string or number, bad things could happen here. (And gratuitous parentheses are always a good idea.)
I think both are equally good.
However, take care that they work only for maps containing string values. If the old map contains the value false, null or 0, they would be transformed into an empty string.
Therefore, I tend to prefer the generic case:
myNewData['somevalue'] = (myOldData.somevalue != undefined) ? myOldData.somevalue : '';
However, if you are handling strings only, the short myOldData.somevalue || '' looks concise and straightforward to me.
No difference.
Perfomance? Negligible.
Readability? You could say by yourself, what is easily to read to you and your colleagues. I'd prefer shorter one.
The second one is better in that you don't need to repeat the same thing and less chances you'll get an error. At the same time it is uncommon for people, used static-typed languages before.
I am currently doing a big project (by big I mean, many processes) where every millisecond I save means a lot (on the long run), so I want to make sure I am doing it the right way.
So, what is the best way to ensure you will have an array greater than 1?
a) use indexOf(), then if result is different than -1, split()
b) split (regardless if characters exist), then do stuff ONLY if the
array.length is greater than 1
c) another not listed above
Using jsPerf, it appears that omitting .indexOf() is roughly 23% more efficient that including it over 500,000 iterations (11.67 vs. 8.95 operations per second):
Without indexOf():
var str = "test";
for (var i = 0; i < 500000; i++) {
var test = str.split('.');
}
With .indexOf():
var str = "test";
for (var i = 0; i < 500000; i++) {
if (str.indexOf('.')) {
var test = str.split('.');
} else {
var test = str;
}
}
http://jsperf.com/split-and-split-indexof
EDIT
Hmm... If the following line is:
if (str.indexOf('.') > -1)
http://jsperf.com/split-and-split-indexof-with-indexof-check
Or any other comparison, it's seemingly quite a bit faster (by about 69%).
The only reason I can think this is the case is that running .split() on every variable will perform two functions on each value (find, then separate), instead of just one when necessary. Note, this last part is just a guess.
We can see that even when there is something to split the best results come from doing the indexOf test against a value. Still the improvement is worse that the cases where 100% of items don't need a split. Thus as you have more items needing to be split testing returns less benefit (as would be expected). So it really depends on the use case since the extra code takes up memory and uses resources.
http://jsperf.com/split-and-split-indexof/2
(b) is obviously more efficient than (a) because split uses the same logic as indexOf and that logic will not need to be repeated if there are indeed more than 2 elements. i cannot think of a more efficient way.
I want to find the number of tabs at the beginning of a string (and of course I want it to be fast running code ;) ). This is my idea, but not sure if this is the best/fastest choice:
//The regular expression
var findBegTabs = /(^\t+)/g;
//This string has 3 tabs and 2 spaces: "<tab><tab><space>something<space><tab>"
var str = " something ";
//Look for the tabs at the beginning
var match = reg.exec( str );
//We found...
var numOfTabs = ( match ) ? match[ 0 ].length : 0;
Another possibility is to use a loop and charAt:
//This string has 3 tabs and 2 spaces: "<tab><tab><space>something<space><tab>"
var str = " something ";
var numOfTabs = 0;
var start = 0;
//Loop and count number of tabs at beg
while ( str.charAt( start++ ) == "\t" ) numOfTabs++;
In general if you can calculate the data by simply iterating through the string and doing a character check at every index, this will be faster than a regex/regular expression which must build up a more complex searching engine. I encourage you to profile this but I think you'll find the straight search is faster.
Note: Your search should use === instead of == here as you don't need to introduce conversions in the equality check.
function numberOfTabs(text) {
var count = 0;
var index = 0;
while (text.charAt(index++) === "\t") {
count++;
}
return count;
}
Try using a profiler (such as jsPerf or one of the many available backend profilers) to create and run benchmarks on your target systems (the browsers and/or interpreters you plan to support for your software).
It's useful to reason about which solution will perform best based on your expected data and target system(s); however, you may sometimes be surprised by which solution actually performs fastest, especially with regard to big-oh analysis and typical data sets.
In your specific case, iterating over characters in the string will likely be faster than regular expression operations.
One-liner (if you find smallest is best):
"\t\tsomething".split(/[^\t]/)[0].length;
i.e. splitting by all non-tab characters, then fetching the first element and obtaining its length.