I was clustering around 40000 points using kmean algorithm. In the first version of the program I wrote the euclidean distance function like this
var euclideanDistance = function( p1, p2 ) { // p1.length === p2.length == 3
var sum = 0;
for( var i in p1 ){
sum += Math.pow( p1[i] - p2[i], 2 );
}
return Math.sqrt( sum );
};
The overall program was quite slow taking on average 7sec to execute. After some profiling I rewrote the above function like this
var euclideanDistance = function( p1, p2 ) { // p1.length === p2.length == 3
var sum = 0;
for( var i = 0; i < p1.length; i++ ) {
sum += Math.pow( p1[i] - p2[i], 2 );
}
return Math.sqrt( sum );
};
Now the programs on average take around 400ms. That's a huge time difference just because of the way I wrote the for loop. I normally don't use for..in loop for arrays but for some reason I used it while writing this function.
Can someone explain why there is this huge difference in performance between these 2 styles?
Look at what's happening differently in each iteration:
for( var i = 0; i < p1.length; i++ )
Check if i < p1.length
Increment i by one
Very simple and fast.
Now look at what's happening in each iteration for this:
for( var i in p1 )
Repeat
Let P be the name of the next property of obj whose [[Enumerable]] attribute is true. If there is no such property, return (normal, V,
empty).
It has to find next property in the object that is enumerable. With your array you know that this can be achieved by a simple integer increment, where as the algorithm to find next enumerable is most likely not that simple because it has to work on arbitrary object and its prototype chain keys.
As a side note, if you cache the length of p1:
var plen = p1.length;
for( var i = 0; i < plen; i++ )
you will get a slight speed increase.
...And if you memoize the function, it will cache results, so if the user tries the same numbers you will see a massive speed increase.
var eDistance = memoize(euclideanDistance);
function memoize( fn ) {
return function () {
var args = Array.prototype.slice.call(arguments),
hash = "",
i = args.length;
currentArg = null;
while (i--) {
currentArg = args[i];
hash += (currentArg === Object(currentArg)) ?
JSON.stringify(currentArg) : currentArg;
fn.memoize || (fn.memoize = {});
}
return (hash in fn.memoize) ? fn.memoize[hash] :
fn.memoize[hash] = fn.apply(this, args);
};
}
eDistance([1,2,3],[1,2,3]);
eDistance([1,2,3],[1,2,3]); //Returns cached value
credit: http://addyosmani.com/blog/faster-javascript-memoization/
First You should be aware of this in the case of for/in and arrays. No big deal if You know what You are doing.
I run some very simple tests to show the difference in performance between different loops:
http://jsben.ch/#/BQhED
That is why prefer to use classic for loop for arrays.
The For/In loop, simply loops through all properties in an object. Since you are not specifying the number of iterations the loop needs to take, it simply 'guesses' at it, and continues on until there are no more objects.
With the second loop, you are specifying all possible variable... a)a starting point, b) the number of iterations the loop should take before stopping, c) increasing the count of the starting point.
You can think of it this way... For/In = guesses the number of iterations, For(a,b,c) you are specifying
Related
Is it safe to use this kind of loop in Javascript?
denseArray = [1,2,3,4,5, '...', 99999]
var x, i = 0
while (x = denseArray[i++]) {
document.write(x + '<br>')
console.log(x)
}
document.write('Used sentinel: ' + denseArray[i])
document.write('Size of array: ' + i)
It is shorter than a for-loop and maybe also more effective for big arrays, to use a built in sentinel. A sentinel flags the caller to the fact that something rather out-of-the-ordinary has happened.
The array has to be a dense array to work! That means there are no other undefined value except the value that come after the last element in the array. I nearly never use sparse arrays, only dense arrays so that's ok for me.
Another more important point to remember (thank to #Jack Bashford reminded) is that's not just undefined as a sentinel. If an array value is 0, false, or any other falsy value, the loop will stop. So, you must be sure that the data in the array does not have falsy values that is 0, "", '', ``, null, undefined and NaN.
Is there something as a "out of range" problem here, or can we consider arrays in Javascript as "infinite" as long memory is not full?
Does undefined mean browsers can set it to any value because it is undefined, or can we consider the conditional test always to work?
Arrays in Javascript is strange because "they are Objects" so better to ask.
I can't find the answer on Stackoverflow using these tags: [javascript] [sentinel] [while-loop] [arrays] . It gives zero result!
I have thought about this a while and used it enough to start to worry. But I want to use it because it is elegant, easy to see, short, maybe effective in big data. It is useful that i is the size of array.
UPDATES
#Barmar told: It's guaranteed by JS that an uninitialized array
element will return the value undefined.
MDN confirms: Using
an invalid index number returns undefined.
A note by #schu34: It is better to use denseArray.forEach((x)=>{...code}) optimized for it's use and known by devs. No need to encounter falsy values. It has good browser support.
Even if your code won't be viewed by others later on, it's a good idea to make it as readable and organized as possible. Value assignment in condition testing (except for the increment and decrement operators) is generally a bad idea.
Your check needs to be a bit more specific, too, as [0, ''] both evaluate to false.
denseArray = [1,2,3,4,5, '...', 99999]
for(let i = 0; i < denseArray.length; i++) {
let x = denseArray[i]
document.write(x + '<br>');
console.log(x);
if (/* check bad value */) break;
}
document.write('Used sentinel: ' + denseArray[i])
document.write('Size of array: ' + i)
From my experience it's usually not worth it to save a few lines if readability or even reliability is the cost.
Edit: here's the code I used to test the speed
const arr = [];
let i;
for (i = 0; i < 30000000; i++) arr.push(i.toString());
let x;
let start = new Date();
for(i = 0; i < arr.length; i++) {
x = arr[i];
if (typeof x !== 'string') break;
}
console.log('A');
console.log(new Date().getTime() - start.getTime());
start = new Date();
i = 0;
while (x = arr[i++]) {
}
console.log('B');
console.log(new Date().getTime() -start.getTime());
start = new Date();
for(i = 0; i < arr.length; i++) {
x = arr[i];
if (typeof x !== 'string') break;
}
console.log('A');
console.log(new Date().getTime() - start.getTime());
start = new Date();
i = 0;
while (x = arr[i++]) {
}
console.log('B');
console.log(new Date().getTime() -start.getTime());
start = new Date();
for(i = 0; i < arr.length; i++) {
x = arr[i];
if (typeof x !== 'string') break;
}
console.log('A');
console.log(new Date().getTime() - start.getTime());
start = new Date();
i = 0;
while (x = arr[i++]) {
}
console.log('B');
console.log(new Date().getTime() -start.getTime());
The for loop even has an extra if statement to check for bad values, and still is faster.
Searching for javascript assignment in while gave results:
Opinions vary from it looks like a common error where you try to compare values to If there is quirkiness in all of this, it's the for statement's wholesale divergence from the language's normal syntax. The for is syntactic sugar adding redundance. It has not outdated while together with if-goto.
The question in first place is if it is safe. MDN say: Using an invalid index number returns undefined in Array, so it is a safe to use. Test on assignments in condition is safe. Several assignments can be done in the same, but a declaration with var, let or const does not return as assign do, so the declaration has to be outside the condition. Have a comment abowe to explain to others or yourself in future that the array must remain dense without falsy values, because otherwise it can bug.
To allow false, 0 or "" (any falsy except undefined) then extend it to: while ((x = denseArray[i++]) !== undefined) ... but then it is not better than an ordinary array length comparision.
Is it useful? Yes:
while( var = GetNext() )
{
...do something with var
}
Which would otherwise have to be written
var = GetNext();
while( var )
{
...do something
var = GetNext();
}
In general it is best to use denseArray.forEach((x) => { ... }) that is well known by devs. No need to think about falsy values. It has good browser support. But it is slow!
I made a jsperf that showed forEach is 60% slower than while! The test also show the for is slightly faster than while, on my machine! See also #Albert answer with a test show that for is slightly faster than while.
While this use of while is safe it may not be bugfree. In time of coding you may know your data, but you don't know if someone copy-paste the code to use on other data.
I've been debugging this for hours to no avail.
The following code checks whether each number in a series of numbers has no repeating digits ( 111 should return false; 123 should return true) and return an array of all numbers in the series which contain no repeating digits.
The array should be populating with values which the helper function returns as true for each value in the array, but running noRepeats() causes an infinite loop or a long array of 1s. What is causing this?
// DO NOT RUN AS IS; POTENTIAL INFINITE LOOP
var noRepeatDigits = function (n) {
n = n.toString();
for ( i = 0 ; i < n.length ; i ++ ) {
for ( j = i + 1 ; j < n.length ; j ++ ) {
if ( n.charAt(i) === n.charAt(j) ) {
return false;
}
}
}
return true;
};
console.log( noRepeatDigits(113) ); // false
console.log( noRepeatDigits(123) ); // true
var noRepeats = function (n1, n2) {
var arr = [];
for ( i = n1 ; i <= n2 ; i ++ ) {
if ( noRepeatDigits(i) ) {
arr.push(i);
}
}
return arr;
};
console.log( noRepeats(1, 100) );
You forgot to var i, so the iterator is global and the two functions using it overwrite each other. This causes unexpected behaviour at best, an infinite loop at worst.
However you can simplify your noRepeatDigits function a lot:
noRepeatDigits = function(n) {
return !n.toString().match(/(.).*?\1/);
};
This effectively does what your original function did, but offloads the heavy work to built-in, lower-level functions which generally speaking are significantly faster.
This is an addon to Niet the Dark Absol's answer.
As pointed out by him the unexpected behavior was caused by the variables used in your loop. I would recommend you to put the "use strict"; hint at the top of your Javascript. That way you will not make the same kind of mistake again.
Example:
"use strict";
// DO NOT RUN AS IS; POTENTIAL INFINITE LOOP
var noRepeatDigits = function (n) {
n = n.toString();
for ( i = 0 ; i < n.length ; i ++ ) {
for ( var j = i + 1 ; j < n.length ; j ++ ) {
if ( n.charAt(i) === n.charAt(j) ) {
return false;
}
}
}
return true;
};
Quote:
Converting mistakes into errors
Strict mode changes some previously-accepted mistakes into errors.
JavaScript was designed to be easy for novice developers, and
sometimes it gives operations which should be errors non-error
semantics. Sometimes this fixes the immediate problem, but sometimes
this creates worse problems in the future. Strict mode treats these
mistakes as errors so that they're discovered and promptly fixed.
First, strict mode makes it impossible to accidentally create global
variables. In normal JavaScript mistyping a variable in an assignment
creates a new property on the global object and continues to "work"
(although future failure is possible: likely, in modern JavaScript).
Assignments which would accidentally create global variables instead
throw in strict mode:
See:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Strict_mode
You know that in Javascript you can access the length of an text/array with length property:
var obj = ["Robert", "Smith", "John", "Mary", "Susan"];
// obj.length returns 5;
I want to know how this is implemented. Does Javascript calculates the length property when it is called? Or it is just a static property which is changed whenever the array is changed. My question is asked due to the following confusion in best-practices with javascript:
for(var i = 0; i < obj.length; i++)
{
}
My Problem: If it is a static property, then accessing the length property in each iteration is nothing to be concerned, but if it is calculated on each iteration, then it cost some memory.
I have read the following definition given by ECMAScript but it doesn't give any clue on how it is implemented. I'm afraid it might give a whole instance of array with the length property calculated in run-time, that if turns out to be true, then the above for() is dangerous to memory and instead the following should be used:
var count = obj.length;
for(var i = 0; i < count; i++)
{
}
Array in JavaScript is not a real Array type but it's an real Object type.
[].length is not being recalculated every time, it is being operated by ++ or -- operators.
See below example which is behaving same like array.length property.
var ArrayLike = {
length: 0,
push: function(val){
this[this.length] = val;
this.length++;
},
pop: function(){
delete this[this.length-1];
this.length--;
},
display: function(){
for(var i = 0; i < this.length; i++){
console.log(this[i]);
}
}
}
// output
ArrayLike.length // length == 0
ArrayLike.push('value1') // length == 1
ArrayLike.push('value2') // length == 2
ArrayLike.push('value3') // length == 3
ArrayLike.pop() // length == 2
ArrayLike.length === 2 // true
var a = ["abc","def"];
a["pqr"] = "hello";
What is a.length?
2
Why?
a.length is updated only when the index of the array is a numeric value. When you write
var a = ["abc","def"];
It is internally stored as:
a["0"] = "abc"
a["1"] = "def"
Note that the indexes are really keys which are strings.
Few more examples:
1.)
var a = ["abc","def"];
a["1001"] = "hello";
What is a.length?
1002
2.) Okay, let's try again:
var a = ["abc","def"];
a[1001] = "hello";
What is a.length?
1002
Note here, internally array is stored as
a["0"] = "abc"
a["1"] = "def"
a["1001"] = "hello"
3.) Okay, last one:
var a = ["abc"];
a["0"] = "hello";
What is a[0]?
"hello"
What is a.length?
1
It's good to know what a.length actually means: Well now you know: a.length is one more than the last numerical key present in the array.
I want to know how this is implemented. Does Javascript calculates the length property when it is called? Or it is just a static property which is changed whenever the array is changed.
Actually, your question cannot be answered in general because all the ECMA specs say is this:
The length property of this Array object is a data property whose
value is always numerically greater than the name of every deletable
property whose name is an array index.
In other words, the specs define the invariant condition of the length property, but not it's implementation. This means that different JavaScript engines could, in principle, implement different behavior.
var avg = function()
{
var sum = 0;
for (var i = 0, j = arguments.length; i < j; i++)
{
sum += arguments[i];
}
return sum / arguments.length;
}
When I try to call this like:
var average = avg(2,3,5);
average; // It works fine;
But how do I call it without assigning to a variable?
If anybody can give any suggestion it will be delightful..Thanks.
You'd simply call it like this:
avg(2, 3, 5);
If you want to see the result, put it in an alert call:
alert( avg(2, 3, 5) );
You don't need to put the result from calling the function in a variable, you can do whatever you like with it.
For example, use it as the value in an alert:
alert(avg(2,3,5));
You can use it in another expression:
var message = "The average is " + avg(2,3,5);
You can use it directly in another call:
someFunction(avg(2,3,5));
You can even throw the result away by not doing anything with it, even if that's not useful in this specific situation:
avg(2,3,5);
If you don't put the result into a variable or in a compatible context, this function cannot output anything, which makes it difficult to use unless you make it output the result. Try this :
var avg = function()
{
var sum = 0;
for (var i = 0, j = arguments.length; i < j; i++)
{
sum += arguments[i];
}
var retvalue = sum / arguments.length;
consoloe.log("avg: "+retvalue);
return retvalue ;
}
Then it may help you to see whenever the function is called or not.
You need to understand the concept of expressions.
Every expression as a whole represents one value. An expression can be made up of multiple subexpressions that are combined in some manner (for example with operators) to yield a new value.
For instance:
3 is an expression (a literal, to be specific) that denotes the numeric value three.
3 + 4 is an expression, made up of two literal expressions, that as a whole yields the value 7
When you assign a value to a variable – as in var average = – the right hand side of the =-operator needs to be an expression, i.e. something that yields a value.
As you have observed, average will have been assigned the value five. It thus follows, that avg(2, 3, 5) must itself be an expression that evaluated to the value 5.
average itself is an expression, denoting the current value of said variable.
The most important thing to take away from this is, that avg() is in no way connected to var average =. avg() stands on its own, you can just think it like an ordinary value such as 5 (there are other differences of course).
If you have understood the concept of expressions and values, it should be clear that if you can do
var average = avg(2,3,5);
average; // It works fine;
You can also do
avg(2,3,5);
I have an array of arrays. The inner array is 16 slots, each with a number, 0..15. A simple permutation.
I want to check if any of the arrays contained in the outer array, have the same values as
a test array (a permutation of 16 values).
I can do this easily by something like so:
var containsArray = function (outer, inner) {
var len = inner.length;
for (var i=0; i<outer.length; i++) {
var n = outer[i];
var equal = true;
for (var x=0; x<len; x++) {
if (n[x] != inner[x]) {
equal = false;
break;
}
}
if (equal) return true;
}
return false;
}
But is there a faster way?
Can I assign each permutation an integral value - actually a 64-bit integer?
Each value in a slot is 0..15, meaning it can be represented in 4 bits. There are 16 slots, which implies 64 total bits of information.
In C# it would be easy to compute and store a hash of the inner array (or permutation) using this approach, using the Int64 type. Does Javascript have 64-bit integer math that will make this fast?
That's just about as fast as it gets, comparing arrays in javascript (as in other languages) is quite painful. I assume you can't get any speed benefits from comparing the lengths before doing the inner loop, as your arrays are of fixed size?
Only "optimizations" I can think of is simplifying the syntax, but it won't give you any speed benefits. You are already doing all you can by returning as early as possible.
Your suggestion of using 64-bit integers sounds interesting, but as javascript doesn't have a Int64 type (to my knowledge), that would require something more complicated and might actually be slower in actual use than your current method.
how about comparing the string values of myInnerArray.join('##') == myCompareArray.join('##'); (of course the latter join should be done once and stored in a variable, not for every iteration like that).
I don't know what the actual performance differences would be, but the code would be more terse. If you're doing the comparisons a lot of times, you could have these values saved away someplace, and the comparisons would probably be quicker at least the second time round.
The obvious problem here is that the comparison is prone to false positives, consider
var array1 = ["a", "b"];
var array2 = ["a##b"];
But if you can rely on your data well enough you might be able to disregard from that? Otherwise, if you always compare the join result and the lengths, this would not be an issue.
Are you really looking for a particular array instance within the outer array? That is, if inner is a match, would it share the same reference as the matched nested array? If so, you can skip the inner comparison loop, and simply do this:
var containsArray = function (outer, inner) {
var len = inner.length;
for (var i=0; i<outer.length; i++) {
if (outer[i] === inner) return true;
}
return false;
}
If you can't do this, you can still make some headway by not referencing the .length field on every loop iteration -- it's an expensive reference, because the length is recalculated each time it's referenced.
var containsArray = function (outer, inner) {
var innerLen = inner.length, outerLen = outer.length;
for (var i=0; i<outerLen; i++) {
var n = outer[i];
var equal = true;
for (var x=0; x<innerLen; x++) {
if (n[x] != inner[x]) {
equal = false;
}
}
if (equal) return true;
}
return false;
}
Also, I've seen claims that loops of this form are faster, though I haven't seen cases where it makes a measurable difference:
var i = 0;
while (i++ < outerLen) {
//...
}
EDIT: No, don't remove the equal variable; that was a bad idea on my part.
the only idea that comes to me is to push the loop into the implementation and trade some memory for (speculated, you'd have to test the assumption) speed gain, which also relies on non-portable Array.prototype.{toSource,map}:
var to_str = function (a) {
a.sort();
return a.toSource();
}
var containsString = function (outer, inner) {
var len = outer.length;
for (var i=0; i<len; ++i) {
if (outer[i] == inner)
return true;
}
return false;
}
var found = containsString(
outer.map(to_str)
, to_str(inner)
);
var containsArray = function (outer, inner) {
var innerLen = inner.length,
innerLast = inner.length-1,
outerLen = outer.length;
outerLoop: for (var i=0; i<outerLen; i++) {
var n = outer[i];
for (var x = 0; x < innerLen; x++) {
if (n[x] != inner[x]) {
continue outerLoop;
}
if (x == innerLast) return true;
}
}
return false;
}
Knuth–Morris–Pratt algorithm
Rumtime: O(n), n = size of the haystack
http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm