Why is Selection Sort so fast in Javascript? - javascript

I'm studying for a technical interview right now, and writing quick javascript implementations of different sorts. The random-array benchmark results for most of the elementary sorts makes sense but the selection sort is freakishly fast. And I don't know why.
Here is my implementation of the Selection Sort:
Array.prototype.selectionSort = function () {
for (var target = 0; target < this.length - 1; target++) {
var min = target;
for (var j = target + 1; j < this.length - 1; j++) {
if (this[min] > this[j]) {
min = j;
}
}
if (min !== target) {
this.swap(min, target);
}
}
}
Here are the results of the same randomly generated array with 10000 elements:
BubbleSort => 148ms
InsertionSort => 94ms
SelectionSort => 91ms
MergeSort => 45ms
All the sorts are using the same swap method. So why is Selection Sort faster? My only guess is that Javascript is really fast at array traversal but slow at value mutation, since SelectionSort uses the least in value mutation, it's faster.
** For Reference **
Here is my Bubble Sort implementation
Array.prototype.bubbleSort = function () {
for (var i = this.length - 1; i > 1; i--) {
var swapped = false;
for (var j = 0; j < i; j++) {
if (this[j + 1] < this[j]) {
this.swap(j, j+1);
swapped = true;
}
}
if ( ! swapped ) {
return;
}
}
}
Here is the swap Implementation
Array.prototype.swap = function (index1, index2) {
var val1 = this[index1],
val2 = this[index2];
this[index1] = val2;
this[index2] = val1;
};

First let me point out two flaws:
The code for your selection sort is faulty. The inner loop needs to be
for (var j = target + 1; j < this.length; j++) {
otherwise the last element is never selected.
Your jsperf tests sort, as you say, the "same randomly generated array" every time. That means that the successive runs in each test loop will try to sort an already sorted array, which would favour algorithms like bubblesort that have a linear best-case performance.
Luckily, your test array is so freakishly large that jsperf runs only a single iteration of its test loop at once, calling the setup code that initialises the array before every run. This would haunt you for smaller arrays, though. You need to shuffle the array inside the "timed code" itself.
Why is Selection Sort faster? My only guess is that Javascript is really fast at array traversal but slow at value mutation.
Yes. Writes are always slower than reads, and have negative effects on cached values as well.
SelectionSort uses the least in value mutation
Yes, and that is quite significant. Both selection and bubble sort do have an O(n²) runtime, which means that both execute about 100000000 loop condition checks, index increments, and comparisons of two array elements.
However, while selection sort does only O(n) swaps, bubble sort does O(n²) of them. That means not only mutating the array, but also the overhead of a method call. And that much much more often than the selection sort does it. Here are some example logs:
> swaps in .selectionSort() of 10000 element arrays
9989
9986
9992
9990
9987
9995
9989
9990
9988
9991
> swaps in .bubbleSort() of 10000 element arrays
24994720
25246566
24759007
24912175
24937357
25078458
24918266
24789670
25209063
24894328
Ooops.

Related

I Wrote a Custom In-Place Sorting Method for Integer Arrays (Javascript) - Time Complexity? Sorting Method Name?

I wrote a simple algorithm for sorting an array of integers in JS. I'd like to know what the Time and Space Complexity is and whether or not this is an efficient algorithm. I couldn't find this sorting method listed elsewhere online (although it seems similar to bubble sort). I realize that JS has a built in sort function but I wrote this for practice. Please let me know what you think:
function arraySort(array){
var i = 0;
//helper function to sort backwards
function leftSort(j){
if(array[j] < array[j-1]){
//swap in place
temp = array[j]
array[j] = array[j-1]
array[j-1] = temp
if(j-1 > 0){
//call recursively to swap further left
leftSort(j-1)
}
}
}
//sort forwards
while(i < array.length){
if(array[i] > array[i+1]){
//swap in place
var temp = array[i]
array[i] = array[i+1]
array[i+1] = temp
if(i>0){
//sort swapped value backwards to the left
leftSort(i)
}
}
i++
}
leftSort(i)
return array
}
This seems to essentially be a bubble sort with a peculiar order of execution.
Your algorithm travels the array forth and back (n)*(0.5n) times -> O(n^2). Space complexity is constant, since this is a in-place sort.

How to partition array of integers to even and odd?

I want to partition an array (eg [1,2,3,4,5,6,7,8]), first partition should keep even values, second odd values (example result: [2,4,6,8,1,3,5,7]).
I managed to resolve this problem twice with built-in Array.prototype methods. First solution uses map and sort, second only sort.
I would like to make a third solution which uses a sorting algorithm, but I don't know what algorithms are used to partition lists. I'm thinking about bubble sort, but I think it is used in my second solution (array.sort((el1, el2)=>(el1 % 2 - el2 % 2)))... I looked at quicksort, but I don't know where to apply a check if an integer is even or odd...
What is the best (linear scaling with array grow) algorithm to perform such task in-place with keeping order of elements?
You can do this in-place in O(n) time pretty easily. Start the even index at the front, and the odd index at the back. Then, go through the array, skipping over the first block of even numbers.
When you hit an odd number, move backwards from the end to find the first even number. Then swap the even and odd numbers.
The code looks something like this:
var i;
var odd = n-1;
for(i = 0; i < odd; i++)
{
if(arr[i] % 2 == 1)
{
// move the odd index backwards until you find the first even number.
while (odd > i && arr[odd] % 2 == 1)
{
odd--;
}
if (odd > i)
{
var temp = arr[i];
arr[i] = arr[odd];
arr[odd] = temp;
}
}
}
Pardon any syntax errors. Javascript isn't my strong suit.
Note that this won't keep the same relative order. That is, if you gave it the array [1,2,7,3,6,8], then the result would be [8,2,6,3,7,1]. The array is partitioned, but the odd numbers aren't in the same relative order as in the original array.
If you are insisting on an in-place approach instead of the trivial standard return [arr.filter(predicate), arr.filter(notPredicate)] approach, that can be easily and efficiently achieved using two indices, running from both sides of the array and swapping where necessary:
function partitionInplace(arr, predicate) {
var i=0, j=arr.length;
while (i<j) {
while (predicate(arr[i]) && ++i<j);
if (i==j) break;
while (i<--j && !predicate(arr[j]));
if (i==j) break;
[arr[i], arr[j]] = [arr[j], arr[i]];
i++;
}
return i; // the index of the first element not to fulfil the predicate
}
let evens = arr.filter(i=> i%2==0);
let odds = arr.filter(i=> i%2==1);
let result = evens.concat(odds);
I believe that's O(n). Have fun.
EDIT:
Or if you really care about efficiency:
let evens, odds = []
arr.forEach(i=> {
if(i%2==0) evens.push(i); else odds.push(i);
});
let result = evens.concat(odds);
Array.prototype.getEvenOdd= function (arr) {
var result = {even:[],odd:[]};
if(arr.length){
for(var i = 0; i < arr.length; i++){
if(arr[i] % 2 = 0)
result.odd.push(arr[i]);
else
result.even.push(arr[i]);
}
}
return result ;
};

What is the time complexity of this for loop nested in a while loop?

I am trying to optimize a function. I believe this nested for loop is quadratic, but I'm not positive. I have recreated the function below
const bucket = [["e","f"],[],["j"],[],["p","q"]]
let totalLettersIWantBack = 4;
//I'm starting at the end of the bucket
function produceLetterArray(bucket, limit){
let result = [];
let countOfLettersAccumulated = 0;
let i = bucket.length - 1;
while(i > 0){
if(bucket[i].length > 0){
bucket[i].forEach( (letter) =>{
if(countOfLettersAccumulated === totalLettersIWantBack){
return;
}
result.push(letter);
countOfLettersAccumulated++;
})
}
i--;
}
return result;
}
console.log(produceLetterArray(bucket, totalLettersIWantBack));
Here is a trick for such questions. For the code whose complexity you want to analyze, just write the time that it would take to execute each statement in the worst case assuming no other statement exists. Note the comments begining with #operations worst case:
For the given code:
while(i > 0){ //#operations worst case: bucket.length
if(bucket[i].length > 0){ //#operations worst case:: 1
bucket[i].forEach( (letter) =>{ //#operations worst case: max(len(bucket[i])) for all i
if(countOfLettersAccumulated === totalLettersIWantBack){ //#operations worst case:1
return;
}
result.push(letter); //#operations worst case:1
countOfLettersAccumulated++; //#operations worst case:1
})
}
i--; ////#operations worst case:: 1
}
We can now multiply all the worst case times (since they all can be achieved in the worst case, you can always set totalLettersIWantBack = 10^9) to get the O complexity of the snippet:
Complexity = O(bucket.length * 1 * max(len(bucket[i])) * 1 * 1 * 1 * 1)
= O(bucket.length * max(len(bucket[i]))
If the length of each of the bucket[i] was a constant, K, then your complexity reduces to:
O(K * bucket.length ) = O(bucket.length)
Note that the complexity of the push operation may not remain constant as the number of elements grow (ultimately, the runtime will need to allocate space for the added elements, and all the existing elements may have to be moved).
Whether or not this is quadratic depends on what you consider N and how bucket is organized. If N is the total number of letters, then the runtime is bound by either the number of bins in your bucket, if that is larger than N, or it is bound by the number of letters in the bucket, if N is larger. In either case, the search time increases linearly with the larger bound, if one would dominate the other the time complexity is O(N). This is effectively a linear search with "turns" in it, scrunching a linear search and spacing it out does not change the time complexity. The existence of multiple loops in a piece of code does not alone make it non linear. Take the linear search example again. We search a list until we've found the largest element.
//12 elements
var array = [0,1,2,3,4,5,6,7,8,9,10,11];
var rows = 3;
var cols = 4;
var largest = -1;
for(var i = 0; i < rows; ++i){
for(var j = 0; j < cols; ++j){
var checked = array[(i * cols) + j];
if (checked > largest){
largest = checked;
}
}
}
console.log("found largest number (eleven): " + largest.toString());
Despite this using two loops instead of one, the runtime complexity is still O(N) where N is the number of elements in the input. Scrunching this down so each index is actually an array to multiple elements, or separating relevant elements by empty bins doesn't change the fact the runtime complexity is bound linearly.
This is technically linear with n being the number of elements total in your matrix. This is because the exit condition is the length of bucket and for each array in bucket you check if countOfLettersAccumulated is equal to totalLettersIWantBack. Continually looking at values.
It gets a lot more complicated if you are looking for an answer matching the dimensions of your matrix because it looks like the dimensions of bucket are not fixed.
You can turn this bit of code into constant by adding an additional check outside the bucket foreach which if countOfLettersAccumulated is equal to
totalLettersIWantBack then you do a break.
I like #axiom's explanation of complexity analyze.
Just would like to add possible optimized solution.
UPD .push (O(1)) is faster that .concat (O(n^2))
also here is test Array push vs. concat
const bucket = [["e","f"],[],["j", 'm', 'b'],[],["p","q"]]
let totalLettersIWantBack = 4;
//I'm starting at the end of the bucket
function produceLetterArray(bucket, limit){
let result = [];
for(let i = bucket.length-1; i > 0 && result.length < totalLettersIWantBack; i--){
//previous version
//result = result.concat(bucket[i].slice(0, totalLettersIWantBack-result.length));
//faster version of merging array
Array.prototype.push.apply(result, bucket[i].slice(0, totalLettersIWantBack-result.length));
}
return result;
}
console.log(produceLetterArray(bucket, totalLettersIWantBack));

Speed differences in JavaScript functions finding the most common element in an array

I'm studying for an interview and have been working through some practice questions. The question is:
Find the most repeated integer in an array.
Here is the function I created and the one they created. They are appropriately named.
var arr = [3, 6, 6, 1, 5, 8, 9, 6, 6]
function mine(arr) {
arr.sort()
var count = 0;
var integer = 0;
var tempCount = 1;
var tempInteger = 0;
var prevInt = null
for (var i = 0; i < arr.length; i++) {
tempInteger = arr[i]
if (i > 0) {
prevInt = arr[i - 1]
}
if (prevInt == arr[i]) {
tempCount += 1
if (tempCount > count) {
count = tempCount
integer = tempInteger
}
} else {
tempCount = 1
}
}
console.log("most repeated is: " + integer)
}
function theirs(a) {
var count = 1,
tempCount;
var popular = a[0];
var temp = 0;
for (var i = 0; i < (a.length - 1); i++) {
temp = a[i];
tempCount = 0;
for (var j = 1; j < a.length; j++) {
if (temp == a[j])
tempCount++;
}
if (tempCount > count) {
popular = temp;
count = tempCount;
}
}
console.log("most repeated is: " + popular)
}
console.time("mine")
mine(arr)
console.timeEnd("mine")
console.time("theirs")
theirs(arr)
console.timeEnd("theirs")
These are the results:
most repeated is: 6
mine: 16.929ms
most repeated is: 6
theirs: 0.760ms
What makes my function slower than their?
My test results
I get the following results when I test (JSFiddle) it for a random array with 50 000 elements:
mine: 28.18 ms
theirs: 5374.69 ms
In other words, your algorithm seems to be much faster. That is expected.
Why is your algorithm faster?
You sort the array first, and then loop through it once. Firefox uses merge sort and Chrome uses a variant of quick sort (according to this question). Both take O(n*log(n)) time on average. Then you loop through the array, taking O(n) time. In total you get O(n*log(n)) + O(n), that can be simplified to just O(n*log(n)).
Their solution, on the other hand, have a nested loop where both the outer and inner loops itterate over all the elements. That should take O(n^2). In other words, it is slower.
Why does your test results differ?
So why does your test results differ from mine? I see a number of possibilities:
You used a to small sample. If you just used the nine numbers in your code, that is definately the case. When you use short arrays in the test, overheads (like running the console.log as suggested by Gundy in comments) dominate the time it takes. This can make the result appear completely random.
neuronaut suggests that it is related to the fact that their code operates on the array that is already sorted by your code. While that is a bad way of testing, I fail to see how it would affect the result.
Browser differences of some kind.
A note on .sort()
A further note: You should not use .sort() for sorting numbers, since it sorts things alphabetically. Instead, use .sort(function(a, b){return a-b}). Read more here.
A further note on the further note: In this particular case, just using .sort() might actually be smarter. Since you do not care about the sorting, only the grouping, it doesnt matter that it sort the numbers wrong. It will still group elements with the same value together. If it is faster without the comparison function (i suspect it is), then it makes sense to sort without one.
An even faster algorithm
You solved the problem in O(n*log(n)), but you can do it in just O(n). The algorithm to do that is quite intuitive. Loop through the array, and keep track of how many times each number appears. Then pick the number that appears the most times.
Lets say there are m different numbers in the array. Looping through the array takes O(n) and finding the max takes O(m). That gives you O(n) + O(m) that simplifies to O(n) since m < n.
This is the code:
function anders(arr) {
//Instead of an array we use an object and properties.
//It works like a dictionary in other languages.
var counts = new Object();
//Count how many of each number there is.
for(var i=0; i<arr.length; i++) {
//Make sure the property is defined.
if(typeof counts[arr[i]] === 'undefined')
counts[arr[i]] = 0;
//Increase the counter.
counts[arr[i]]++;
}
var max; //The number with the largest count.
var max_count = -1; //The largest count.
//Iterate through all of the properties of the counts object
//to find the number with the largerst count.
for (var num in counts) {
if (counts.hasOwnProperty(num)) {
if(counts[num] > max_count) {
max_count = counts[num];
max = num;
}
}
}
//Return the result.
return max;
}
Running this on a random array with 50 000 elements between 0 and 49 takes just 3.99 ms on my computer. In other words, it is the fastest. The backside is that you need O(m) memory to store how many time each number appears.
It looks like this isn't a fair test. When you run your function first, it sorts the array. This means their function ends up using already sorted data but doesn't suffer the time cost of performing the sort. I tried swapping the order in which the tests were run and got nearly identical timings:
console.time("theirs")
theirs(arr)
console.timeEnd("theirs")
console.time("mine")
mine(arr)
console.timeEnd("mine")
most repeated is: 6
theirs: 0.307ms
most repeated is: 6
mine: 0.366ms
Also, if you use two separate arrays you'll see that your function and theirs run in the same amount of time, approximately.
Lastly, see Anders' answer -- it demonstrates that larger data sets reveal your function's O(n*log(n)) + O(n) performance vs their function's O(n^2) performance.
Other answers here already do a great job of explaining why theirs is faster - and also how to optimize yours. Yours is actually better with large datasets (#Anders). I managed to optimize the theirs solution; maybe there's something useful here.
I can get consistently faster results by employing some basic JS micro-optimizations. These optimizations can also be applied to your original function, but I applied them to theirs.
Preincrementing is slightly faster than postincrementing, because the value does not need to be read into memory first
Reverse-while loops are massively faster (on my machine) than anything else I've tried, because JS is translated into opcodes, and guaranteeing >= 0 is very fast. For this test, my computer scored 514,271,438 ops/sec, while the next-fastest scored 198,959,074.
Cache the result of length - for larger arrays, this would make better more noticeably faster than theirs
Code:
function better(a) {
var top = a[0],
count = 0,
i = len = a.length - 1;
while (i--) {
var j = len,
temp = 0;
while (j--) {
if (a[j] == a[i]) ++temp;
}
if (temp > count) {
count = temp;
top = a[i];
}
}
console.log("most repeated is " + top);
}
[fiddle]
It's very similar, if not the same, to theirs, but with the above micro-optimizations.
Here are the results for running each function 500 times. The array is pre-sorted before any function is run, and the sort is removed from mine().
mine: 44.076ms
theirs: 35.473ms
better: 32.016ms

Alternatives to javascript function-based iteration (e.g. jQuery.each())

I've been watching Google Tech Talks' Speed Up Your Javascript and in talking about loops, the speaker mentions to stay away from function-based iterations such as jQuery.each() (among others, at about 24:05 in the video). He briefly explains why to avoid them which makes sense, but admittedly I don't quite understand what an alternative would be. Say, in the case I want to iterate through a column of table cells and use the value to manipulate the adjacent cell's value (just a quick example). Can anyone explain and give an example of an alternative to function-based iteration?
Just a simple for loop should be quicker if you need to loop.
var l = collection.length;
for (var i = 0; i<l; i++) {
//do stuff
}
But, just because it's quicker doesn't mean it's always important that it is so.
This runs at the client, not the server, so you don't need to worry about scaling with the number of users, and if it's quick with a .each(), then leave it. But, if that's slow, a for loop could speed it up.
Ye olde for-loop
It seems to me that it would be case that function-based iteration would be slightly slower because of the 1) the overhead of function itself, 2) the overhead of the callback function being created and executed N times, and 3) the extra depth in the scope chain. However, I thought I'd do a quick benchmark just for kicks. Turns out, at least in my simple test-case, that function-based iteration was faster. Here's the code and the findings
Test benchmark code
// Function based iteration method
var forEach = function(_a, callback) {
for ( var _i=0; _i<_a.length; _i++ ) {
callback(_a[_i], _i);
}
}
// Generate a big ass array with numbers 0..N
var a = [], LENGTH = 1024 * 10;
for ( var i=0; i<LENGTH; i++ ) { a.push(i); }
console.log("Array length: %d", LENGTH);
// Test 1: function-based iteration
console.info("function-base iteration");
var end1 = 0, start1 = new Date().getTime();
var sum1 = 0;
forEach(a, function(value, index) { sum1 += value; });
end1 = new Date().getTime();
console.log("Time: %sms; Sum: %d", end1 - start1, sum1);
// Test 2: normal for-loop iteration
console.info("Normal for-loop");
var end2 = 0, start2 = new Date().getTime();
var sum2 = 0;
for (var j=0; j<a.length; j++) { sum2 += a[j]; }
end2 = new Date().getTime();
console.log("Time: %sms; Sum: %d", end2 - start2, sum2);
Each test just sums the array which is simplistic, but something that can be realistically seen in some sort of real life scenario.
Results for FF 3.5
Array length: 10240
function-base iteration
Time: 9ms; Sum: 52423680
Normal for-loop
Time: 22ms; Sum: 52423680
Turns out that a basic for iteration was faster in this test case. I haven't watched the video yet, but I'll give it a look and see if he's differing somewhere that would make function-based iterations slower.
Edit: This is by no means the end-all, be-all and is only the results of one engine and one test-case. I fully expected the results to be the other way around (function-based iteration being slower), but it is interesting to see how certain browsers have made optimizations (which may or may not be specifically aimed at this style of JavaScript) so that the opposite is true.
The fastest possible way to iterate is to cut down on stuff you do within the loop. Bring stuff out of the iteration and minimise lookups/increments within the loop, e.g.
var i = arr.length;
while (i--) {
console.log("Item no "+i+" is "+arr[i]);
}
NB! By testing on the latest Safari (with WebKit nightly), Chrome and Firefox, you'll find that it really doesn't matter which kind of the loop you're choosing if it's not for each or for in (or even worse, any derived functions built upon them).
Also, what turns out, is that the following for loop is very slightly even faster than the above option:
var l = arr.length;
for (var i=l; i--;) {
console.log("Item no "+i+" is "+arr[i]);
}
If the order of looping doesn't matter, the following should be fastest as you only need a single local variable; also, decrementing the counter and bounds-checking are done with a single statement:
var i = foo.length;
if(i) do { // check for i != 0
// do stuff with `foo[i]`
} while(--i);
What I normally use is the following:
for(var i = foo.length; i--; ) {
// do stuff with `foo[i]`
}
It's potentially slower than the previous version (post- vs pre-decrement, for vs while), but more readable.

Categories