Sorting an array and indexOf() runtime - javascript

Is it safe to assume that array.indexOf() does a linear search from the beginning of the array?
So if I often search for the biggest value, will sorting the array before calling indexOf make it run even faster?
Notes:
I want to sort only once and search many times.
"biggest value" is actually the most popular search key which is a string.

Yes, indexOf is starting from the first to the last. Sorting it before asking the first entry afterwards makes a difference in performance depending in the performance of sorting algorithm. Normally O(N log N) in quicksort to O(n) in linear search. I would suggest you to make a simple test for it with random value count and see how performance behave.
Of course it depends on your DataObject:
ArrayList:
public int indexOf(Object o) {
if (o == null) {
for (int i = 0; i < size; i++)
if (elementData[i]==null)
return i;
} else {
for (int i = 0; i < size; i++)
if (o.equals(elementData[i]))
return i;
}
return -1;
}
Perhaps a TreeSet can also help you till its ordered all the time.
At the end i would say:"Depends on your data-container and how the ordering or search is implemented and where the performance in the whole process is needed".
As a comment say with pure arrays you can make binarySearch which has the same performance impact of quicksort.
So at the end its not only a question of the performance of algorithm but also at what time you need the performance in the process. For example if you have lots of time by adding values and no time by getting the value you need, a sorted structure can improve it very well, like Tree-Collections. If you need more performance by adding things its perhaps the other way around.

Related

Time Complexity Hash Table

I implemented a simple algorithm in 2 ways. One using indexOf and other one using Hash Table.
The problem:
Given an arbitrary ransom note string and another string containing letters from all the magazines, write a function that will return true if the ransom note can be constructed from the magazines ; otherwise, it will return false.
First one. Is it Time Complexity O(N^2) because I have N letters in the ransomNote and I can do N searches in the indexOf?
var canConstruct = function(ransomNote, magazine) {
if(magazine.length < ransomNote.length) return false;
const arr = magazine.split("");
for(let i=0; i<ransomNote.length; i++) {
if(arr.indexOf(ransomNote[i]) < 0)
return false;
const index = arr.indexOf(ransomNote[i]);
arr.splice(index, 1);
}
return true;
};
Second one. What is the time complexity? Does the hash table makes it O(N)?
var canConstruct = function(ransomNote, magazine) {
if(magazine.length < ransomNote.length) return false;
const map = new Map();
for(let i =0; i<magazine.length; i++) {
if(map.has(magazine[i]))
map.set(magazine[i], map.get(magazine[i])+1);
else
map.set(magazine[i], 1);
}
for(let i=0; i<ransomNote.length; i++) {
if(!map.has(ransomNote[i]))
return false;
else {
const x = map.get(ransomNote[i]) - 1;
if(x > 0)
map.set(ransomNote[i], x)
else
map.delete(ransomNote[i]);
}
}
return true;
};
Thanks
FIRST solution
Well one thing you have to take in consideration, especially in the first solution is that split, slice and indexOf methods all have their own time complexity.
Let's say you have m letters in magazine. When you split it into array, you will already use O(m) time complexity there (and of course O(m) space complexity due to the fact you store it all in a new array, that has a size of m).
Now you enter a for loop that will run n times (where n is the number of letters in ransomNote). So right then and there you have O(m * n) time complexity. indexOf operation will also be called n times with the caveat that it runs O(m) every time it's called. You can see there how quickly you start adding up your time complexity.
I see it being like O(3 * m * n^2) which rounds up to O(m * n^n) time complexity. I would strongly suggest not calling indexOf multiple times, just call it once and store its result somewhere. You will either have an index or -1 meaning it wasn't found.
SECOND solution
Much better. Here you populate a hash map (so some extra memory used, but considering you also used a split in first, and stored it, it should roughly be the same).
And then you just simply go with one loop over a randomNote and find a letter in the hashMap. Finding a letter in map is O(1) Time so it's really useful for this type of algorithm.
I would argue the complexity would be O(n * m) in the end so much better than the first one.
Hope I made sense to you. IF you want we can dive a bit deeper in space analysis later in the comments if you respond
The version with indexOf() is O(N*M), where N is the length of the ransom note, and M is the length of the magazine. You perform up to N searches, and each search is linear through the magazine array that's N characters. Also, array.splice() is O(M) because it has to copy all the array elements after index to the lower index.
Hash table access and updates are generally considered to be O(1). The second version performs N hash table lookups and updates, so the overall complexity is O(N).

Array contains anything other than 0

I have an array of numbers with 64 indexes (it's canvas image data).
I want to know if my array contains only zero's or anything other than zero.
We can return a boolean upon the first encounter of any number greater than zero (even if the very last index is non-zero and all the others are zero, we should return true).
What is the most efficient way to determine this?
Of course, we could loop over our array (focus on the testImageData function):
// Setup
var imgData = {
data: new Array(64)
};
imgData.data.fill(0);
// Set last pixel to black
imgData.data[imgData.data.length - 1] = 255;
// The part in question...
function testImageData(img_data) {
var retval = false;
for (var i = 0; i < img_data.data.length; i++) {
if (img_data.data[i] > 0) {
retval = true;
break;
}
}
return retval;
}
var result = testImageData(imgData);
...but this could take a while if my array were bigger.
Is there a more efficient way to test if any index in the array is greater than zero?
I am open to answers using lodash, though I am not using lodash in this project. I would rather the answer be native JavaScript, either ES5 or ES6. I'm going to ignore any jQuery answers, just saying...
Update
I setup a test for various ways to check for a non-zero value in an array, and the results were interesting.
Here is the JSPerf Link
Note, the Array.some test was much slower than using for (index) and even for-in. The fastest, of course, was for(index) for(let i = 0; i < arr.length; i++)....
You should note that I also tested a Regex solution, just to see how it compared. If you run the tests, you will find that the Regex solution is much, much slower (not surprising), but still very interesting.
I would like to see if there is a solution that could be accomplished using bitwise operators. If you feel up to it, I would like to see your approach.
Your for loop is the fastest way on Chrome 64 with Windows 10.
I've tested against two other options, here is the link to the test so you can run them on your environment.
My results are:
// 10776 operations per second (the best)
for (let i = 0; i < arr.length; i++) {
if (arr[i] !== 0) {
break
}
}
// 4131 operations per second
for (const n of arr) {
if (n !== 0) {
break
}
}
// 821 operations per second (the worst)
arr.some(x => x)
There is no faster way than looping through every element in the array. logically in the worst case scenario the last pixel in your array is black, so you have to check all of them. The best algorithm therefore can only have a O(n) runtime. Best thing you can do is write a loop that breaks early upon finding a non-white pixel.

Is it possible to have quadratic time complexity without nested loops?

It was going so well. I thought I had my head around time complexity. I was having a play on codility and used the following algorithm to solve one of their problems. I am aware there are better solutions to this problem (permutation check) - but I simply don't understand how something without nested loops could have a time complexity of O(N^2). I was under the impression that the associative arrays in Javascript are like hashes and are very quick, and wouldn't be implemented as time-consuming loops.
Here is the example code
function solution(A) {
// write your code in JavaScript (Node.js)
var dict = {};
for (var i=1; i<A.length+1; i++) {
dict[i] = 1;
}
for (var j=0; j<A.length; j++) {
delete dict[A[j]];
}
var keyslength = Object.keys(dict).length;
return keyslength === 0 ? 1 : 0;
}
and here is the verdict
There must be a bug in their tool that you should report: this code has a complexity of O(n).
Believe me I am someone on the Internet.
On my machine:
console.time(1000);
solution(new Array(1000));
console.timeEnd(1000);
//about 0.4ms
console.time(10000);
solution(new Array(10000));
console.timeEnd(10000);
// about 4ms
Update: To be pedantic (sic), I still need a third data point to show it's linear
console.time(100000);
solution(new Array(100000));
console.timeEnd(100000);
// about 45ms, well let's say 40ms, that is not a proof anyway
Is it possible to have quadratic time complexity without nested loops? Yes. Consider this:
function getTheLengthOfAListSquared(list) {
for (var i = 0; i < list.length * list.length; i++) { }
return i;
}
As for that particular code sample, it does seem to be O(n) as #floribon says, given that Javascript object lookup should be constant time.
Remember that making an algorithm that takes an arbitrary function and determines whether that function will complete at all is provably impossible (halting problem), let alone determining complexity. Writing a tool to statically determine the complexity of anything but the most simple programs would be extremely difficult and this tool's result demonstrates that.

why to use sorting maps on arrays. how is it better in some instances

I'm trying to learn about array sorting. It seems pretty straightforward. But on the mozilla site, I ran across a section discussing sorting maps (about three-quarters down the page).
The compareFunction can be invoked multiple times per element within
the array. Depending on the compareFunction's nature, this may yield a
high overhead. The more work a compareFunction does and the more
elements there are to sort, the wiser it may be to consider using a
map for sorting.
The example given is this:
// the array to be sorted
var list = ["Delta", "alpha", "CHARLIE", "bravo"];
// temporary holder of position and sort-value
var map = [];
// container for the resulting order
var result = [];
// walk original array to map values and positions
for (var i=0, length = list.length; i < length; i++) {
map.push({
// remember the index within the original array
index: i,
// evaluate the value to sort
value: list[i].toLowerCase()
});
}
// sorting the map containing the reduced values
map.sort(function(a, b) {
return a.value > b.value ? 1 : -1;
});
// copy values in right order
for (var i=0, length = map.length; i < length; i++) {
result.push(list[map[i].index]);
}
// print sorted list
print(result);
I don't understand a couple of things. To wit: What does it mean, "The compareFunction can be invoked multiple times per element within the array"? Can someone show me an example of that. Secondly, I understand what's being done in the example, but I don't understand the potential "high[er] overhead" of the compareFunction. The example shown here seems really straightforward and mapping the array into an object, sorting its value, then putting it back into an array would take much more overhead I'd think at first glance. I understand this is a simple example, and probably not intended for anything else than to show the procedure. But can someone give an example of when it would be lower overhead to map like this? It seems like a lot more work.
Thanks!
When sorting a list, an item isn't just compared to one other item, it may need to be compared to several other items. Some of the items may even have to be compared to all other items.
Let's see how many comparisons there actually are when sorting an array:
var list = ["Delta", "alpha", "CHARLIE", "bravo", "orch", "worm", "tower"];
var o = [];
for (var i = 0; i < list.length; i++) {
o.push({
value: list[i],
cnt: 0
});
}
o.sort(function(x, y){
x.cnt++;
y.cnt++;
return x.value == y.value ? 0 : x.value < y.value ? -1 : 1;
});
console.log(o);
Result:
[
{ value="CHARLIE", cnt=3},
{ value="Delta", cnt=3},
{ value="alpha", cnt=4},
{ value="bravo", cnt=3},
{ value="orch", cnt=3},
{ value="tower", cnt=7},
{ value="worm", cnt=3}
]
(Fiddle: http://jsfiddle.net/Guffa/hC6rV/)
As you see, each item was compared to seveal other items. The string "tower" even had more comparisons than there are other strings, which means that it was compared to at least one other string at least twice.
If the comparison needs some calculation before the values can be compared (like the toLowerCase method in the example), then that calculation will be done several times. By caching the values after that calculation, it will be done only once for each item.
The primary time saving in that example is gotten by avoiding calls to toLowerCase() in the comparison function. The comparison function is called by the sort code each time a pair of elements needs to be compared, so that's a savings of a lot of function calls. The cost of building and un-building the map is worth it for large arrays.
That the comparison function may be called more than once per element is a natural implication of how sorting works. If only one comparison per element were necessary, it would be a linear-time process.
edit — the number of comparisons that'll be made will be roughly proportional to the length of the array times the base-2 log of the length. For a 1000 element array, then, that's proportional to 10,000 comparisons (probably closer to 15,000, depending on the actual sort algorithm). Saving 20,000 unnecessary function calls is worth the 2000 operations necessary to build and un-build the sort map.
This is called the “decorate - sort - undecorate” pattern (you can find a nice explanation on Wikipedia).
The idea is that a comparison based sort will have to call the comparison function at least n times (where n is the number of item in the list) as this is the number of comparison you need just to check that the array is already sorted. Usually, the number of comparison will be larger than that (O(n ln n) if you are using a good algorithm), and according to the pingeonhole principle, there is at least one value that will be passed twice to the comparison function.
If your comparison function does some expensive processing before comparing the two values, then you can reduce the cost by first doing the expensive part and storing the result for each values (since you know that even in the best scenario you'll have to do that processing). Then, when sorting, you use a cheaper comparison function that only compare those cached outputs.
In this example, the "expensive" part is converting the string to lowercase.
Think of this like caching. It's simply saying that you should not do lots of calculation in the compare function, because you will be calculating the same value over and over.
What does it mean, "The compareFunction can be invoked multiple times per element within the array"?
It means exactly what it says. Lets you have three items, A, B and C. They need to be sorted by the result of compare function. The comparisons might be done like this:
compare(A) to compare(B)
compare(A) to compare(C)
compare(B) to compare(C)
So here, we have 3 values, but the compare() function was executed 6 times. Using a temporary array to cache things ensures we do a calculation only once per item, and can compare those results.
Secondly, I understand what's being done in the example, but I don't understand the potential "high[er] overhead" of the compareFunction.
What if compare() does a database fetch (comparing the counts of matching rows)? Or a complex math calculation (factorial, recursive fibbinocci, or iteration over a large number of items) These sorts of things you don't want to do more than once.
I would say most of the time, it's fine to leave really simple/fast calculations inline. Don't over optimize. But if you need to anything complex or slow in the comparison, you have to be smarter about it.
To respond to your first question, why would the compareFunction be called multiple times per element in the array?
Sorting an array almost always requires more than N passes, where N is the size of the array (unless the array is already sorted). Thus, for every element in your array, it may be compared to another element in your array up to N times (bubble sort requires at most N^2 comparisons). The compareFunction you provide will be used every time to determine whether two elements are less/equal/greater and thus will be called multiple times per element in the array.
A simple response for you second question, why would there be potentially higher overhead for a compareFunction?
Say your compareFunction does a lot of unnecessary work while comparing two elements of the array. This can cause sort to be slower, and thus using a compareFunction could potentially cause higher overhead.

What's the most efficient way of filtering an array based on the contents of another array?

Say I have two arrays, items and removeItems and I wanted any values found in removeItems to be removed from items.
The brute force mechanism would probably be:
var animals = ["cow","dog","frog","cat","whale","salmon","zebra","tuna"];
var nonMammals = ["salmon","frog","tuna","spider"];
var mammals = [];
var isMammal;
for(var i=0;i<animals.length;i++){
isMammal = true;
for(var j=0;j<nonMammals;j++){
if(nonMammals[j] === animals[i]){
isMammal = false;
break;
}
}
if(isMammal){
mammals.push(animals[i]);
}
}
This is what? O(N^2)? Is there a more efficient way?
Basically what you want to do is efficiently compute the set difference S \ T. The (asymptotically) fastest way I know is to put T into a hashmap (that makes |T| steps) and go over each s in S (that makes |S| steps) checking wether s in T (which is O(1)). So you get to O(|T| + |S|) steps.
That's actually O(M * N).
Probably you could do better sorting the animals array first, then doing a binary search. You'll be able to reduce to O(N * log N) - well, that's if log N < M anyway.
Anyway, if you're working with JS and that runs client-side, then just try to keep amount of data at minimum, or their browsers will yell at you with every request.
With jQuery it's pretty easy:
function is_mammal(animal)
{
return $.inArray(animal, nonMammals) == -1;
}
mammals = $.grep(animals, is_mammal);
See docs for $.grep and $.inArray.

Categories