I'm not good at determining time complexities and memory complexities and would appreciate it if someone could help me out.
I have an algorithm that returns data from cache or fetch data if it's not in cache, I am not sure what its time and memory complexities would be.
What I am trying to figure out ?
What is its time and memory complexity and why.
What have I done before posting a question on SO ?
I have read this, this, this and many more links.
What I have done so far ?
As I understood from all articles and questions that I read, all my operations with loops are linear. I have 3 loops so it's N+N+N complexity and we can write it as N. I think that complexity of getData is O(n). Space complexity is more complex, as I understand it's often equal to time complexity for simple data structures so I think space complexity is also N but I have cache object (Hash Table) that save every response from fetchData, so I don't understand how to calculate it as space complexity.
Function
https://jsfiddle.net/30k42hrf/9/
or
const cache = {};
const fetchData = (key, arrayOfKeys) => {
const result = [];
for (let i = 0; i < arrayOfKeys.length; i++) {
result.push({
isin: arrayOfKeys[i],
data: Math.random()
});
}
return result;
}
const getData = (key, arrayOfKeys) => {
if (arrayOfKeys.length < 1) return null;
const result = [];
const keysToFetch = [];
for (let i = 0; i < arrayOfKeys.length; i++) {
const isin = arrayOfKeys[i];
if (key in cache && isin in cache[key]) {
result.push({
isin,
data: cache[key][isin]
});
} else {
keysToFetch.push(isin);
}
}
if (keysToFetch.length > 0) {
const response = fetchData(key, keysToFetch);
for (let i = 0; i < response.length; i++) {
const { isin, data } = response[i];
if (cache[key]) {
cache[key][isin] = data;
} else {
cache[key] = { [isin]: data }
}
}
return [...result, ...response];
}
return result;
}
// getData('123', ['a', 'b'])
Thanks
Time/Space complexity is determined in terms of how much more time/space will the iterations take as input increased/doubled. An intuitive view is that imagine the input size is 10, think about the time/space it takes and then think about it again as input size is 20, then input size is 100.
I am not quite clear about your code here but for a general cache stuff, the average time complexity is O(1) because once you got it in the cache, the retrieve time complexity is always O(1). You can think about a case that you retrieve the same item for 1 million times but you only need to store it once.
The average space complexity is O(n) because essentially you need to store everything in the space, which is N.
When it comes to extreme worst case, the time complexity can also be worse for the first time retrieving.
Related
I've been reading conflicting answers about modern javascript engines' time complexity when it comes to sets vs arrays in javascript.
I completed the demo task of codility, which is a simple assignment to find a solution for the following: given an array A of N integers, return the smallest positive integer (greater than 0) that does not occur in A.
For example, given A = [1, 3, 6, 4, 1, 2], the function should return 5.
My first solution was:
const solution = arr => {
for(let int = 1;;int++) {
if (!arr.includes(int)) {
return int;
}
}
}
Now, the weird thing is that codility says this solution has a time complexity of O(n**2) (they prefer a solution of complexity O(n). As far as I know, array.prototype.includes is a linear search (https://tc39.es/ecma262/#sec-array.prototype.includes) meaning it should have an O(n) time complexity.
If I enter a different solution, using a Set, I get the full score:
const solution = arr => {
const set = new Set(arr);
let i = 1;
while (set.has(i)) {
i++;
}
return i;
}
Codility says this apparently has a time complexity of O(N) or O(N * log(N)).
Is this correct? Is array.prototype.includes in fact O(n**2) instead of O(n)?
Lastly, I'm a bit confused as to why Set.has() is preferred as in my console performance tests, Array.includes() is consistently outperforming the solution to first create a Set and then looking it up on the set, as can be seen in the following snippet.
const rand = (size) => [...Array(size)].map(() => Math.floor(Math.random() * size));
const small = rand(100);
const medium = rand(5000);
const large = rand(100000);
const solution1 = arr => {
console.time('Array.includes');
for(let int = 1;;int++) {
if (!arr.includes(int)) {
console.timeEnd('Array.includes');
return int;
}
}
}
const solution2 = arr => {
console.time('Set.has');
const set = new Set(arr);
let i = 1;
while (set.has(i)) {
i++;
}
console.timeEnd('Set.has');
return i;
}
console.log('Testing small array:');
solution1(small);
solution2(small);
console.log('Testing medium array:');
solution1(medium);
solution2(medium);
console.log('Testing large array:');
solution1(large);
solution2(large);
If a set lookup has better time complexity (if that's true) and is preferred by codility, why are my performance tests favoring the array.prototype.includes solution?
I know this is an old question, but I was double checking the data. I too assumed Set.has would be O(1) or O(log N), but in my first test, it appeared to be O(N). The specs for these functions hint as much, but are quite hard to decipher: https://tc39.es/ecma262/#sec-array.prototype.includes https://tc39.es/ecma262/#sec-set.prototype.has Elsewhere, though, they also say that Set.has must be sublinear-- and I believe modern implementations are.
Empirically, Set.has demonstrates linear performance when I ran it in some code playgrounds... but in real environments like node and chrome, they there were no surprises. I'm not sure what the playground was running on the back end, but perhaps a Set polyfill was used. So be careful!
Here's my test cases, trimmed down to remove the randomness:
const makeArray = (size) => [...Array(size)].map(() => size);
const small = makeArray(1000000);
const medium = makeArray(10000000);
const large = makeArray(100000000);
const solution1 = arr => {
console.time('Array.includes');
arr.includes(arr.length - 1)
console.timeEnd('Array.includes');
}
const solution2 = arr => {
const set = new Set(arr)
console.time('Set.has');
set.has(arr.length-1)
console.timeEnd('Set.has');
}
console.log('** Testing small array:');
solution1(small);
solution2(small);
console.log('** Testing medium array:');
solution1(medium);
solution2(medium);
console.log('** Testing large array:');
solution1(large);
solution2(large);
In Chrome, though:
** Testing small array:
VM183:10 Array.includes: 1.371826171875 ms
VM183:17 Set.has: 0.005859375 ms
VM183:25 ** Testing medium array:
VM183:10 Array.includes: 14.32568359375 ms
VM183:17 Set.has: 0.009765625 ms
VM183:28 ** Testing large array:
VM183:10 Array.includes: 115.695068359375 ms
VM183:17 Set.has: 0.0048828125 ms
In Node 16.5:
Testing small array:
Array.includes: 1.223ms
Set.has: 0.01ms
Testing medium array:
Array.includes: 11.41ms
Set.has: 0.054ms
Testing large array:
Array.includes: 127.297ms
Set.has: 0.047ms
So, yeah, Arrays are definitionly linear, and Sets are much faster.
The comparison like that is not entirely fair because in the function where you use the Set, the Array needs to be converted to a Set first, which takes some time.
Have a look at the results below if this is ignored. I have updated the solution2 function to receive a Set and changed the while loop to a for loop - for the sake of direct comparison.
You may notice that for a small array, Set might be slower. This is trivial because the time complexity only really comes into affect for a large (significant) n.
Also note, Array.includes is indeed O(n) but because it is in a for loop which in the worst case could go up to n the solution has a time complexity of O(n^2).
const rand = (size) => [...Array(size)].map(() => Math.floor(Math.random() * size));
const small = rand(100);
const medium = rand(5000);
const large = rand(100000);
const solution1 = arr => {
console.time('Array.includes');
for (let int = 1;;int++) {
if (!arr.includes(int)) {
console.timeEnd('Array.includes');
return int;
}
}
}
const solution2 = set => {
console.time('Set.has');
for (let i = 1;;i++) {
if (!set.has(i)) {
console.timeEnd('Set.has');
return i
}
}
}
console.log('Testing small array:');
solution1(small);
solution2(new Set(small));
console.log('Testing medium array:');
solution1(medium);
solution2(new Set(medium));
console.log('Testing large array:');
solution1(large);
solution2(new Set(large));
I am taking a course on algorithms and big O on Udemy.
I learned that nested loops are bad for performance. I wrote a Leet Code challenge before starting this course, and I wanted to try it again using some things I learned on the course. I was expecting it to be much faster than it was the last time. But it was the same speed. Can someone explain to me where I'm going wrong and why there's no improvement in the performance of this function?
Challenge: function with array and target integer arguments, find the two integers from the array whose sum is the target.
New code: Time: 212ms
var twoSum = function(nums, target) {
let right = nums.length - 1;
let left = 0;
// as long as left > nums.lenth - 2
while (left < nums.length) {
if (nums[left] + nums[right] === target) {
return [right, left];
}
if (right > left + 1) {
right--;
} else {
left++;
right = nums.length - 1;
}
}
};
Old code: Time: 204ms
var twoSum = function(nums, target) {
for (let i = 0; i < nums.length; i++) {
for (let ii = 0; ii < nums.length; ii++) {
if (i !== ii && nums[i] + nums[ii] === target) {
return [i, ii];
break;
}
}
}
};
Big-o is purely theoretical, yet LeetCode's benchmarking is something practical, not to mention that their measurements are highly inaccurate and unreliable, which you can fully ignore. It's just something there without much benefit.
var twoSum = function(nums, target) {
let numsMap = {};
for(let index = 0; index < nums.length; index++) {
const num = nums[index];
if(numsMap[target - num] !== undefined) {
return [numsMap[target - num], index];
}
numsMap[num] = index;
}
return [];
}
References
For additional details, you can see the Discussion Board. There are plenty of accepted solutions with a variety of languages and explanations, efficient algorithms, as well as asymptotic time/space complexity analysis1, 2 in there.
If you are preparing for interviews:
We would want to write bug-free and clean codes based on standards and conventions (e.g., c1, 2, c++1, 2, java1, 2, c#1, 2, python1, javascript1, go1, rust1). Overall, we would like to avoid anything that might become controversial for interviews.
There are also other similar platforms, which you might have to become familiar with, in case you'd be interviewing with specific companies that would use those platforms.
If you are practicing for contests1:
Just code as fast as you can, almost everything else is very trivial.
For easy questions, brute force algorithms usually get accepted. For interviews, brute force is less desired, especially if the question would be an easy level.
For medium and hard questions, about 90% of the time, brute force algorithms fail mostly with Time Limit Exceeded (TLE) and less with Memory Limit Exceeded (MLE) errors.
Contestants are ranked based on an algorithm explained here.
I've got a data set that's several hundred elements long. I need to loop through the arrays and objects and determine if the data in them is less than a certain number (in my case, 0). If it is, I need to remove all those data points which are less than zero from the data set.
I've tried .pop and .slice but I'm not implementing them correctly. I was trying to push the bad data into its own array, leaving me with only the good data left.
Here's my JS
for (var i = 0; i < data.length; i++) {
if (data[i].high < 0) {
console.log(data[i].high)
var badData = [];
badData.push(data.pop(data[i].high));
console.log(data[i].high)
}
}
I'd go with .filter():
const result = data.filter(row => row.high > 0);
In case you need the bad results too.
const { good, bad } = data.reduce((acc, row) => {
const identifier = row.high > 0 ? 'good' : 'bad';
acc[identifier].push(row);
return acc;
}, { bad: [], good: [] });
I want a help in optimising a solution of a problem, I already sort out the problem, but my code is not good enough for handling large array -
codeWars : Sum of Pairs - problem
Here is my code -
var sum_pairs=function(e, sum){
var result=null;
var arrLen=e.length;
for(let i=0;i<arrLen-1;i++){
let nextIndex=e.slice(i+1,arrLen).indexOf(sum-e[i]);
if(nextIndex>=0){
result=[e[i],e[nextIndex+1+i]];
arrLen=nextIndex+1+i;
}
}
return result;
}
Well, I know this is not a good solution. Anyway, this passes all the test cases but failed when it encounter large array -
Result On codewars
I want to know how to optimise this code, and also Learn any technique to writing a good code.
One solution is to use Set data structure to memorize the numbers all ready iterated over. Then we can check for each element if there has been a number which sums to s. The set has an average constant time complexity for insert and search making the algorithm linear in time (and space).
var sum_pairs=function(ints, s){
if (ints.length < 2) return undefined; //not enough numbers for pair.
let intSet = new Set()
intSet.add(ints[0]);
for (let i=1; i < ints.length; ++i){
let needed = s-ints[i];
if (intSet.has(needed)){//check if we have already seen the number needed to complete the pair.
return [needed,ints[i]];
}
intSet.add(ints[i]);//if not insert the number in set and continue.
}
return undefined;//No answer found
}
function sumPairs (ints, s) {
if (ints.length<2) return undefined
let inSet = new Set()
for (let i= 0;i<ints.length;i++){
let need = s-ints[i]
if( inSet.has(need)){
return [need,ints[i]]
}
inSet.add(ints[i])
}
return undefined
}
I'm studying for an interview and have been working through some practice questions. The question is:
Find the most repeated integer in an array.
Here is the function I created and the one they created. They are appropriately named.
var arr = [3, 6, 6, 1, 5, 8, 9, 6, 6]
function mine(arr) {
arr.sort()
var count = 0;
var integer = 0;
var tempCount = 1;
var tempInteger = 0;
var prevInt = null
for (var i = 0; i < arr.length; i++) {
tempInteger = arr[i]
if (i > 0) {
prevInt = arr[i - 1]
}
if (prevInt == arr[i]) {
tempCount += 1
if (tempCount > count) {
count = tempCount
integer = tempInteger
}
} else {
tempCount = 1
}
}
console.log("most repeated is: " + integer)
}
function theirs(a) {
var count = 1,
tempCount;
var popular = a[0];
var temp = 0;
for (var i = 0; i < (a.length - 1); i++) {
temp = a[i];
tempCount = 0;
for (var j = 1; j < a.length; j++) {
if (temp == a[j])
tempCount++;
}
if (tempCount > count) {
popular = temp;
count = tempCount;
}
}
console.log("most repeated is: " + popular)
}
console.time("mine")
mine(arr)
console.timeEnd("mine")
console.time("theirs")
theirs(arr)
console.timeEnd("theirs")
These are the results:
most repeated is: 6
mine: 16.929ms
most repeated is: 6
theirs: 0.760ms
What makes my function slower than their?
My test results
I get the following results when I test (JSFiddle) it for a random array with 50 000 elements:
mine: 28.18 ms
theirs: 5374.69 ms
In other words, your algorithm seems to be much faster. That is expected.
Why is your algorithm faster?
You sort the array first, and then loop through it once. Firefox uses merge sort and Chrome uses a variant of quick sort (according to this question). Both take O(n*log(n)) time on average. Then you loop through the array, taking O(n) time. In total you get O(n*log(n)) + O(n), that can be simplified to just O(n*log(n)).
Their solution, on the other hand, have a nested loop where both the outer and inner loops itterate over all the elements. That should take O(n^2). In other words, it is slower.
Why does your test results differ?
So why does your test results differ from mine? I see a number of possibilities:
You used a to small sample. If you just used the nine numbers in your code, that is definately the case. When you use short arrays in the test, overheads (like running the console.log as suggested by Gundy in comments) dominate the time it takes. This can make the result appear completely random.
neuronaut suggests that it is related to the fact that their code operates on the array that is already sorted by your code. While that is a bad way of testing, I fail to see how it would affect the result.
Browser differences of some kind.
A note on .sort()
A further note: You should not use .sort() for sorting numbers, since it sorts things alphabetically. Instead, use .sort(function(a, b){return a-b}). Read more here.
A further note on the further note: In this particular case, just using .sort() might actually be smarter. Since you do not care about the sorting, only the grouping, it doesnt matter that it sort the numbers wrong. It will still group elements with the same value together. If it is faster without the comparison function (i suspect it is), then it makes sense to sort without one.
An even faster algorithm
You solved the problem in O(n*log(n)), but you can do it in just O(n). The algorithm to do that is quite intuitive. Loop through the array, and keep track of how many times each number appears. Then pick the number that appears the most times.
Lets say there are m different numbers in the array. Looping through the array takes O(n) and finding the max takes O(m). That gives you O(n) + O(m) that simplifies to O(n) since m < n.
This is the code:
function anders(arr) {
//Instead of an array we use an object and properties.
//It works like a dictionary in other languages.
var counts = new Object();
//Count how many of each number there is.
for(var i=0; i<arr.length; i++) {
//Make sure the property is defined.
if(typeof counts[arr[i]] === 'undefined')
counts[arr[i]] = 0;
//Increase the counter.
counts[arr[i]]++;
}
var max; //The number with the largest count.
var max_count = -1; //The largest count.
//Iterate through all of the properties of the counts object
//to find the number with the largerst count.
for (var num in counts) {
if (counts.hasOwnProperty(num)) {
if(counts[num] > max_count) {
max_count = counts[num];
max = num;
}
}
}
//Return the result.
return max;
}
Running this on a random array with 50 000 elements between 0 and 49 takes just 3.99 ms on my computer. In other words, it is the fastest. The backside is that you need O(m) memory to store how many time each number appears.
It looks like this isn't a fair test. When you run your function first, it sorts the array. This means their function ends up using already sorted data but doesn't suffer the time cost of performing the sort. I tried swapping the order in which the tests were run and got nearly identical timings:
console.time("theirs")
theirs(arr)
console.timeEnd("theirs")
console.time("mine")
mine(arr)
console.timeEnd("mine")
most repeated is: 6
theirs: 0.307ms
most repeated is: 6
mine: 0.366ms
Also, if you use two separate arrays you'll see that your function and theirs run in the same amount of time, approximately.
Lastly, see Anders' answer -- it demonstrates that larger data sets reveal your function's O(n*log(n)) + O(n) performance vs their function's O(n^2) performance.
Other answers here already do a great job of explaining why theirs is faster - and also how to optimize yours. Yours is actually better with large datasets (#Anders). I managed to optimize the theirs solution; maybe there's something useful here.
I can get consistently faster results by employing some basic JS micro-optimizations. These optimizations can also be applied to your original function, but I applied them to theirs.
Preincrementing is slightly faster than postincrementing, because the value does not need to be read into memory first
Reverse-while loops are massively faster (on my machine) than anything else I've tried, because JS is translated into opcodes, and guaranteeing >= 0 is very fast. For this test, my computer scored 514,271,438 ops/sec, while the next-fastest scored 198,959,074.
Cache the result of length - for larger arrays, this would make better more noticeably faster than theirs
Code:
function better(a) {
var top = a[0],
count = 0,
i = len = a.length - 1;
while (i--) {
var j = len,
temp = 0;
while (j--) {
if (a[j] == a[i]) ++temp;
}
if (temp > count) {
count = temp;
top = a[i];
}
}
console.log("most repeated is " + top);
}
[fiddle]
It's very similar, if not the same, to theirs, but with the above micro-optimizations.
Here are the results for running each function 500 times. The array is pre-sorted before any function is run, and the sort is removed from mine().
mine: 44.076ms
theirs: 35.473ms
better: 32.016ms