I am trying to insert a Float32Array in the middle of anther Float32Array. I am currently creating a new Float32Array and using 3 for loops to insert elements into this new Float32Array (1 for before the insertion, 1 for inserting the new Float32Array, and 1 for after the insertion).
This is taking a long time. Is there a faster way to insert a Float32Array into another? For instance, is there functionality akin to
// Suppose originalArray and insertedArray are 2 Float32Arrays of
// lengths 100000 and 5000 respectively, and I want to insert
// insertedArray into originalArray at element 50000.
var combinedArray = new Float32Array(105000);
combinedArray.set(originalArray.subarray(0, 50000));
combinedArray.subarray(50000, 55000).set(insertedArray);
combinedArray.subarray(55000, 105000).set(originalArray.subarray(50000, 100000));
Currently, the above code does not work because the subarray method does not return a value with a set method pertinent to the entire Float32Array.
There's something you can do in just a bunch of instructions:
var combinedArray = new Float32Array(105000);
combinedArray.set(originalArray);
[].splice.apply(combinedArray, [50000, 0].concat([].slice.call(insertedArray, 0)));
I don't really know about its performances, though. I fear that combinedArray is somehow converted into an Array, thus taking a lot of memory and maybe CPU occupation. It shouldn't, but I'm not sure.
Anyway, the set method has a second optional argument, i.e. is the offset of the array from where the new elements must be set. So, your last two lines would become:
combinedArray.set(insertedArray, 50000);
combinedArray.set(originalArray.subarray(50000, 100000), 55000);
Maybe this is more efficient.
Edit: it is, according to this test. So, you have your way.
Related
platform: Chrome 76.
I was performance testing, when I came across the following. The app runs the attached (and prior sieving) code, repeated millions of times.
t_sieve is a boolean array, at this point has been fully sieved. This code bit is just counting the false into a int array t_match such that at any point the t_match will contain the number of 'false's up to that array index.
For the image attached, the code takes about 30s to complete out of which 25s happens here. Prior to this, the sieving takes about 4s.
Why? How to refactor this for better performance?
Also FYI, I've repeated the performance profiling and consistently this is where it lags.
Immediate following block is an update to comments received. Shows how these array are being instantiated ( a bit higher up than the code block following )
const t_sieve = new Array( M_MAX + 1 ) ;
const t_match = new Array( M_MAX + 1 ) ;
let cum = 0;
for (let i = 1; i < mod_period1; i++) {
if (!t_sieve[i]) {
cum++;
}
t_match[i] = cum;
}
const period_match_count = t_match[mod_period];
One thought I had was based on line 190 taking a significant amount of time. The reason is when you add new elements to an existing array in JS it recreates the entire array again. Thus taking exponentially longer each time a new item is added. Check out the performance screenshots:
Here is code that is very similar to yours. Creates an array of 1 million true/false, then counts them just like yours. The performance shows very similar behavior: Alot of the time is spent on line 21 adding the new elements to the array.
Now check out this restructure:
I replaced line 13 with an instantiation of the array object with a predefined length. Check out how much faster line 21 ran. This is because it already has space and doesn't have to rebuild the entire array the entire time. Granted, there is a trade off. Line 13 took slightly longer, but if you could initialize the size once then you'll make up for it each iteration through the loop.
I am creating a algorithm to match any combination of cells of first array to second array value with priority in second array. for example in javascript :
var arr=[10,20,30,40,50,60,70,80,90];
var arr2=[100,120,140];
what I want is to define into following logic(priority for value of second array's cell serially) automatically and please help me finding pseudo for algorithm
100 = 10+20+30+40 //arr2[0] = arr1[0] + arr1[1] + arr1[2] + arr1[3]
120 = 50+70 //arr2[1] = arr1[4] + arr1[6]
140 = 60+80 //arr2[2] = arr1[5] + arr1[7]
90 = 90 //remaining arr1[8]
values are demo and can be changed dynamically.
Solution is possible if you take both array as sorted array and then start adding elements from last ends of first array (array1) which are the greatest as array is sorted , now check if sum matches then proceed else if sum is lesser than element in array2 you were checking then you need to add third element from array1. Another case if sum is greater than element in array2 then you have to neglect one of the element from array1 you have used in addition and replace the addition with the previous element you HV used from array one. Repeat the steps. You need to think how to do this correctly or else you need to share some of your work or logic u r thinking , so that we can help
As the matter is quite complex, over and above sufficing on a pseudo code style explanation, I have also coded a practical implementation that you may find at this link.
I advise you to refrain from looking at the solution and first try to implement the algorithm yourself as there is a lot of scope for further improvement.
Here is in broad lines an explanation to the way I have decided to tackle the algorithm:
The problem presented by the OP is related to a classic example of distributing n unique elements over k unique boxes.
In this case here, arr has 9 unique elements that need to be distributed over three distinct spots, represented by the container: arr2.
So the first step in tackling this problem is to figure out how you can implement a function that given n and k, is able to calculate all the possible distributions that apply.
The closest that I could come up with was the Stirling Numbers of the Second Kind, which is defined as:
The number of ways of partitioning a set of n elements into m nonempty sets (i.e., m set blocks), also called a Stirling set number. For example, the set {1,2,3} can be partitioned into three subsets in one way: {{1},{2},{3}}; into two subsets in three ways: {{1,2},{3}}, {{1,3},{2}}, and {{1},{2,3}}; and into one subset in one way: {{1,2,3}}.
If you pay close attention to the example provided, you will realize that it pertains to the enumeration of all the distribution combinations possible over INDISTINGUISHABLE partitions as order doesn't matter.
Since in our case, each spot in the container arr2 represents a UNIQUE spot and order therefore does matter, we will thus be required to enumerate all the Stirling Combinations over every possible combination of arr2.
Practically speaking, this means that for our example where arr2.length === 3, we will be required to apply all of the Stirling Combinations obtained to [100,120,140], [120,140,100], [140,100,120] etc.(in total 6 permutations)
The main challenging part here is to implement the Stirling Function, but luckily somebody has already done so:
http://blogs.msdn.com/b/oldnewthing/archive/2014/03/24/10510315.aspx
After copy and pasting the Stirling Function and using it to distribute arr over 3 unique spots, you now need to filter out the distributions that don't sum up to the designated spots encompassed by arr2.
This will then leave you with all the possible solutions that apply. In your case, for
var arr=[10,20,30,40,50,60,70,80,90];
var arr2=[100,120,140];
no solutions apply at all.
A quick workaround to that is by expanding the distribution target arr2 from [100,120,140] to [100,120,140,90]. A better workaround is that in the case zero solutions are found, then take away one element from list arr until you obtain a solution. Then you can later on expand your solution sets by including this element where it represents a mapping of it unto itself.
Considering the performance, what's the best way to get random subset from an array?
Say we get an array with 90000 items, I wanna get 10000 random items from it.
One approach I'm thinking about is to get a random index from 0 to array.length and then remove the selected one from the original array by using Array.prototype.splice. Then get the next random item from the rest.
But the splice method will rearrange the index of all the items after the one we just selected and move them forward on step. Doesn't it affect the performance?
Items may duplicates, but what we select should not. Say we've selected index 0, then we should only look up the rest 1~89999.
If you want a subset of the shuffled array, you do not need to shuffle the whole array. You can stop the classic fisher-yates shuffle when you have drawn your 10000 items, leaving the other 80000 indices untouched.
I would first randomize the whole array then splice of a 10000 items.
How to randomize (shuffle) a JavaScript array?
Explains a good way to randomize a array in javascript
A reservoir sampling algorithm can do this.
Here's an attempt at implementing Knuth's "Algorithm S" from TAOCP Volume 2 Section 3.4.2:
function sample(source, size) {
var chosen = 0,
srcLen = source.length,
result = new Array(size);
for (var seen = 0; chosen < size; seen++) {
var remainingInput = srcLen - seen,
remainingOutput = size - chosen;
if (remainingInput*Math.random() < remainingOutput) {
result[chosen++] = source[seen];
}
}
return result;
}
Basically it makes one pass over the input array, choosing or skipping items based on a function of a random number, the number of items remaining in the input, and the number of items remaining to be required in the output.
There are three potential problems with this code: 1. I may have mucked it up, 2. Knuth calls for a random number "between zero and one" and I'm not sure if this means the [0, 1) interval JavaScript provides or the fully closed or fully open interval, 3. it's vulnerable to PRNG bias.
The performance characteristics should be very good. It's O(srcLen). Most of the time we finish before going through the entire input. The input is accessed in order, which is a good thing if you are running your code on a computer that has a cache. We don't even waste any time reading or writing elements that don't ultimately end up in the output.
This version doesn't modify the input array. It is possible to write an in-place version, which might save some memory, but it probably wouldn't be much faster.
I'm trying to learn about array sorting. It seems pretty straightforward. But on the mozilla site, I ran across a section discussing sorting maps (about three-quarters down the page).
The compareFunction can be invoked multiple times per element within
the array. Depending on the compareFunction's nature, this may yield a
high overhead. The more work a compareFunction does and the more
elements there are to sort, the wiser it may be to consider using a
map for sorting.
The example given is this:
// the array to be sorted
var list = ["Delta", "alpha", "CHARLIE", "bravo"];
// temporary holder of position and sort-value
var map = [];
// container for the resulting order
var result = [];
// walk original array to map values and positions
for (var i=0, length = list.length; i < length; i++) {
map.push({
// remember the index within the original array
index: i,
// evaluate the value to sort
value: list[i].toLowerCase()
});
}
// sorting the map containing the reduced values
map.sort(function(a, b) {
return a.value > b.value ? 1 : -1;
});
// copy values in right order
for (var i=0, length = map.length; i < length; i++) {
result.push(list[map[i].index]);
}
// print sorted list
print(result);
I don't understand a couple of things. To wit: What does it mean, "The compareFunction can be invoked multiple times per element within the array"? Can someone show me an example of that. Secondly, I understand what's being done in the example, but I don't understand the potential "high[er] overhead" of the compareFunction. The example shown here seems really straightforward and mapping the array into an object, sorting its value, then putting it back into an array would take much more overhead I'd think at first glance. I understand this is a simple example, and probably not intended for anything else than to show the procedure. But can someone give an example of when it would be lower overhead to map like this? It seems like a lot more work.
Thanks!
When sorting a list, an item isn't just compared to one other item, it may need to be compared to several other items. Some of the items may even have to be compared to all other items.
Let's see how many comparisons there actually are when sorting an array:
var list = ["Delta", "alpha", "CHARLIE", "bravo", "orch", "worm", "tower"];
var o = [];
for (var i = 0; i < list.length; i++) {
o.push({
value: list[i],
cnt: 0
});
}
o.sort(function(x, y){
x.cnt++;
y.cnt++;
return x.value == y.value ? 0 : x.value < y.value ? -1 : 1;
});
console.log(o);
Result:
[
{ value="CHARLIE", cnt=3},
{ value="Delta", cnt=3},
{ value="alpha", cnt=4},
{ value="bravo", cnt=3},
{ value="orch", cnt=3},
{ value="tower", cnt=7},
{ value="worm", cnt=3}
]
(Fiddle: http://jsfiddle.net/Guffa/hC6rV/)
As you see, each item was compared to seveal other items. The string "tower" even had more comparisons than there are other strings, which means that it was compared to at least one other string at least twice.
If the comparison needs some calculation before the values can be compared (like the toLowerCase method in the example), then that calculation will be done several times. By caching the values after that calculation, it will be done only once for each item.
The primary time saving in that example is gotten by avoiding calls to toLowerCase() in the comparison function. The comparison function is called by the sort code each time a pair of elements needs to be compared, so that's a savings of a lot of function calls. The cost of building and un-building the map is worth it for large arrays.
That the comparison function may be called more than once per element is a natural implication of how sorting works. If only one comparison per element were necessary, it would be a linear-time process.
edit — the number of comparisons that'll be made will be roughly proportional to the length of the array times the base-2 log of the length. For a 1000 element array, then, that's proportional to 10,000 comparisons (probably closer to 15,000, depending on the actual sort algorithm). Saving 20,000 unnecessary function calls is worth the 2000 operations necessary to build and un-build the sort map.
This is called the “decorate - sort - undecorate” pattern (you can find a nice explanation on Wikipedia).
The idea is that a comparison based sort will have to call the comparison function at least n times (where n is the number of item in the list) as this is the number of comparison you need just to check that the array is already sorted. Usually, the number of comparison will be larger than that (O(n ln n) if you are using a good algorithm), and according to the pingeonhole principle, there is at least one value that will be passed twice to the comparison function.
If your comparison function does some expensive processing before comparing the two values, then you can reduce the cost by first doing the expensive part and storing the result for each values (since you know that even in the best scenario you'll have to do that processing). Then, when sorting, you use a cheaper comparison function that only compare those cached outputs.
In this example, the "expensive" part is converting the string to lowercase.
Think of this like caching. It's simply saying that you should not do lots of calculation in the compare function, because you will be calculating the same value over and over.
What does it mean, "The compareFunction can be invoked multiple times per element within the array"?
It means exactly what it says. Lets you have three items, A, B and C. They need to be sorted by the result of compare function. The comparisons might be done like this:
compare(A) to compare(B)
compare(A) to compare(C)
compare(B) to compare(C)
So here, we have 3 values, but the compare() function was executed 6 times. Using a temporary array to cache things ensures we do a calculation only once per item, and can compare those results.
Secondly, I understand what's being done in the example, but I don't understand the potential "high[er] overhead" of the compareFunction.
What if compare() does a database fetch (comparing the counts of matching rows)? Or a complex math calculation (factorial, recursive fibbinocci, or iteration over a large number of items) These sorts of things you don't want to do more than once.
I would say most of the time, it's fine to leave really simple/fast calculations inline. Don't over optimize. But if you need to anything complex or slow in the comparison, you have to be smarter about it.
To respond to your first question, why would the compareFunction be called multiple times per element in the array?
Sorting an array almost always requires more than N passes, where N is the size of the array (unless the array is already sorted). Thus, for every element in your array, it may be compared to another element in your array up to N times (bubble sort requires at most N^2 comparisons). The compareFunction you provide will be used every time to determine whether two elements are less/equal/greater and thus will be called multiple times per element in the array.
A simple response for you second question, why would there be potentially higher overhead for a compareFunction?
Say your compareFunction does a lot of unnecessary work while comparing two elements of the array. This can cause sort to be slower, and thus using a compareFunction could potentially cause higher overhead.
Context: I'm building a little site that reads an rss feed, and updates/checks the feed in the background. I have one array to store data to display, and another which stores ID's of records that have been shown.
Question: How many items can an array hold in Javascript before things start getting slow, or sluggish. I'm not sorting the array, but am using jQuery's inArray function to do a comparison.
The website will be left running, and updating and its unlikely that the browser will be restarted / refreshed that often.
If I should think about clearing some records from the array, what is the best way to remove some records after a limit, like 100 items.
The maximum length until "it gets sluggish" is totally dependent on your target machine and your actual code, so you'll need to test on that (those) platform(s) to see what is acceptable.
However, the maximum length of an array according to the ECMA-262 5th Edition specification is bound by an unsigned 32-bit integer due to the ToUint32 abstract operation, so the longest possible array could have 232-1 = 4,294,967,295 = 4.29 billion elements.
No need to trim the array, simply address it as a circular buffer (index % maxlen). This will ensure it never goes over the limit (implementing a circular buffer means that once you get to the end you wrap around to the beginning again - not possible to overrun the end of the array).
For example:
var container = new Array ();
var maxlen = 100;
var index = 0;
// 'store' 1538 items (only the last 'maxlen' items are kept)
for (var i=0; i<1538; i++) {
container [index++ % maxlen] = "storing" + i;
}
// get element at index 11 (you want the 11th item in the array)
eleventh = container [(index + 11) % maxlen];
// get element at index 11 (you want the 11th item in the array)
thirtyfifth = container [(index + 35) % maxlen];
// print out all 100 elements that we have left in the array, note
// that it doesn't matter if we address past 100 - circular buffer
// so we'll simply get back to the beginning if we do that.
for (i=0; i<200; i++) {
document.write (container[(index + i) % maxlen] + "<br>\n");
}
Like #maerics said, your target machine and browser will determine performance.
But for some real world numbers, on my 2017 enterprise Chromebook, running the operation:
console.time();
Array(x).fill(0).filter(x => x < 6).length
console.timeEnd();
x=5e4 takes 16ms, good enough for 60fps
x=4e6 takes 250ms, which is noticeable but not a big deal
x=3e7 takes 1300ms, which is pretty bad
x=4e7 takes 11000ms and allocates an extra 2.5GB of memory
So around 30 million elements is a hard upper limit, because the javascript VM falls off a cliff at 40 million elements and will probably crash the process.
EDIT: In the code above, I'm actually filling the array with elements and looping over them, simulating the minimum of what an app might want to do with an array. If you just run Array(2**32-1) you're creating a sparse array that's closer to an empty JavaScript object with a length, like {length: 4294967295}. If you actually tried to use all those 4 billion elements, you'll definitely crash the javascript process.
You could try something like this to test and trim the length:
http://jsfiddle.net/orolo/wJDXL/
var longArray = [1, 2, 3, 4, 5, 6, 7, 8];
if (longArray.length >= 6) {
longArray.length = 3;
}
alert(longArray); //1, 2, 3
I have built a performance framework that manipulates and graphs millions of datasets, and even then, the javascript calculation latency was on order of tens of milliseconds. Unless you're worried about going over the array size limit, I don't think you have much to worry about.
It will be very browser dependant. 100 items doesn't sound like a large number - I expect you could go a lot higher than that. Thousands shouldn't be a problem. What may be a problem is the total memory consumption.
I have shamelessly pulled some pretty big datasets in memory, and altough it did get sluggish it took maybe 15 Mo of data upwards with pretty intense calculations on the dataset. I doubt you will run into problems with memory unless you have intense calculations on the data and many many rows. Profiling and benchmarking with different mock resultsets will be your best bet to evaluate performance.