Related
I just got this challenge and I got stumped, blank. The question somewhat goes like this:
There are chess players that will play in duels. For example, 4 players (A, B, C, D), paired by 2, will yield a total of 6 chess games: AB, AC, AD, BC, BD, CD. Write a function taking an integer value and returning the number of possible combination.
Tests:
gameCount(4); // 6
gameCount(10000); // 49995000
I remember, many years ago, my math class on linear problems, and this was one of them. A very quick Google search yielded the nCr formula. So I quickly wrote the code:
const r = 2;
const factorial = n => {
let f = 1;
for (let i = 2; i <= n; ++i) {
f = f * i;
}
return f;
}
const gameCount = n => factorial(n) / (factorial(r) * factorial(n - r));
console.log( gameCount(4) ); // 6
console.log( gameCount(10000) ); // NaN
The NaN baffled me at first, but then I realized that 10000! is quite a large number!
How can I optimise gameCount to accept large numbers by still remaining linear?
Note: I am aware that the factorial function is not linear, but this was what I wrote at the time. The ideal solution would be to have everything linear, perhaps removing factorials altogether.
What you currently do is calculate two very big numbers that share a very big common divisor, and then divide one by the other. You could just not calculate that divisor (factorial(n - r)) in the first place.
Change your factorial function to take a minimum argument.
const r = 2;
const factorial = (n, m) => {
let f = 1;
for (let i = n; i > m; --i) {
f = f * i;
}
return f;
}
const gameCount = n => factorial(n, n - r) / factorial(r, 0);
console.log( gameCount(4) ); // 6
console.log( gameCount(10000) ); // 49995000
Though note that this will only avoid the NaN issue as long as r is reasonably small. If you need it to work for cases where both n and r are big, you cannot use the native JavaScript number type. I suggest looking into BigInt for that case.
And for the sake of completeness, if the case r = 2 is all you care about, then your entire formula can be simplified to:
const gameCount = n => (n * (n - 1)) / 2;
Given a "split ratio", I am trying to randomly split up a dataset into two groups. The catch is, that I do not know beforehand how many items the dataset contains. My library receives the data one by one from an input stream and is expected to return the data to two output streams. The resulting two datasets should ideally be exactly split into the given split ratio.
Illustration:
┌─► stream A
input stream ──► LIBRARY ──┤
└─► stream B
For example, given a split ratio of 30/70, stream A would be expected to receive 30% of the elements from the input stream and stream B the remaining 70%. The order must remain.
My ideas so far:
Idea 1: "Roll the dice" for each element
The obvious approach: For each element the algorithm randomly decides if the element should go into stream A or B. The problem is, that the resulting data sets might be far off from the expected split ratio. Given a split ratio of 50/50, the resulting data split might be something far off (could even be 100/0 for very small data sets). The goal is to keep the resulting split ratio as close as possible to the desired split ratio.
Idea 2: Use a cache and randomize the cached data
Another idea is to cache a fixed number of elements before passing them on. This would result in caching 1000 elements and shuffling the data (or their corresponding indices to keep the order stable), splitting them up and passing the resulting data sets on. This should work very well, but I'm unsure if the randomization is really random for large data sets (I imagine there will patterns when looking at the distribution).
Both algorithms are not optimal, so I hope you can help me.
Background
This is about a layer-based data science tool, where each layer receives data from the previous layer via a stream. This layer is expected to split the data (vectors) up into a training and test set before passing them on. The input data can range from just a few elements to a never ending stream of data (hence, the streams). The code is developed in JavaScript, but this question is more about the algorithm than the actual implementation.
You could adjust the probability as it shifts away from the desired rate.
Here's an example along with tests for various levels of adjusting the probability. As we increase the adjustments, we see the stream splitter deviates less from the ideal ratio, but it also means its less random (knowing the previous values, you can predict the next values).
// rateStrictness = 0 will lead to "rolling the dice" for each invocations
// higher values of rateStrictness will lead to strong "correcting" forces
function* splitter(desiredARate, rateStrictness = .5) {
let aCount = 0, bCount = 0;
while (true) {
let actualARate = aCount / (aCount + bCount);
let aRate = desiredARate + (desiredARate - actualARate) * rateStrictness;
if (Math.random() < aRate) {
aCount++;
yield 'a';
} else {
bCount++;
yield 'b';
}
}
}
let test = (desiredARate, rateStrictness) => {
let s = splitter(desiredARate, rateStrictness);
let values = [...Array(1000)].map(() => s.next().value);
let aCount = values.map((_, i) => values.reduce((count, v, j) => count + (v === 'a' && j <= i), 0));
let aRate = aCount.map((c, i) => c / (i + 1));
let deviation = aRate.map(a => a - desiredARate);
let avgDeviation = deviation.reduce((sum, dev) => sum + dev, 0) / deviation.length;
console.log(`inputs: desiredARate = ${desiredARate}; rateStrictness = ${rateStrictness}; average deviation = ${avgDeviation}`);
};
test(.5, 0);
test(.5, .25);
test(.5, .5);
test(.5, .75);
test(.5, 1);
test(.5, 10);
test(.5, 100);
How about rolling the dice twice: First of all decide wether the stream should be chosen randomly or if the ratio should be taken into account. Then for the first case, roll the dice, for the second case take the ratio. Some pseudocode:
const toA =
Math.random() > 0.5 // 1 -> totally random, 0 -> totally equally distributed
? Math.random() > 0.7
: (numberA / (numberA + numberB) > 0.7);
That's just an idea I came up with, I haven't tried that ...
Here is a way that combines both of your ideas: It uses a cache. As long as the amount of elements in cache can handle that if the stream ends, we can still approach target distribution, we just roll a dice. If not, we add it to the cache. When input stream ends, we shuffle elements in cache and send them trying to approach distribution. I am not sure if there is any gain in this over just forcing element to go to x if distribution is straying off too much in terms of randomness.
Beware that this approach does not preserve order from original input stream. A few other things could be added such as cache limit and relaxing distribution error (using 0 here). If you need to preserve order, it can be done by sending cache value and pushing to cache current one instead of just sending current one when there are still elements in cache.
let shuffle = (array) => array.sort(() => Math.random() - 0.5);
function* generator(numElements) {
for (let i = 0; i < numElements;i++) yield i;
}
function* splitter(aGroupRate, generator) {
let cache = [];
let sentToA = 0;
let sentToB = 0;
let bGroupRate = 1 - aGroupRate;
let maxCacheSize = 0;
let sendValue = (value, group) => {
sentToA += group == 0;
sentToB += group == 1;
return {value: value, group: group};
}
function* retRandomGroup(value, expected) {
while(Math.random() > aGroupRate != expected) {
if (cache.length) {
yield sendValue(cache.pop(), !expected);
} else {
yield sendValue(value, !expected);
return;
}
}
yield sendValue(value, expected);
}
for (let value of generator) {
if (sentToA + sentToB == 0) {
yield sendValue(value, Math.random() > aGroupRate);
continue;
}
let currentRateA = sentToA / (sentToA + sentToB);
if (currentRateA <= aGroupRate) {
// can we handle current value going to b group?
if ((sentToA + cache.length) / (sentToB + sentToA + 1 + cache.length) >= aGroupRate) {
for (val of retRandomGroup(value, 1)) yield val;
continue;
}
}
if (currentRateA > aGroupRate) {
// can we handle current value going to a group?
if (sentToA / (sentToB + sentToA + 1 + cache.length) <= aGroupRate) {
for (val of retRandomGroup(value, 0)) yield val;
continue;
}
}
cache.push(value);
maxCacheSize = Math.max(maxCacheSize, cache.length)
}
shuffle(cache);
let totalElements = sentToA + sentToB + cache.length;
while (sentToA < totalElements * aGroupRate) {
yield {value: cache.pop(), group: 0}
sentToA += 1;
}
while (cache.length) {
yield {value: cache.pop(), group: 1}
}
yield {cache: maxCacheSize}
}
function test(numElements, aGroupRate) {
let gen = generator(numElements);
let sentToA = 0;
let total = 0;
let cacheSize = null;
let split = splitter(aGroupRate, gen);
for (let val of split) {
if (val.cache != null) cacheSize = val.cache;
else {
sentToA += val.group == 0;
total += 1
}
}
console.log("required rate for A group", aGroupRate, "actual rate", sentToA / total, "cache size used", cacheSize);
}
test(3000, 0.3)
test(5000, 0.5)
test(7000, 0.7)
Let's say you have to maintain a given ratio R for data items going to stream A, e.g. R = 0.3 as per your example. Then on receiving each data item count the
total number of items and the items passed on to stream A and decide for each item if it goes to A based on what choice keeps you closer to your target ratio R.
That should be about the best you can do for any size of the data set. As for randomness the resulting streams A and B should be about as random as your input stream.
Let's see how this plays out for the first couple of iterations:
Example: R = 0.3
N : total number of items processed so far (initially 0)
A : numbers passed on to stream A so far (initially 0)
First Iteration
N = 0 ; A = 0 ; R = 0.3
if next item goes to stream A then
n = N + 1
a = A + 1
r = a / n = 1
else if next item goes to stream B
n = N + 1
a = A
r = a / n = 0
So first item goes to stream B since 0 is closer to 0.3
Second Iteration
N = 1 ; A = 0 ; R = 0.3
if next item goes to stream A then
n = N + 1
a = A + 1
r = a / n = 0.5
else if next item goes to stream B
n = N + 1
a = A
r = a / n = 0
So second item goes to stream A since 0.5 is closer to 0.3
Third Iteration
N = 2 ; A = 1 ; R = 0.3
if next item goes to stream A then
n = N + 1
a = A + 1
r = a / n = 0.66
else if next item goes to stream B
n = N + 1
a = A
r = a / n = 0.5
So third item goes to stream B since 0.5 is closer to 0.3
Fourth Iteration
N = 3 ; A = 1 ; R = 0.3
if next item goes to stream A then
n = N + 1
a = A + 1
r = a / n = 0.5
else if next item goes to stream B
n = N + 1
a = A
r = a / n = 0.25
So third item goes to stream B since 0.25 is closer to 0.3
So this here would be the pseudo code for deciding each data item:
if (((A + 1) / (N + 1)) - R) < ((A / (N + 1)) - R ) then
put the next data item on stream A
A = A + 1
N = N + 1
else
put the next data item on B
N = N + 1
As discussed in the comments below, that is not random in the sense intended by the OP. So once we know the correct target stream for the next item we flip a coin to decide if we actually put it there, or introduce an error.
if (((A + 1) / (N + 1)) - R) < ((A / (N + 1)) - R ) then
target_stream = A
else
target_stream = B
if random() < 0.5 then
if target_stream == A then
target_stream = B
else
target_stream = A
if target_stream == A then
put the next data item on stream A
A = A + 1
N = N + 1
else
put the next data item on B
N = N + 1
Now that could lead to an arbitrarily large error overall. So we have to set an error limit L and check how far off the resulting ratio is from the target R when errors are about to be introduced:
if (((A + 1) / (N + 1)) - R) < ((A / (N + 1)) - R ) then
target_stream = A
else
target_stream = B
if random() < 0.5 then
if target_stream == A then
if abs((A / (N + 1)) - R) < L then
target_stream = B
else
if abs(((A + 1) / (N + 1)) - R) < L then
target_stream = A
if target_stream == A then
put the next data item on stream A
A = A + 1
N = N + 1
else
put the next data item on B
N = N + 1
So here we have it: Processing data items one by one we know the correct stream to put the next item on, then we introduce random local errors and we are able to limit the overall error with L.
Looking at the two numbers you wrote (chunk size of 1000, probability split of 0.7) you might not have any problem with the simple approach of just rolling the dice for every element.
Talking about probability and high numbers, you have the law of large numbers.
This means, that you do have a risk of splitting the streams very unevenly into 0 and 1000 elements, but in practice this is veeery unlikely to happen. As you are talking about testing and training sets I also do not expect your probability split to be far off of 0.7. And in case you are allowed to cache, you can still use this for the first 100 elements, so that you are sure to have enough data for the law of large numbers to kick in.
This is the binomial distribution for n=1000, p=.7
In case you want to reproduce the image with other parameters
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import binom
index = np.arange(binom.ppf(0.01, n, p), binom.ppf(0.99, n, p))
pd.Series(index=index, data=binom.pmf(x, n, p)).plot()
plt.show()
Find length of line 300* slower
First of I have read the answer to Why is my WebAssembly function slower than the JavaScript equivalent?
But it has shed little light on the problem, and I have invested a lot of time that may well be that yellow stuff against the wall.
I do not use globals, I do not use any memory. I have two simple functions that find the length of a line segment and compare them to the same thing in plain old Javascript. I have 4 params 3 more locals and returns a float or double.
On Chrome the Javascript is 40 times faster than the webAssembly and on firefox the wasm is almost 300 times slower than the Javascript.
jsPref test case.
I have added a test case to jsPref WebAssembly V Javascript math
What am I doing wrong?
Either
I have missed an obvious bug, bad practice, or I am suffering coder stupidity.
WebAssembly is not for 32bit OS (win 10 laptop i7CPU)
WebAssembly is far from a ready technology.
Please please be option 1.
I have read the webAssembly use case
Re-use existing code by targeting WebAssembly, embedded in a larger
JavaScript / HTML application. This could be anything from simple
helper libraries, to compute-oriented task offload.
I was hoping I could replace some geometry libs with webAssembly to get some extra performance. I was hoping that it would be awesome, like 10 or more times faster. BUT 300 times slower WTF.
UPDATE
This is not a JS optimisation issues.
To ensure that optimisation has as little as possible effect I have tested using the following methods to reduce or eliminate any optimisation bias..
counter c += length(... to ensure all code is executed.
bigCount += c to ensure whole function is executed. Not needed
4 lines for each function to reduce a inlining skew. Not Needed
all values are randomly generated doubles
each function call returns a different result.
add slower length calculation in JS using Math.hypot to prove code is being run.
added empty call that return first param JS to see overhead
// setup and associated functions
const setOf = (count, callback) => {var a = [],i = 0; while (i < count) { a.push(callback(i ++)) } return a };
const rand = (min = 1, max = min + (min = 0)) => Math.random() * (max - min) + min;
const a = setOf(100009,i=>rand(-100000,100000));
var bigCount = 0;
function len(x,y,x1,y1){
var nx = x1 - x;
var ny = y1 - y;
return Math.sqrt(nx * nx + ny * ny);
}
function lenSlow(x,y,x1,y1){
var nx = x1 - x;
var ny = y1 - y;
return Math.hypot(nx,ny);
}
function lenEmpty(x,y,x1,y1){
return x;
}
// Test functions in same scope as above. None is in global scope
// Each function is copied 4 time and tests are performed randomly.
// c += length(... to ensure all code is executed.
// bigCount += c to ensure whole function is executed.
// 4 lines for each function to reduce a inlining skew
// all values are randomly generated doubles
// each function call returns a different result.
tests : [{
func : function (){
var i,c=0,a1,a2,a3,a4;
for (i = 0; i < 10000; i += 1) {
a1 = a[i];
a2 = a[i+1];
a3 = a[i+2];
a4 = a[i+3];
c += length(a1,a2,a3,a4);
c += length(a2,a3,a4,a1);
c += length(a3,a4,a1,a2);
c += length(a4,a1,a2,a3);
}
bigCount = (bigCount + c) % 1000;
},
name : "length64",
},{
func : function (){
var i,c=0,a1,a2,a3,a4;
for (i = 0; i < 10000; i += 1) {
a1 = a[i];
a2 = a[i+1];
a3 = a[i+2];
a4 = a[i+3];
c += lengthF(a1,a2,a3,a4);
c += lengthF(a2,a3,a4,a1);
c += lengthF(a3,a4,a1,a2);
c += lengthF(a4,a1,a2,a3);
}
bigCount = (bigCount + c) % 1000;
},
name : "length32",
},{
func : function (){
var i,c=0,a1,a2,a3,a4;
for (i = 0; i < 10000; i += 1) {
a1 = a[i];
a2 = a[i+1];
a3 = a[i+2];
a4 = a[i+3];
c += len(a1,a2,a3,a4);
c += len(a2,a3,a4,a1);
c += len(a3,a4,a1,a2);
c += len(a4,a1,a2,a3);
}
bigCount = (bigCount + c) % 1000;
},
name : "length JS",
},{
func : function (){
var i,c=0,a1,a2,a3,a4;
for (i = 0; i < 10000; i += 1) {
a1 = a[i];
a2 = a[i+1];
a3 = a[i+2];
a4 = a[i+3];
c += lenSlow(a1,a2,a3,a4);
c += lenSlow(a2,a3,a4,a1);
c += lenSlow(a3,a4,a1,a2);
c += lenSlow(a4,a1,a2,a3);
}
bigCount = (bigCount + c) % 1000;
},
name : "Length JS Slow",
},{
func : function (){
var i,c=0,a1,a2,a3,a4;
for (i = 0; i < 10000; i += 1) {
a1 = a[i];
a2 = a[i+1];
a3 = a[i+2];
a4 = a[i+3];
c += lenEmpty(a1,a2,a3,a4);
c += lenEmpty(a2,a3,a4,a1);
c += lenEmpty(a3,a4,a1,a2);
c += lenEmpty(a4,a1,a2,a3);
}
bigCount = (bigCount + c) % 1000;
},
name : "Empty",
}
],
Results from update.
Because there is a lot more overhead in the test the results are closer but the JS code is still two orders of magnitude faster.
Note how slow the function Math.hypot is. If optimisation was in effect that function would be near the faster len function.
WebAssembly 13389µs
Javascript 728µs
/*
=======================================
Performance test. : WebAssm V Javascript
Use strict....... : true
Data view........ : false
Duplicates....... : 4
Cycles........... : 147
Samples per cycle : 100
Tests per Sample. : undefined
---------------------------------------------
Test : 'length64'
Mean : 12736µs ±69µs (*) 3013 samples
---------------------------------------------
Test : 'length32'
Mean : 13389µs ±94µs (*) 2914 samples
---------------------------------------------
Test : 'length JS'
Mean : 728µs ±6µs (*) 2906 samples
---------------------------------------------
Test : 'Length JS Slow'
Mean : 23374µs ±191µs (*) 2939 samples << This function use Math.hypot
rather than Math.sqrt
---------------------------------------------
Test : 'Empty'
Mean : 79µs ±2µs (*) 2928 samples
-All ----------------------------------------
Mean : 10.097ms Totals time : 148431.200ms 14700 samples
(*) Error rate approximation does not represent the variance.
*/
Whats the point of WebAssambly if it does not optimise
End of update
All the stuff related to the problem.
Find length of a line.
Original source in custom language
// declare func the < indicates export name, the param with types and return type
func <lengthF(float x, float y, float x1, float y1) float {
float nx, ny, dist; // declare locals float is f32
nx = x1 - x;
ny = y1 - y;
dist = sqrt(ny * ny + nx * nx);
return dist;
}
// and as double
func <length(double x, double y, double x1, double y1) double {
double nx, ny, dist;
nx = x1 - x;
ny = y1 - y;
dist = sqrt(ny * ny + nx * nx);
return dist;
}
Code compiles to Wat for proof read
(module
(func
(export "lengthF")
(param f32 f32 f32 f32)
(result f32)
(local f32 f32 f32)
get_local 2
get_local 0
f32.sub
set_local 4
get_local 3
get_local 1
f32.sub
tee_local 5
get_local 5
f32.mul
get_local 4
get_local 4
f32.mul
f32.add
f32.sqrt
)
(func
(export "length")
(param f64 f64 f64 f64)
(result f64)
(local f64 f64 f64)
get_local 2
get_local 0
f64.sub
set_local 4
get_local 3
get_local 1
f64.sub
tee_local 5
get_local 5
f64.mul
get_local 4
get_local 4
f64.mul
f64.add
f64.sqrt
)
)
As compiled wasm in hex string (Note does not include name section) and loaded using WebAssembly.compile. Exported functions then run against Javascript function len (in below snippet)
// hex of above without the name section
const asm = `0061736d0100000001110260047d7d7d7d017d60047c7c7c7c017c0303020001071402076c656e677468460000066c656e67746800010a3b021c01037d2002200093210420032001932205200594200420049492910b1c01037c20022000a1210420032001a122052005a220042004a2a09f0b`
const bin = new Uint8Array(asm.length >> 1);
for(var i = 0; i < asm.length; i+= 2){ bin[i>>1] = parseInt(asm.substr(i,2),16) }
var length,lengthF;
WebAssembly.compile(bin).then(module => {
const wasmInstance = new WebAssembly.Instance(module, {});
lengthF = wasmInstance.exports.lengthF;
length = wasmInstance.exports.length;
});
// test values are const (same result if from array or literals)
const a1 = rand(-100000,100000);
const a2 = rand(-100000,100000);
const a3 = rand(-100000,100000);
const a4 = rand(-100000,100000);
// javascript version of function
function len(x,y,x1,y1){
var nx = x1 - x;
var ny = y1 - y;
return Math.sqrt(nx * nx + ny * ny);
}
And the test code is the same for all 3 functions and run in strict mode.
tests : [{
func : function (){
var i;
for (i = 0; i < 100000; i += 1) {
length(a1,a2,a3,a4);
}
},
name : "length64",
},{
func : function (){
var i;
for (i = 0; i < 100000; i += 1) {
lengthF(a1,a2,a3,a4);
}
},
name : "length32",
},{
func : function (){
var i;
for (i = 0; i < 100000; i += 1) {
len(a1,a2,a3,a4);
}
},
name : "lengthNative",
}
]
The test results on FireFox are
/*
=======================================
Performance test. : WebAssm V Javascript
Use strict....... : true
Data view........ : false
Duplicates....... : 4
Cycles........... : 34
Samples per cycle : 100
Tests per Sample. : undefined
---------------------------------------------
Test : 'length64'
Mean : 26359µs ±128µs (*) 1128 samples
---------------------------------------------
Test : 'length32'
Mean : 27456µs ±109µs (*) 1144 samples
---------------------------------------------
Test : 'lengthNative'
Mean : 106µs ±2µs (*) 1128 samples
-All ----------------------------------------
Mean : 18.018ms Totals time : 61262.240ms 3400 samples
(*) Error rate approximation does not represent the variance.
*/
Andreas describes a number of good reasons why the JavaScript implementation was initially observed to be x300 faster. However, there are a number of other issues with your code.
This is a classic 'micro benchmark', i.e. the code that you are testing is so small, that the other overheads within your test loop are a significant factor. For example, there is an overhead in calling WebAssembly from JavaScript, which will factor in your results. What are you trying to measure? raw processing speed? or the overhead of the language boundary?
Your results vary wildly, from x300 to x2, due to small changes in your test code. Again, this is a micro benchmark issue. Others have seen the same when using this approach to measure performance, for example this post claims wasm is x84 faster, which is clearly wrong!
The current WebAssembly VM is very new, and an MVP. It will get faster. Your JavaScript VM has had 20 years to reach its current speed. The performance of the JS <=> wasm boundary is being worked on and optimised right now.
For a more definitive answer, see the joint paper from the WebAssembly team, which outlines an expected runtime performance gain of around 30%
Finally, to answer your point:
Whats the point of WebAssembly if it does not optimise
I think you have misconceptions around what WebAssembly will do for you. Based on the paper above, the runtime performance optimisations are quite modest. However, there are still a number of performance advantages:
Its compact binary format mean and low level nature means the browser can load, parse and compile the code much faster than JavaScript. It is anticipated that WebAssembly can be compiled faster than your browser can download it.
WebAssembly has a predictable runtime performance. With JavaScript the performance generally increases with each iteration as it is further optimised. It can also decrease due to se-optimisation.
There are also a number of non-performance related advantages too.
For a more realistic performance measurement, take a look at:
Its use within Figma
Results from using it with PDFKit
Both are practical, production codebases.
The JS engine can apply a lot of dynamic optimisations to this example:
Perform all calculations with integers and only convert to double for the final call to Math.sqrt.
Inline the call to the len function.
Hoist the computation out of the loop, since it always computes the same thing.
Recognise that the loop is left empty and eliminate it entirely.
Recognise that the result is never returned from the testing function, and hence remove the entire body of the test function.
All but (4) apply even if you add the result of every call. With (5) the end result is an empty function either way.
With Wasm an engine cannot do most of these steps, because it cannot inline across language boundaries (at least no engine does that today, AFAICT). Also, for Wasm it is assumed that the producing (offline) compiler has already performed relevant optimisations, so a Wasm JIT tends to be less aggressive than one for JavaScript, where static optimisation is impossible.
Serious answer
It seemed like
WebAssembly is far from a ready technology.
actually did play a role in this, and performance of calling WASM from JS in Firefox was improved in late 2018.
Running your benchmarks in a current FF/Chromium yields results like "Calling the WASM implementation from JS is 4-10 times slower than calling the JS implementation from JS". Still, it seems like engines don't inline across WASM/JS borders, and the overhead of having to call vs. not having to call is significant (as the other answers already pointed out).
Mocking answer
Your benchmarks are all wrong. It turns out that JS is actually 8-40 times (FF, Chrome) slower than WASM. WTF, JS is soo slooow.
Do I intend to prove that? Of course (not).
First, I re-implement your benchmarking code in C:
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
static double lengthC(double x, double y, double x1, double y1) {
double nx = x1 - x;
double ny = y1 - y;
return sqrt(nx * nx + ny * ny);
}
double lengthArrayC(double* a, size_t length) {
double c = 0;
for (size_t i = 0; i < length; i++) {
double a1 = a[i + 0];
double a2 = a[i + 1];
double a3 = a[i + 2];
double a4 = a[i + 3];
c += lengthC(a1,a2,a3,a4);
c += lengthC(a2,a3,a4,a1);
c += lengthC(a3,a4,a1,a2);
c += lengthC(a4,a1,a2,a3);
}
return c;
}
#ifdef __wasm__
__attribute__((import_module("js"), import_name("len")))
double lengthJS(double x, double y, double x1, double y1);
double lengthArrayJS(double* a, size_t length) {
double c = 0;
for (size_t i = 0; i < length; i++) {
double a1 = a[i + 0];
double a2 = a[i + 1];
double a3 = a[i + 2];
double a4 = a[i + 3];
c += lengthJS(a1,a2,a3,a4);
c += lengthJS(a2,a3,a4,a1);
c += lengthJS(a3,a4,a1,a2);
c += lengthJS(a4,a1,a2,a3);
}
return c;
}
__attribute__((import_module("bench"), import_name("now")))
double now();
__attribute__((import_module("bench"), import_name("result")))
void printtime(int benchidx, double ns);
#else
void printtime(int benchidx, double ns) {
if (benchidx == 1) {
printf("C: %f ns\n", ns);
} else if (benchidx == 0) {
printf("avoid the optimizer: %f\n", ns);
} else {
fprintf(stderr, "Unknown benchmark: %d", benchidx);
exit(-1);
}
}
double now() {
struct timespec ts;
if (clock_gettime(CLOCK_MONOTONIC, &ts) == 0) {
return (double)ts.tv_sec + (double)ts.tv_nsec / 1e9;
} else {
return sqrt(-1);
}
}
#endif
#define iters 1000000
double a[iters+3];
int main() {
int bigCount = 0;
srand(now());
for (size_t i = 0; i < iters + 3; i++)
a[i] = (double)rand()/RAND_MAX*2e5-1e5;
for (int i = 0; i < 10; i++) {
double startTime, endTime;
double c;
startTime = now();
c = lengthArrayC(a, iters);
endTime = now();
bigCount = (bigCount + (int64_t)c) % 1000;
printtime(1, (endTime - startTime) * 1e9 / iters / 4);
#ifdef __wasm__
startTime = now();
c = lengthArrayJS(a, iters);
endTime = now();
bigCount = (bigCount + (int64_t)c) % 1000;
printtime(2, (endTime - startTime) * 1e9 / iters / 4);
#endif
}
printtime(0, bigCount);
return 0;
}
Compile it with clang 12.0.1:
clang -O3 -target wasm32-wasi --sysroot /opt/wasi-sdk/wasi-sysroot/ foo2.c -o foo2.wasm
And provide it with a length function from JS via imports:
"use strict";
(async (wasm) => {
const wasmbytes = new Uint8Array(wasm.length);
for (var i in wasm)
wasmbytes[i] = wasm.charCodeAt(i);
(await WebAssembly.instantiate(wasmbytes, {
js: {
len: function (x,y,x1,y1) {
var nx = x1 - x;
var ny = y1 - y;
return Math.sqrt(nx * nx + ny * ny);
}
},
bench: {
now: () => window.performance.now() / 1e3,
result: (bench, ns) => {
let name;
if (bench == 1) { name = "C" }
else if (bench == 2) { name = "JS" }
else if (bench == 0) { console.log("Optimizer confuser: " + ns); /*not really necessary*/; return; }
else { throw "unknown bench"; }
console.log(name + ": " + ns + " ns");
},
},
})).instance.exports._start();
})(atob('AGFzbQEAAAABFQRgBHx8fHwBfGAAAXxgAn98AGAAAAIlAwJqcwNsZW4AAAViZW5jaANub3cAAQViZW5jaAZyZXN1bHQAAgMCAQMFAwEAfAcTAgZtZW1vcnkCAAZfc3RhcnQAAwr2BAHzBAMIfAJ/An5BmKzoAwJ/EAEiA0QAAAAAAADwQWMgA0QAAAAAAAAAAGZxBEAgA6sMAQtBAAtBAWutNwMAQejbl3whCANAQZis6ANBmKzoAykDAEKt/tXk1IX9qNgAfkIBfCIKNwMAIAhBmKzoA2ogCkIhiKe3RAAAwP///99Bo0QAAAAAAGoIQaJEAAAAAABq+MCgOQMAIAhBCGoiCA0ACwNAEAEhBkGQCCsDACEBQYgIKwMAIQRBgAgrAwAhAEQAAAAAAAAAACECQRghCANAIAQhAyABIgQgAKEiASABoiIHIAMgCEGACGorAwAiAaEiBSAFoiIFoJ8gACAEoSIAIACiIgAgBaCfIAAgASADoSIAIACiIgCgnyACIAcgAKCfoKCgoCECIAMhACAIQQhqIghBmKToA0cNAAtBARABIAahRAAAAABlzc1BokQAAAAAgIQuQaNEAAAAAAAA0D+iEAICfiACmUQAAAAAAADgQ2MEQCACsAwBC0KAgICAgICAgIB/CyALfEQAAAAAAAAAACECQYDcl3whCBABIQMDQCACIAhBgKzoA2orAwAiBSAIQYis6ANqKwMAIgEgCEGQrOgDaisDACIAIAhBmKzoA2orAwAiBBAAoCABIAAgBCAFEACgIAAgBCAFIAEQAKAgBCAFIAEgABAAoCECIAhBCGoiCA0AC0ECEAEgA6FEAAAAAGXNzUGiRAAAAACAhC5Bo0QAAAAAAADQP6IQAkLoB4EhCgJ+IAKZRAAAAAAAAOBDYwRAIAKwDAELQoCAgICAgICAgH8LIAp8QugHgSELIAlBAWoiCUEKRw0AC0EAIAuntxACCwB2CXByb2R1Y2VycwEMcHJvY2Vzc2VkLWJ5AQVjbGFuZ1YxMS4wLjAgKGh0dHBzOi8vZ2l0aHViLmNvbS9sbHZtL2xsdm0tcHJvamVjdCAxNzYyNDliZDY3MzJhODA0NGQ0NTcwOTJlZDkzMjc2ODcyNGE2ZjA2KQ=='))
Now, calling the JS function from WASM is unsurprisingly a lot slower than calling the WASM function from WASM. (In fact, WASM→WASM it isn't calling. You can see the f64.sqrt being inlined into _start.)
(One last interesting datapoint is that WASM→WASM and JS→JS seem to have about the same cost (about 1.5 ns per inlined length(…) on my E3-1280). Disclaimer: It's entirely possible that my benchmark is even more broken than the original question.)
Conclusion
WASM isn't slow, crossing the border is. For now and the foreseeable future, don't put things into WASM unless they're a significant computational task. (And even then, it depends. Sometimes, JS engines are really smart. Sometimes.)
I have a 2D array, something like the following:
[1.11, 23]
[2.22, 52]
[3.33, 61]
...
Where the array is ordered by the first value in each row.
I am trying to find a value within the array that is close to the search value - within a certain sensitivity. The way this is set up, and the value of the sensitivity, ensure only one possible match within the array.
The search value is the current x-pos of the mouse. The search is called on mousemove, and so is being called often.
Originally I had the following (using a start-to-end for loop):
for(var i = 0; i < arr.length; i++){
if(Math.abs(arr[i][0] - x) <= sensitivity){
hit = true;
break;
}
}
And it works like a charm. So far, I've only been using small data sets, so there is no apparent lag using this method. But, eventually I will be using much larger data sets, and so want to switch this to a Binary Search:
var a = 0;
var b = arr.length - 1;
var c = 0;
while(a < b){
c = Math.floor((a + b) / 2);
if(Math.abs(arr[c][0] - x) <= sensitivity){
hit = true;
break;
}else if(arr[c][0] < x){
a = c;
}else{
b = c;
}
}
This works well, for all of 2 seconds, and then it hangs to the point where I need to restart my browser. I've used binary searches plenty in the past, and cannot for the life of me figure out why this one isn't working properly.
EDIT 1
var sensitivity = (width / arr.length) / 2.001
The points in the array are equidistant, and so this sensitivity ensures that there is no ambiguous 1/2-way point in between two arr values. You are either in one or the other.
Values are created dynamically at page load, but look exactly like what I've mentioned above. The x-values have more significant figures, and the y values are all over the place, but there is no significant difference between the small sample I provided and the generated one.
EDIT 2
Printed a list that was dynamically created:
[111.19999999999999, 358.8733333333333]
[131.4181818181818, 408.01333333333326]
[151.63636363636363, 249.25333333333327]
[171.85454545454544, 261.01333333333326]
[192.07272727272726, 298.39333333333326]
[212.29090909090908, 254.2933333333333]
[232.5090909090909, 308.47333333333324]
[252.72727272727272, 331.1533333333333]
[272.94545454545454, 386.1733333333333]
[293.16363636363633, 384.9133333333333]
[313.3818181818182, 224.05333333333328]
[333.6, 284.53333333333325]
[353.81818181818187, 278.2333333333333]
[374.0363636363637, 391.63333333333327]
[394.25454545454556, 322.33333333333326]
[414.4727272727274, 300.9133333333333]
[434.69090909090926, 452.95333333333326]
[454.9090909090911, 327.7933333333333]
[475.12727272727295, 394.9933333333332]
[495.3454545454548, 451.27333333333326]
[515.5636363636366, 350.89333333333326]
[535.7818181818185, 308.47333333333324]
[556.0000000000003, 395.83333333333326]
[576.2181818181822, 341.23333333333323]
[596.436363636364, 371.47333333333324]
[616.6545454545459, 436.9933333333333]
[636.8727272727277, 280.7533333333333]
[657.0909090909096, 395.4133333333333]
[677.3090909090914, 433.21333333333325]
[697.5272727272733, 355.09333333333325]
[717.7454545454551, 333.2533333333333]
[737.963636363637, 255.55333333333328]
[758.1818181818188, 204.7333333333333]
[778.4000000000007, 199.69333333333327]
[798.6181818181825, 202.63333333333327]
[818.8363636363644, 253.87333333333328]
[839.0545454545462, 410.5333333333333]
[859.272727272728, 345.85333333333324]
[879.4909090909099, 305.11333333333323]
[899.7090909090917, 337.8733333333333]
[919.9272727272736, 351.3133333333333]
[940.1454545454554, 324.01333333333326]
[960.3636363636373, 331.57333333333327]
[980.5818181818191, 447.4933333333333]
[1000.800000000001, 432.3733333333333]
As you can see, it is ordered by the first value in each row, ascending.
SOLUTION
Changing the condition to
while(a < b)
and
var b = positions.length;
and
else if(arr[c][0] < x){
a = c + 1;
}
did the trick.
Your binary search seems to be a bit off: try this.
var arr = [[1,0],[3,0],[5,0]];
var lo = 0;
var hi = arr.length;
var x = 5;
var sensitivity = 0.1;
while (lo < hi) {
var c = Math.floor((lo + hi) / 2);
if (Math.abs(arr[c][0] - x) <= sensitivity) {
hit = true;
console.log("FOUND " + c);
break;
} else if (x > arr[c][0]) {
lo = c + 1;
} else {
hi = c;
}
}
This is meant as a general reference to anyone implementing binary search.
Let:
lo be the smallest index that may possibly contain your value,
hi be one more than the largest index that may contain your value
If these conventions are followed, then binary search is simply:
while (lo < hi) {
var mid = (lo + hi) / 2;
if (query == ary[mid]) {
// do stuff
else if (query < ary[mid]) {
// query is smaller than mid
// so query can be anywhere between lo and (mid - 1)
// the upper bound should be adjusted
hi = mid;
else {
// query can be anywhere between (mid + 1) and hi.
// adjust the lower bound
lo = mid + 1;
}
I don't know your exact situation, but here's a way the code could crash:
1) Start with an array with two X values. This array will have a length of 2, so a = 0, b = 1, c = 0.
2) a < b, so the while loop executes.
3) c = floor((a + b) / 2) = floor(0.5) = 0.
4) Assume the mouse is not within sensitivity of the first X value, so the first if branch does not hit.
5) Assume our X values are to the right of our mouse, so the second if branch enters. This sets a = c, or 0, which it already is.
6) Thus, we get an endless loop.
I am searching for a way to calculate the Cumulative distribution function in Javascript. Are there classes which have implemented this? Do you have an idea to get this to work? It does not need to be 100% percent accurate but I need a good idea of the value.
http://en.wikipedia.org/wiki/Cumulative_distribution_function
I was able to write my own function with the help of Is there an easily available implementation of erf() for Python? and the knowledge from wikipedia.
The calculation is not 100% correct as it is just a approximation.
function normalcdf(mean, sigma, to)
{
var z = (to-mean)/Math.sqrt(2*sigma*sigma);
var t = 1/(1+0.3275911*Math.abs(z));
var a1 = 0.254829592;
var a2 = -0.284496736;
var a3 = 1.421413741;
var a4 = -1.453152027;
var a5 = 1.061405429;
var erf = 1-(((((a5*t + a4)*t) + a3)*t + a2)*t + a1)*t*Math.exp(-z*z);
var sign = 1;
if(z < 0)
{
sign = -1;
}
return (1/2)*(1+sign*erf);
}
normalcdf(30, 25, 1.4241); //-> 0.12651187738346226
//wolframalpha.com 0.12651200000000000
The math.js library provides an erf function. Based on a definition found at Wolfram Alpha , the cdfNormalfunction can be implemented like this in Javascript:
const mathjs = require('mathjs')
function cdfNormal (x, mean, standardDeviation) {
return (1 - mathjs.erf((mean - x ) / (Math.sqrt(2) * standardDeviation))) / 2
}
In the node.js console:
> console.log(cdfNormal(5, 30, 25))
> 0.15865525393145707 // Equal to Wolfram Alpha's result at: https://sandbox.open.wolframcloud.com/app/objects/4935c1cb-c245-4d8d-9668-4d353ad714ec#sidebar=compute
This formula will give the correct normal CDF unlike the currently accepted answer
function ncdf(x, mean, std) {
var x = (x - mean) / std
var t = 1 / (1 + .2315419 * Math.abs(x))
var d =.3989423 * Math.exp( -x * x / 2)
var prob = d * t * (.3193815 + t * ( -.3565638 + t * (1.781478 + t * (-1.821256 + t * 1.330274))))
if( x > 0 ) prob = 1 - prob
return prob
}
This answer comes from math.ucla.edu
Due to some needs in the past, i put together an implementation of distribution function in javascript. my library is available at github. You can take a look at https://github.com/chen0040/js-stats
it provides javascript implementation of CDF and inverse CDF for Normal distribution, Student's T distribution, F distribution and Chi-Square Distribution
To use the js lib for obtaining CDF and inverse CDF:
jsstats = require('js-stats');
//====================NORMAL DISTRIBUTION====================//
var mu = 0.0; // mean
var sd = 1.0; // standard deviation
var normal_distribution = new jsstats.NormalDistribution(mu, sd);
var X = 10.0; // point estimate value
var p = normal_distribution.cumulativeProbability(X); // cumulative probability
var p = 0.7; // cumulative probability
var X = normal_distribution.invCumulativeProbability(p); // point estimate value
//====================T DISTRIBUTION====================//
var df = 10; // degrees of freedom for t-distribution
var t_distribution = new jsstats.TDistribution(df);
var t_df = 10.0; // point estimate or test statistic
var p = t_distribution.cumulativeProbability(t_df); // cumulative probability
var p = 0.7;
var t_df = t_distribution.invCumulativeProbability(p); // point estimate or test statistic
//====================F DISTRIBUTION====================//
var df1 = 10; // degrees of freedom for f-distribution
var df2 = 20; // degrees of freedom for f-distribution
var f_distribution = new jsstats.FDistribution(df1, df2);
var F = 10.0; // point estimate or test statistic
var p = f_distribution.cumulativeProbability(F); // cumulative probability
//====================Chi Square DISTRIBUTION====================//
var df = 10; // degrees of freedom for cs-distribution
var cs_distribution = new jsstats.ChiSquareDistribution(df);
var X = 10.0; // point estimate or test statistic
var p = cs_distribution.cumulativeProbability(X); // cumulative probability
This is a brute force implementation, but accurate to more digits of precision. The approximation above is accurate within 10^-7. My implementation runs slower (700 nano-sec) but is accurate within 10^-14. normal(25,30,1.4241) === 0.00022322110257305683, vs wolfram's 0.000223221102572082.
It takes the power series of the standard normal pdf, i.e. the bell-curve, and then integrates the series.
I originally wrote this in C, so I concede some of the optimizations might seem silly in Javascript.
function normal(x, mu, sigma) {
return stdNormal((x-mu)/sigma);
}
function stdNormal(z) {
var j, k, kMax, m, values, total, subtotal, item, z2, z4, a, b;
// Power series is not stable at these extreme tail scenarios
if (z < -6) { return 0; }
if (z > 6) { return 1; }
m = 1; // m(k) == (2**k)/factorial(k)
b = z; // b(k) == z ** (2*k + 1)
z2 = z * z; // cache of z squared
z4 = z2 * z2; // cache of z to the 4th
values = [];
// Compute the power series in groups of two terms.
// This reduces floating point errors because the series
// alternates between positive and negative.
for (k=0; k<100; k+=2) {
a = 2*k + 1;
item = b / (a*m);
item *= (1 - (a*z2)/((a+1)*(a+2)));
values.push(item);
m *= (4*(k+1)*(k+2));
b *= z4;
}
// Add the smallest terms to the total first that
// way we minimize the floating point errors.
total = 0;
for (k=49; k>=0; k--) {
total += values[k];
}
// Multiply total by 1/sqrt(2*PI)
// Then add 0.5 so that stdNormal(0) === 0.5
return 0.5 + 0.3989422804014327 * total;
}
You can also take a look here, it's a scientific calculator implemented in javascript, it includes erf and its author claims no copyright on the implementation.