Bias in randomizing normally distributed numbers (javascript)

Bias in randomizing normally distributed numbers (javascript) - javascript

I’m having problems generating normally distributed random numbers (mu=0 sigma=1)
using JavaScript.
I’ve tried Box-Muller's method and ziggurat, but the mean of the generated series of numbers comes out as 0.0015 or -0.0018 — very far from zero!! Over 500,000 randomly generated numbers this is a big issue. It should be close to zero, something like 0.000000000001.
I cannot figure out whether it’s a method problem, or whether JavaScript’s built-in Math.random() generates not exactly uniformly distributed numbers.
Has someone found similar problems?
Here you can find the ziggurat function:
http://www.filosophy.org/post/35/normaldistributed_random_values_in_javascript_using_the_ziggurat_algorithm/
And below is the code for the Box-Muller:
function rnd_bmt() {
var x = 0, y = 0, rds, c;
// Get two random numbers from -1 to 1.
// If the radius is zero or greater than 1, throw them out and pick two
// new ones. Rejection sampling throws away about 20% of the pairs.
do {
x = Math.random()*2-1;
y = Math.random()*2-1;
rds = x*x + y*y;
}
while (rds === 0 || rds > 1)
// This magic is the Box-Muller Transform
c = Math.sqrt(-2*Math.log(rds)/rds);
// It always creates a pair of numbers. I'll return them in an array.
// This function is quite efficient so don't be afraid to throw one away
// if you don't need both.
return [x*c, y*c];
}

If you generate n independent normal random variables, the standard deviation of the mean will be sigma / sqrt(n).
In your case n = 500000 and sigma = 1 so the standard error of the mean is approximately 1 / 707 = 0.0014. The 95% confidence interval, given 0 mean, would be around twice this or (-0.0028, 0.0028). Your sample means are well within this range.
Your expectation of obtaining 0.000000000001 (1e-12) is not mathematically grounded. To get within that range of accuracy, you would need to generate about 10^24 samples. At 10,000 samples per second that would still take 3 quadrillon years to do...this is precisely why it's good to avoid computing things by simulation if possible.
On the other hand, your algorithm does seem to be implemented correctly :)

Related

Generating a random number within a large range based on probabilities

I am curious how Stake.com managed to create the game "Limbo" where the odds of a multiplier happening is specific to the probability they've calculated. Here's the game : https://stake.com/casino/games/limbo
For example :
Multiplier -> x2
Probability -> 49.5% chance.
What it means is you have a 49.5% chance of winning because those are the odds that the multiplier will actually hit a number above x2.
If you set the multiplier all the way up to x1,000,000. You have a 0.00099% chance of actually hitting 1,000,000.
It's not a project I'm working on but I'm just extremely curious how we could achieve this.
Example:
Math.floor(Math.random()*1000000)
is not as random as we think, since Math.random() generates a number between 0-1. When paired with a huge multiplier like 1,000,000. We would actually generate a 6-figure number most of the time and it's not as random as we thought.
I've read that we have to convert it into a power law distribution but I'm not sure how it works. Would love to have more material to read up on how it works.

It sounds like you need to define some function that gives the probability of winning for a given multiplier N. These probabilities don't have to add up to 1, because they are not part of the same random variable; there is a unique random variable for each N chosen and two events, win or lose; we can subscript them as win(N) and lose(N). We really only need to define win(N) since lose(N) = 1 - win(N).
Something like an exponential functional would make sense here. Consider win(N) = 2^(1 - N). Then we get the following probabilities of winning:
n win(n)
1 1
2 1/2
3 1/4
4 1/8
etc
Or we could use just an inverse function: win(N) = 1/N
n win(n)
1 1
2 1/2
3 1/3
...
Then to actually see whether you win or lose for a given N, just choose a random number in some range - [0.0, 1.0) works fine for this purpose - and see whether that number is less than the win(N). If so, it's a win, of not, it's a loss.
Yes, technically speaking, it is probably true that the floating point numbers are not really uniformly distributed over [0, 1) when calling standard library functions. If you really need that level of precision then you have a much harder problem. But, for a game, regular rand() type functions should be plenty uniform for your purposes.

Finding a pattern in an array that is not always consistant

I have an ordered data set of decimal numbers. This data is always similar - but not always the same. The expected data is a few, 0 - 5 large numbers, followed by several (10 - 90) average numbers then follow by smaller numbers. There are cases where a large number may be mixed into the average numbers' See the following arrays.
let expectedData = [35.267,9.267,9.332,9.186,9.220,9.141,9.107,9.114,9.098,9.181,9.220,4.012,0.132];
let expectedData = [35.267,32.267,9.267,9.332,9.186,9.220,9.141,9.107,30.267,9.114,9.098,9.181,9.220,4.012,0.132];
I am trying to analyze the data by getting the average without high numbers on front and low numbers on back. The middle high/low are fine to keep in the average. I have a partial solution below. Right now I am sort of brute forcing it but the solution isn't perfect. On smaller datasets the first average calculation is influenced by the large number.
My question is: Is there a way to handle this type of problem, which is identifying patterns in an array of numbers?
My algorithm is:
Get an average of the array
Calculate an above/below average value
Remove front (n) elements that are above average
remove end elements that are below average
Recalculate average
In JavaScript I have: (this is partial leaving out below average)
let total= expectedData.reduce((rt,cur)=> {return rt+cur;}, 0);
let avg = total/expectedData.length;
let aboveAvg = avg*0.1+avg;
let remove = -1;
for(let k=0;k<expectedData.length;k++) {
if(expectedData[k] > aboveAvg) {
remove=k;
} else {
if(k==0) {
remove = -1;//no need to remove
}
//break because we don't want large values from middle removed.
break;
}
}
if(remove >= 0 ) {
//remove front above average
expectedData.splice(0,remove+1);
}
//remove belows
//recalculate average

I believe you are looking for some outlier detection Algorithm. There are already a bunch of questions related to this on Stack overflow.
However, each outlier detection algorithm has its own merits.
Here are a few of them
https://mathworld.wolfram.com/Outlier.html
High outliers are anything beyond the 3rd quartile + 1.5 * the inter-quartile range (IQR)
Low outliers are anything beneath the 1st quartile - 1.5 * IQR
Grubbs's test
You can check how it works for your expectations here
Apart from these 2, the is a comparison calculator here . You can visit this to use other Algorithms per your need.

I would have tried to get a sliding window coupled with an hysteresis / band filter in order to detect the high value peaks, first.
Then, when your sliding windows advance, you can add the previous first value (which is now the last of analyzed values) to the global sum, and add 1 to the number of total values.
When you encounter a peak (=something that causes the hysteresis to move or overflow the band filter), you either remove the values (may be costly), or better, you set the value to NaN so you can safely ignore it.
You should keep computing a sliding average within your sliding window in order to be able to auto-correct the hysteresis/band filter, so it will reject only the start values of a peak (the end values are the start values of the next one), but once values are stabilized to a new level, values will be kept again.
The size of the sliding window will set how much consecutive "stable" values are needed to be kept, or in other words how much UNstable values are rejected when you reach a new level.

For that, you can check the mode of the values (rounded) and then take all the numbers in a certain range around the mode. That range can be taken from the data itself, for example by taking the 10% of the max - min value. That helps you to filter your data. You can select the percent that fits your needs. Something like this:
let expectedData = [35.267,9.267,9.332,9.186,9.220,9.141,9.107,9.114,9.098,9.181,9.220,4.012,0.132];
expectedData.sort((a, b) => a - b);
/// Get the range of the data
const RANGE = expectedData[ expectedData.length - 1 ] - expectedData[0];
const WINDOW = 0.1; /// Window of selection 10% from left and right
/// Frequency of each number
let dist = expectedData.reduce((acc, e) => (acc[ Math.floor(e) ] = (acc[ Math.floor(e) ] || 0) + 1, acc), {});
let mode = +Object.entries(dist).sort((a, b) => b[1] - a[1])[0][0];
let newData = expectedData.filter(e => mode - RANGE * WINDOW <= e && e <= mode + RANGE * WINDOW);
console.log(newData);

Precision decimals, 30 of them, in JavaScript (Node.js)

My Challenge
I am presently working my way through reddit's /r/dailyprogrammer challenges using Node.js and have caught a snag. Being that I'm finishing out day 3 with this single exercise, I've decided to look for help. I refuse to just move on without knowing how.
Challenge #6: Your challenge for today is to create a program that can calculate pi accurately to at least 30 decimal places.
My Snag
I've managed to obtain the precision arithmetic I was seeking via mathjs, but am left stumped on how to obtain 30 decimal places. Does anyone know a library, workaround or config that could help me reach my goal?
/*jslint node: true */
"use strict";
var mathjs = require('mathjs'),
math = mathjs();
var i,
x,
pi;
console.log(Math.PI);
function getPi(i, x, pi) {
if (i === undefined) {
pi = math.eval('3 + (4/(2*3*4))');
i = 2;
x = 4;
getPi(i, x, pi);
} else {
pi = math.eval('pi + (4/('+x+'*'+x+1+'*'+x+2+')) - (4/('+x+2+'*'+x+3+'*'+x+4+'))');
x += 4;
i += 1;
if (x < 20000) {
getPi(i, x, pi);
} else {
console.log(pi);
}
}
}
getPi();
I have made my way through many interations of this, and in this example am using the Nilakatha Series:

This question uses some algorithm to compute digits of pi, apparently to arbitrary precision. Comments on that question indicate possible sources, in particular this paper. You could easily port that approach to JavaScript.

This algorithm has, as an alternating series, an error of about 4/n^3 if the last term is 4/((n-2)*(n-1)*n), that is, using n-3 fraction terms. To get an error smaller than 0.5*10^(-30), you would need (at least) n=2*10^10 terms of this series. With that number, you have to take care of floating point errors, especially of cancellation effects when adding a large number and a small number. The best way to avoid that is to start the summation with the smallest term and then go backwards. Or do the summation forward, but with a precision of 60 decimals, to then round the result to 30 decimals.
It would be better to use the faster converging Machin formula, or one of the Machin-like formulas, if you want to have some idea of what exactly you are computing. If not, then use one of the super fast formulas used for billions of digits, but for 30 digits this is likely overkill.
See wikipedia on the approximations of pi.

What are the chances that JavaScript Math.Random() will create the same number twice in a row?

Is this correct? using - http://en.wikipedia.org/wiki/Binomial_probability
Looks like values are from .0000000000000000 to .9999999999999999
Probability of happening twice = p^2 = (1/9999999999999999)^2 = 1.0 e-32
I think I am missing something here?
Also, how does being a pseudo random number generator change this calculation?
Thank You.

In an ideal world Math.random() would be absolutely random, with one output being completely independent from another, which (assuming p=the probability of any given number being produced) results in a probably of p^2 for any value being repeated immediately after another (as others have already said).
In practice people want Math.random to be fast which means pseudo-random number generators are used by the engines. There are many different kinds of PRNG but the most basic is a linear congruential generator, which is basically a function along the lines of:
s(n + 1) = some_prime * s(n) + some_value mod some_other_prime
If such a generator is used then you won't see a value repeated until you've called random() some_other_prime times. You're guaranteed of that.
Relatively recently however it's become apparent that this kind of behaviour (coupled with seeding the PRNGs with the current time) could be used for some forms tracking have led to browsers doing a number of things that mean you can't assume anything about subsequent random() calls.

I think the probability of getting two numbers in a row is 1 divided by the range of the generator, assuming that it has a good distribution.
The reason for this is that the first number can be anything, and the second number needs to just be that number again, which means we don't care about the first number at all. The probability of getting the same number twice in a row is the same as the probability of getting any particular number once.
Getting some particular number twice in a row, e.g. two 0.5s in a row, would be p^2; however, if you just care about any number twice in a row, it's just p.

If the numbers were truly random, you'd expect them, indeed, to appear with probability 1/p, so twice that would be 1/p^2.
The value for p is not exactly the one you have though, because the numbers are being represented internally as binary. Figure out how many bits of mantissa the numbers have in javascript and use that for your combinatoric count.
The "pseudorandom" part is more interesting, because the properties of pseudorandom number generators vary. Knuth does some lovely work with that in Seminumerical Algorithms, but basically most usual PN generators have at least some spectral distributiuon. Cryptograp0hic PN generators are generally stronger.
Update: The amount of time shouldn't be significant. Whether it's a millisecond or a year, as long as you don't update the state The probabilities will stay the same.

The probability that you would get 2 given numbers is (1/p)^2, but the probability that you get 2 of same numbers (any) is 1/p. That is because the first number can be anything, and the second just needs to match that.

You can kind of find out, just let it run a few days :)
var last = 0.1;
var count = 0 | 0;
function rand(){
++count;
var num = Math.random();
if(num === last){
console.log('count: '+count+' num: '+num);
}
last = num;
}
while(true) rand();

Secure random numbers in javascript?

How do I generate cryptographically secure random numbers in javascript?

There's been discussion at WHATWG on adding this to the window.crypto object. You can read the discussion and check out the proposed API and webkit bug (22049).
Just tested the following code in Chrome to get a random byte:
(function(){
var buf = new Uint8Array(1);
window.crypto.getRandomValues(buf);
alert(buf[0]);
})();

In order, I think your best bets are:
window.crypto.getRandomValues or window.msCrypto.getRandomValues
The sjcl library's randomWords function (http://crypto.stanford.edu/sjcl/)
The isaac library's random number generator (which is seeded by Math.random, so not really cryptographically secure) (https://github.com/rubycon/isaac.js)
window.crypto.getRandomValues has been implemented in Chrome for a while now, and relatively recently in Firefox as well. Unfortunately, Internet Explorer 10 and before do not implement the function. IE 11 has window.msCrypto, which accomplishes the same thing. sjcl has a great random number generator seeded from mouse movements, but there's always a chance that either the mouse won't have moved sufficiently to seed the generator, or that the user is on a mobile device where there is no mouse movement whatsoever. Thus, I recommend having a fallback case where you can still get a non-secure random number if there is no choice. Here's how I've handled this:
function GetRandomWords (wordCount) {
var randomWords;
// First we're going to try to use a built-in CSPRNG
if (window.crypto && window.crypto.getRandomValues) {
randomWords = new Int32Array(wordCount);
window.crypto.getRandomValues(randomWords);
}
// Because of course IE calls it msCrypto instead of being standard
else if (window.msCrypto && window.msCrypto.getRandomValues) {
randomWords = new Int32Array(wordCount);
window.msCrypto.getRandomValues(randomWords);
}
// So, no built-in functionality - bummer. If the user has wiggled the mouse enough,
// sjcl might help us out here
else if (sjcl.random.isReady()) {
randomWords = sjcl.random.randomWords(wordCount);
}
// Last resort - we'll use isaac.js to get a random number. It's seeded from Math.random(),
// so this isn't ideal, but it'll still greatly increase the space of guesses a hacker would
// have to make to crack the password.
else {
randomWords = [];
for (var i = 0; i < wordCount; i++) {
randomWords.push(isaac.rand());
}
}
return randomWords;
};
You'll need to include sjcl.js and isaac.js for that implementation, and be sure to start the sjcl entropy collector as soon as your page is loaded:
sjcl.random.startCollectors();
sjcl is dual-licensed BSD and GPL, while isaac.js is MIT, so it's perfectly safe to use either of those in any project. As mentioned in another answer, clipperz is another option, however for whatever bizarre reason, it is licensed under the AGPL. I have yet to see anyone who seems to understand what implications that has for a JavaScript library, but I'd universally avoid it.
One way to improve the code I've posted might be to store the state of the isaac random number generator in localStorage, so it isn't reseeded every time the page is loaded. Isaac will generate a random sequence, but for cryptography purposes, the seed is all-important. Seeding with Math.random is bad, but at least a little less bad if it isn't necessarily on every page load.

You can for instance use mouse movement as seed for random numbers, read out time and mouse position whenever the onmousemove event happens, feed that data to a whitening function and you will have some first class random at hand. Though do make sure that user has moved the mouse sufficiently before you use the data.
Edit: I have myself played a bit with the concept by making a password generator, I wouldn't guarantee that my whitening function is flawless, but being constantly reseeded I'm pretty sure that it's plenty for the job: ebusiness.hopto.org/generator.htm
Edit2: It now sort of works with smartphones, but only by disabling touch functionality while the entropy is gathered. Android won't work properly any other way.

Use window.crypto.getRandomValues, like this:
var random_num = new Uint8Array(2048 / 8); // 2048 = number length in bits
window.crypto.getRandomValues(random_num);
This is supported in all modern browsers and uses the operating system's random generator (e.g. /dev/urandom). If you need IE11 compatibility, you have to use their prefixed implementation viavar crypto = window.crypto || window.msCrypto; crypto.getRandomValues(..) though.
Note that the window.crypto API can also generate keys outright, which may be the better option.

Crypto-strong
to get cryptographic strong number from range [0, 1) (similar to Math.random()) use crypto:
let random = ()=> crypto.getRandomValues(new Uint32Array(1))[0]/2**32;
console.log( random() );

You might want to try
http://sourceforge.net/projects/clipperzlib/
It has an implementation of Fortuna which is a cryptographically secure random number generator. (Take a look at src/js/Clipperz/Crypto/PRNG.js). It appears to use the mouse as a source of randomness as well.

First of all, you need a source of entropy. For example, movement of the mouse, password, or any other. But all of these sources are very far from random, and guarantee you 20 bits of entropy, rarely more. The next step that you need to take is to use the mechanism like "Password-Based KDF" it will make computationally difficult to distinguish data from random.

Many years ago, you had to implement your own random number generator and seed it with entropy collected by mouse movement and timing information. This was the Phlogiston Era of JavaScript cryptography. These days we have window.crypto to work with.
If you need a random integer, random-number-csprng is a great choice. It securely generates a series of random bytes and then converts it into an unbiased random integer.
const randomInt = require("random-number-csprng");
(async function() {
let random = randomInt(10, 30);
console.log(`Your random number: ${random}`);
})();
If you need a random floating point number, you'll need to do a little more work. Generally, though, secure randomness is an integer problem, not a floating point problem.

I know i'm late to the party, but if you don't want to deal with the math of getting a cryptographically secure random value, i recommend using rando.js. it's a super small 2kb library that'll give you a decimal, pick something from an array, or whatever else you want- all cryptographically secure.
It's on npm too.
Here's a sample I copied from the GitHub, but it does more than this if you want to go there and read about it more.
console.log(rando()); //a floating-point number between 0 and 1 (could be exactly 0, but never exactly 1)
console.log(rando(5)); //an integer between 0 and 5 (could be 0 or 5)
console.log(rando(5, 10)); //a random integer between 5 and 10 (could be 5 or 10)
console.log(rando(5, "float")); //a floating-point number between 0 and 5 (could be exactly 0, but never exactly 5)
console.log(rando(5, 10, "float")); //a floating-point number between 5 and 10 (could be exactly 5, but never exactly 10)
console.log(rando(true, false)); //either true or false
console.log(rando(["a", "b"])); //{index:..., value:...} object representing a value of the provided array OR false if array is empty
console.log(rando({a: 1, b: 2})); //{key:..., value:...} object representing a property of the provided object OR false if object has no properties
console.log(rando("Gee willikers!")); //a character from the provided string OR false if the string is empty. Reoccurring characters will naturally form a more likely return value
console.log(rando(null)); //ANY invalid arguments return false
<script src="https://randojs.com/2.0.0.js"></script>

If you need large amounts, here's what I would do:
// Max value of random number length
const randLen = 16384
var randomId = randLen
var randomArray = new Uint32Array(randLen)
function random32() {
if (randomId === randLen) {
randomId = 0
return crypto.getRandomValues(randomArray)[randomId++] * 2.3283064365386963e-10
}
return randomArray[randomId++] * 2.3283064365386963e-10
}
function random64() {
if (randomId === randLen || randomId === randLen - 1) {
randomId = 0
crypto.getRandomValues(randomArray)
}
return randomArray[randomId++] * 2.3283064365386963e-10 + randomArray[randomId++] * 5.421010862427522e-20
}
console.log(random32())
console.log(random64())

We Keep Coding

JavaScript is the programming language of the Web.

Bias in randomizing normally distributed numbers (javascript) - javascript

Related

Generating a random number within a large range based on probabilities

Finding a pattern in an array that is not always consistant

Precision decimals, 30 of them, in JavaScript (Node.js)

What are the chances that JavaScript Math.Random() will create the same number twice in a row?

Secure random numbers in javascript?

Categories

Resources