Confused about xxHash - javascript

I'm using xxHash to create hashes from elements id. I just don't want to show real id on website. I created script to test is there option to get same hashes:
const _ = require('lodash');
const XXH = require('xxhashjs');
let hashes = []
let uniq_hashes = []
for(let i = 0; i < 1000000; i++){
var h = XXH.h32(i.toString(), 0xABCD).toString(16)
hashes.push(h)
}
uniq_hashes = _.uniq(hashes)
console.log(hashes.length, uniq_hashes.length);
Log from the script is 1000000 999989, so some hashes was the same. Is it correct way how xxHash works?
Also, first pair is '1987' and '395360'
If i need really unique hashes (no crypto) what should I use?

By the birthday paradox you should see a collisions at around 1:16^2 or 10^6 / 2^16 = ~15 so 11 collisions seems about right. (Note: the math is grossly simplified, see Birthday problem for good math.)
Too reduce the number of collisions increase the hash size and use a cryptographic hash such as SHA-256. Cryptographic hash functions are designed to avoid collisions.

You should use a hash with a larger hash digest. Even a 32-bit chunk of a secure cryptographic hash will have unavoidable collisions eventually.
Since you are using Node.js and want something faster than cryptographic hashes, try MetroHash128 or murmur128 or CityHash128. There's also CityHash256 is you want to go completely overboard. These should be very fast due to using C++ bindings, and the chance of random collision is reduced astronomically.

Related

es6 Map and Set complexity, v8 implementation

Is it a fair assumption that in v8 implementation retrieval / lookup is O(1)?
(I know that the standard doesn't guarantee that)
Is it a fair assumption that in v8 implementation retrieval / lookup is O(1)?
Yes. V8 uses a variant of hash tables that generally have O(1) complexity for these operations.
For details, you might want to have a look at https://codereview.chromium.org/220293002/ where OrderedHashTable is implemented based on https://wiki.mozilla.org/User:Jorend/Deterministic_hash_tables.
For people don't want to dig into the rabbit hole too deep:
1: We can assume good hash table implementations have practically O(1) time complexity.
2: Here is a blog posted by V8 team explains how some memory optimization was done on its hashtable implementation for Map, Set, WeakSet, and WeakMap: Optimizing hash tables: hiding the hash code
Based on 1 and 2: V8's Set and Map's get & set & add & has time complexity practically is O(1).
The original question is already answered, but O(1) doesn't tell a lot about how efficient the implementation is.
First of all, we need to understand what variation of hash table is used for Maps. "Classical" hash tables won't do as they do not provide any iteration order guarantees, while ES6 spec requires insertion to be in the iteration order. So, Maps in V8 are built on top of so-called deterministic hash tables. The idea is similar to the classical algorithm, but there is another layer of indirection for buckets and all entries are inserted and stored in a contiguous array of a fixed size. Deterministic hash tables algorithm indeed guarantees O(1) time complexity for basic operations, like set or get.
Next, we need to know what's the initial size of the hash table, the load factor, and how (and when) it grows/shrinks. The short answer is: initial size is 4, load factor is 2, the table (i.e. the Map) grows x2 as soon as its capacity is reached, and shrinks as soon as there is more than 1/2 of deleted entries.
Let’s consider the worst-case when the table has N out of N entries (it's full), all entries belong to a single bucket, and the required entry is located at the tail. In such a scenario, a lookup requires N moves through the chain elements.
On the other hand, in the best possible scenario when the table is full, but each bucket has 2 entries, a lookup will require up to 2 moves.
It is a well-known fact that while individual operations in hash tables are “cheap”, rehashing is not. Rehashing has O(N) time complexity and requires allocation of the new hash table on the heap. Moreover, rehashing is performed as a part of insertion or deletion operations, when necessary. So, for instance, a map.set() call could be more expensive than you would expect. Luckily, rehashing is a relatively infrequent operation.
Apart from that, such details as memory layout or hash function also matter, but I'm not going to go into these details here. If you're curious how V8 Maps work under the hood, you may find more details here. I was interested in the topic some time ago and tried to share my findings in a readable manner.
let map = new Map();
let obj = {};
const benchMarkMapSet = size => {
console.time("benchMarkMapSet");
for (let i = 0; i < size; i++) {
map.set(i, i);
}
console.timeEnd("benchMarkMapSet");
};
const benchMarkMapGet = size => {
console.time("benchMarkMapGet");
for (let i = 0; i < size; i++) {
let x = map.get(i);
}
console.timeEnd("benchMarkMapGet");
};
const benchMarkObjSet = size => {
console.time("benchMarkObjSet");
for (let i = 0; i < size; i++) {
obj[i] = i;
}
console.timeEnd("benchMarkObjSet");
};
const benchMarkObjGet = size => {
console.time("benchMarkObjGet");
for (let i = 0; i < size; i++) {
let x = obj[i];
}
console.timeEnd("benchMarkObjGet");
};
let size = 2e6;
benchMarkMapSet(size);
benchMarkObjSet(size);
benchMarkMapGet(size);
benchMarkObjGet(size);
benchMarkMapSet: 382.935ms
benchMarkObjSet: 76.077ms
benchMarkMapGet: 125.478ms
benchMarkObjGet: 2.764ms
benchMarkMapSet: 373.172ms
benchMarkObjSet: 77.192ms
benchMarkMapGet: 123.035ms
benchMarkObjGet: 2.638ms
Why don't we just test.
var size_1 = 1000,
size_2 = 1000000,
map_sm = new Map(Array.from({length: size_1}, (_,i) => [++i,i])),
map_lg = new Map(Array.from({length: size_2}, (_,i) => [++i,i])),
i = size_1,
j = size_2,
s;
s = performance.now();
while (i) map_sm.get(i--);
console.log(`Size ${size_1} map returns an item in average ${(performance.now()-s)/size_1}ms`);
s = performance.now();
while (j) map_lg.get(j--);
console.log(`Size ${size_2} map returns an item in average ${(performance.now()-s)/size_2}ms`);
So to me it seems like it converges to O(1) as the size grows. Well lets call it O(1) then.

Insecure Randomness in JavaScript? [duplicate]

How do I generate cryptographically secure random numbers in javascript?
There's been discussion at WHATWG on adding this to the window.crypto object. You can read the discussion and check out the proposed API and webkit bug (22049).
Just tested the following code in Chrome to get a random byte:
(function(){
var buf = new Uint8Array(1);
window.crypto.getRandomValues(buf);
alert(buf[0]);
})();
In order, I think your best bets are:
window.crypto.getRandomValues or window.msCrypto.getRandomValues
The sjcl library's randomWords function (http://crypto.stanford.edu/sjcl/)
The isaac library's random number generator (which is seeded by Math.random, so not really cryptographically secure) (https://github.com/rubycon/isaac.js)
window.crypto.getRandomValues has been implemented in Chrome for a while now, and relatively recently in Firefox as well. Unfortunately, Internet Explorer 10 and before do not implement the function. IE 11 has window.msCrypto, which accomplishes the same thing. sjcl has a great random number generator seeded from mouse movements, but there's always a chance that either the mouse won't have moved sufficiently to seed the generator, or that the user is on a mobile device where there is no mouse movement whatsoever. Thus, I recommend having a fallback case where you can still get a non-secure random number if there is no choice. Here's how I've handled this:
function GetRandomWords (wordCount) {
var randomWords;
// First we're going to try to use a built-in CSPRNG
if (window.crypto && window.crypto.getRandomValues) {
randomWords = new Int32Array(wordCount);
window.crypto.getRandomValues(randomWords);
}
// Because of course IE calls it msCrypto instead of being standard
else if (window.msCrypto && window.msCrypto.getRandomValues) {
randomWords = new Int32Array(wordCount);
window.msCrypto.getRandomValues(randomWords);
}
// So, no built-in functionality - bummer. If the user has wiggled the mouse enough,
// sjcl might help us out here
else if (sjcl.random.isReady()) {
randomWords = sjcl.random.randomWords(wordCount);
}
// Last resort - we'll use isaac.js to get a random number. It's seeded from Math.random(),
// so this isn't ideal, but it'll still greatly increase the space of guesses a hacker would
// have to make to crack the password.
else {
randomWords = [];
for (var i = 0; i < wordCount; i++) {
randomWords.push(isaac.rand());
}
}
return randomWords;
};
You'll need to include sjcl.js and isaac.js for that implementation, and be sure to start the sjcl entropy collector as soon as your page is loaded:
sjcl.random.startCollectors();
sjcl is dual-licensed BSD and GPL, while isaac.js is MIT, so it's perfectly safe to use either of those in any project. As mentioned in another answer, clipperz is another option, however for whatever bizarre reason, it is licensed under the AGPL. I have yet to see anyone who seems to understand what implications that has for a JavaScript library, but I'd universally avoid it.
One way to improve the code I've posted might be to store the state of the isaac random number generator in localStorage, so it isn't reseeded every time the page is loaded. Isaac will generate a random sequence, but for cryptography purposes, the seed is all-important. Seeding with Math.random is bad, but at least a little less bad if it isn't necessarily on every page load.
You can for instance use mouse movement as seed for random numbers, read out time and mouse position whenever the onmousemove event happens, feed that data to a whitening function and you will have some first class random at hand. Though do make sure that user has moved the mouse sufficiently before you use the data.
Edit: I have myself played a bit with the concept by making a password generator, I wouldn't guarantee that my whitening function is flawless, but being constantly reseeded I'm pretty sure that it's plenty for the job: ebusiness.hopto.org/generator.htm
Edit2: It now sort of works with smartphones, but only by disabling touch functionality while the entropy is gathered. Android won't work properly any other way.
Use window.crypto.getRandomValues, like this:
var random_num = new Uint8Array(2048 / 8); // 2048 = number length in bits
window.crypto.getRandomValues(random_num);
This is supported in all modern browsers and uses the operating system's random generator (e.g. /dev/urandom). If you need IE11 compatibility, you have to use their prefixed implementation viavar crypto = window.crypto || window.msCrypto; crypto.getRandomValues(..) though.
Note that the window.crypto API can also generate keys outright, which may be the better option.
Crypto-strong
to get cryptographic strong number from range [0, 1) (similar to Math.random()) use crypto:
let random = ()=> crypto.getRandomValues(new Uint32Array(1))[0]/2**32;
console.log( random() );
You might want to try
http://sourceforge.net/projects/clipperzlib/
It has an implementation of Fortuna which is a cryptographically secure random number generator. (Take a look at src/js/Clipperz/Crypto/PRNG.js). It appears to use the mouse as a source of randomness as well.
First of all, you need a source of entropy. For example, movement of the mouse, password, or any other. But all of these sources are very far from random, and guarantee you 20 bits of entropy, rarely more. The next step that you need to take is to use the mechanism like "Password-Based KDF" it will make computationally difficult to distinguish data from random.
Many years ago, you had to implement your own random number generator and seed it with entropy collected by mouse movement and timing information. This was the Phlogiston Era of JavaScript cryptography. These days we have window.crypto to work with.
If you need a random integer, random-number-csprng is a great choice. It securely generates a series of random bytes and then converts it into an unbiased random integer.
const randomInt = require("random-number-csprng");
(async function() {
let random = randomInt(10, 30);
console.log(`Your random number: ${random}`);
})();
If you need a random floating point number, you'll need to do a little more work. Generally, though, secure randomness is an integer problem, not a floating point problem.
I know i'm late to the party, but if you don't want to deal with the math of getting a cryptographically secure random value, i recommend using rando.js. it's a super small 2kb library that'll give you a decimal, pick something from an array, or whatever else you want- all cryptographically secure.
It's on npm too.
Here's a sample I copied from the GitHub, but it does more than this if you want to go there and read about it more.
console.log(rando()); //a floating-point number between 0 and 1 (could be exactly 0, but never exactly 1)
console.log(rando(5)); //an integer between 0 and 5 (could be 0 or 5)
console.log(rando(5, 10)); //a random integer between 5 and 10 (could be 5 or 10)
console.log(rando(5, "float")); //a floating-point number between 0 and 5 (could be exactly 0, but never exactly 5)
console.log(rando(5, 10, "float")); //a floating-point number between 5 and 10 (could be exactly 5, but never exactly 10)
console.log(rando(true, false)); //either true or false
console.log(rando(["a", "b"])); //{index:..., value:...} object representing a value of the provided array OR false if array is empty
console.log(rando({a: 1, b: 2})); //{key:..., value:...} object representing a property of the provided object OR false if object has no properties
console.log(rando("Gee willikers!")); //a character from the provided string OR false if the string is empty. Reoccurring characters will naturally form a more likely return value
console.log(rando(null)); //ANY invalid arguments return false
<script src="https://randojs.com/2.0.0.js"></script>
If you need large amounts, here's what I would do:
// Max value of random number length
const randLen = 16384
var randomId = randLen
var randomArray = new Uint32Array(randLen)
function random32() {
if (randomId === randLen) {
randomId = 0
return crypto.getRandomValues(randomArray)[randomId++] * 2.3283064365386963e-10
}
return randomArray[randomId++] * 2.3283064365386963e-10
}
function random64() {
if (randomId === randLen || randomId === randLen - 1) {
randomId = 0
crypto.getRandomValues(randomArray)
}
return randomArray[randomId++] * 2.3283064365386963e-10 + randomArray[randomId++] * 5.421010862427522e-20
}
console.log(random32())
console.log(random64())

Generate a random big prime number with forge (or another JavaScript approach)

I need to generate a random big (around 4096 bit) prime number in JavaScript and I'm already using forge. Forge has to have some kind of generator for such tasks as it implements RSA which also relies on random prime numbers. However I haven't found something in the documentation of forge when you just want to get a random prime number (something like var myRandomPrime = forge.random.getPrime(4096); would have been great).
So what would be the best approach to get such a prime (with or without forge) in JavaScript?
Update 06/11/2014: Now, with forge version 0.6.6 you can use this:
var bits = 1024;
forge.prime.generateProbablePrime(bits, function(err, num) {
console.log('random prime', num.toString(16));
});
Finding large primes in JavaScript is difficult -- it's slow and you don't want to block the main thread. It requires some fairly customized code to do right and the code in forge is specialized for RSA key generation. There's no API call to simply produce a large random prime.
There are some extra operations that the RSA code in forge runs that you don't need if you're just looking for a single prime number. That being said, the slowest part of the process is in actually finding the primes, not in those extra operations. However, the RSA code also generates two primes (when you only need one) and they aren't the same bitsize you're looking for. So if you're using the forge API you'd have to pass a bitsize of 8196 (I believe ... that's off the top of my head, so it may be inaccurate) to get a 4096-bit prime.
One way to find a large random prime is as follows:
Generate a random number that has the desired number of bits (ensure the MSB is set).
Align the number on a 30k+1 boundary as all primes have this property.
Run a primality test (the slow part) on your number; if it passes, you're done, if not, add to the number to get to the next 30k+1 boundary and repeat. A "quick" primality test is to check against low primes and then use Miller-Rabin (see the Handbook of Applied Cryptography 4.24).
Step #3 can run for a long time -- and that's usually pretty undesirable with JavaScript (w/node or in the browser). To mitigate this, you can attempt to limit the amount time spent doing primality tests to some acceptable period of time (N milliseconds) or you can use Web Workers to background the process. Of course, both of these approaches complicate the code.
Here's some code for generating a 4096-bit random prime that shouldn't block the main thread:
var forge = require('node-forge');
var BigInteger = forge.jsbn.BigInteger;
// primes are 30k+i for i = 1, 7, 11, 13, 17, 19, 23, 29
var GCD_30_DELTA = [6, 4, 2, 4, 2, 4, 6, 2];
var THIRTY = new BigInteger(null);
THIRTY.fromInt(30);
// generate random BigInteger
var num = generateRandom(4096);
// find prime nearest to random number
findPrime(num, function(num) {
console.log('random', num.toString(16));
});
function generateRandom(bits) {
var rng = {
// x is an array to fill with bytes
nextBytes: function(x) {
var b = forge.random.getBytes(x.length);
for(var i = 0; i < x.length; ++i) {
x[i] = b.charCodeAt(i);
}
}
};
var num = new BigInteger(bits, rng);
// force MSB set
var bits1 = bits - 1;
if(!num.testBit(bits1)) {
var op_or = function(x,y) {return x|y;};
num.bitwiseTo(BigInteger.ONE.shiftLeft(bits1), op_or, num);
}
// align number on 30k+1 boundary
num.dAddOffset(31 - num.mod(THIRTY).byteValue(), 0);
return num;
}
function findPrime(num, callback) {
/* Note: All primes are of the form 30k+i for i < 30 and gcd(30, i)=1. The
number we are given is always aligned at 30k + 1. Each time the number is
determined not to be prime we add to get to the next 'i', eg: if the number
was at 30k + 1 we add 6. */
var deltaIdx = 0;
// find prime nearest to 'num' for 100ms
var start = Date.now();
while(Date.now() - start < 100) {
// do primality test (only 2 iterations assumes at
// least 1251 bits for num)
if(num.isProbablePrime(2)) {
return callback(num);
}
// get next potential prime
num.dAddOffset(GCD_30_DELTA[deltaIdx++ % 8], 0);
}
// keep trying (setImmediate would be better here)
setTimeout(function() {
findPrime(num, callback);
});
}
Various tweaks can be made to adjust it for your needs, like setting the amount of time (which is just an estimate) to run the primality tester before bailing to try again on the next scheduled tick. You'd probably want some kind of UI feedback each time it bails. If you're using node or a browser that supports setImmediate you can use that instead of setTimeout as well to avoid clamping to speed things up. But, note that it's going to take a while to generate a 4096-bit random prime in JavaScript (at least at the time of this writing).
Forge also has a Web Worker implementation for generating RSA keys that is intended to speed up the process by letting multiple threads run the primality test using different inputs. You can look at the forge source (prime.worker.js for instance) to see that in action, but it's a project in itself to get working properly. IMO, though, it's the best way to speed things up.
Anyway, hopefully the above code will help you. I'd run it with a smaller bitsize to test it.
It does more work then you specifically require but you can always use forge to generate a key pair and extract one of the primes from that.
//generate a key pair of required size
var keyPair = forge.pki.rsa.generateKeyPair(4096);
//at this point we have 2 primes p and q in the privateKey
var p = keyPair.privateKey.p;
var q = keyPair.privateKey.q;
The type of p and q are BigInteger they have a p.toByteArray() method to access their representations as a byte array.
If you decide to implement your own method, you may want to read Close to Uniform Prime Number Generation With Fewer Random Bits which has discussion and algorithms for faster generation of well-distributed large n-bit primes. The FIPS 186-4 publication also has a lot of information including algorithms for Shawe-Taylor proven prime construction.
dlongley's answer uses the "PRIMEINC" method, which is efficient but not a good distribution (this may or may not matter to you, and either way he's given a nice framework to use). Note that FIPS recommends a lot of M-R tests (this can be mitigated if your library includes a Lucas or BPSW test).
Re: proven primes, my experience using GMP is that up to at least 8192 bits, both Shawe-Taylor and Maurer's FastPrime are slower than using Fouque and Tibouchi algorithm A1 combined with BPSW + additional M-R tests. Your mileage may vary, and of course the proven prime methods get a proven prime as a result.

What is the performance of Objects/Arrays in JavaScript? (specifically for Google V8)

Performance associated with Arrays and Objects in JavaScript (especially Google V8) would be very interesting to document. I find no comprehensive article on this topic anywhere on the Internet.
I understand that some Objects use classes as their underlying data structure. If there are a lot of properties, it is sometimes treated as a hash table?
I also understand that Arrays are sometimes treated like C++ Arrays (i.e. fast random indexing, slow deletion and resizing). And, other times, they are treated more like Objects (fast indexing, fast insertion/removal, more memory). And, maybe sometimes they are stored as linked lists (i.e. slow random indexing, fast removal/insertion at the beginning/end)
What is the precise performance of Array/Object retrievals and manipulations in JavaScript? (specifically for Google V8)
More specifically, what it the performance impact of:
Adding a property to an Object
Removing a property from an Object
Indexing a property in an Object
Adding an item to an Array
Removing an item from an Array
Indexing an item in an Array
Calling Array.pop()
Calling Array.push()
Calling Array.shift()
Calling Array.unshift()
Calling Array.slice()
Any articles or links for more details would be appreciated, as well. :)
EDIT: I am really wondering how JavaScript arrays and objects work under the hood. Also, in what context does the V8 engine "know" to "switch-over" to another data structure?
For example, suppose I create an array with...
var arr = [];
arr[10000000] = 20;
arr.push(21);
What's really going on here?
Or... what about this...???
var arr = [];
//Add lots of items
for(var i = 0; i < 1000000; i++)
arr[i] = Math.random();
//Now I use it like a queue...
for(var i = 0; i < arr.length; i++)
{
var item = arr[i].shift();
//Do something with item...
}
For conventional arrays, the performance would be terrible; whereas, if a LinkedList was used... not so bad.
I created a test suite, precisely to explore these issues (and more) (archived copy).
And in that sense, you can see the performance issues in this 50+ test case tester (it will take a long time).
Also as its name suggest, it explores the usage of using the native linked list nature of the DOM structure.
(Currently down, rebuilt in progress) More details on my blog regarding this.
The summary is as followed
V8 Array is Fast, VERY FAST
Array push / pop / shift is ~approx 20x+ faster than any object equivalent.
Surprisingly Array.shift() is fast ~approx 6x slower than an array pop, but is ~approx 100x faster than an object attribute deletion.
Amusingly, Array.push( data ); is faster than Array[nextIndex] = data by almost 20 (dynamic array) to 10 (fixed array) times over.
Array.unshift(data) is slower as expected, and is ~approx 5x slower than a new property adding.
Nulling the value array[index] = null is faster than deleting it delete array[index] (undefined) in an array by ~approx 4x++ faster.
Surprisingly Nulling a value in an object is obj[attr] = null ~approx 2x slower than just deleting the attribute delete obj[attr]
Unsurprisingly, mid array Array.splice(index,0,data) is slow, very slow.
Surprisingly, Array.splice(index,1,data) has been optimized (no length change) and is 100x faster than just splice Array.splice(index,0,data)
unsurprisingly, the divLinkedList is inferior to an array on all sectors, except dll.splice(index,1) removal (Where it broke the test system).
BIGGEST SURPRISE of it all [as jjrv pointed out], V8 array writes are slightly faster than V8 reads =O
Note: These metrics applies only to large array/objects which v8 does not "entirely optimise out". There can be very isolated optimised performance cases for array/object size less then an arbitrary size (24?). More details can be seen extensively across several google IO videos.
Note 2: These wonderful performance results are not shared across browsers, especially
*cough* IE. Also the test is huge, hence I yet to fully analyze and evaluate the results : please edit it in =)
Updated Note (dec 2012): Google representatives have videos on youtubes describing the inner workings of chrome itself (like when it switches from a linkedlist array to a fixed array, etc), and how to optimize them. See GDC 2012: From Console to Chrome for more.
At a basic level that stays within the realms of JavaScript, properties on objects are much more complex entities. You can create properties with setters/getters, with differing enumerability, writability, and configurability. An item in an array isn't able to be customized in this way: it either exists or it doesn't. At the underlying engine level this allows for a lot more optimization in terms of organizing the memory that represents the structure.
In terms of identifying an array from an object (dictionary), JS engines have always made explicit lines between the two. That's why there's a multitude of articles on methods of trying to make a semi-fake Array-like object that behaves like one but allows other functionality. The reason this separation even exists is because the JS engines themselves store the two differently.
Properties can be stored on an array object but this simply demonstrates how JavaScript insists on making everything an object. The indexed values in an array are stored differently from any properties you decide to set on the array object that represents the underlying array data.
Whenever you're using a legit array object and using one of the standard methods of manipulating that array you're going to be hitting the underlying array data. In V8 specifically, these are essentially the same as a C++ array so those rules will apply. If for some reason you're working with an array that the engine isn't able to determine with confidence is an array, then you're on much shakier ground. With recent versions of V8 there's more room to work though. For example, it's possible to create a class that has Array.prototype as its prototype and still gain efficient access to the various native array manipulation methods. But this is a recent change.
Specific links to recent changes to array manipulation may come in handy here:
http://code.google.com/p/v8/source/detail?r=10024
http://code.google.com/p/v8/source/detail?r=9849
http://code.google.com/p/v8/source/detail?r=9747
As a bit of extra, here's Array Pop and Array Push directly from V8's source, both implemented in JS itself:
function ArrayPop() {
if (IS_NULL_OR_UNDEFINED(this) && !IS_UNDETECTABLE(this)) {
throw MakeTypeError("called_on_null_or_undefined",
["Array.prototype.pop"]);
}
var n = TO_UINT32(this.length);
if (n == 0) {
this.length = n;
return;
}
n--;
var value = this[n];
this.length = n;
delete this[n];
return value;
}
function ArrayPush() {
if (IS_NULL_OR_UNDEFINED(this) && !IS_UNDETECTABLE(this)) {
throw MakeTypeError("called_on_null_or_undefined",
["Array.prototype.push"]);
}
var n = TO_UINT32(this.length);
var m = %_ArgumentsLength();
for (var i = 0; i < m; i++) {
this[i+n] = %_Arguments(i);
}
this.length = n + m;
return this.length;
}
I'd like to complement existing answers with an investigation to the question of how implementations behave regarding growing arrays: If they implement them the "usual" way, one would see many quick pushes with rare, interspersed slow pushes at which point the implementation copies the internal representation of the array from one buffer to a larger one.
You can see this effect very nicely, this is from Chrome:
16: 4ms
40: 8ms 2.5
76: 20ms 1.9
130: 31ms 1.7105263157894737
211: 14ms 1.623076923076923
332: 55ms 1.5734597156398105
514: 44ms 1.5481927710843373
787: 61ms 1.5311284046692606
1196: 138ms 1.5196950444726811
1810: 139ms 1.5133779264214047
2731: 299ms 1.5088397790055248
4112: 341ms 1.5056755767118273
6184: 681ms 1.5038910505836576
9292: 1324ms 1.5025873221216042
Even though each push is profiled, the output contains only those that take time above a certain threshold. For each test I customized the threshold to exclude all the pushes that appear to be representing the fast pushes.
So the first number represents which element has been inserted (the first line is for the 17th element), the second is how long it took (for many arrays the benchmark is done for in parallel), and the last value is the division of the first number by that of the one in the former line.
All lines that have less than 2ms execution time are excluded for Chrome.
You can see that Chrome increases array size in powers of 1.5, plus some offset to account for small arrays.
For Firefox, it's a power of two:
126: 284ms
254: 65ms 2.015873015873016
510: 28ms 2.0078740157480315
1022: 58ms 2.003921568627451
2046: 89ms 2.0019569471624266
4094: 191ms 2.0009775171065494
8190: 364ms 2.0004885197850513
I had to put the threshold up quite a bit in Firefox, that's why we start at #126.
With IE, we get a mix:
256: 11ms 256
512: 26ms 2
1024: 77ms 2
1708: 113ms 1.66796875
2848: 154ms 1.6674473067915691
4748: 423ms 1.6671348314606742
7916: 944ms 1.6672283066554338
It's a power of two at first and then it moves to powers of five thirds.
So all common implementations use the "normal" way for arrays (instead of going crazy with ropes, for example).
Here's the benchmark code and here's the fiddle it's in.
var arrayCount = 10000;
var dynamicArrays = [];
for(var j=0;j<arrayCount;j++)
dynamicArrays[j] = [];
var lastLongI = 1;
for(var i=0;i<10000;i++)
{
var before = Date.now();
for(var j=0;j<arrayCount;j++)
dynamicArrays[j][i] = i;
var span = Date.now() - before;
if (span > 10)
{
console.log(i + ": " + span + "ms" + " " + (i / lastLongI));
lastLongI = i;
}
}
While running under node.js 0.10 (built on v8) I was seeing CPU usage that seemed excessive for the workload. I traced one performance problem to a function that was checking for the existence of a string in an array. So I ran some tests.
loaded 90,822 hosts
loading config took 0.087 seconds (array)
loading config took 0.152 seconds (object)
Loading 91k entries into an array (with validate & push) is faster than setting obj[key]=value.
In the next test, I looked up every hostname in the list one time (91k iterations, to average the lookup time):
searching config took 87.56 seconds (array)
searching config took 0.21 seconds (object)
The application here is Haraka (a SMTP server) and it loads the host_list once at startup (and after changes) and subsequently performs this lookup millions of times during operation. Switching to an object was a huge performance win.

Secure random numbers in javascript?

How do I generate cryptographically secure random numbers in javascript?
There's been discussion at WHATWG on adding this to the window.crypto object. You can read the discussion and check out the proposed API and webkit bug (22049).
Just tested the following code in Chrome to get a random byte:
(function(){
var buf = new Uint8Array(1);
window.crypto.getRandomValues(buf);
alert(buf[0]);
})();
In order, I think your best bets are:
window.crypto.getRandomValues or window.msCrypto.getRandomValues
The sjcl library's randomWords function (http://crypto.stanford.edu/sjcl/)
The isaac library's random number generator (which is seeded by Math.random, so not really cryptographically secure) (https://github.com/rubycon/isaac.js)
window.crypto.getRandomValues has been implemented in Chrome for a while now, and relatively recently in Firefox as well. Unfortunately, Internet Explorer 10 and before do not implement the function. IE 11 has window.msCrypto, which accomplishes the same thing. sjcl has a great random number generator seeded from mouse movements, but there's always a chance that either the mouse won't have moved sufficiently to seed the generator, or that the user is on a mobile device where there is no mouse movement whatsoever. Thus, I recommend having a fallback case where you can still get a non-secure random number if there is no choice. Here's how I've handled this:
function GetRandomWords (wordCount) {
var randomWords;
// First we're going to try to use a built-in CSPRNG
if (window.crypto && window.crypto.getRandomValues) {
randomWords = new Int32Array(wordCount);
window.crypto.getRandomValues(randomWords);
}
// Because of course IE calls it msCrypto instead of being standard
else if (window.msCrypto && window.msCrypto.getRandomValues) {
randomWords = new Int32Array(wordCount);
window.msCrypto.getRandomValues(randomWords);
}
// So, no built-in functionality - bummer. If the user has wiggled the mouse enough,
// sjcl might help us out here
else if (sjcl.random.isReady()) {
randomWords = sjcl.random.randomWords(wordCount);
}
// Last resort - we'll use isaac.js to get a random number. It's seeded from Math.random(),
// so this isn't ideal, but it'll still greatly increase the space of guesses a hacker would
// have to make to crack the password.
else {
randomWords = [];
for (var i = 0; i < wordCount; i++) {
randomWords.push(isaac.rand());
}
}
return randomWords;
};
You'll need to include sjcl.js and isaac.js for that implementation, and be sure to start the sjcl entropy collector as soon as your page is loaded:
sjcl.random.startCollectors();
sjcl is dual-licensed BSD and GPL, while isaac.js is MIT, so it's perfectly safe to use either of those in any project. As mentioned in another answer, clipperz is another option, however for whatever bizarre reason, it is licensed under the AGPL. I have yet to see anyone who seems to understand what implications that has for a JavaScript library, but I'd universally avoid it.
One way to improve the code I've posted might be to store the state of the isaac random number generator in localStorage, so it isn't reseeded every time the page is loaded. Isaac will generate a random sequence, but for cryptography purposes, the seed is all-important. Seeding with Math.random is bad, but at least a little less bad if it isn't necessarily on every page load.
You can for instance use mouse movement as seed for random numbers, read out time and mouse position whenever the onmousemove event happens, feed that data to a whitening function and you will have some first class random at hand. Though do make sure that user has moved the mouse sufficiently before you use the data.
Edit: I have myself played a bit with the concept by making a password generator, I wouldn't guarantee that my whitening function is flawless, but being constantly reseeded I'm pretty sure that it's plenty for the job: ebusiness.hopto.org/generator.htm
Edit2: It now sort of works with smartphones, but only by disabling touch functionality while the entropy is gathered. Android won't work properly any other way.
Use window.crypto.getRandomValues, like this:
var random_num = new Uint8Array(2048 / 8); // 2048 = number length in bits
window.crypto.getRandomValues(random_num);
This is supported in all modern browsers and uses the operating system's random generator (e.g. /dev/urandom). If you need IE11 compatibility, you have to use their prefixed implementation viavar crypto = window.crypto || window.msCrypto; crypto.getRandomValues(..) though.
Note that the window.crypto API can also generate keys outright, which may be the better option.
Crypto-strong
to get cryptographic strong number from range [0, 1) (similar to Math.random()) use crypto:
let random = ()=> crypto.getRandomValues(new Uint32Array(1))[0]/2**32;
console.log( random() );
You might want to try
http://sourceforge.net/projects/clipperzlib/
It has an implementation of Fortuna which is a cryptographically secure random number generator. (Take a look at src/js/Clipperz/Crypto/PRNG.js). It appears to use the mouse as a source of randomness as well.
First of all, you need a source of entropy. For example, movement of the mouse, password, or any other. But all of these sources are very far from random, and guarantee you 20 bits of entropy, rarely more. The next step that you need to take is to use the mechanism like "Password-Based KDF" it will make computationally difficult to distinguish data from random.
Many years ago, you had to implement your own random number generator and seed it with entropy collected by mouse movement and timing information. This was the Phlogiston Era of JavaScript cryptography. These days we have window.crypto to work with.
If you need a random integer, random-number-csprng is a great choice. It securely generates a series of random bytes and then converts it into an unbiased random integer.
const randomInt = require("random-number-csprng");
(async function() {
let random = randomInt(10, 30);
console.log(`Your random number: ${random}`);
})();
If you need a random floating point number, you'll need to do a little more work. Generally, though, secure randomness is an integer problem, not a floating point problem.
I know i'm late to the party, but if you don't want to deal with the math of getting a cryptographically secure random value, i recommend using rando.js. it's a super small 2kb library that'll give you a decimal, pick something from an array, or whatever else you want- all cryptographically secure.
It's on npm too.
Here's a sample I copied from the GitHub, but it does more than this if you want to go there and read about it more.
console.log(rando()); //a floating-point number between 0 and 1 (could be exactly 0, but never exactly 1)
console.log(rando(5)); //an integer between 0 and 5 (could be 0 or 5)
console.log(rando(5, 10)); //a random integer between 5 and 10 (could be 5 or 10)
console.log(rando(5, "float")); //a floating-point number between 0 and 5 (could be exactly 0, but never exactly 5)
console.log(rando(5, 10, "float")); //a floating-point number between 5 and 10 (could be exactly 5, but never exactly 10)
console.log(rando(true, false)); //either true or false
console.log(rando(["a", "b"])); //{index:..., value:...} object representing a value of the provided array OR false if array is empty
console.log(rando({a: 1, b: 2})); //{key:..., value:...} object representing a property of the provided object OR false if object has no properties
console.log(rando("Gee willikers!")); //a character from the provided string OR false if the string is empty. Reoccurring characters will naturally form a more likely return value
console.log(rando(null)); //ANY invalid arguments return false
<script src="https://randojs.com/2.0.0.js"></script>
If you need large amounts, here's what I would do:
// Max value of random number length
const randLen = 16384
var randomId = randLen
var randomArray = new Uint32Array(randLen)
function random32() {
if (randomId === randLen) {
randomId = 0
return crypto.getRandomValues(randomArray)[randomId++] * 2.3283064365386963e-10
}
return randomArray[randomId++] * 2.3283064365386963e-10
}
function random64() {
if (randomId === randLen || randomId === randLen - 1) {
randomId = 0
crypto.getRandomValues(randomArray)
}
return randomArray[randomId++] * 2.3283064365386963e-10 + randomArray[randomId++] * 5.421010862427522e-20
}
console.log(random32())
console.log(random64())

Categories