Optimizing hash table implementation to accommodate large amount of elements - javascript

Consider the following scenario:
One million clients visit a store and pay an amount of money using their credit card. The credit card codes are generated using a 16-digit number, and replacing 4 of its digits (randomly) with the characters 'A', 'B', 'C', 'D'. The 16-digit number is generated randomly once, and is used for every credit card, the only change between cards being the positions in the string of the aforementioned characters (that's ~40k possible distinct codes).
I have to organize the clients in a hash table, using a hash function of my choosing and also using open addressing (linear probing) to deal with the collisions. Once organized in the table, I have to find the client who
paid the most money during his purchases.
visited the store the most times.
My implementation of the hash table is as follows, and seems to be working correctly for the test of 1000 clients. However once I increase the number of clients to 10000 the page never finishes loading. This is a big issue since the total number of "shopping sessions" has to be one million, and I am not even getting close to that number.
class HashTable{
constructor(size){
this.size = size;
this.items = new Array(this.size);
this.collisions = 0;
}
put(k, v){
let hash = polynomial_evaluation(k);
//evaluating the index to the array
//using modulus a prime number (size of the array)
//This works well as long as the numbers are uniformly
//distributed and sparse.
let index = hash%this.size;
//if the array position is empty
//then fill it with the value v.
if(!this.items[index]){
this.items[index] = v;
}
//if not, search for the next available position
//and fill that with value v.
//if the card already is in the array,
//update the amount paid.
//also increment the collisions variable.
else{
this.collisions++;
let i=1, found = false;
//while the array at index is full
//check whether the card is the same,
//and if not then calculate the new index.
while(this.items[index]){
if(this.items[index] == v){
this.items[index].increaseAmount(v.getAmount());
found = true;
break;
}
index = (hash+i)%this.size;
i++;
}
if(!found){
this.items[index] = v;
}
found = false;
}
return index;
}
get(k){
let hash = polynomial_evaluation(k);
let index = hash%this.size, i=1;
while(this.items[index] != null){
if(this.items[index].getKey() == k){
return this.items[index];
}
else{
index = (hash+i)%this.size;
i++;
}
}
return null;
}
findBiggestSpender(){
let max = {getAmount: function () {
return 0;
}};
for(let item of this.items){
//checking whether the specific item is a client
//since many of the items will be null
if(item instanceof Client){
if(item.getAmount() > max.getAmount()){
max = item;
}
}
}
return max;
}
findMostFrequentBuyer(){
let max = {getTimes: function () {
return 0;
}};
for(let item of this.items){
//checking whether the specific item is a client
//since many of the items will be null
if(item instanceof Client){
if(item.getTimes() > max.getTimes()){
max = item;
}
}
}
return max;
}
}
To key I use to calculate the index to the array is a list of 4 integers ranging from 0 to 15, denoting the positions of 'A', 'B', 'C', 'D' in the string
Here's the hash function I am using:
function polynomial_evaluation(key, a=33){
//evaluates the expression:
// x1*a^(d-1) + x2*a^(d-2) + ... + xd
//for a given key in the form of a tuple (x1,x2,...,xd)
//and for a nonzero constant "a".
//for the evaluation of the expression horner's rule is used:
// x_d + a*(x_(d-1) + a(x_(d-2) + .... + a*(x_3 + a*(x_2 + a*x1))... ))
//defining a new key with the elements of the
//old times 2,3,4 or 5 depending on the position
//this helps for "spreading" the values of the keys
let nKey = [key[0]*2, key[1]*3, key[2]*4, key[3]*5];
let sum=0;
for(let i=0; i<nKey.length; i++){
sum*=a;
sum+=nKey[i];
}
return sum;
}
The values corresponding to the keys generated by the hash function are instances of a Client class which contains the fields amount (the amount of money paid), times (the times this particular client shopped), key (the array of 4 integers mentioned above), as well as getter functions for those fields. In addition there's a method that increases the amount when the same client appears more than once.
The size of the hash table is 87383 (a prime number) and the code in my main file looks like this:
//initializing the clients array
let clients = createClients(10000);
//creating a new hash table
let ht = new HashTable(N);
for(let client of clients){
ht.put(client.getKey(), client);
}
This keeps running until google chrome gives a "page not responding" error. Is there any way I can make this faster? Is my approach on the subject (perhaps even my choice of language) correct?
Thanks in advance.

The page is not responding since the main (UI) thread is locked. Use a WebWorker or ServiceWorker to handle the calculations, and post them as messages to the main thread.
Regarding optimizing your code, one thing I see is in findBiggestSpender. I'll break it down line-by-line.
let max = {getAmount: function () {
return 0;
}};
This is a waste. Just assign a local variable, no need to keep calling max.getAmount() in every iteration.
for(let item of this.items){
The fastest way to iterate a list in Javascript is with a cached length for loop: for (let item, len = this.items.length; i < len; i ++)
if(item instanceof Client){
This is slower than a hard null check, just use item !== null.

Related

(how) can I tell my JavaScript to append a variable/column to the data Qualtrics exports?

I've implemented a task in Qualtrics that randomly selects a number between 0 and 3, and then selects a corresponding word pool to sample 5 words from. To be able to analyze these data, though, I need to know which 5 words (or at minimum, the index number or name of the word pool being sampled from) is presented to each respondent. Is there a way to implement the recording of this information within JavaScript? Ideally this information would show up when I use Qualtrics' native "export" options, but if I have to somehow create a second spreadsheet with this treatment data, that works just fine as well.
Qualtrics.SurveyEngine.addOnload(function()
{
// first, create four arrays for the four word pools used in task
var wordpool1 = []
var wordpool2 = []
var wordpool3 = []
var wordpool4 = []
// assemble word list arrays into one array, with index 0-3
let masterwordlist = [wordpool1, wordpool2, wordpool3, wordpool4]
// function that randomly chooses an integer between x and y
function randomInteger(min, max) {
return Math.floor(Math.random() * (max - min + 1)) + min;
}
// function that shuffles (randomizes) a word list array (Fisher-Yates shuffle )
function shuffle(target){
for (var i = target.length - 1; i > 0; i--){
var j = Math.floor(Math.random() * (i + 1));
var temp = target[i];
target[i] = target[j];
target[j] = temp;
}
return target;
}
// function that chooses 5 words from a shuffled word list array, returns those 5 words as array
function pickWords(target) {
var randomwords = shuffle(target)
return randomwords.slice(0, 5);
}
// top-level function
function genWords(masterlist){
var x = randomInteger(0, 3)
return pickWords(masterlist[x])
}
// actually running the function
randomwords = genWords(masterwordlist)
// save final output as embedded qualtrics data
Qualtrics.SurveyEngine.setEmbeddedData("randomwords", randomwords);
Is there a way I can have this code record (within Qualtrics or otherwise) which values var x or var randomwords take on?
EDIT: I found another answer on here which may be relevant. According to this answer, though, it looks like I have all the code needed to record my variable selection; do I simply need to set embedded data within the survey flow, as well?
See here: Is it possible to save a variable from javascript to the qualtrics dataset?
Yes, you need to define the embedded data field randomwords in the survey flow.

How to generate a new random number (that's different from the previous random number)

I'm trying to change the following (that currently returns a random number from an array), so that each random number is different from the last one chosen.
function randomize(arr) {
return arr[Math.floor(Math.random()*arr.length)];
}
oracleImg = [];
for (var i=1;i<=6;i++) {
oracleImg.push(i);
}
randOracleImg = randomize(oracleImg);
I tried the following, but it's not always giving me a number different from the last number.
function randomize(arr) {
var arr = Math.floor(Math.random()*arr.length);
if(arr == this.lastSelected) {
randomize();
}
else {
this.lastSelected = arr;
return arr;
}
}
How can I fix this?
Your existing function's recursive randomize() call doesn't make sense because you don't pass it the arr argument and you don't do anything with its return value. That line should be:
return randomize(arr);
...except that by the time it gets to that line you have reassigned arr so that it no longer refers to the original array. Using an additional variable as in the following version should work.
Note that I've also added a test to make sure that if the array has only one element we return that item immediately because in that case it's not possible to select a different item each time. (The function returns undefined if the array is empty.)
function randomize(arr) {
if (arr.length < 2) return arr[0];
var num = Math.floor(Math.random()*arr.length);
if(num == this.lastSelected) {
return randomize(arr);
} else {
this.lastSelected = num;
return arr[num];
}
}
document.querySelector("button").addEventListener("click", function() {
console.log(randomize(["a","b","c","d"]));
});
<button>Test</button>
Note that your original function seemed to be returning a random array index, but the code shown in my answer returns a random array element.
Note also that the way you are calling your function means that within the function this is window - not sure if that's what you intended; it works, but basically lastSelected is a global variable.
Given that I'm not keen on creating global variables needlessly, here's an alternative implementation with no global variables, and without recursion because in my opinion a simple while loop is a more semantic way to implement the concept of "keep trying until x happens":
var randomize = function () {
var lastSelected, num;
return function randomize(arr) {
if (arr.length < 2) return arr[0];
while (lastSelected === (num = Math.floor(Math.random()*arr.length)));
lastSelected = num;
return arr[num];
};
}();
document.querySelector("button").addEventListener("click", function() {
console.log(randomize(["a","b","c","d"]));
});
<button>Test</button>
Below code is just an example, it will generate 99 numbers and all will be unique and random (Range is 0-1000), logic is simple just add random number in a temporary array and compare new random if it is already generated or not.
var tempArray = [];
var i=0;
while (i != 99) {
var random = Math.floor((Math.random() * 999) + 0);
if (tempArray.indexOf(random)==-1) {
tempArray.push(random);
i++;
} else {
continue;
}
}
console.log(tempArray);
here is a version which will ensure a random number that is always different from the last one. additionally you can control the max and min value of the generated random value. defaults are max: 100 and min: 1
var randomize = (function () {
var last;
return function randomize(min, max) {
max = typeof max != 'number' ? 100 : max;
min = typeof min != 'number' ? 1 : min;
var random = Math.floor(Math.random() * (max - min)) + min;
if (random == last) {
return randomize(min, max);
}
last = random;
return random;
};
})();
If you want to ALWAYS return a different number from an array then don't randomize, shuffle instead!*
The simplest fair (truly random) shuffling algorithm is the Fisher-Yates algorithm. Don't make the same mistake Microsoft did and try to abuse .sort() to implement a shuffle. Just implement Fisher-Yates (otherwise known as the Knuth shuffle):
// Fisher-Yates shuffle:
// Note: This function shuffles in-place, if you don't
// want the original array to change then pass a copy
// using [].slice()
function shuffle (theArray) {
var tmp;
for (var i=0; i<theArray.length;i++) {
// Generate random index into the array:
var j = Math.floor(Math.random()*theArray.length);
// Swap current item with random item:
tmp = theArray[i];
theArray[j] = theArray[i];
theArray[i] = tmp;
}
return theArray;
}
So just do:
shuffledOracleImg = shuffle(oracleImg.slice());
var i=0;
randOracleImg = shuffledOracleImg[i++]; // just get the next image
// to get a random image
How you want to handle running out of images is up to you. Media players like iTunes or the music player on iPhones, iPads and iPods give users the option of stop playing or repeat from beginning. Some card game software will reshuffle and start again.
*note: One of my pet-peeves is music player software that randomize instead of shuffle. Randomize is exactly the wrong thing to do because 1. some implementations don't check if the next song is the same as the current song so you get a song played twice (what you seem to want to avoid) and 2. some songs end up NEVER getting played. Shuffling and playing the shuffled playlist from beginning to end avoids both problems. CD player manufacturers got it right. MP3 player developers tend to get it wrong.

Give structure to 'random' function js?

I have an array and a function that picks randomly elements from this array and displays them in a div.
My array:
var testarray = [A, B, C, D, E, F];
Part of the js function:
var new_word = testarray[Math.floor((Math.random()*testarray.length)+1)];
$("#stimuli").text(new_word);
My question is, is there a way I can have them picked randomly in a certain ratio/order?
For example, that if I have my function executed 12 times, that each of the six letters is displayed exactly twice, and that there can never be the same letter displayed twice in a row?
You might want to try a quasi-random sequence. These sequences have the properties you're after. http://en.wikipedia.org/wiki/Low-discrepancy_sequence
Edit:
To your question in the comment: Of course there are hundreds ways to solve a problem. Think about using artificial intelligence, a mathematical algorithm or the answers given by others here. It depends on what you really want to achieve. I just gave a robust solution that is easy to understand and implement..
Here's another (different approach), same result but with the prevention that values displays twice in a row.
Jsfiddle: http://jsfiddle.net/kychan/jJE7F/
Code:
function StructuredRandom(arr, nDisplay)
{
// storage array.
this.mVar = [];
this.previous;
// add it in the storage.
for (var i in arr)
for (var j=0; j<nDisplay; j++)
this.mVar.push(arr[i]);
// shuffle it, making it 'random'.
for(var a, b, c = this.mVar.length; c; a = Math.floor(Math.random() * c), b = this.mVar[--c], this.mVar[c] = this.mVar[a], this.mVar[a] = b);
// call this when you want the next item.
this.next = function()
{
// default value if empty.
if (this.mVar.length==0) return 0;
// if this is the last element...
if (this.mVar.length==1)
{
// we must give it..
return this.mVar.pop();
// or give a default value,
// because we can't 'control' re-occuring values.
return -1;
}
// fetch next element.
var element = this.mVar.pop();
// check if this was already given before.
if (element==this.previous)
{
// put it on top if so.
this.mVar.unshift(element);
// call the function again for next number.
return this.next();
}
// set 'previous' for next call.
this.previous = element;
// give an element if not.
return element;
};
}
NOTE: In this example we can't fully control that the same values are displayed twice.. This is because we can control the first numbers, but when there is only one number left to display, we must either give it or display a default value for it, thus there is a chance that the same value is shown.
Good luck!
Like this?
var arr = [1,2,3,4,5,6,7], // array with random values.
maxDispl = 2, // max display.
arr2 = init(arr) // storage.
;
// create object of given array.
function init(arr)
{
var pop = [];
for (var i in arr)
{
pop.push({value:arr[i], displayed:0});
}
return pop;
}
// show random number using global var arr2.
function showRandom()
{
// return if all numbers has been given.
if (arr2.length<1) return;
var randIndex= Math.floor(Math.random()*arr2.length);
if (arr2[randIndex].displayed<maxDispl)
{
document.getElementById('show').innerHTML+=arr2[randIndex].value + ', ';
arr2[randIndex].displayed++;
}
else
{
// remove from temp array.
arr2.splice(randIndex, 1);
// search for a new random.
showRandom();
}
}
// iterate the function *maxDispl plus random.
var length = (arr.length*maxDispl) + 2;
for (var i=0; i<length; i++)
{
showRandom();
}
jsfiddle: http://jsfiddle.net/kychan/JfV77/3/

Detecting datatype of each column in javascript prepending

I am working on an algorithm that detect the datatype of each column and prepend it to the top of the list. So I have say 2D matrix and
1) I need to detect the datatype of each element
2) Count the sum of each type of matrix column wise
3) Get the maximum of each type column wise
4) Prepend the type from 3 at the top of respective column
So here is the example
1) For first, I know 2 techniques. i.e via jquery
$.each(row, function (index,item) {
alert(typeof(item)); //result is object instead of specific type
2) matrix traversing
for (var i=0;i<data[i].length;i++) {
for (var j=0;j<data1.length;j++) {
var r = data1[j][i];
if(isFinite(r) == true && )
numeric++ ;
else {str++;}
I know this is not the best method, but it is working for me well in giving number of string and numeric types
I know there is a an unshift() that prepend data at the top of a list..But, still not sure how will it work for me.
Any help and suggestions.
Not sure what your end goal is. One would likely say a column is string if any of the values is a string, and number only if all items are numbers.
Be careful with polluting global space with variables. Usually you would confine then to closed scopes.
More strings or more numbers also have to account for cases where the count is equal. E.g. five numbers and five strings.
The following could be a start:
// Should be expanded to fully meet needs.
function isNumber(n) {
return !isNaN(n + 0) /* && varius */;
}
// Count types of data in column.
// Return array with one object for each column.
function arrTypeStat(ar) {
var type = [], i, j;
// Loop by column
for (i = 0; i < ar[0].length; ++i) {
// Type-stats for column 'i'
type[i] = {num: 0, str: 0};
// Loop rows for current column.
for (j = 1; j < ar.length; ++j) {
if (isNumber(ar[j][i]))
++type[i].num;
else
++type[i].str;
}
}
return type;
}
// Add a row after header with value equal to 'num' or 'str' depending
// on which type is most frequent. Favoring string if count is equal.
function arrAddColType(ar) {
var type = arrTypeStat(ar);
// Add new array as second item in 'ar'
// as in: add a new (empty) row. This is the row to be filled
// with data types.
ar.splice(1, 0, []);
// Note:
// Favour string over number if count is equal.
for (i = 0; i < type.length; ++i) {
ar[1][i] = type[i].num > type[i].str ? 'num' : 'str';
}
return ar;
}
Sample fiddle. Use console, F12, to view result of test. Note that in this sample the numbers are represented both as quoted and un-quoted.
// Sample array.
var arrTest = [
['album','artist','price'],
[50,5,'50'],
[5,5,'5'],
['Lateralus', 'Tool', '13'],
['Ænima','Tool','12'],
['10,000 days','Tool',14]
];
// Test
var test = arrAddColType(arrTest);
// Log
console.log(
JSON.stringify(test)
.replace(/],\[/g, '],\n['));
Yield:
[["album","artist","price"],
["str","str","num"],
[50,5,"50"],
[5,5,"5"],
["Lateralus","Tool","13"],
["Ænima","Tool","12"],
["10,000 days","Tool",14]]

Generating random unique data takes too long and eats 100% CPU

WARNING: CPU Usage goes to 100%, be careful.
Link to the jsFiddle
This script has been written to design a dynamic snake and ladder board. Everytime the page is refreshed a new board is created. Most of the time all of the background images do not appear, and the CPU usage goes up to 100%. But on occasion all of them appear and the CPU usage is normal.
Opera shows some of the background images, Firefox lags and asks me if I wish to stop the script.
I believe that the problem is with these lines of code:
for(var key in origin) // Need to implement check to ensure that two keys do not have the same VALUES!
{
if(origin[key] == random_1 || origin[key] == random_2 || key == random_2) // End points cannot be the same AND starting and end points cannot be the same.
{
valFlag = 1;
}
console.log(key);
}
Your algorithm is very ineffective. When array is almost filled up, you literally do millions of useless iterations until you're in luck and RNG accidentally picks missing number. Rewrite it to:
Generate an array of all possible numbers - from 1 to 99.
When you need a random numbers, generate a random index in current bounds of this array, splice element and this random position, removing it from array and use its value as your desired random number.
If generated numbers don't fit some of your conditions (minDiff?) return them back to array. Do note, that you can still stall in loop forever if everything that is left in array is unable to fit your conditions.
Every value you pull from array in this way is guaranteed to be unique, since you originally filled it with unique numbers and remove them on use.
I've stripped drawing and placed generated numbers into array that you can check in console. Put your drawing back and it should work - numbers are generated instantly now:
var snakes = ['./Images/Snakes/snake1.png','./Images/Snakes/snake2.jpg','./Images/Snakes/snake3.gif','./Images/Snakes/snake4.gif','./Images/Snakes/snake5.gif','./Images/Snakes/snake6.jpg'];
var ladders = ['./Images/Ladders/ladder1.jpg','./Images/Ladders/ladder2.jpg','./Images/Ladders/ladder3.png','./Images/Ladders/ladder4.jpg','./Images/Ladders/ladder5.png'];
function drawTable()
{
// Now generating snakes.
generateRand(snakes,0);
generateRand(ladders,1);
}
var uniqNumbers = []
for(var idx = 1; idx < 100; idx++){ uniqNumbers.push(idx) }
var results = []
function generateRand(arr,flag)
{
var valFlag = 0;
var minDiff = 8; // Minimum difference between start of snake/ladder to its end.
var temp;
for(var i = 0; i< arr.length; ++i) {
var valid = false
// This is the single place it still can hang, through with current size of arrays it is highly unlikely
do {
var random_1 = uniqNumbers.splice(Math.random() * uniqNumbers.length, 1)[0]
var random_2 = uniqNumbers.splice(Math.random() * uniqNumbers.length, 1)[0]
if (Math.abs(random_1 - random_2) < minDiff) {
// return numbers
uniqNumbers.push(random_1)
uniqNumbers.push(random_2)
} else {
valid = true
}
} while (!valid);
if(flag == 0) // Snake
{
if(random_1 < random_2) // Swapping them if the first number is smaller than the second number.
{
var temp = random_1; random_1 = random_2; random_2 = temp
}
}
else // Ladders
{
if(random_1>random_2) // Swapping them if the first number is greater than the second number.
{
var temp = random_1; random_1 = random_2; random_2 = temp
}
}
// Just for debug - look results up on console
results.push([random_1, random_2])
}
}
drawTable()
I had a problem like this using "HighCharts", in a for loop - "browsers" have an in-built functionality to detect dead scripts or infinite loops. So the browsers halts or pop-ups up a message saying not responding. Not sure if you have that symptom!
This was resulted from a "loop" with a large pool of data. I wrote a tutorial on it on CodeProject, you might try it, and it might be your answer.
http://www.codeproject.com/Tips/406739/Preventing-Stop-running-this-script-in-Browsers

Categories