How to convert an array to a matrix and vice versa? - javascript

I was trying to solve a 2D matrix problem, and stumble upon the matrix-array convert formula:
r * c matrix convert to an array => matrix[x][y] => arr[x * c + y]
an array convert to r * c matrix => arr[x] =>matrix[x / c][x % c]
But I can't seem to understand why it works like this. Can someone explain how the conversion works? I mean why the conversions manipulate the columns(c)? And why using subtraction and remainder([x / c][x % c])? And why x * c + y?
I was solving the problem in JavaScript.

First have a look at "Row Major Vs Column Major Order: 2D arrays access in programming languages" on Computing Science Stack Overflow as an insight to the main ways an array can be serialized:
Row order stores all the values for a row contiguously in serialized output, starting with first row, before proceeding to the next row.
Column order stores all the values for a column in serialized output, starting with the first column before proceeding to the next.
Formula Confusion
The quoted formula creates output in row order because it multiplies one of the indexes by the number of columns, not the number of rows, which means index x is being used as the row index. This is nonsensical in a spread sheet context where "x" means column and "y" means row.
JavaScript Formula for row order storage of a row/column matrix
matrix[row][column] is serialized to arr[row * c + col]
arr[index] is restored to matrix[ Math.floor(index/c), index % c]
where calculating the row in de-serialization requires truncation of the division result since there is no integral division operator in JavaScript to match that used in the quoted formula.
Typical JavaScript Application
imageData store canvas pixel data in row order:
Pixels then proceed from left to right, then downward, throughout the [serialized] array.
It's not that you can't use x and y variable names to access image pixel data, but bear in mind x is a column index, and y is the row number from the top.

Related

Finding a pattern in an array that is not always consistant

I have an ordered data set of decimal numbers. This data is always similar - but not always the same. The expected data is a few, 0 - 5 large numbers, followed by several (10 - 90) average numbers then follow by smaller numbers. There are cases where a large number may be mixed into the average numbers' See the following arrays.
let expectedData = [35.267,9.267,9.332,9.186,9.220,9.141,9.107,9.114,9.098,9.181,9.220,4.012,0.132];
let expectedData = [35.267,32.267,9.267,9.332,9.186,9.220,9.141,9.107,30.267,9.114,9.098,9.181,9.220,4.012,0.132];
I am trying to analyze the data by getting the average without high numbers on front and low numbers on back. The middle high/low are fine to keep in the average. I have a partial solution below. Right now I am sort of brute forcing it but the solution isn't perfect. On smaller datasets the first average calculation is influenced by the large number.
My question is: Is there a way to handle this type of problem, which is identifying patterns in an array of numbers?
My algorithm is:
Get an average of the array
Calculate an above/below average value
Remove front (n) elements that are above average
remove end elements that are below average
Recalculate average
In JavaScript I have: (this is partial leaving out below average)
let total= expectedData.reduce((rt,cur)=> {return rt+cur;}, 0);
let avg = total/expectedData.length;
let aboveAvg = avg*0.1+avg;
let remove = -1;
for(let k=0;k<expectedData.length;k++) {
if(expectedData[k] > aboveAvg) {
remove=k;
} else {
if(k==0) {
remove = -1;//no need to remove
}
//break because we don't want large values from middle removed.
break;
}
}
if(remove >= 0 ) {
//remove front above average
expectedData.splice(0,remove+1);
}
//remove belows
//recalculate average
I believe you are looking for some outlier detection Algorithm. There are already a bunch of questions related to this on Stack overflow.
However, each outlier detection algorithm has its own merits.
Here are a few of them
https://mathworld.wolfram.com/Outlier.html
High outliers are anything beyond the 3rd quartile + 1.5 * the inter-quartile range (IQR)
Low outliers are anything beneath the 1st quartile - 1.5 * IQR
Grubbs's test
You can check how it works for your expectations here
Apart from these 2, the is a comparison calculator here . You can visit this to use other Algorithms per your need.
I would have tried to get a sliding window coupled with an hysteresis / band filter in order to detect the high value peaks, first.
Then, when your sliding windows advance, you can add the previous first value (which is now the last of analyzed values) to the global sum, and add 1 to the number of total values.
When you encounter a peak (=something that causes the hysteresis to move or overflow the band filter), you either remove the values (may be costly), or better, you set the value to NaN so you can safely ignore it.
You should keep computing a sliding average within your sliding window in order to be able to auto-correct the hysteresis/band filter, so it will reject only the start values of a peak (the end values are the start values of the next one), but once values are stabilized to a new level, values will be kept again.
The size of the sliding window will set how much consecutive "stable" values are needed to be kept, or in other words how much UNstable values are rejected when you reach a new level.
For that, you can check the mode of the values (rounded) and then take all the numbers in a certain range around the mode. That range can be taken from the data itself, for example by taking the 10% of the max - min value. That helps you to filter your data. You can select the percent that fits your needs. Something like this:
let expectedData = [35.267,9.267,9.332,9.186,9.220,9.141,9.107,9.114,9.098,9.181,9.220,4.012,0.132];
expectedData.sort((a, b) => a - b);
/// Get the range of the data
const RANGE = expectedData[ expectedData.length - 1 ] - expectedData[0];
const WINDOW = 0.1; /// Window of selection 10% from left and right
/// Frequency of each number
let dist = expectedData.reduce((acc, e) => (acc[ Math.floor(e) ] = (acc[ Math.floor(e) ] || 0) + 1, acc), {});
let mode = +Object.entries(dist).sort((a, b) => b[1] - a[1])[0][0];
let newData = expectedData.filter(e => mode - RANGE * WINDOW <= e && e <= mode + RANGE * WINDOW);
console.log(newData);

Storing a Minecraft-like map as a JavaScript array

I'm writing a map editor that converts a 3D space into JavaScript arrays so that it can be exported to a JSON file.
Each map will have a 2D plane acting as a ground layer (the user will have to specify X and Y size), then to add height, the user can place blocks on top of this 2D plane, following a X & Y grid (similar to Minecraft).
My idea was to have an array for each Z layer, and fill it with the information about which blocks are placed there. Because the X and Y sizes of the map must be specified, a simple array should do the trick, as to read the map you would simply loop for each Z layer array and fill the map with its contents, which would be another array. Creating rows defined by the X and Y size of the ground layer.
I know you can fill arrays like layer[165] = grassBlock
after you declare them, Which would make everything before index 165 empty and thus save space. But in a JSON format, wouldn't that array have 164 zeroes or nulls before it reaches this index?
Is this even the most efficient way to store a 3D space? I'm trying to minimize map size and speed up load time as much as possible.
If you only have block/empty then a single bit is sufficient and you can use a single array Javascript typed array for the matrix.
Assuming size of the matrix is X, Y and Z then the conversion from coordinates (x, y, z) to array index could be:
index = (x*Y + y)*Z + z;
then the map could be stored as a single Uint8Array object initialized with length (X*Y*Z + 7) >> 3 (each of the bytes will give you 8 bits but you need to round up).
To read/write a single bit you can finally use
bit = (matrix[index >> 3] >> (index & 7)) & 1; // Read element
matrix[index >> 3] |= 1 << (index & 7); // Set element to 1
matrix[index >> 3] &= ~(1 << (index & 7)); // Clear element to 0
If instead you need to store a logical ID and there are no more than 256 distinct values (including "empty") then a single byte per element is enough. The index computation is as above but you can use as size X*Y*Z and then simply read/write element with matrix[index].
If more than 256 but less than 65537 distinct values are needed then a Uint16Array can be used.
If most of the elements do not carry specific data except the class (e.g. they're just "air", "land", "water") and only a small percentage require much more then may be a byte map with a value for "other" and then just a dictionary mapping (x,y,z) to data only for "other" blocks is a resonable approach (very simple code, still fast access and update).
Note that while Javascript has data types to store binary data efficiently unfortunately JSON doesn't provide a type to send/receive arbitrary bytes (not text) over the network and you'll need to convert to and load from base64 encoding or something similar (if you want to use JSON).

Very large plots - generated on the fly?

I have two very simple functions in python that calculate the size of a struct (C struct, numpy struct, etc) given the range of numbers you want to store. So, if you wanted to store numbers from 0 to 8389798, 8389798 would be the value you feed the function:
def ss(value):
nbits = 8
while value > 2**nbits:
nbits += 8
return nbits * value
def mss(value):
total_size = 0
max_bits = [(0,0)] # (bits for 1 row in struct, max rows in struct)
while value > 2 ** max_bits[-1][0] :
total_size += max_bits[-1][0] * max_bits[-1][1]
value -= max_bits[-1][1]
new_struct_bits = max_bits[-1][0]+8
max_bits.append( (new_struct_bits,2**new_struct_bits) )
total_size += max_bits[-1][0] * value
#print max_bits
return total_size
ss is for a single struct where you need as many bytes in the first row to store "1" as you would in the last row to store "8389798". However, this is not as space efficient as breaking your struct up into structs of 1 byte, 2 bytes, 3 bytes, etc, up until N bytes needed to store your value. This is what mss calculates.
So now i want to see how much more efficient mss is over ss for a range of values - that range being 1 to, say, 100 billion. That's much to much data to save and plot, and it's totally unnecessary to do so in the first place. Far better to take the plot window, and for every value of X that has a pixel in that window, calculate the value of y [which is ss(x) - mss(x)].
This sort of interactive graph is really the only way i can think of to look at the relationship between mss and ss. Does anyone know how i should plot such a graph? I'm willing to use a JavaScript solution because I can rewrite the python to that, as well as use "solutions" like Excel, R, Wolfram, if they offer a way to do interactive/generated graphs.

How is linear interpolation of data sets usually implemented?

Suppose if you are given a bunch of points in (x,y) values and you need to generate points by linearly interpolate between the 2 nearest values in the x axis, what is the fastest implementation to do so?
I searched around but I was unable to find a satisfactory answer, I feel its because I wasnt searching for the right words.
For example, if I was given (0,0) (0.5 , 1) (1, 0.5), then I want to get a value at 0.7; it would be (0.7-0.5)/(1-0.5) * (0.5-1) + 1; but what data structure would allow me to find the 2 nearest key values to interpolate in between? Is a simple linear search/ binary search if I have many key values the best I could do?
The way I usually implement O(1) interpolation is by means of an additional data structure, which I call IntervalSelector that in time O(1) will give the two surrounding values of the sequence that have to be interpolated.
An IntervalSelector is a class that, when given a sequence of n abscissas builds and remembers a table that will map any given value of x to the index i such that sequence[i] <= x < sequence[i+1] in time O(1).
Note: In what follows arrays are 1 based.
The algorithm that builds the table proceeds as follow:
Find delta to be the minimum distance between two consecutive elements in the input sequence of abscissas.
Set count := (b-a)/delta + 1, where a and b are respectively the first and last of the (ascending) sequence and / stands for the integer quotient of the division.
Define table to be an Array of count elements.
For i between 1 and n set table[(sequence[j]-a)/delta + 1] := j.
Repeat every entry of table visited in 4 to the unvisited positions that come right after it.
On output, table maps j to i if (j-1)*d <= sequence[i] - a < j*d.
Here is an example:
Since elements 3rd and 4th are the closest ones, we divide the interval in subintervals of this smallest length. Now, we remember in the table the positions of the left end of each of these deta-intervals. Later on, when an input x is given, we compute the delta-interval of such x as (x-a)/delta + 1 and use the table to deduce the corresponding interval in the sequence. If x falls to the left of the ith sequence element, we choose the (i-1)th.
More precisely:
Given any input x between a and b calculate j := (x-a)/delta + 1 and i := table[j]. If x < sequence[i] put i := i - 1. Then, the index i satisfies sequence[i] <= x < sequence[i+1]; otherwise the distance between these two consecutive elements would be smaller than delta, which is not.
Remark: Be aware that if the minimum distance delta between consecutive elements in sequence is too small the table will have too many entries. The simple description I've presented here ignores these pathological cases, which require additional work.
Yes, a simple binary search should do well and will typically suffice.
If you need to get better, you might try interpolation search (has nothing to do with your value interpolation).
If your points are distributed at fixed intervals (like in your example, 0 0.5 1), you can also simply store the values in an array and access them in constant time via their index.

Javascript - dataset too large, need to only include data up to 1000 values that's spaced out evenly

Basically I can only plot 1000 values on a chart but my dataset frequently has more than 1000 values.
So... let's say I have 3000 values - that's easy, every 3rd point is plotted (if i / 3 == 1). What about when it's a number like 2106? I'm trying to plot evenly.
for(var i = 0; i < chartdata.node.length; i++){
//something in here
}
Since your may have more or less than 1000 I would go with something like this
var inc = Math.floor(chartdata.node.length / 1000);
if ( inc==0 )
inc=1;
for ( var i=0; i<chartdata.node.length; i+=inc )
{
}
Exactly 1000 points, slightly irregular spacing
Let A be the number of data points you have (i.e. 2106) and B be the number of data points you want to use (i.e. 1000). In a continuous case, you'd space your plot points at every A/B data points. With discrete data points, you can do the following: maintain a counter C, initialized to zero. For every one of the A input data points, you add B to that counter. If the resulting value is larger than A, you plot the data point and subtract A from the counter. On the whole, you'll have added the value B A times, and subtracted A B times, so you should end up with a zero counter again, having plotted exactly B data items.
You can tweak this to obtain different behaviour at the end points, e.g. to always include the first and last data point. Simply plot the first point unconditionally, then do the above scheme for the remaining points, i.e. with A=2105 and B=999. One benefit of this whole approach is that all of this works in integer arithmetic, so rounding errors will be of no concern to you.
Perfectly regular spacing, but less data points
If even spacing is more important, then you can simply compute the amount by which you want to increment your index for every plot using floor(A/B). Due to the floor function, this will be a smaller number than the fractional resoult would be. In the worst case, a number which is almost two will get rounded down to one, resulting in only slightly more than 500 data points being actually plotted. These will be evenly spaced, though.
You could try something like this (in pseudo-code):
var desired_data_length = 1000;
for (var i = 0; i < desired_data_length; i++)
{
var actual_i = int(float(i) / float(desired_data_length) * float(chartdata.length));
// do something with actual_i as the index
}
This will use desired_data_length number of indices, and will linearly interpolate from [0,desired_data_length) to [0,chartdata.length), which is what you want.
If the data is purely numerical you may try Typed Arrays.

Categories