CouchDB / PouchDB Pass values to MapReduce

CouchDB / PouchDB Pass values to MapReduce - javascript

So, I'm being a bit creative in searching through persistent graphical objects in HTML5 canvas by using PouchDB and MapRepduce. (I'm trying to tell if the user's clicked on that object with simple bounding box logic) That part isn't all that important; it might be silly, but I just want to do it, because I'm a dork like that.
That said, I want to pass a pair of custom values to the reducer function of my PouchDB query. I'm unsure how to do that, exactly.
Here's kinda what I'm doing right now:
var x = evt.clientX, y = evt.clientY
var map = function (doc) {
emit('bbox',
{
x0: doc.x,
x1: doc.x + doc.w,
y0: doc.y,
y1: doc.y + doc.h
}
)
}
var reduce = function (keys, values, rereduce) {
return values.forEach(function (bbox) {
if (x >= bbox.x0 && x < bbox.x1 && y >= bbox.y0 && y < bbox.y1) {
return true
}
})
}
var result = db.px.query({map: map, reduce: reduce}, function (err, rsp) {
cb(rsp)
})
It doesn't work right now because the reduce function can't access the x and y values since, for some reason, they are inaccessible from the scope in which the functions are run. So, I need to pass them to Pouch/Couch through that query method, I think. I'm kinda stuck here.

Shouldn't it be sufficient to have a "createReduce" function that makes the x and y parameters more accessible?
var createReduce = function(x,y) { return function(keys, values, rereduce) {...} }

It seemed I was trying to do the impossible, and I really didn't need the persistence for this particular application. Furthermore, more complex graphics-oriented spatial queries will require a more specialized data persistence and manipulation framework. I'll be working on such a thing in the future.
That said, here's some helpful filter-based bbox code to help out any intrepid coders out there who might be walking down my own path in the future.
var db = {
// 2d, canvas-related data store and methods
px: {
store: [],
add: function (obj) {
db.px.store.push(obj)
},
pick: function (x, y) {
return db.px.store.filter(function (obj) {
return x >= obj.x
&& x < obj.x + obj.w * gfx.px.ratio
&& y >= obj.y
&& y < obj.y + obj.h * gfx.px.ratio
})
}
}
}
This works over graphical objects formatted like so:
{
x: 0, y: 0, w: 0, h: 0
}
I hope that's helpful for anyone who tries to do what I'm doing in the future.
Also, the gfx.px.ratio variable is for hi-DPI (retina) display compatibility.

Related

Performance of findIndex and possible alternative for large coordinate array in JavaScript

I have an array of vectors, myCoords, with x and y coordinates, of size larger than 50,000.
I am looking to find the index value of the vector in myCoord having the same coordinate (x, y) as vector, myVector. Note that myVector always exists in myCoords.
The following line of code returns the index value I am looking for. I was wondering if there is any faster way to obtain the same result. I have to perform this search thousand of time and realized it is significantly slowing down my script.
myIndex = myCoords.findIndex(a => a.x === myVector.x && a.y === myVector.y);

here is my no-brainer improvement (using deno, but you can use it in any flavor you like):
import { runBenchmarks, bench } from "https://deno.land/std/testing/bench.ts";
bench({
name: "Array.prototype.findIndex",
runs: 10_000,
func(b): void {
const arr = Array(100_000)
.fill(null)
.map(() => ({ x: Math.random(), y: Math.random() } as const));
const lastItem = arr[arr.length - 1];
b.start();
arr.findIndex((item) => item.x === lastItem.x && item.y === lastItem.y);
b.stop();
},
});
function findIndex(
arrX: Float64Array,
arrY: Float64Array,
x: number,
y: number
) {
const length = arrX.length;
for (let i = 0; i < length; i++) {
if (arrX[i] === x && arrY[i] === y) {
return i;
}
}
return -1;
}
bench({
name: "typed array findIndex",
runs: 10_000,
func(b): void {
const arrX = new Float64Array(100_000).map(Math.random);
const arrY = new Float64Array(100_000).map(Math.random);
const lastItemX = arrX[arrX.length - 1];
const lastItemY = arrY[arrY.length - 1];
b.start();
findIndex(arrX, arrY, lastItemX, lastItemY);
b.stop();
},
});
await runBenchmarks();
results:
running 2 benchmarks ...
benchmark Array.prototype.findIndex ...
10000 runs avg: 1.1294ms
benchmark typed array findIndex ...
10000 runs avg: 0.1306ms
benchmark result: DONE. 2 measured; 0 filtered
why do we have such improvement here?
well, technically you could go under O(n) by using eg. binary search (which has O(log n)), but if you don't want to bother just optimize how your code runs by using more performant data structures,
first - don't use closures they are not very performant,
second - notice I've used two typed array here: coordinates are numbers, so let's work on numbers - however I don't know what kind of coordinates you have there so you can optimize that a bit by narrowing the type,
additionally maybe you could try Object.is instead of === operator, but I'm not sure if there would be any gains

How to get median and quartiles/percentiles of an array in JavaScript (or PHP)?

This question is turned into a Q&A, because I had struggle finding the answer, and think it can be useful for others
I have a JavaScript array of values and need to calculate in JavaScript its Q2 (50th percentile aka MEDIAN), Q1 (25th percentile) and Q3 (75th percentile) values.

I updated the JavaScript translation from the first answer to use arrow functions and a bit more concise notation. The functionality remains mostly the same, except for std, which now computes the sample standard deviation (dividing by arr.length - 1 instead of just arr.length)
// sort array ascending
const asc = arr => arr.sort((a, b) => a - b);
const sum = arr => arr.reduce((a, b) => a + b, 0);
const mean = arr => sum(arr) / arr.length;
// sample standard deviation
const std = (arr) => {
const mu = mean(arr);
const diffArr = arr.map(a => (a - mu) ** 2);
return Math.sqrt(sum(diffArr) / (arr.length - 1));
};
const quantile = (arr, q) => {
const sorted = asc(arr);
const pos = (sorted.length - 1) * q;
const base = Math.floor(pos);
const rest = pos - base;
if (sorted[base + 1] !== undefined) {
return sorted[base] + rest * (sorted[base + 1] - sorted[base]);
} else {
return sorted[base];
}
};
const q25 = arr => quantile(arr, .25);
const q50 = arr => quantile(arr, .50);
const q75 = arr => quantile(arr, .75);
const median = arr => q50(arr);

After searching for a long time, finding different versions that give different results, I found this nice snippet on Bastian Pöttner's web blog, but for PHP. For the same price, we get the average and standard deviation of the data (for normal distributions)...
PHP Version
//from https://blog.poettner.de/2011/06/09/simple-statistics-with-php/
function Median($Array) {
return Quartile_50($Array);
}
function Quartile_25($Array) {
return Quartile($Array, 0.25);
}
function Quartile_50($Array) {
return Quartile($Array, 0.5);
}
function Quartile_75($Array) {
return Quartile($Array, 0.75);
}
function Quartile($Array, $Quartile) {
sort($Array);
$pos = (count($Array) - 1) * $Quartile;
$base = floor($pos);
$rest = $pos - $base;
if( isset($Array[$base+1]) ) {
return $Array[$base] + $rest * ($Array[$base+1] - $Array[$base]);
} else {
return $Array[$base];
}
}
function Average($Array) {
return array_sum($Array) / count($Array);
}
function StdDev($Array) {
if( count($Array) < 2 ) {
return;
}
$avg = Average($Array);
$sum = 0;
foreach($Array as $value) {
$sum += pow($value - $avg, 2);
}
return sqrt((1 / (count($Array) - 1)) * $sum);
}
Based on the author's comments, I simply wrote a JavaScript translation that will certainly be useful, because surprisingly, it is nearly impossible to find a JavaScript equivalent on the web, and otherwise requires additional libraries like Math.js
JavaScript Version
//adapted from https://blog.poettner.de/2011/06/09/simple-statistics-with-php/
function Median(data) {
return Quartile_50(data);
}
function Quartile_25(data) {
return Quartile(data, 0.25);
}
function Quartile_50(data) {
return Quartile(data, 0.5);
}
function Quartile_75(data) {
return Quartile(data, 0.75);
}
function Quartile(data, q) {
data=Array_Sort_Numbers(data);
var pos = ((data.length) - 1) * q;
var base = Math.floor(pos);
var rest = pos - base;
if( (data[base+1]!==undefined) ) {
return data[base] + rest * (data[base+1] - data[base]);
} else {
return data[base];
}
}
function Array_Sort_Numbers(inputarray){
return inputarray.sort(function(a, b) {
return a - b;
});
}
function Array_Sum(t){
return t.reduce(function(a, b) { return a + b; }, 0);
}
function Array_Average(data) {
return Array_Sum(data) / data.length;
}
function Array_Stdev(tab){
var i,j,total = 0, mean = 0, diffSqredArr = [];
for(i=0;i<tab.length;i+=1){
total+=tab[i];
}
mean = total/tab.length;
for(j=0;j<tab.length;j+=1){
diffSqredArr.push(Math.pow((tab[j]-mean),2));
}
return (Math.sqrt(diffSqredArr.reduce(function(firstEl, nextEl){
return firstEl + nextEl;
})/tab.length));
}

TL;DR
The other answers appear to have solid implementations of the "R-7" version of computing quantiles. Below is some context and another JavaScript implementation borrowed from D3 using the same R-7 method, with the bonuses that this solution is es5 compliant (no JavaScript transpilation required) and probably covers a few more edge cases.
Existing solution from D3 (ported to es5/"vanilla JS")
The "Some Background" section, below, should convince you to grab an existing implementation instead of writing your own.
One good candidate is D3's d3.array package. It has a quantile function that's essentially BSD licensed:
https://github.com/d3/d3-array/blob/master/src/quantile.js
I've quickly created a pretty straight port from es6 into vanilla JavaScript of d3's quantileSorted function (the second function defined in that file) that requires the array of elements to have already been sorted. Here it is. I've tested it against d3's own results enough to feel it's a valid port, but your experience might differ (let me know in the comments if you find a difference, though!):
Again, remember that sorting must come before the call to this function, just as in D3's quantileSorted.
//Credit D3: https://github.com/d3/d3-array/blob/master/LICENSE
function quantileSorted(values, p, fnValueFrom) {
var n = values.length;
if (!n) {
return;
}
fnValueFrom =
Object.prototype.toString.call(fnValueFrom) == "[object Function]"
? fnValueFrom
: function (x) {
return x;
};
p = +p;
if (p <= 0 || n < 2) {
return +fnValueFrom(values[0], 0, values);
}
if (p >= 1) {
return +fnValueFrom(values[n - 1], n - 1, values);
}
var i = (n - 1) * p,
i0 = Math.floor(i),
value0 = +fnValueFrom(values[i0], i0, values),
value1 = +fnValueFrom(values[i0 + 1], i0 + 1, values);
return value0 + (value1 - value0) * (i - i0);
}
Note that fnValueFrom is a way to process a complex object into a value. You can see how that might work in a list of d3 usage examples here -- search down where .quantile is used.
The quick version is if the values are tortoises and you're sorting tortoise.age in every case, your fnValueFrom might be x => x.age. More complicated versions, including ones that might require accessing the index (parameter 2) and entire collection (parameter 3) during the value calculation, are left up to the reader.
I've added a quick check here so that if nothing is given for fnValueFrom or if what's given isn't a function the logic assumes the elements in values are the actual sorted values themselves.
Logical comparison to existing answers
I'm reasonably sure this reduces to the same version in the other two answers (see "The R-7 Method", below), but if you needed to justify why you're using this to a product manager or whatever maybe the below will help.
Quick comparison:
function Quartile(data, q) {
data=Array_Sort_Numbers(data); // we're assuming it's already sorted, above, vs. the function use here. same difference.
var pos = ((data.length) - 1) * q; // i = (n - 1) * p
var base = Math.floor(pos); // i0 = Math.floor(i)
var rest = pos - base; // (i - i0);
if( (data[base+1]!==undefined) ) {
// value0 + (i - i0) * (value1 which is values[i0+1] - value0 which is values[i0])
return data[base] + rest * (data[base+1] - data[base]);
} else {
// I think this is covered by if (p <= 0 || n < 2)
return data[base];
}
}
So that's logically close/appears to be exactly the same. I think d3's version that I ported covers a few more edge/invalid conditions and includes the fnValueFrom integration, both of which could be useful.
The R-7 Method vs. "Common Sense"
As mentioned in the TL;DR, the answers here, according to d3.array's readme, all use the "R-7 method".
This particular implementation [from d3] uses the R-7 method, which is the default for the R programming language and Excel.
Since the d3.array code matches the other answers here, we can safely say they're all using R-7.
Background
After a little sleuthing on some math and stats StackExchange sites (1, 2), I found that there are "common sensical" ways of calculating each quantile, but that those don't typically mesh up with the results of the nine generally recognized ways to calculate them.
The answer at that second link from stats.stackexchange says of the common-sensical method that...
Your textbook is confused. Very few people or software define quartiles this way. (It tends to make the first quartile too small and the third quartile too large.)
The quantile function in R implements nine different ways to compute quantiles!
I thought that last bit was interesting, and here's what I dug up on those nine methods...
Wikipedia's description of those nine methods here, nicely grouped in a table
An article from the Journal of Statistics Education titled "Quartiles in Elementary Statistics"
A blog post at SAS.com called "Sample quantiles: A comparison of 9 definitions"
The differences between d3's use of "method 7" (R-7) to determine quantiles versus the common sensical approach is demonstrated nicely in the SO question "d3.quantile seems to be calculating q1 incorrectly", and the why is described in good detail in this post that can be found in philippe's original source for the php version.
Here's a bit from Google Translate (original is in German):
In our example, this value is at the (n + 1) / 4 digit = 5.25, i.e. between the 5th value (= 5) and the 6th value (= 7). The fraction (0.25) indicates that in addition to the value of 5, ¼ of the distance between 5 and 6 is added. Q1 is therefore 5 + 0.25 * 2 = 5.5.
All together, that tells me I probably shouldn't try to code something based on my understanding of what quartiles represent and should borrow someone else's solution.

Based on buboh's answer, which I have used for over a year, I have noticed some weird things for calculating the Q1 and Q3 when there are 2 numbers in the middle.
I have no clue why there is a rest value and how it is used, but by my understanding if you and up having 2 numbers in the middle you need to take the average of them to calculate the median. With that in mind I edited the function:
const asc = (arr) => arr.sort((a, b) => a - b);
const quantile = (arr, q) => {
const sorted = asc(arr);
let pos = (sorted.length - 1) * q;
if (pos % 1 === 0) {
return sorted[pos];
}
pos = Math.floor(pos);
if (sorted[pos + 1] !== undefined) {
return (sorted[pos] + sorted[pos + 1]) / 2;
}
return sorted[pos];
};

can two variables operated equate to a constant in JavaScript?

I need to have two variables, x and y, and an equation involving both that always equals one.
This is a simple example. This example may be easy to define in terms of y, like y = 1/x, but I have another equation I need to use, and it is too hard to define in terms of y.
var x, y;
x*y = 1;
x = Math.random()*20;
console.log(y);
I would get an error message for this, like
Uncaught ReferenceError: Invalid left-hand side in assignment (line 2)
a variable could be defined as x*y, like var z= x*y, but apparently not a constant. Maybe there is some way around this, like defining two variables, one as the constant and the other as the equation and finding some way to relate them? Maybe javascript's looseness will allow for some new technique?
Thanks ahead of time! :)

You can't do this directly. The value of x depends on the value of y. You must describe their dependency.
You can do something like:
var x_y = (function () {
var x = 1, y = 1;
return {
set_x: function (new_x) {
x = new_x;
y = 1 / new_x;
},
set_y: function (new_y) {
y = new_y;
x = 1 / new_y;
},
get_x: function () {
return x;
},
get_y: function () {
return y;
}
};
}());
x_y.set_x(Math.random()*20);
console.log(x_y.get_y());
So essentially we're saying x = 1 / y; and y = 1 / x; and we couple the values: if you change one, the other changes as well.

Javascript Functions using unicode operator signs as names?

I was looking at this: What characters are valid for JavaScript variable names?
And I had a thought, and I can't decide if it is a good idea or not...
Say you have some object, for example a Vector (x, y, z):
Vector = (function() {
function Vector(x, y, z) {
this.x = x != null ? x : 0;
this.y = y != null ? y : 0;
this.z = z != null ? z : 0;
if (isNaN(this.x) || isNaN(this.y) || isNaN(this.z)) {
throw new Error("Vector contains a NaN");
}
}
return Vector;
})();
and you wanted to add some Vectors, possibly with a function like this:
this.add = function(v) {
return new Vector(this.x + v.x, this.y + v.y, this.z + v.z);
};
Would it be awful, or awesome, to declare the addition function as this["\u002b"] = function... (or even just this["+"] = function...), and use it like:
var v1 = new Vector(1, 2, 3)
var v2 = new Vector(2, 3, 4)
var v3 = v1["+"](v2)
Obviously, for "add", theres no actual gain in terms of the code size (I think it might be slighly more readable than v1.add(v2), but something such as "multiply" would be,
var v1 = new Vector(1, 2, 3);
v1["×"](2);
or for other objects something like "integrate" could be:
integrand["∫"](0, 1000)
Thoughts? Is it wonderfully wonderful, or horribly hacky? How reliable are unicode characters across browsers? Would it be safe to use in node.js?

Difficulties in converting an recursive algorithm into an iterative one

I've been trying to implement a recursive backtracking maze generation algorithm in javascript. These were done after reading a great series of posts on the topic here
While the recursive version of the algorithm was a no brainer, the iterative equivalent has got me stumped.
I thought I understood the concept, but my implementation clearly produces incorrect results. I've been trying to pin down a bug that might be causing it, but I am beginning to believe that my problems are being caused by a failure in logic, but of course I am not seeing where.
My understanding of the iterative algorithm is as follows:
A stack is created holding representations of cell states.
Each representation holds the coordinates of that particular cell, and a list of directions to access adjacent cells.
While the stack isn't empty iterate through the directions on the top of the stack, testing adjacent cells.
If a valid cell is found place it at the top of the stack and continue with that cell.
Here is my recursive implementation ( note: keydown to step forward ): http://jsbin.com/urilan/14
And here is my iterative implementation ( once again, keydown to step forward ): http://jsbin.com/eyosij/2
Thanks for the help.
edit: I apologize if my question wasn't clear. I will try to further explain my problem.
When running the iterative solution various unexpected behaviors occur. First and foremost, the algorithm doesn't exhaust all available options before backtracking. Rather, it appears to be selecting cells at a random when there is one valid cell left. Overall however, the movement doesn't appear to be random.
var dirs = [ 'N', 'W', 'E', 'S' ];
var XD = { 'N': 0, 'S':0, 'E':1, 'W':-1 };
var YD = { 'N':-1, 'S':1, 'E':0, 'W': 0 };
function genMaze(){
var dirtemp = dirs.slice().slice(); //copies 'dirs' so its not overwritten or altered
var path = []; // stores path traveled.
var stack = [[0,0, shuffle(dirtemp)]]; //Stack of instances. Each subarray in 'stacks' represents a cell
//and its current state. That is, its coordinates, and which adjacent cells have been
//checked. Each time it checks an adjacent cell a direction value is popped from
//from the list
while ( stack.length > 0 ) {
var current = stack[stack.length-1]; // With each iteration focus is to be placed on the newest cell.
var x = current[0], y = current[1], d = current[2];
var sLen = stack.length; // For testing whether there is a newer cell in the stack than the current.
path.push([x,y]); // Store current coordinates in the path
while ( d.length > 0 ) {
if( stack.length != sLen ){ break;}// If there is a newer cell in stack, break and then continue with that cell
else {
var cd = d.pop();
var nx = x + XD[ cd ];
var ny = y + YD[ cd ];
if ( nx >= 0 && ny >= 0 && nx < w && ny < h && !cells[nx][ny] ){
dtemp = dirs.slice().slice();
cells[nx][ny] = 1;
stack.push( [ nx, ny, shuffle(dtemp) ] ); //add new cell to the stack with new list of directions.
// from here the code should break from the loop and start again with this latest addition being considered.
}
}
}
if (current[2].length === 0){stack.pop(); } //if all available directions have been tested, remove from stack
}
return path;
}
I hope that helps clear up the question for you. If it is still missing any substance please let me know.
Thanks again.

I'm not very good in javascript, but I try to implement your recursive code to iterative. You need to store For index on stack also. So code look like:
function genMaze(cx,cy) {
var dirtemp = dirs; //copies 'dirs' so its not overwritten
var path = []; // stores path traveled.
var stack = [[cx, cy, shuffle(dirtemp), 0]]; // we also need to store `for` indexer
while (stack.length > 0) {
var current = stack[stack.length - 1]; // With each iteration focus is to be placed on the newest cell.
var x = current[0], y = current[1], d = current[2], i = current[3];
if (i > d.length) {
stack.pop();
continue;
}
stack[stack.length - 1][3] = i + 1; // for next iteration
path.push([x, y]); // Store current coordinates in the path
cells[x][y] = 1;
var cd = d[i];
var nx = x + XD[cd];
var ny = y + YD[cd];
if (nx >= 0 && ny >= 0 && nx < w && ny < h && !cells[nx][ny]) {
dtemp = dirs;
stack.push([nx, ny, shuffle(dtemp), 0]);
}
}
return path;
}

Does this little code could also help ?
/**
Examples
var sum = tco(function(x, y) {
return y > 0 ? sum(x + 1, y - 1) :
y < 0 ? sum(x - 1, y + 1) :
x
})
sum(20, 100000) // => 100020
**/
function tco(f) {
var value, active = false, accumulated = []
return function accumulator() {
accumulated.push(arguments)
if (!active) {
active = true
while (accumulated.length) value = f.apply(this, accumulated.shift())
active = false
return value
}
}
}
Credits, explanations ans more infos are on github https://gist.github.com/1697037
Is has the benefit to not modifying your code, so it could be applied in other situations too. Hope that helps :)

We Keep Coding

JavaScript is the programming language of the Web.

CouchDB / PouchDB Pass values to MapReduce - javascript

Shouldn't it be sufficient to have a "createReduce" function that makes the x and y parameters more accessible? var createReduce = function(x,y) { return function(keys, values, rereduce) {...} }

Related

Performance of findIndex and possible alternative for large coordinate array in JavaScript

How to get median and quartiles/percentiles of an array in JavaScript (or PHP)?

can two variables operated equate to a constant in JavaScript?

Javascript Functions using unicode operator signs as names?

Difficulties in converting an recursive algorithm into an iterative one

Categories

Resources