I have a data set in couchdb with multiple documents that list a timestamp and a set of signals from sensors. In this example I've only used a few different names, but there can be an infinite amount of different names as additional sensors are added to the system. Here's an example of three sample documents:
{ timestamp: 12345,
signals: ["highTemperature", "highPressure"]
}
{ timestamp: 12346,
signals: ["highTemperature"]
}
{ timestamp: 12347,
signals: ["lowPressure", "highTemperature"]
}
What I'd like to be able to do is to get the frequency of each tag. A simple way to do this is to create a map function like this:
function (doc) {
for (var idx in doc.signals) {
emit(doc.signals[idx], 1);
}
Along with a reduce function like this:
function(signal, counts) {
var sum = 0;
for(var i = 0; i < counts.length; i++) {
sum += counts[i];
};
return sum;
}
This will return a nice set of data like this:
{"rows":[
{"key":"highTemperature","value":3},
{"key":"highPressure","value":1},
{"key":"lowPressure","value":1}
]}
This is great if I wanted to know the signal distribution over all time, but I really want to know is the distribution of tags for a subset of data points, say timestamp 12346 - 12349. However, what I can't do is slice the data by timestamp using startkey and endkey because timestamp is not part of a key. If I make timestamp is the key then I can't reduce to get a distribution of signals.
Is there a way to do such a grouping so you reduce on an element that isn't part of the key? Ideally I'd like to specify the grouping interval via a URL parameter such as: /mydb/_design/main/_view/signalsByTime?startkey=12346&endkey=12347 and have it return the distribution of signals for just that time period, like this:
{"rows":[
{"key":"highTemperature","value":2},
{"key":"lowPressure","value":1}
]}
If you want timestamp to be the key and number of possible signals is very small (O(1), lets assume 3 as in your example), then you can emit in map characteristic vector of your signal:
if (doc.signal == "highTemperature") {
emit(doc.timestamp, [1,0,0]);
} else if (doc.signal == "highPressure") {
emit(doc.timestamp, [0,1,0]);
} ...
and sum-up vectors in reduce, possibly like this:
function(keys, values) {
var sum = [0,0,0];
for (v in values) {
for (s in sum) {
sum[s] += values[v][s];
}
}
return sum;
}
Related
I would like to store product information in a key, value array, with the key being the unique product url. Then I would also like to store the visit frequency of each of these products. I will store these objects as window.localStorage items, but that's not very important.
The thing I had in mind was two key value arrays:
//product information
prods["url"] = ["name:product_x,type:category_x,price:50"]
//product visits frequency
freq["url"] = [6]
Then I would like to sort these prods based on the frequency.
Is that possible?
Hope you guys can help! Thanks a lot
Well you seem to have made several strange choices for your data format/structure. But assuming the format of the "prod" is beyond your control but you can choose your data structure, here's one way to do it.
Rather than two objects both using url as a key and having one value field each I've made a single object still keyed on url but with the product and frequency information from each in a field.
Objects don't have any inherent order so rather than sorting the table object I sort the keys, your "url"s ordered by ascending frequency.
To show that it's sorted that way I print it out (not in the same format).
For descending frequency, change data[a].freq - data[b].freq to data[b].freq - data[a].freq
var data = {
"url": {
prod: "name:product_x,type:category_x,price:50",
freq: 6
},
"url2": {
prod: "name:product_y,type:category_y,price:25",
freq: 3
}
};
var sorted = Object.keys(data).sort((a, b) => data[a].freq - data[b].freq);
console.log(sorted.map(k => [data[k].freq, k, data[k].prod]));
There's more than one way to format the data, which would change the shape of the code here.
maybe something like this:
var prods = [
{url:1, val:[{name:'a',type:'x',price:60}]},
{url:2, val:[{name:'b',type:'x',price:30}]},
{url:3, val:[{name:'c',type:'x',price:50}]},
{url:4, val:[{name:'c',type:'x',price:20}]},
{url:5, val:[{name:'c',type:'x',price:10}]},
{url:6, val:[{name:'c',type:'x',price:40}]}
];
var freq = [
{url:1, freq:6},
{url:2, freq:3},
{url:3, freq:5},
{url:4, freq:2},
{url:5, freq:1},
{url:6, freq:4}
];
prods.sort(function (a, b) {
var aU = freq.filter(function(obj) {
return obj.url === a.url;
});
var bU = freq.filter(function(obj) {
return obj.url === b.url;
});
if (aU[0].freq > bU[0].freq) {
return 1;
}
if (aU[0].freq < bU[0].freq) {
return -1;
}
return 0;
});
I have a csv as to which number called which number and the call details(duration and time etc.)
i want to have all the numbers a particular number called in an array.
that array should be an array of documents and so,in each document i can have all the call details also.
so finally i need documents with a "caller" number and a "called" array(that array is as defined above).
for this i had come up with a map reduce solution.(quite basic and intuitive).
but my problem is that i need only distinct numbers that a "caller" number has called.
my current mapreduce script repeats the dialled numbers.
how can i only consider unique numbers during the reduce phase?
my code looks like this:(i enter this in the mongo shell)
db.contacts.mapReduce(
function(){
numbers = [];
value={phone:this.<<called_number>>};
numbers.push(value);
emit(this.<<caller_number>>,{called:numbers});
},
function(key,values) {
result={called:[]};
values.forEach(function (v) {
var i,j;
for(i=0;i<v.called.length;i++) {
var flag=0;
for(j=0;j<result.called.length;j++) {
if(v.called[i].phone==result.called[j].phone){
flag=1;
}
}
if(flag==0) {
result.called.push(v.called[i])
}
}
});
return result;
},
{"query": {},"out":"new_collection"}
)
I understand that the map and reduce functions are java script functions.
so even the javascript coders can help me out here(to create the reduce function).
Try this.
db.contacts.mapReduce(function(){
emit(this.<<caller_number>>, {called:this.<<called_number>>, callDuration:this.<<callDuration>>,...});}
,function(key,values)
{
var map = {};
var called=values.filter(function removeDuplicated(it){
if (!map[it.called]){
map[it.called] = 1;
return true;
}
return false;
})
return {caller:key, called:called};},
{"query": {},"out":"new_collection"})
I have 'user' document with array named 'Orders'. Every order has properties like 'title', 'date', 'fee'. I would like to calculate the sum of every Order fee for every user in the database.
This is the map function:
map: function(doc) {
if (doc.Doc_type && doc.Doc_type === 'user' && doc.Orders) {
for (var i = 0; i < doc.Orders.length; i++) {
emit([doc.Orders[i].Order_date], doc.Orders[i].Fee);
}
}
}
And the reduce function:
reduce: function (keys, values, rereduce){
var sum = 0;
for(var i=0,fee;fee=values[i];i++){
sum+=fee;
}
return {
Transactions: sum,
Revenue: 10
};
}
The result I get is:
{"Transactions":"0[object Object][object Object]","Revenue":10}}
For simplicity I don't use your particular data structure but the names should be self-explaining.
Emit the key/value pair :username/:fee (maybe you have to parse it into an integer/float) for every order like you do it in your code (map):
for(every_order) {
emit(username, fee)
}
You will get 1 or more order-rows for every username in the index. If you request the view with ?key=":username" you will get all order-rows in the result array of the response. Now you could summarize the fees client-side e.g. in the browser or - as you have asked for - do the same server-side with the built-in reduce function
_sum
You will get one row per username with the total amount of order fees as value.
I have a dataset of records that look like this :
[{
"d1d":"2015-05-28T00:00:00.000Z",
"d1h":0,
"d15m":0,
"ct":3
},
{
"d1d":"2015-05-28T00:00:00.000Z",
"d1h":0,
"d15m":0,
"ct":1
}
]
The ct value changes in every record. If d1d, d1h, and d15m are the same in one or more records, I need to combine those records into one with the sum of all the ct values.
I do have jquery, can I use grep for this?
I realize the server side could do a better job of getting me this data , but I have zero control over that.
You don't have to use jQuery for this, vanilla JavaScript will do.
I'll show you two solutions to your problem;
Example 1: Abusing Array#reduce as an iterator
var intermediaryArray = [];
dataset.reduce(function(prev, curr) {
if(prev.d1d === curr.d1d && prev.d1h === curr.d1h && prev.d15m === curr.d15m) {
intermediaryArray.push({
d1d: prev.d1d,
d1h: prev.d1h,
d15m: prev.d15m,
ct: prev.ct + curr.ct
});
} else {
// push the one that wasn't the same
intermediaryArray.push(curr);
}
// return current element so reduce has something to work on
// for the next iteration.
return curr;
});
Example 2: Using Array#Map and Array#Reduce in conjunction
This example utilises underscore.js to demonstrate the logic behind what you want to do.
.map() produces the new array of grouped objects.
.groupBy() produces an array of subarrays containing the objects that pass the predicate that all objects must share the same d1d or grouping function.
.reduce() boils all subarrays down to one value, your object with both cts added to each other.
var merged = _.map(_.groupBy(a, 'd1d'), function(subGroup) {
return subGroup.reduce(function(prev, curr) {
return {
d1d: prev.d1d,
d1h: prev.d1h,
d15m: prev.d15m,
ct: prev.ct + curr.ct
};
});
});
Here's one possible solution:
var dataset = [{
"d1d":"2015-05-28T00:00:00.000Z",
"d1h":0,
"d15m":0,
"ct":3
},
{
"d1d":"2015-05-28T00:00:00.000Z",
"d1h":0,
"d15m":0,
"ct":1
}
]
function addCt(dataset) {
var ctMap = {}
var d1d, d1h, d15m, ct, key, value
for (var ii=0, record; record=dataset[ii]; ii++) {
key = record.d1d+"|"+record.d1h+"|"+record.d15m
value = ctMap[key]
if (!value) {
value = 0
}
value += record.ct
ctMap[key] = value
}
return ctMap
}
ctMap = addCt(dataset)
console.log(ctMap)
// { "2015-05-28T00:00:00.000Z|0|0": 4 }
You may want to construct the key in a different way. You may want set the value to an object containing the d1d, d1h, d15m and cumulated ct values, with a single object for all matching d1d, d1h and d15m values.
I am trying to build a data structure.
In my limited knowledge, 'hash table' seems to be the way to go. If you think there is an easier way, please suggest it.
I have two, 1-dimensional arrays:-
A[] - contains names of badges (accomplishment)
B[] - contains respective dates those achievements were accomplished from array A[].
An achievement/accomplishment/badge can be accomplished more than one time.
Therefore a sample of the two arrays:-
A['scholar', 'contributor', 'teacher', 'student', 'tumbleweed', 'scholar'.....,'scholar',......]
B['1/2010', '2/2011', '3/2011', '6/2012', '10/2012', '2/2013',......'3/2013',........]
What I want to achieve with my data structure is:-
A list of unique keys (eq:- 'scholar') and all of its existing values (dates in array B[]).
Therefore my final result should be like:-
({'scholar': '1/2010', '2/2013', '3/2013'}), ({'contributor' : ........})..........
This way I can pick out a unique key and then traverse through all its unique values and then use them to plot on x-y grid. (y axis labels being unique badge names, and x axis being dates, sort of a timeline.)
Can anyone guide me how to build such a data structure??
and how do I access the keys from the data structure created.... granted that I don't know how many keys there are and what are their individual values. Assigning of these keys are dynamic, so the number and their names vary.
Your final object structure would look like this:
{
'scholar': [],
'contributor': []
}
To build this, iterate through the names array and build the final result as you go: if the final result contains the key, push the corresponding date on to its value otherwise set a new key to an array containing its corresponding date.
something like:
var resultVal = {};
for(var i = 0; i < names.length; ++i) {
if(resultVal[names[i]]) {
resultVal[names[i]].push(dates[i]);
} else {
resultVal[names[i]] = [dates[i]];
}
}
Accessing the result - iterating through all values:
for(var key in resultVal) {
var dates = resultVal[key];
for(var i = 0; i < dates.length; ++i) {
// you logic here for each date
console.log("resultVal[" + key + "] ==> " + resultVal[key][i]);
}
}
will give results like:
resultVal[scholar] ==> 1/2010
resultVal[scholar] ==> 2/2013
resultVal[scholar] ==> 3/2013
resultVal[contributor] ==> 2/2011
resultVal[teacher] ==> 3/2011
resultVal[student] ==> 6/2012
resultVal[tumbleweed] ==> 10/2012
You can try this...
var A = ['scholar', 'contributor',
'teacher', 'student', 'tumbleweed', 'scholar','scholar'];
var B = ['1/2010', '2/2011',
'3/2011', '6/2012', '10/2012', '2/2013','3/2013'];
var combined = {};
for(var i=0;i<A.length;i++) {
if(combined[A[i]] === undefined) {
combined[A[i]] = [];
}
combined[A[i]].push(B[i]);
}
Then each one of the arrays in combined can be accessed via
combined.scholar[0]
or
combined['scholar'][0]
Note the === when comparing against undefined