Expand a variable in a MongoDB aggregation pipeline

Expand a variable in a MongoDB aggregation pipeline - javascript

In a Typesctipt code, I would like to use a varible value in an aggregation pipeline in MongoDB; the problem is that the "keyToCheck" field is a variable that is set by the Typescript code and, therefore, can change based by many conditions.
Is there a way to expand the variable "keyToCheck"?
I have tried $$keyToCheck, $keyToCheck with no result (compilation errors).
Thanks.
...
const pipeline = [
{
$match: {
[this.countryOriginFieldName!]: {
$in: members
},
**keyToCheck**: {
$nin: dictionaryNotAbsoluteFieldList
}
}
},
...
UPDATE: try with this example:
var keyToCheck = "indicator";
var queryMatch = {"`$${keyToCheck}`": "US$millions"}
printjson(queryMatch);
db.getCollection("temp_collection").aggregate([
{
$match: queryMatch
},
{$project: {indicator: 1, value: 1}}
]
);
db.getCollection("temp_collection").insertMany([
{
"indicator" : "US$millions",
"value" : 1.0
},
{
"indicator" : "US$millions",
"value" : 2.0
},
{
"indicator" : "EUROmillions",
"value" : 3
}
]);
Desired output:
{
"indicator" : "US$millions",
"value" : 1.0
}
{
"indicator" : "US$millions",
"value" : 2.0
}

Query
the [keyToCheck] is to take the value of the variable, its not an array
here its assumed that you want to project also the keyToCheck, and not always project the indicator
var keyToCheck = "indicator";
db.getCollection("temp_collection").aggregate([
{
$match: {[keyToCheck]: "US$millions"}
},
{$project: {[keyToCheck]: 1, value: 1}}
]
);
This will work, key will be just a string,and in project also just a string.
You dont need $ or $$ with this query.

Related

find query to find retrieve the data of specific element in mongodb

I have a document in my mongo instance in below format,
{
"_id" : "08d4a242-08fb-07f7-46e5-8717a81d5b70",
"fname" : "john",
"created_date" : ISODate("2017-05-24T01:13:06.829Z"),
"customProp" : [
[
"customX","{\"some data related to X \"}"
],
[
"customY","{\"some data related to Y \"}"
],
[
"customZ","{\"some data related to Z \"}"
]
]
}
the elements/values like "customX","customY" & "customZ" are not necessarily be in all documents. How to retrieve all the values in second element of "customProp" array, in this document it contains "customZ"?
I'm able to use following query to filter & find all the documents which are having "customZ" element,
db.getCollection('col1').find({$and : [{"customProp":{$elemMatch:{0:"customZ"}}}, {"created": { $gte: ISODate("2017-05-22T00:00:00.000Z") }}] },{"created":1}).limit(1) .pretty()
output:
{
"_id" : "08d4a242-08fb-07f7-46e5-8717a81d5b45",
"created" : ISODate("2017-05-24T01:13:06.829Z")
}
but finding a way to retrieve all the values in second element of array where the first value is "customZ".
expected result:
{
"_id" : "08d4a242-08fb-07f7-46e5-8717a81d5b45",
"created" : ISODate("2017-05-24T01:13:06.829Z"),
"customPro": ["customZ","{\"some data related to Z \"}"]
}
I'm fine if my query just returns
{
"{\"some data related to Z \"}"
}

Well it is a nested array, which is not a great idea but you are in fact matching the element with the $elemMatch expression, so you do get the position in the "outer" array of customProp, which allows you to project with the positional $ operator:
db.getCollection('coll1').find(
{
"customProp":{ "$elemMatch": { "0": "customZ" } },
"created_date": { "$gte": ISODate("2017-05-22T00:00:00.000Z") }
},
{ "created_date": 1, "customProp.$": 1 }
)
That yields the result:
{
"_id" : "08d4a242-08fb-07f7-46e5-8717a81d5b70",
"created_date" : ISODate("2017-05-24T01:13:06.829Z"),
"customProp" : [
[
"customZ",
"{\"some data related to Z \"}"
]
]
}
Where customProp is of course still in a nested array, but when processing the individual documents in python you can just access the property at the array index:
doc['customProp'][0][1]
Which of course returns the value:
'{"some data related to Z "}'
Same goes for JavaScript really, which is basically identical in syntax. As a shell example:
db.getCollection('coll1').find(
{
"customProp":{ "$elemMatch": { "0": "customZ" } },
"created_date": { "$gte": ISODate("2017-05-22T00:00:00.000Z") }
},
{ "created_date": 1, "customProp.$": 1 }
).map(function(doc) {
doc['customProp'] = doc['customProp'][0][1];
return doc;
})
And the output:
{
"_id" : "08d4a242-08fb-07f7-46e5-8717a81d5b70",
"created_date" : ISODate("2017-05-24T01:13:06.829Z"),
"customProp" : "{\"some data related to Z \"}"
}
And the positional $ projection here ensures there is only one element in the returned array, so the notation is always the same to extract from all document results. So you get the matched element from the database, and you extract the property through the code.
Also note that you do not need $and here since all the query arguments are already AND conditions. This is the MongoDB default, so you do not need to explicitly express it. See how much nicer this looks without it.

Integrate between two collections

I have two collections:
'DBVisit_DB':
"_id" : ObjectId("582bc54958f2245b05b455c6"),
"visitEnd" : NumberLong(1479252157766),
"visitStart" : NumberLong(1479249815749),
"fuseLocation" : {.... }
"userId" : "A926D9E4853196A98D1E4AC6006DAF00#1927cc81cfcf7a467e9d4f4ac7a1534b",
"modificationTimeInMillis" : NumberLong(1479263563107),
"objectId" : "C4B4CE9B-3AF1-42BC-891C-C8ABB0F8DC40",
"creationTime" : NumberLong(1479252167996),
"lastUserInteractionTime" : NumberLong(1479252167996)
}
'device_data':
"_id" : { "$binary" : "AN6GmE7Thi+Sd/dpLRjIilgsV/4AAAg=", "$type" : "00" },
"auditVersion" : "1.0",
"currentTime" : NumberLong(1479301118381),
"data" : {
"networkOperatorName" : "Cellcom",...
},
"timezone" : "Asia/Jerusalem",
"collectionAlias" : "DEVICE_DATA",
"shortDate" : 17121,
"userId" : "00DE86984ED3862F9277F7692D18C88A#1927cc81cfcf7a467e9d4f4ac7a1534b"
In DBVisit_DB I need to show all visits only for Cellcom users which took more than 1 hour. (visitEnd - visitStart > 1 hour). by matching the userId value in both the collection.
this is what I did so far:
//create an array that contains all the rows that "Cellcom" is their networkOperatorName
var users = db.device_data.find({ "data.networkOperatorName": "Cellcom" },{ userId: 1, _id: 0}).toArray();
//create an array that contains all the rows that the visit time is more then one hour
var time = db.DBVisit_DB.find( { $where: function() {
timePassed = new Date(this.visitEnd - this.visitStart).getHours();
return timePassed > 1}},
{ userId: 1, _id: 0, "visitEnd" : 1, "visitStart":1} ).toArray();
//merge between the two arrays
var result = [];
var i, j;
for (i = 0; i < time; i++) {
for (j = 0; j < users; j++) {
if (time[i].userId == users[j].userId) {
result.push(time[i]);
}
}
}
for (var i = 0; i < result.length; i++) {
print(result[i].userId);
}
but it doesn't show anything although I know for sure that there is id's that can be found in both the array I created.
*for verification: I'm not 100% sure that I calculated the visit time correctly.
btw I'm new to both javaScript and mongodb
********update********
in the "device_data" there are different rows but with the same "userId" field.
in the "device_data" I have also the "data.networkOperatorName" field which contains different types of cellular companies.
I've been asked to show all "Cellcom" users that based on the 'DBVisit_DB' collection been connected more then an hour means,
based on the field "visitEnd" and "visitStart" I need to know if ("visitEnd" - "visitStart" > 1)
{ "userId" : "457A7A0097F83074DA5E05F7E05BEA1D#1927cc81cfcf7a467e9d4f4ac7a1534b" }
{ "userId" : "E0F5C56AC227972CFAFC9124E039F0DE#1927cc81cfcf7a467e9d4f4ac7a1534b" }
{ "userId" : "309FA12926EC3EB49EB9AE40B6078109#1927cc81cfcf7a467e9d4f4ac7a1534b" }
{ "userId" : "B10420C71798F1E8768ACCF3B5E378D0#1927cc81cfcf7a467e9d4f4ac7a1534b" }
{ "userId" : "EE5C11AD6BFBC9644AF3C742097C531C#1927cc81cfcf7a467e9d4f4ac7a1534b" }
{ "userId" : "20EA1468672EFA6793A02149623DA2C4#1927cc81cfcf7a467e9d4f4ac7a1534b" }
each array contains this format, after my queries, I need to merge them into one. that I'll have the intersection between them.
thanks a lot for all the help!

With the aggregation framework, you can achieve the desired result by making use of the $lookup operator which allows you to do a "left-join" operation on collections in the same database as well as taking advantage of the $redact pipeline operator which can accommodate arithmetic operators that manipulate timestamps and converting them to minutes which you can query.
To show a simple example how useful the above aggregate operators are, you can run the following pipeline on the DBVisit_DB collection to see the actual time difference in minutes:
db..getCollection('DBVisit_DB').aggregate([
{
"$project": {
"visitStart": { "$add": [ "$visitStart", new Date(0) ] },
"visitEnd": { "$add": [ "$visitEnd", new Date(0) ] },
"timeDiffInMinutes": {
"$divide": [
{ "$subtract": ["$visitEnd", "$visitStart"] },
1000 * 60
]
},
"isMoreThanHour": {
"$gt": [
{
"$divide": [
{ "$subtract": ["$visitEnd", "$visitStart"] },
1000 * 60
]
}, 60
]
}
}
}
])
Sample Output
{
"_id" : ObjectId("582bc54958f2245b05b455c6"),
"visitEnd" : ISODate("2016-11-15T23:22:37.766Z"),
"visitStart" : ISODate("2016-11-15T22:43:35.749Z"),
"timeDiffInMinutes" : 39.0336166666667,
"isMoreThanHour" : false
}
Now, having an understanding of how the above operators work, you can now apply it in the following example, where running the following aggregate pipeline will use the device_data collection as the main collection, first filter the documents on the specified field using $match and then do the join to DBVisit_DB collection using $lookup. $redact will process the logical condition of getting visits which are more than an hour long within $cond and uses the special system variables $$KEEP to "keep" the document where the logical condition is true or $$PRUNE to "discard" the document where the condition was false.
The arithmetic operators $divide and $subtract allow you to calculate the difference between the two timestamp fields as minutes, and the $gt logical operator then evaluates the condition:
db.device_data.aggregate([
/* Filter input documents */
{ "$match": { "data.networkOperatorName": "Cellcom" } },
/* Do a left-join to DBVisit_DB collection */
{
"$lookup": {
"from": "DBVisit_DB",
"localField": "userId",
"foreignField": "userId",
"as": "userVisits"
}
},
/* Flatten resulting array */
{ "$unwind": "$userVisits" },
/* Redact documents */
{
"$redact": {
"$cond": [
{
"$gt": [
{
"$divide": [
{ "$subtract": [
"$userVisits.visitEnd",
"$userVisits.visitStart"
] },
1000 * 60
]
},
60
]
},
"$$KEEP",
"$$PRUNE"
]
}
}
])

There are couple of things incorrect in your java script.
Replace time and users condition with time.length and users.length in for loops.
Your timePassed calculation should be
timePassed = this.visitEnd - this.visitStart
return timePassed > 3600000
You have couple of data related issues.
You don't have matching userId and difference between visitEnd and visitStart is less than an hour for the documents you posted in the question.
For mongo based query you should checkout the other answer.

How to exclude a document if two fields are the same?

After performing some aggregation magic, I have arrived at this data:
{ "_id" : "5700edfe03fcdb000347bebb", "size" : 3, "count" : 2 }
{ "_id" : "5700edfe03fcdb000347bebf", "size" : 2, "count" : 2 }
Now, I want to eliminate all the entries where size is equal to count.
So I ran this aggregation instruction:
match3 = { "$match" : { "size" : { "$ne" : "count"} } }
But it doesn't eliminate anything and returns the two lines as it is.
I want the result to be just this one line as it is the only one where size is not equal to count:
{ "_id" : "5700edfe03fcdb000347bebb", "size" : 3, "count" : 2 }

You need to add a $redact stage to your aggregation pipeline:
{ "$redact": {
"$cond": [
{ "$eq": [ "$size", "$count" ] },
"$$PRUNE",
"$$KEEP"
]
}}

You can use the $where operator for this
db.collection.find({ $where: "this.size != this.count" })
db.collection.remove({ $where: "this.size != this.count" })
UPDATE:
After I got downvoted I decided to compare the 2 solutions.
Both use a COLLSCAN and both return the same results.
So please enlighten me what is so wrong about my solution? :)

Merging MongoDB fields of documents into one document

I'm using MongoDB 2.6.6
I have these documents in a MongoDB collection and here is an example:
{ ..., "field3" : { "one" : [ ISODate("2014-03-18T05:47:33Z"),ISODate("2014-06-02T20:00:25Z") ] }, ...}
{ ..., "field3" : { "two" : [ ISODate("2014-03-18T05:47:33Z"),ISODate("2014-06-02T20:00:25Z") ] }, ...}
{ ..., "field3" : { "three" : [ ISODate("2014-03-18T05:47:39Z"),ISODate("2014-03-19T20:18:38Z") ] }, ... }
I would like the merge these documents in one field. For an example, I would like the new result to be as follows:
{ "field3", : { "all" : [ ISODate("2014-03-18T05:47:39Z"),ISODate("2014-03-19T20:18:38Z"),...... ] },}
I'm just not sure any more how to have that result!

Doesn't really leave much to go on here but you can arguably get the kind of merged result with mapReduce:
db.collection.mapReduce(
function() {
var field = this.field3;
Object.keys(field).forEach(function(key) {
field[key].forEach(function(date) {
emit( "field3", { "all": [date] } )
});
});
},
function (key,values) {
var result = { "all": [] };
values.forEach(function(value) {
value.all.forEach(function(date) {
result.all.push( date );
});
});
result.all.sort(function(a,b) { return a.valueOf()-b.valueOf() });
return result;
},
{ "out": { "inline": 1 } }
)
Which being mapReduce is not exactly in the same output format given it's own restrictions for doing things:
{
"results" : [
{
"_id" : "field3",
"value" : {
"all" : [
ISODate("2014-03-18T05:47:33Z"),
ISODate("2014-03-18T05:47:33Z"),
ISODate("2014-03-18T05:47:39Z"),
ISODate("2014-03-19T20:18:38Z"),
ISODate("2014-06-02T20:00:25Z"),
ISODate("2014-06-02T20:00:25Z")
]
}
}
],
"timeMillis" : 86,
"counts" : {
"input" : 3,
"emit" : 6,
"reduce" : 1,
"output" : 1
},
"ok" : 1
}
Since the aggregation here into a single document is fairly arbitrary you could pretty much argue that you simply take the same kind of approach in client code.
At any rate this is only going to be useful over a relatively small set of data with next to the same sort of restrictions on the client processing. More than the 16MB BSON limit for MongoDB, but certainly limited by memory to be consumed.
So I presume you would need to add a "query" argument but it's not really clear from your question. Either using mapReduce or your client code, you are basically going to need to follow this sort of process to "mash" the arrays together.
I would personally go with the client code here.

How to extract an array of fields from an array of JSON documents?

I have 2 mongodb collections, stu_creds and stu_profile. I first want to retrieve all the student records from stu_creds where stu_pref_contact is the email and then for those stu_ids I want to retrieve the complete profile from stu_profile. The problem is, the first query returns an array of JSON documents, with each document holding one field, the stu_id. Here is my query and the result:
db.stu_creds.find({"stu_pref_contact" : "email"}, {'_id' : 1})
Result:
[{ "_id" : ObjectId("51927cc93080baac04000001") },
{ "_id" : ObjectId("51927d7b3080baac04000002") },
{ "_id" : ObjectId("519bb011c5c5035b2a000002") },
{ "_id" : ObjectId("519ce3d09f047a192b000010") },
{ "_id" : ObjectId("519e6dc0f919cfdc66000003") },
{ "_id" : ObjectId("51b39be0c74f0e3d23000012") },
{ "_id" : ObjectId("51b39ca9c74f0e3d23000014") },
{ "_id" : ObjectId("51b39cb7c74f0e3d23000016") },
{ "_id" : ObjectId("51b39e87c74f0e3d23000018") },
{ "_id" : ObjectId("51b39f2fc74f0e3d2300001a") },
{ "_id" : ObjectId("51b39f47c74f0e3d2300001c") },
{ "_id" : ObjectId("518d454deb1e3a525e000009") },
{ "_id" : ObjectId("51bc8381dd10286e5b000002") },
{ "_id" : ObjectId("51bc83f7dd10286e5b000004") },
{ "_id" : ObjectId("51bc85cbdd10286e5b000006") },
{ "_id" : ObjectId("51bc8630dd10286e5b000008") },
{ "_id" : ObjectId("51bc8991dd10286e5b00000a") },
{ "_id" : ObjectId("51bc8a43dd10286e5b00000c") },
{ "_id" : ObjectId("51bc8a7ddd10286e5b00000e") },
{ "_id" : ObjectId("51bc8acadd10286e5b000010") }]
The thing is, I don't think I can use the above array as part of an $in clause for my second query to retrieve the student profiles. I have to walk through the array and and create a new array which is just an array of object ids rather than an array of JSON docs.
Is there an easier way to do this?

Use Array.map (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map). This allows you to perform a transform on each element of the array, returning you a new array of the transformed items.
var arrayOfIds = result.map(function(item){ return item._id; });
Array.map was introduced in ECMAScript 5. If you're using node.js, a modern browser, or an Array polyfill, it should be available to use.

Ummm, am I missing something or is all you want the following:
var results = [];
for(var i = 0; i < yourArray.length; i++) {
results.push(yourArray[i]._id);
}

You could use $or:
db.stu_profile.find({ $or : results }) // `results` is your list of ObjectId's
But it's considerably slower than $in, so I would suggest using one of the other answers ;)

We Keep Coding

JavaScript is the programming language of the Web.

Expand a variable in a MongoDB aggregation pipeline - javascript

Related

find query to find retrieve the data of specific element in mongodb

Integrate between two collections

How to exclude a document if two fields are the same?

Merging MongoDB fields of documents into one document

How to extract an array of fields from an array of JSON documents?

Categories

Resources