Find Index of Object in array with Aggregate - javascript

Is there a way to get index in aggregate pipeline, I have a result from long aggreagte query
[
{
"_id": "59ed949227ec482044b2671e",
"points": 300,
"fan_detail": [
{
"_id": "59ed949227ec482044b2671e",
"name": "mila ",
"email": "mila#gmail.com ",
"password": "$2a$10$J0.KfwVnZkaimxj/BiqGW.D40qXhvrDA952VV8x.xdefjNADaxnSW",
"username": "mila 0321",
"updated_at": "2017-10-23T07:04:50.004Z",
"created_at": "2017-10-23T07:04:50.004Z",
"celebrity_request_status": 0,
"push_notification": [],
"fan_array": [],
"fanLength": 0,
"celeb_bio": null,
"is_admin": 0,
"is_blocked": 2,
"notification_setting": [
1,
2,
3,
4,
5,
6,
7
],
"total_stars": 0,
"total_points": 134800,
"user_type": 2,
"poster_pic": null,
"profile_pic": "1508742289662.jpg",
"facebook_id": "alistnvU79vcc81PLW9o",
"is_user_active": 1,
"is_username_selected": "false",
"__v": 0
}
]
}
],
so I want to find the index of _id in aggregate query and above array can contain 100s of object in it.

Depending on the available version of MongoDB you have there are different approaches:
$indexOfArray - MongoDB 3.4
The best operator for this is simply $indexOfArray where you have it available. The name says it all really:
Model.aggregate([
{ "$match": { "fan_detail._id": mongoose.Types.ObjectId("59ed949227ec482044b2671e") } },
{ "$addFields": {
"fanIndex": {
"$indexOfArray": [
"$fan_detail._id",
mongoose.Types.ObjectId("59ed949227ec482044b2671e")
]
}
}}
])
$unwind with includeArrayIndex - MongoDB 3.2
Going back a version in releases, you can get the index from the array from the syntax of $unwind. But this does require you to $unwind the array:
Model.aggregate([
{ "$match": { "fan_detail._id": mongoose.Types.ObjectId("59ed949227ec482044b2671e") } },
{ "$unwind": { "path": "$fan_detail", "includeArrayIndex": true } },
{ "$match": { "fan_detail._id": mongoose.Types.ObjectId("59ed949227ec482044b2671e") } }
])
mapReduce - Earlier versions
Earlier versions of MongoDB to 3.2 don't have a way of returning an array index in an aggregation pipeline. So if you want the matched index instead of all the data, then you use mapReduce instead:
Model.mapReduce({
map: function() {
emit(
this._id,
this['fan_detail']
.map( f => f._id.valueOf() )
.indexOf("59ed949227ec482044b2671e")
)
},
reduce: function() {},
query: { "fan_detail._id": mongoose.Types.ObjectId("59ed949227ec482044b2671e") }
})
In all cases we essentially "query" for the presence of the element "somewhere" in the array beforehand. The "indexOf" variants would return -1 where nothing was found otherwise.
Also $addFields is here just for example. If it's your real intent to not return the array of 100's of items, then you're probably using $project or other output anyway.

Related

Concatenate all the arrays of the elements in a collection [MongoDB]

Sorry, I didn't get the MongoDB aggregation well.
How can I achieve with an aggregation this:
[
{array: [1,2,3] },
{array: [4,5,6] },
{array: [7,8,9] }
]
desired result:
[1,2,3,4,5,6,7,8,9]
Does the performance change if instead of using MongoDB aggregation I consider documents as normal objects?
Aggregation is always a better option instead of using some language code and that is why database provides such type of relief to get the results in one go.
db.collection.aggregate([
{ "$group": {
"_id": null,
"data": { "$push": "$array" }
}},
{ "$project": {
"_id": 0,
"data": {
"$reduce": {
"input": "$data",
"initialValue": [],
"in": { "$concatArrays": ["$$this", "$$value"] }
}
}
}}
])
The only thing you have to take care here is the size of the returned result for single document should not exceed more 16MB Bson limit. More you can learn from here
You can $group by null to get an array of arrays as a single document and then you can run $reduce with $concatArrays to flatten that array:
db.col.aggregate([
{
$group: {
_id: null,
array: { $push: "$array" }
}
},
{
$project: {
_id: 0,
array: {
$reduce: {
input: "$array",
initialValue: [],
in: { $concatArrays: [ "$$value", "$$this" ] }
}
}
}
}
])
MongoDB Playground

Query and filter key names instead of values in MongoDB

I want to find all key names from a collection that partially match a certain string.
The closest I got was to check if a certain key exists, but that's an exact match:
db.collection.find({ "fkClientID": { $exists:1 }})
I'd like to get all keys that start with fk instead.
You can do that using mapReduce:
To get just the field names at root level:
db.collection.mapReduce(function () {
Object.keys(this).map(function(key) {
if (key.match(/^fk/)) emit(key, null);
// OR: key.indexOf("fk") === 0
});
}, function(/* key, values */) {
// No need for params or to return anything in the
// reduce, just pass an empty function.
}, { out: { inline: 1 }});
This will output something like this:
{
"results": [{
"_id": "fkKey1",
"value": null
}, {
"_id": "fkKey2",
"value": null
}, {
"_id": "fkKey3",
"value": null
}],
"timeMillis": W,
"counts": {
"input": X,
"emit": Y,
"reduce": Z,
"output": 3
},
"ok" : 1
}
To get field names and any or all (whole doc) its values:
db.test.mapReduce(function () {
var obj = this;
Object.keys(this).map(function(key) {
// With `obj[key]` you will get the value of the field as well.
// You can change `obj[key]` for:
// - `obj` to return the whole document.
// - `obj._id` (or any other field) to return its value.
if (key.match(/^fk/)) emit(key, obj[key]);
});
}, function(key, values) {
// We can't return values or an array directly yet:
return { values: values };
}, { out: { inline: 1 }});
This will output something like this:
{
"results": [{
"_id": "fkKey1",
"value": {
"values": [1, 4, 6]
}
}, {
"_id": "fkKey2",
"value": {
"values": ["foo", "bar"]
}
}],
"timeMillis": W,
"counts": {
"input": X,
"emit": Y,
"reduce": Z,
"output": 2
},
"ok" : 1
}
To get field names in subdocuments (without path):
To do that you will have to use store JavaScript functions on the Server:
db.system.js.save({ _id: "hasChildren", value: function(obj) {
return typeof obj === "object";
}});
db.system.js.save({ _id: "getFields", value: function(doc) {
Object.keys(doc).map(function(key) {
if (key.match(/^fk/)) emit(key, null);
if (hasChildren(doc[key])) getFields(doc[key])
});
}});
And change your map to:
function () {
getFields(this);
}
Now run db.loadServerScripts() to load them.
To get field names in subdocuments (with path):
The previous version will just return field names, not the whole path to get them, which you will need if what you want to do is rename those keys. To get the path:
db.system.js.save({ _id: "getFields", value: function(doc, prefix) {
Object.keys(doc).map(function(key) {
if (key.match(/^fk/)) emit(prefix + key, null);
if (hasChildren(doc[key]))
getFields(doc[key], prefix + key + '.')
});
}});
And change your map to:
function () {
getFields(this, '');
}
To exclude overlapping path matches:
Note that if you have a field fkfoo.fkbar, it will return fkfoo and fkfoo.fkbar. If you don't want overlapping path matches, then:
db.system.js.save({ _id: "getFields", value: function(doc, prefix) {
Object.keys(doc).map(function(key) {
if (hasChildren(doc[key]))
getFields(doc[key], prefix + key + '.')
else if (key.match(/^fk/)) emit(prefix + key, null);
});
}});
Going back to your question, renaming those fields:
With this last option, you get all the paths that include keys that start with fk, so you can use $rename for that.
However, $rename doesn't work for those that contain arrays, so for those you could use forEach to do the update. See MongoDB rename database field within array
Performance note:
MapReduce is not particularly fast thought, so you may want to specify { out: "fk_fields"} to output the results into a new collection called fk_fields and query those results later, but that will depend on your use case.
Possible optimisations for specific cases (consistent schema):
Also, note that if you know that the schema of your documents is always the same, then you just need to check one of them to get its fields, so you can do that adding limit: 1 to the options object or just retrieving one document with findOne and reading its fields in the application level.
If you have the latest MongoDB 3.4.4 then you can use $objectToArray in an aggregate statement with $redact as the the most blazing fast way this can possibly be done with native operators. Not that scanning the collection is "fast". but as fast as you get for this:
db[collname].aggregate([
{ "$redact": {
"$cond": {
"if": {
"$gt": [
{ "$size": { "$filter": {
"input": { "$objectToArray": "$$ROOT" },
"as": "doc",
"cond": {
"$eq": [ { "$substr": [ "$$doc.k", 0, 2 ] }, "fk" ]
}
}}},
0
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}}
])
The presently undocumented $objectToArray translates an "object" into "key" and "value" form in an array. So this:
{ "a": 1, "b": 2 }
Becomes this:
[{ "k": "a", "v": 1 }, { "k": "b", "v": 2 }]
Used with $$ROOT which is a special variable referring to the current document "object", we translate to an array so the values of "k" can be inspected.
Then it's just a matter of applying $filter and using $substr to get the preceding characters of the "key" string.
For the record, this would be the MongoDB 3.4.4 optimal way of obtaining an unique list of the matching keys:
db[collname].aggregate([
{ "$redact": {
"$cond": {
"if": {
"$gt": [
{ "$size": { "$filter": {
"input": { "$objectToArray": "$$ROOT" },
"as": "doc",
"cond": {
"$eq": [ { "$substr": [ "$$doc.k", 0, 2 ] }, "fk" ]
}
}}},
0
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}},
{ "$project": {
"j": {
"$filter": {
"input": { "$objectToArray": "$$ROOT" },
"as": "doc",
"cond": {
"$eq": [ { "$substr": [ "$$doc.k", 0, 2 ] }, "fk" ]
}
}
}
}},
{ "$unwind": "$j" },
{ "$group": { "_id": "$j.k" }}
])
That's the safe provision, which is considering that the key may not be present in all documents and that there could possibly be multiple keys in the document.
If you are absolutely certain that you "always" have the key present in the document and that there will only be one, then you can shorten to just $group:
db[colname].aggregate([
{ "$group": {
"_id": {
"$arrayElemAt": [
{ "$map": {
"input": { "$filter": {
"input": { "$objectToArray": "$$ROOT" },
"as": "doc",
"cond": {
"$eq": [ { "$substr": [ "$$doc.k", 0, 2 ] }, "fk" ]
}
}},
"as": "el",
"in": "$$el.k"
}},
0
]
}
}}
])
The most efficient way in earlier versions would be using the $where syntax that allows a JavaScript expression to evaluate. Not that anything that evaluates JavaScript is the "most" efficient thing you can do, but analyzing "keys" as opposed to "data" is not optimal for any data store:
db[collname].find(function() { return Object.keys(this).some( k => /^fk/.test(k) ) })
The inline function there is just shell shorthand and this could also be written as:
db[collname].find({ "$where": "return Object.keys(this).some( k => /^fk/.test(k) )" })
The only requirement for $where is that the expression returns a true value for any document you want to return, so the documents return unaltered.

condtitional aggregate function in mongodb

I have a mongodb that has data as
{
"_id": "a",
"reply": "<",
"criterion": "story"
},
{
"_id": "b",
"reply": "<",
"criterion": "story"
},
{
"_id": "c",
"reply": ">",
"criterion": "story"
}
And I want the result as:
{
"criterion": "story",
"result" : {
">" : 1,
"<" : 2
}
}
I want to aggregate on "criterion". So if I do that there will be 1 document. However, I want to count the number of "<" and ">" and write that in the new key as shown in the json above. That is the logic behind this. Could anyone who has a good idea in mongodb help me with this?
You'd need to use the aggregation framework where you would run an aggregation pipeline that has a $group operator pipeline stage which aggregates the documents to create the desired counts using the accumulator operator $sum.
For the desired result, you would need to use a tenary operator like $cond to create the independent count fields since that will feed the number of documents to the $sum expression depending on the name value. The $cond operator can be used effectively to evaluate the counts based on the reply field value. It takes a logical condition as its first argument (if) and then returns the second argument where the evaluation is true (then) or the third argument where false (else). This converts the true/false boolean evaluated returns into 1 and 0 that will feed into $sum respectively:
"$cond": [
{ "$eq": ["$reply", ">"] },
1, 0
]
So, if within the document being processed the "$reply" field has a ">" value, the $cond operator feeds the value 1 to the $sum else it sums a zero value.
Use the $project as your final pipeline step as it allows you to reshape each document in the stream, include, exclude or rename fields, inject computed fields, create sub-document fields, using mathematical expressions, dates, strings and/or logical (comparison, boolean, control) expressions. It is similar to SELECT in SQL.
The following pipeline should return the desired result:
Model.aggregate([
{
"$group": {
"_id": "$criterion",
">": {
"$sum": {
"$cond": [
{ "$eq": [ "$reply", ">" ] },
1, 0
]
}
},
"<": {
"$sum": {
"$cond": [
{ "$eq": [ "$reply", "<" ] },
1, 0
]
}
}
}
},
{
"$project": {
"_id": 0,
"criterion": "$_id",
"result.>": "$>",
"result.<": "$<"
}
}
]).exec(function(err, result) {
console.log(JSON.stringify(result, null, 4));
});
Sample Console Output
{
"criterion" : "story",
"result" : {
">" : 1,
"<" : 2
}
}
Note: This approach takes into consideration the values for the $reply field are fixed and known hence it's not flexible where the values are dynamic and unknown.
For a more flexible alternative which executes much faster than the above, has better performance and also takes into consideration unknown values for the count fields, I would suggest running the pipeline as follows:
Model.aggregate([
{
"$group": {
"_id": {
"criterion": "$criterion",
"reply": "$reply"
},
"count": { "$sum": 1 }
}
},
{
"$group": {
"_id": "$_id.criterion",
"result": {
"$push": {
"reply": "$_id.reply",
"count": "$count"
}
}
}
}
]).exec(function(err, result) {
console.log(JSON.stringify(result, null, 4));
});
Sample Console Output
{
"_id" : "story",
"result" : [
{
"reply" : "<",
"count" : 2
},
{
"reply" : ">",
"count" : 1
}
]
}

Cannot get correct result when using MongoDB aggregation in meteor

I am using MongoDB aggregation in meteor.
The items in database look like this:
// item1
{
products: {
aaa: 100,
bbb: 200
}
}
// item2
{
products: {
aaa: 300,
bbb: 400
}
}
My pipeline looks like this
let pipeline = [{
$limit: 10
}, {
$group: {
_id: {
// …
},
total: {
$sum: "$products.aaa"
}
}
}];
And it is working perfect. But when I change my database structure to this
// item1
{
products: [
{code: "aaa", num: 100},
{code: "bbb", num: 200}
]
}
// item2
{
products: [
{code: "aaa", num: 300},
{code: "bbb", num: 400}
]
}
The results I got for total is always 0, I think my pipeline is wrong. Please see the comment inside:
let pipeline = [{
$limit: 10
}, {
$group: {
_id: {
// …
},
total: {
$sum: "$products.0.num" // Neither this nor "$products[0].num" works
}
}
}];
So how can I write it correctly? Thanks
With MongoDB 3.2 ( which won't be the bundled server with meteor, but there is noting stopping you using a seperate server instance. And actually would be recommended ) you can use $arrayElemAt with $map:
let pipeline = [
{ "$limit": 10 },
{ "$group": {
"_id": {
// …
},
"total": {
"$sum": { "$arrayElemAt": [
{ "$map": {
"input": "$products",
"as": "product",
"in": "$$product.num"
}},
0
]}
}
}}
];
With older versions, use "two" $group stages and the $first operator after processing with $unwind. And that's just for the "first" index value:
let pipeline = [
{ "$limit": 10 },
{ "$unwind": "$products" },
{ "$group": {
"_id": "$_id", // The document _id
"otherField": { "$first": "$eachOtherFieldForGroupingId" },
"productNum": { "$first": "$products.num" }
}},
{ "$group": {
"_id": {
// …
},
"total": {
"$sum": "$productNum"
}
}}
];
So in the latter case, after you $unwind you just want to use $first to get the "first" index from the array, and it would also be used to get every field you want to use as part of the grouping key from the original document. All elements would be copied for each array member after $unwind.
In the former case, $map just extracts the "num" values for each array member, then $arrayElemAt just retrieves the wanted index position.
Naturally the newer method for MongoDB 3.2 is better. If you wanted another array index then you would need to repeatedly get the $first element from the array and keep filtering it out from the array results until you reached the required index.
So whilst it's possible in earlier versions, it's a lot of work to get there.

Concatenate string values in array in a single field in MongoDB

Suppose that I have a series of documents with the following format:
{
"_id": "3_0",
"values": ["1", "2"]
}
and I would like to obtain a projection of the array's values concatenated in a single field:
{
"_id": "3_0",
"values": "1_2"
}
Is this possible? I have tried $concat but I guess I can't use $values as the array for $concat.
In Modern MongoDB releases you can. You still cannot "directly" apply an array to $concat, however you can use $reduce to work with the array elements and produce this:
db.collection.aggregate([
{ "$addFields": {
"values": {
"$reduce": {
"input": "$values",
"initialValue": "",
"in": {
"$cond": {
"if": { "$eq": [ { "$indexOfArray": [ "$values", "$$this" ] }, 0 ] },
"then": { "$concat": [ "$$value", "$$this" ] },
"else": { "$concat": [ "$$value", "_", "$$this" ] }
}
}
}
}
}}
])
Combining of course with $indexOfArray in order to not "concatenate" with the "_" underscore when it is the "first" index of the array.
Also my additional "wish" has been answered with $sum:
db.collection.aggregate([
{ "$addFields": {
"total": { "$sum": "$items.value" }
}}
])
This kind of gets raised a bit in general with aggregation operators that take an array of items. The distinction here is that it means an "array" of "aguments" provided in the coded representation a opposed to an "array element" present in the current document.
The only way you can really do the kind of concatenation of items within an array present in the document is to do some kind of JavaScript option, as with this example in mapReduce:
db.collection.mapReduce(
function() {
emit( this._id, { "values": this.values.join("_") } );
},
function() {},
{ "out": { "inline": 1 } }
)
Of course if you are not actually aggregating anything, then possibly the best approach is to simply do that "join" operation within your client code in post processing your query results. But if it needs to be used in some purpose across documents then mapReduce is going to be the only place you can use it.
I could add that "for example" I would love for something like this to work:
{
"items": [
{ "product": "A", "value": 1 },
{ "product": "B", "value": 2 },
{ "product": "C", "value": 3 }
]
}
And in aggregate:
db.collection.aggregate([
{ "$project": {
"total": { "$add": [
{ "$map": {
"input": "$items",
"as": "i",
"in": "$$i.value"
}}
]}
}}
])
But it does not work that way because $add expects arguments as opposed to an array from the document. Sigh! :(. Part of the "by design" reasoning for this could be argued that "just because" it is an array or "list" of singular values being passed in from the result of the transformation it is not "guaranteed" that those are actually "valid" singular numeric type values that the operator expects. At least not at the current implemented methods of "type checking".
That means for now we still have to do this:
db.collection.aggregate([
{ "$unwind": "$items" },
{ "$group": {
"_id": "$_id",
"total": { "$sum": "$items.value" }
}}
])
And also sadly there would be no way to apply such a grouping operator to concatenate strings either.
So you can hope for some sort of change on this, or hope for some change that allows an externally scoped variable to be altered within the scope of a $map operation in some way. Better yet a new $join operation would be welcome as well. But these do not exist as of writing, and probably will not for some time to come.
You can use the reduce operator together with the substr operator.
db.collection.aggregate([
{
$project: {
values: {
$reduce: {
input: '$values',
initialValue: '',
in: {
$concat: ['$$value', '_', '$$this']
}
}
}
}
},
{
$project: {
values: { $substr: ['$values', 1 , -1]}
}
}])
Starting in Mongo 4.4, the $function aggregation operator allows applying a custom javascript function to implement behaviour not supported by the MongoDB Query Language.
For instance, in order to concatenate an array of strings:
// { "_id" : "3_0", "values" : [ "1", "2" ] }
db.collection.aggregate(
{ $set:
{ "values":
{ $function: {
body: function(values) { return values.join('_'); },
args: ["$values"],
lang: "js"
}}
}
}
)
// { "_id" : "3_0", "values" : "1_2" }
$function takes 3 parameters:
body, which is the function to apply, whose parameter is the array to join.
args, which contains the fields from the record that the body function takes as parameter. In our case "$values".
lang, which is the language in which the body function is written. Only js is currently available.

Categories