How to join array of documents from array of _ids in mongoDB

How to join array of documents from array of _ids in mongoDB - javascript

I have a document inside of MongoDb that has an array of ObjectIds. Looks like this:
--Shops Collection--
listing_ids: [
ObjectId("6092f741ccb0d55ba444883e"),
ObjectId("6092f741ccb0d55ba444883f"),
ObjectId("6092f741ccb0d55ba4448840"),
ObjectId("6092f741ccb0d55ba4448841")
],
...
and I have a listing document like this:
--Listings Collection--
_id: ObjectId("6092f741ccb0d55ba444883e"),
...
How do I join the listings documents onto the shops document that contains the arrays? I have tried the "$lookup" operator from MonogoDb but can't get it to work. I'm always hit with a FailedToParse error. This is what the "$lookup" looks like:
const collection = client.db("development").collection("shops");
const shop = await collection.aggregate([
{
"$lookup": {
"from": "listings",
"localfield": "listing_ids",
"foreignField": "_id",
"as": "listings"
}}
]).toArray();
This is expected to return all of the shops documents with the corresponding listings documents in an array called listings

Turns out I almost had it right. I ended up using the MongoDB Atlas Aggregation tool and it helped me. This is the working pipeline:
.aggregate([
{
'$lookup': {
'from': 'listings',
'localField': 'listing_ids',
'foreignField': '_id',
'as': 'listings'
}
}, {
'$lookup': {
'from': 'listings',
'localField': 'highlightedListings',
'foreignField': '_id',
'as': 'highlightedListings'
}
}
]).toArray();

Related

How to filter only some rows of an array inside a document?

I have a MongoDB collection of documents, using a schema like this:
var schema = new Schema({
name: String,
images: [{
uri: string,
active: Boolean
}]
});
I'd like to get all documents (or filter using some criteria), but retrieve - in the images array - only the items with a specific property (in my case, it's {active: true}).
This is what I do now:
db.people.find( { 'images.active': true } )
But this retrieves only documents with at least one image which is active, which is not what I need.
I know of course I can filter in code after the find is returned, but I do not like wasting memory.
Is there a way I can filter array items in a document using mongoose?

Here is the aggregation you're looking for:
db.collection.aggregate([
{
$match: {}
},
{
$project: {
name: true,
images: {
$filter: {
input: "$images",
as: "images",
cond: {
$eq: [
"$$images.active",
true
]
}
}
}
}
}
])
https://mongoplayground.net/p/t_VxjfiBBMK

'InvalidPipelineOperator' with $lookup when adding .toArray()

I'm new to MongoDB, and working with aggregation. Here goes:
I have two collections 'video_card' and 'vendors'. I am fetching a video card document from the collection with the following structure:
_id:ObjectId(x),
vendorList:
0: ID1;
1: ID2;
I am trying to do a join between this document and this vendor collection:
_id: id:ObjectId(y)
name: "Amazon"
My aggregate is as follows so far:
const products = await db
.collection("video_card")
.aggregate([
{
$project: {
mappedVendors: {
$map: {
input: "$vendorList",
as: "vendorName",
in: {
$lookup: {
from: "vendors",
localField: "vendorList",
foreignField: "name",
as: "VendorNames"
},
},
},
},
},
},
]);
This returns a cursor object. However, when I attach .toArray() to the end of this, I get a code:168 'InvalidPipelineOperator'. Why is this?
To clarify, my intent is to return the data with vendorList ids swapped with names.

To follow up, I realized that my foreignField in the $lookup was incorrect, and changed it to _id. I hope this helps other people learning aggregate functions.

Async/await functionality for db.eval aggregate

I am trying to execute a db.collection.aggregate() query within a call to db.eval(). I am using eval() because I am making a dynamic number of lookups, so I generate the query by concatenating relevant strings. The query works perfectly when I manually remove the quotes from the string:
await db.collection('Products').aggregate([{
$lookup: {
from: 'Golomax',
localField: 'barcode',
foreignField: 'barcode',
as: 'Golomax'
}
}, {
$unwind: {
path: '$Golomax',
preserveNullAndEmptyArrays: true
}
}, {
$lookup: {
from: 'Masivos SA',
localField: 'barcode',
foreignField: 'barcode',
as: 'Masivos SA'
}
}, {
$unwind: {
path: '$Masivos SA',
preserveNullAndEmptyArrays: true
}
}, {
$out: 'Output'
}]).toArray();
Unfortunately, it does not work when I am using the string in a call to db.eval(). I put quotes around the code snippet above and set the string equal to the variable 'query' and tried this:
db.eval('async function(){' + query + ' return;}', function(err, result) {
console.log('the result is: ', result);
});
I've also tried removing the word "async," and this still has not worked. How do I ensure that the function will finish aggregating before returning? Thanks.
-- EDIT --
I just noticed that db.eval() is deprecated and planned for removal. The alternative is to "implement the equivalent queries/operations using the normal MongoDB query language and client driver API." How can I do this using a string query?

You don't need $eval for this. It sounds like you want to create a $lookup and $unwind for each item in an Array. That is exactly what map() is for. You can create the array of commands separate and then pass it to aggregate():
// Have a list of places
const thingsToUnwind = [
'Golomax',
'Masivos SA',
'Some Other Place',
'Yet Another Place'
];
const unwindables = thingsToUnwind
// Create a $lookup and $unwind for each place
.map(place => {
return [{
$lookup: {
from: place,
localField: 'barcode',
foreignField: 'barcode',
as: place
}
},
{
$unwind: {
path: `$${place}`,
preserveNullAndEmptyArrays: true
}
}
];
})
// Flatten the array of arrays
.reduce((acc, curr) => [...acc, ...curr], []);
// Add an $output node
unwindables.push({
$out: 'Output'
});
// Perform the aggregation
await db
.collection('Products')
.aggregate(unwindables)
.toArray();

I just solved my own problem using the Javascript eval() and removing the "await" at the beginning of the string. It executes perfectly now!

Insert value inside array within Mongo DB documents using bulk write [duplicate]

I want to show products by ids (56e641d4864e5b780bb992c6 and 56e65504a323ee0812e511f2) and show price after subtracted by discount if available.
I can count the final price using aggregate, but this return all document in a collection, how to make it return only the matches ids
"_id" : ObjectId("56e641d4864e5b780bb992c6"),
"title" : "Keyboard",
"discount" : NumberInt(10),
"price" : NumberInt(1000)
"_id" : ObjectId("56e65504a323ee0812e511f2"),
"title" : "Mouse",
"discount" : NumberInt(0),
"price" : NumberInt(1000)
"_id" : ObjectId("56d90714a48d2eb40cc601a5"),
"title" : "Speaker",
"discount" : NumberInt(10),
"price" : NumberInt(1000)
this is my query
productModel.aggregate([
{
$project: {
title : 1,
price: {
$cond: {
if: {$gt: ["$discount", 0]}, then: {$subtract: ["$price", {$divide: [{$multiply: ["$price", "$discount"]}, 100]}]}, else: "$price"
}
}
}
}
], function(err, docs){
if (err){
console.log(err)
}else{
console.log(docs)
}
})
and if i add this $in query, it returns empty array
productModel.aggregate([
{
$match: {_id: {$in: ids}}
},
{
$project: {
title : 1,
price: {
$cond: {
if: {$gt: ["$discount", 0]}, then: {$subtract: ["$price", {$divide: [{$multiply: ["$price", "$discount"]}, 100]}]}, else: "$price"
}
}
}
}
], function(err, docs){
if (err){
console.log(err)
}else{
console.log(docs)
}
})

Your ids variable will be constructed of "strings", and not ObjectId values.
Mongoose "autocasts" string values for ObjectId into their correct type in regular queries, but this does not happen in the aggregation pipeline, as in described in issue #1399.
Instead you must do the correct casting to type manually:
ids = ids.map(function(el) { return mongoose.Types.ObjectId(el) })
Then you can use them in your pipeline stage:
{ "$match": { "_id": { "$in": ids } } }
The reason is because aggregation pipelines "typically" alter the document structure, and therefore mongoose makes no presumption that the "schema" applies to the document in any given pipeline stage.
It is arguable that the "first" pipeline stage when it is a $match stage should do this, since indeed the document is not altered. But right now this is not how it happens.
Any values that may possibly be "strings" or at least not the correct BSON type need to be manually cast in order to match.

In the mongoose , it work fine with find({_id:'606c1ceb362b366a841171dc'})
But while using the aggregate function we have to use the mongoose object to convert the _id as object eg.
$match: { "_id": mongoose.Types.ObjectId("606c1ceb362b366a841171dc") }
This will work fine.

You can simply convert your id to
let id = mongoose.Types.ObjectId(req.query.id);
and then match
{ $match: { _id: id } },

instead of:
$match: { _id: "6230415bf48824667a417d56" }
use:
$match: { _id: ObjectId("6230415bf48824667a417d56") }

Use this
$match: { $in : [ {_id: mongoose.Types.ObjectId("56e641d4864e5b780bb992c6 ")}, {_id: mongoose.Types.ObjectId("56e65504a323ee0812e511f2")}] }
Because Mongoose autocasts string values for ObjectId into their correct type in regular queries, but this does not happen in the aggregation pipeline. So we need to define ObjectId cast in pipeline queries.

Is there a way to find a document matching two different populates and get his document in findOne()?

I'm using mongoose with the combo mongoDb/nodejs. I would like to findOne() a doc with some conditions.
There is my Schema :
var prognosticSchema = new Schema({
userRef : { type : Schema.Types.ObjectId, ref : 'users'},
matchRef : { type : Schema.Types.ObjectId, ref : 'match'},
...
});
Model schema 'users' contain a String 'email' and model 'match' contain a Number 'id_match' like this:
var userSchema = new Schema({
email: String,
...
});
then
var matchSchema = new Schema({
id_match: {type: Number, min: 1, max: 51},
...
});
My goal is to findOne() one doc which contains an id_match = id_match and an email = req.headers['x-key'].
I tried this:
var prognoSchema = require('../db_schema/prognostic'); // require prognostics
require('../db_schema/match'); // require match to be able to populate
var prognoQuery = prognoSchema.find()
.populate({path: 'userRef', // populate userRef
match : {
'email' : req.headers['x-key'] // populate where email match with email in headers of request (I'm using Express as node module)
},
select : 'email pseudo'
});
prognoQuery.findOne() // search for only one doc
.populate({path: 'matchRef', // populate match
match: {
'id_match': id_match // populate match where id_match is correct
}})
.exec(function(err, data) {
... // Return of value as response ...
}
When I run this code and try to get the right document knowing that there much of other prognosticSchema with such others users and match in my dataBase, i'll get userRef at null and correct matchRef in my data document.
In my dataBase, there is others users and others id_match but I would like to get the right document in findOne() helped by this two objectId in my Schema.
Is there a way to findOne() a document matching two different populates and get his document in findOne() ?

Well you can include "both" populate expressions in the same query, but of course since you actually want to "match" on the properties contained in "referenced" collections this does mean that the actual data returned from the "parent" would need to look at "all parents" first in order to populate the data:
prognoSchema.find()
.populate([
{
"path": "userRef",
"match": { "email": req.headers['x-key'] }
},
{
"path": "matchRef",
"match": { "id_match": id_match }
}
]).exec(function(err,data) {
/*
data contains the whole collection since there was no
condition there. But populated references that did not
match are now null. So .filter() them:
*/
data = data.filter(function(doc) {
return ( doc.userRef != null && doc.matchRef != null );
});
// data now contains only those item(s) that matched
})
That is not ideal, but it's just how using "referenced" data works.
A better approach would be to search the other collections "indiviually" for there single match, and then supply the found _id values to the "parent" collection. A little help from async.parallel here to facilitate waiting on the results of the other queries before executing on the parent with the matched values. Can be done in various ways, but this looks relatively clean:
async.parallel(
{
"userRef": function(callback) {
User.findOne({ "email": req.headers['x-key'] },callback);
},
"id_match": function(callback) {
Match.findOne({ "id_match": id_match },callback);
}
},
function(err,result) {
prognoSchema.findOne({
"userRef": result.userRef._id,
"matchRef": result.id_match._id
}).populate([
{ "path": "userRef", "match": { "email": req.headers['x-key'] } },
{ "path": "matchRef", "match": { "id_match": id_match } }
]).exec(function(err,progno) {
// Matched and populated data only
})
}
)
As an alternate, in modern MongoDB releases from 3.2 and onwards you could use the $lookup aggregation operator instead:
prognoSchema.aggregate(
[
// $lookup the userRef data
{ "$lookup": {
"from": "users",
"localField": "userRef",
"foreignField": "_id",
"as": "userRef"
}},
// target is an array always so $unwind
{ "$unwind": "$userRef" },
// Then filter out anything that does not match
{ "$match": {
"userRef.email": req.headers['x-key']
}},
// $lookup the matchRef data
{ "$lookup": {
"from": "matches",
"localField": "matchRef",
"foreignField": "_id",
"as": "matchRef"
}},
// target is an array always so $unwind
{ "$unwind": "$matchRef" },
// Then filter out anything that does not match
{ "$match": {
"matchRef.id_match": id_match
}}
],
function(err,prognos) {
}
)
But again similarly ugly since the "source" is still selecting everything and you are only gradually filtering out results after each $lookup operation.
The basic premise here is "MongoDB does not 'really' perform joins", and neither is .populate() a "JOIN", but just additional queries on the related collections. Since this is "not" a "join" there is no way to filter out the "parent" until the actual related data is retrieved. Even if it's done on the "server" via $lookup rather than on the "client" via .populate()
So if you "must" query this way, it's generally better to query the other collections for results "first" and then match the "parent" based on the matching _id property values as references.
But the other case here is that you "should" consider "embedding" the data instead, where it is your intent to "query" on those properties. Only when that data resides in the "single collection" is is possible for MongoDB to query and match those conditions with a single query and a performant operation.

We Keep Coding

JavaScript is the programming language of the Web.

How to join array of documents from array of _ids in mongoDB - javascript

Related

How to filter only some rows of an array inside a document?

'InvalidPipelineOperator' with $lookup when adding .toArray()

Async/await functionality for db.eval aggregate

Insert value inside array within Mongo DB documents using bulk write [duplicate]

Is there a way to find a document matching two different populates and get his document in findOne()?

Categories

Resources