I'm trying to get data from 2 collections, and return one array with merge data of both collection.
The best solution for me was :
const bothValues = await ValueA.aggregate([
{ $unionWith: { coll: 'valueB' } },
{ $sort: { rank: -1, _id: -1 } },
{
$match: {
isAvailable: true,
},
},
{ $skip: skip },
{ $limit: 30 },
]);
which work perfectly. But.. $unionWith was not implemented my MongoDB version (4.0.X) so I can't use it.
const bothValues = await ValueA.aggregate(
[
{ $limit: 1 },
{
$lookup: {
from: 'valueB',
pipeline: [{ $limit: 15 }],
as: 'valueB',
},
},
{
$lookup: {
from: 'ValueA',
pipeline: [{ $limit: 15 }, { $sort: { rank: -1, _id: -1 } }],
as: 'ValueA',
},
},
{
$project:
{
Union: { $concatArrays: ['$valueB', '$ValueA'] },
},
},
{ $unwind: '$Union' },
{ $replaceRoot: { newRoot: '$Union' } },
],
);
but now, I got 2 problems :
I can't use a $skip, which is important, where use it ?
How to use $match ?
Thanks
Query
your query made with some changes to work like the first query
match in both pipelines, sort in both, (limit limitN+skipN)
(this way we make sure that we always have enough documents even if all are taken from valueA or valueB)
Take sorted 70 from each, so in all ways we will have the 70 needed in the final sort/skip/limit after the union.
concat,unwind,replace-root like in your query
sort again (to sort the union now), skip, limit
no matter we always have enough documents to skip
this example query is made for skip=40 and limit=30 so in the first 2 pipelines we limit=70
db.ValueA.aggregate([
{
"$limit": 1
},
{
"$lookup": {
"from": "valueB",
"pipeline": [
{
"$match": {
"isAvailable": true
}
},
{
"$sort": {
"rank": -1,
"_id": -1
}
},
{
"$limit": 70
}
],
"as": "valueB"
}
},
{
"$lookup": {
"from": "valueA",
"pipeline": [
{
"$match": {
"isAvailable": true
}
},
{
"$sort": {
"rank": -1,
"_id": -1
}
},
{
"$limit": 70
}
],
"as": "valueA"
}
},
{
"$project": {
"union": {
"$concatArrays": [
"$valueA",
"$valueB"
]
}
}
},
{
"$unwind": {
"path": "$union"
}
},
{
"$replaceRoot": {
"newRoot": "$union"
}
},
{
"$sort": {
"rank": -1,
"_id": -1
}
},
{
"$skip": 40
},
{
"$limit": 30
}
])
Related
i have the following aggregate function in my code to count how many times a value is found in the db:
let data: any = await this.dataModel.aggregate(
[
{
$match: {
field: new ObjectID(fieldID),
},
},
{
$group: {
_id: "$value",
total_for_value: { $sum: 1 },
},
},
]
);
This works correctly, however my data setup is a bit different. I have two types of value fields. Some like this:
{
"_id" : ObjectId("123"),
"value" : "MALE"
}
and some like this:
{
"_id" : ObjectId("456"),
"value" : {
"value" : "MALE",
}
}
Is there a way to group the ones where the _id and the _id.value are the same? At the moment it counts them separately.
db.collection.aggregate([
{
"$addFields": {
"key2": {
"$cond": {
"if": {
$and: [
{
"$eq": [
{
"$type": "$key"
},
"object"
]
}
]
},
"then": "$key.value",
"else": "$key"
}
}
}
},
{
"$group": {
"_id": "$key2",
"data": {
$push: "$$ROOT"
}
}
}
])
This would do the job if _id.value is an object.
Playground
I have the following document structure
{
"_id": "60b7b7c784bd6c2a1ca57f29",
"user": "607c58578bac8c21acfeeae1",
"exercises": [
{
"executed_reps": [8,7],
"_id": "60b7b7c784bd6c2a1ca57f2a",
"exercise_name": "Push up"
},
{
"executed_reps": [5,5],
"_id": "60b7b7c784bd6c2a1ca57f2b",
"exercise_name": "Pull up"
}
],
}
In aggregation, I am trying to sum all the executed_reps so the end value in this example should be 25 (8+7+5+5).
Here is the code I have so far:
const exerciseStats = await UserWorkout.aggregate([
{
$match: {
user: { $eq: ObjectId(req.query.user) },
},
},
{ $unwind: '$exercises' },
{
$group: {
_id: null,
totalReps: {
$sum: {
$reduce: {
input: '$exercises.executed_reps',
initialValue: '',
in: { $add: '$$this' },
},
},
},
},
},
]);
This gives a result of 5 for totalReps. What am I doing wrong?
Well 10 minutes later I found the solution myself:
UserWorkout.aggregate([
{
$match: {
user: { $eq: ObjectId(req.query.user) },
},
},
{ $unwind: '$exercises' },
{
$project: {
total: { $sum: '$exercises.executed_reps' },
},
},
{
$group: {
_id: null,
totalExercises: { $sum: '$total' },
},
},])
Had to use $project first. :)
You can do it like this:
const exerciseStats = await UserWorkoutaggregate([
{
"$addFields": {
"total": {
"$sum": {
"$map": {
"input": "$exercises",
"as": "exercise",
"in": {
"$sum": "$$exercise.executed_reps"
}
}
}
}
}
}
])
Here is the working example: https://mongoplayground.net/p/5_fsPgSh8EP
My documents looks like this.
{
"_id" : ObjectId("572c4bffd073dd581edae045"),
"name" : "What's New in PHP 7",
"description" : "PHP 7 is the first new major version number of PHP since 2004. This course shows what's new, and what's changed.",
"difficulty_level" : "Beginner",
"type" : "Normal",
"tagged_skills" : [
{
"_id" : "5714e894e09a0f7d804b2254",
"name" : "PHP"
}
],
"created_at" : 1462520831.649,
"updated_at" : 1468233074.243 }
Is it possible to get recent 5 documents and total count in a single query.
I am using two queries for this requirement as given below.
db.course.find().sort({created_at:-1}).limit(5)
db.course.count()
This is a perfect job for the aggregation framework.
db.course.aggregate(
[
{ "$sort": { "created_at": -1 }},
{ "$group": {
"_id": null,
"docs": { "$push": "$$ROOT" },
"count": { "$sum": 1 }
}},
{ "$project": { "_id": 0, "count": 1, "docs": { "$slice": [ "$docs", 5 ] } }}
]
)
If your MongoDB server doesn't support $slice then you need to use the ugly and inefficient approach.
db.course.aggregate(
[
{ "$sort": { "created_at": -1 }},
{ "$group": {
"_id": null,
"docs": { "$push": "$$ROOT" },
"count": { "$sum": 1 }
}},
{ "$unwind": "$docs" },
{ "$limit": 5 }
]
)
You can implement this easily with $facet
myCollection.aggregate([
{
$facet: {
count: [{ $count: "value" }],
data: [{ $sort: { _id: -1 } }, { $skip: skip }, { $limit: limit }]
}
},
{ $unwind: "$count" },
{ $set: { count: "$count.value" } }
])
the return result will be like:
[
{
"count": 234,
"data": [
// ...
]
}
]
#styvane I tested in person, this query is even less efficient than twice queries.
// get count
db.course.aggregate([{$match:{}}, {$count: "count"}]);
// get docs
db.course.aggregate(
[
{$match:{}},
{ "$sort": { "created_at": -1 }},
{"$skip": offset},
{"$limit": limit}
]
)
No, there is no other way. Two queries - one for count - one with limit.
I currently have this schema
var dataSchema = new Schema({
hid: { type: String },
sensors: [{
nid: { type: String },
sid: { type: String },
data: {
param1: { type: String },
param2: { type: String },
data: { type: String }
},
date: { type: Date, default: Date.now }
}],
actuators: [{
nid: { type: String },
aid: { type: String },
control_id: { type: String },
data: {
param1: { type: String },
param2: { type: String },
data: { type: String }
},
date: { type: Date, default: Date.now }
}],
status: [{
nid: {type: String},
status_code: {type: String},
date: { type: Date, default: Date.now }
}],
updated: { type: Date, default: Date.now },
created: { type: Date }
});
And the query that I'm trying to build should search the schema by "hid" and then only pick the last object (by date) from the "sensors", "actuators" and "status" arrays but I can't figure out how to do that.
With this query I can partially achieve what I'm trying to get but it only give me one array at the time so I have to query the database three times and I would avoid doing so
db.getCollection('data').aggregate([
{ $match : { hid : "testhid" } },
{$project : {"sensors" : 1}},
{$unwind : "$sensors"},
{$sort : {"sensors.date" : -1}},
{$limit : 1}
])
Thanks in advance for any help
The best advice here would be to "store" the arrays as sorted in the first place. Chances are that they probably already are considering that any $push operation ( or even if you used .push() ) will actually just "append" to the array so that the latest item is "last" anyway.
So unless you are actually "changing" the "date" properties after you create, then the "latest date" is always the "last" item anyway. In which case, just $slice the entries:
Data.find({ "hid": "testhid" }).select({
"sensors": { "$slice": -1 },
"actuators": { "$slice": -1 },
"status": { "$slice": -1 }
}).exec(function(err,data) {
]);
"If", some reason you actually did manage to store in a different way or altered the "date" properties so they latest is no longer the "last", then it's probably a good idea to have all future updates use the $sort modifier with $push. This can "ensure" that additions to the array are consistently sorted. You can even modify the whole collection in one simple statement:
Date.update(
{},
{
"$push": {
"sensors": { "$each": [], "$sort": { "date": 1 } },
"actuators": { "$each": [], "$sort": { "date": 1 } },
"status": { "$each": [], "$sort": { "date": 1 } }
}
},
{ "multi": true },
function(err,num) {
}
)
In that one statement, every document in the collection is having every array mentioned re-sorted to that the "latest date" is the "last" entry for each array. This then means that the above usage of $slice is perfectly fine.
Now "If", absolutely none of that is possible for your case and you actually have some reason why the array entries are not to be commonly stored in "date" order, then ( and only really then ) should you resort to using .aggregate() in order to the the results:
Data.aggregate(
[
{ "$match": { "hid": "testhid" } },
{ "$unwind": "$sensors" },
{ "$sort": { "_id": 1, "sensors.date": -1 } },
{ "$group": {
"_id": "$_id",
"sensors": { "$first": "$sensors" },
"actuators": { "$first": "$actuators" },
"status": { "$first": "$status" },
"updated": { "$first": "$updated" },
"created": { "$first": "$created" }
}},
{ "$unwind": "$actuators" },
{ "$sort": { "_id": 1, "actuators.date": -1 } },
{ "$group": {
"_id": "$_id",
"sensors": { "$first": "$sensors" },
"actuators": { "$first": "$actuators" },
"status": { "$first": "$status" },
"updated": { "$first": "$updated" },
"created": { "$first": "$created" }
}},
{ "$unwind": "$status" },
{ "$sort": { "_id": 1, "status.date": -1 } },
{ "$group": {
"_id": "$_id",
"sensors": { "$first": "$sensors" },
"actuators": { "$first": "$actuators" },
"status": { "$first": "$status" },
"updated": { "$first": "$updated" },
"created": { "$first": "$created" }
}}
],
function(err,data) {
}
)
The reality there is that MongoDB has no way to "inline sort" array content in a return from any query or aggregation pipeline statement. You can only really do this by processing with $unwind then using $sort and finally a $group using $first to effectively get the single item from the sorted array.
This you need to do "per" array, since the process of $unwind is creating seperate documents for each array item. You "could" do it all in one go like:
Data.aggregate(
[
{ "$match": { "hid": "testhid" } },
{ "$unwind": "$sensors" },
{ "$unwind": "$actuators" },
{ "$unwind": "$status" }
{ "$sort": {
"_id": 1,
"sensors.date": -1,
"actuators.date": -1,
"actuators.status": -1
}},
{ "$group": {
"_id": "$_id",
"sensors": { "$first": "$sensors" },
"actuators": { "$first": "$actuators" },
"status": { "$first": "$status" },
"updated": { "$first": "$updated" },
"created": { "$first": "$created" }
}}
],
function(err,data) {
}
)
But it's really not that much improvement on the other process with all things considered.
The real lesson here should be to "keep the array sorted" and then doing an operation to $slice the last item is a very simple process.
Assuming I have a schema that looks something like this:
{
field: [{
subDoc: ObjectId,
...
}],
...
}
and I have some list of ObjectIds (user input), how would I get a count of those specific ObjectIds? For exmaple, if I have data like this:
[
{field: [ {subDoc: 123}, {subDoc: 234} ]},
{field: [ {subDoc: 234}, {subDoc: 345} ]},
{field: [ {subDoc: 123}, {subDoc: 345}, {subDoc: 456} ]}
]
and the list of ids given by the user is 123, 234, 345, I need to get a count the given ids, so a result approximating this:
{
123: 2,
234: 2,
345: 2
}
What would be the best way to go about this?
The aggregation framework itself if not going to dynamically name keys the way you have presented as a proposed output, and that probably is a good thing really. But you can probably just do a query like this:
db.collection.aggregate([
// Match documents that contain the elements
{ "$match": {
"field.subDoc": { "$in": [123,234,345] }
}},
// De-normalize the array field content
{ "$unwind": "$field" },
// Match just the elements you want
{ "$match": {
"field.subDoc": { "$in": [123,234,345] }
}},
// Count by the element as a key
{ "$group": {
"_id": "$field.subDoc",
"count": { "$sum": 1 }
}}
])
That gives you output like this:
{ "_id" : 345, "count" : 2 }
{ "_id" : 234, "count" : 2 }
{ "_id" : 123, "count" : 2 }
But if you really want to go nuts on this, you are specifying the "keys" that you want as part of your query, so you could form a pipeline like this:
db.collection.aggregate([
{ "$match": {
"field.subDoc": { "$in": [123,234,345] }
}},
{ "$unwind": "$field" },
{ "$match": {
"field.subDoc": { "$in": [123,234,345] }
}},
{ "$group": {
"_id": "$field.subDoc",
"count": { "$sum": 1 }
}},
{ "$group": {
"_id": null,
"123": {
"$max": {
"$cond": [
{ "$eq": [ "$_id", 123 ] },
"$count",
0
]
}
},
"234": {
"$max": {
"$cond": [
{ "$eq": [ "$_id", 234 ] },
"$count",
0
]
}
},
"345": {
"$max": {
"$cond": [
{ "$eq": [ "$_id", 345 ] },
"$count",
0
]
}
}
}}
])
Which is a relatively simple thing to construct that last stage in code by just processing the list of arguments:
var list = [123,234,345];
var group2 = { "$group": { "_id": null } };
list.forEach(function(id) {
group2["$group"][id] = {
"$max": {
"$cond": [
{ "$eq": [ "$_id", id ] },
"$count",
0
]
}
};
});
And that comes out more or less how you want it.
{
"_id" : null,
"123" : 2,
"234" : 2,
"345" : 2
}
Not exactly what you're asking for but it can give you an idea:
db.test.aggregate([
{
$unwind: '$field'
},
{
$group: {
_id: {
subDoc: '$field.subDoc'
},
count: {
$sum: 1
}
}
},
{
$project: {
subDoc: '$subDoc.subDoc',
count: '$count'
}
}
]);
Output:
{
"result": [
{
"_id": {
"subDoc": 456
},
"count": 1
},
{
"_id": {
"subDoc": 345
},
"count": 2
},
{
"_id": {
"subDoc": 234
},
"count": 2
},
{
"_id": {
"subDoc": 123
},
"count": 2
}
],
"ok": 1
}