I would like to return this only if there are 2 by in the data array. The number of _id can be unlimited.
However, the code { $size: { data: 2 }, } does not work because I get $size is not allowed in this atlas tier error.
Expected return:
[
{
"_id": "Something1?",
"data": [
{
"by": "user1",
},
{
"by": "user2",
}
]
},
]
I want to include something like $size in the code, otherwise it will return the data even if there is only 1 by, or 3 by, or 0 by. I only want to return the data if there are 2 by.
What should I do? Full code without $size:
let x = await Answer.aggregate([
{
$match: {
$and: [
{
by: {
$in: [user.email, user2[0].email],
},
},
],
},
},
{
$group: {
_id: "$question",
data: {
$push: "$$ROOT",
},
},
},
{
$project: {
"data._id": 0,
"data.question": 0,
"data.__v": 0,
},
},
{ $sort: { "data.date": -1 } },
]);
Looks like your atlas tier doesn't support $size.
But you can have a field like count that increments by 1 when grouping:
db.collection.aggregate([
{
$group: {
_id: "$question",
data: {
$push: "$$ROOT",
},
count: {
$sum: 1
}
}
},
{
$match: {
count: 2
}
}
])
Try this in playground
Update
Finally, your aggregation should look like this:
[
{
$match: {
$and: [
{
by: {
$in: [user.email, user2[0].email],
},
},
],
},
},
{
$group: {
_id: "$question",
data: {
$push: "$$ROOT",
},
count: {
$sum: 1
}
},
},
{
$match: {
count: 2
}
},
{
$project: {
"data._id": 0,
"data.question": 0,
"data.__v": 0,
"count": 0
},
},
{ $sort: { "data.date": -1 } },
]
You can learn more about $sum here.
Presuming your model is called Employee:
Employee.find({ { "social_account.2": { "$exists": false }} },function(err,docs) {
})
As $exists asks for the 2 index of an array which means it has something in it.
The same applies to a maximum number:
Employee.find({ { "social_account.9": { "$exists": true}} },function(err,docs) {
})
For your perspective I think this should be your answer:
Employee.find({ { "data.2": { "$exists": false }} },function(err,docs) {
})
Related
I have a collection named Vote that looks like the following:
{
postId: "1",
comment:{
text_sentiment: "positive",
topic: "A"
}
}, // DOC-1
{
postId: "2",
comment:{
text_sentiment: "negative",
topic: "A"
}
}, // DOC-2
{
postId: "3",
comment:{
text_sentiment: "positive",
topic: "B"
}
},..//DOC-3 ..
I want to do an aggregation on this collection such that it returns the following structure.
[
{
_id: "hash",
topic: "A",
topicOccurance: 2,
sentiment: {
positive: 1,
negative: 1,
neutral: 0
},
postIds: [1,2]
},
..
]
I created the following aggregation:
db.Vote.aggregate([
{
$match: {
surveyId: "e6d38e1ecd",
"comment.topic": {
$exists: 1
},
}
},
{
$group: {
_id: {
topic: "$comment.topic",
text_sentiment: "$comment.text_sentiment"
},
total: {
$sum: 1
},
}
},
{
$group: {
_id: "$_id.topic",
total: {
$sum: "$total"
},
text_sentiments: {
$push: {
k: "$_id.text_sentiment",
v: "$total"
}
}
}
},
{
$project: {
topic: "$_id",
topicOccurance: "$total",
sentiment: {
"$arrayToObject": "$text_sentiments"
}
}
},
{
$sort: {
"topicOccurance": -1
}
}
])
This works fine but I do not know how can I also get an array in the response holding the key postIds. Each document inside the collection vote has postId and I want to group the posts having the same topic and push to an array. How can I do this?
2nd stage ($group) - Add postId into postIds array via $push.
3rd stage ($group) - Add postIds array into postIds array via $push. This will leads postIds become nested array.
[[1,2], ...]
4th stage ($project) - For postIds field, use $reduce operator to flatten the postIds array by $concat. Update: with $setUnion to distinct items in array.
db.collection.aggregate([
// match stage
{
$group: {
_id: {
topic: "$comment.topic",
text_sentiment: "$comment.text_sentiment"
},
total: {
$sum: 1
},
postIds: {
$push: "$postId"
}
}
},
{
$group: {
_id: "$_id.topic",
total: {
$sum: "$total"
},
text_sentiments: {
$push: {
k: "$_id.text_sentiment",
v: "$total"
}
},
postIds: {
"$push": "$postIds"
}
}
},
{
$project: {
topic: "$_id",
topicOccurance: "$total",
sentiment: {
"$arrayToObject": "$text_sentiments"
},
postIds: {
$setUnion: [
{
$reduce: {
input: "$postIds",
initialValue: [],
in: {
$concatArrays: [
"$$value",
"$$this"
]
}
}
}
]
}
}
},
// sort stage
])
Sample Mongo Playground
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 12 months ago.
Improve this question
Task 1:
I have my collection with documents in mongodb with value from sequential ranges as follow :
{x:1}
{x:2}
{x:3}
{x:5}
{x:6}
{x:7}
{x:8}
{x:20}
{x:21}
I need to extract a list of sequential ranges in the form(the count is not compulsory , but I need at least the first and last values from the range):
{x:[1,3] , count:3}
{x:[5,8], count:4}
{x:[20,21],count:2}
or
{ min:1 , max:3 , count:3}
{ min:5 , max:8 , count:4}
{ min:20 , max:21 , count:2}
Please, advice suitable solution , collection has ~100M docs , some of the values are in 10 digit ranges others in 15 digit ranges , but they are all sequentially incremental in their range?
Task 2:
Same think like in Task 1 , but taken based on custom sequence step ,
for example if the sequence step is 3:
{y:1}
{y:3}
{y:5}
{y:20}
{y:22}
need to produce:
{y:[1,5] ,count:3}
{y:[20,22]} , count:2}
Thanks!
P.S.
I succeeded partially to get some ranges picture by fetch distribution by number of digits range , but this seems to be very general:
db.collection.aggregate([
{
$addFields: {
range: {
$strLenCP: {
$toString: "$x"
}
}
}
},
{
$group: {
_id: "$range",
minValue: {
$min: "$x"
},
maxValue: {
$max: "$x"
},
Count: {
$sum: 1
}
}
},
{
$addFields: {
x: [
{
$toString: "$minValue"
},
{
$toString: "$maxValue"
}
]
}
},
{
$project: {
range: "$_id",
"_id": 0,
x: 1,
Count: 1
}
},
{
$sort: {
range: 1
}
}
])
playground
Here is another way of querying - produces result with format [ { min: 1 , max: 3 , count: 3 }, ... ]:
db.collection.aggregate([
{
$sort: { x: 1 }
},
{
$group: {
_id: null,
docs: { $push: "$x" },
firstVal: { $first: "$x" },
lastVal: { $last: "$x" }
}
},
{
$project: {
_id: 0,
output: {
$let: {
vars: {
result: {
$reduce: {
input: "$docs",
initialValue: {
prev: { $add: [ "$firstVal", -1 ] },
val: { min: "$firstVal", max: 0, count: 0 },
vals: [ ]
},
in: {
$cond: [
{ $eq: [ { $subtract: [ "$$this", "$$value.prev" ] }, 1 ] },
{
prev: "$$this",
val: {
min : "$$value.val.min",
max: "$$value.val.max",
count: { $add: [ "$$value.val.count", 1 ] }
},
vals: "$$value.vals"
},
{
vals: {
$concatArrays: [
"$$value.vals",
[ { min : "$$value.val.min", max: "$$value.prev", count: "$$value.val.count" } ]
]
},
val: { min: "$$this", max: "$lastVal", count: 1 },
prev: "$$this"
},
]
}
}
}
},
in: {
$concatArrays: [ "$$result.vals", [ "$$result.val" ] ]
}
}
}
}
},
])
Use $setWindowFields instead of $group all data
db.collection.aggregate([
{
$setWindowFields: {
partitionBy: "",
sortBy: { x: 1 },
output: {
c: {
$push: "$x",
window: {
range: [ -3, 0 ]
}
}
}
}
},
{
$set: {
"c": {
"$cond": {
"if": { "$gt": [ { "$size": "$c" }, 1 ] },
"then": 0,
"else": 1
}
}
}
},
{
$setWindowFields: {
partitionBy: "",
sortBy: { x: 1 },
output: {
g: {
$sum: "$c",
window: {
documents: [ "unbounded", "current" ]
}
}
}
}
},
{
$group: {
_id: "$g",
count: { $sum: 1 },
max: { "$max": "$x" },
min: { "$min": "$x" }
}
}
])
mongoplayground
In PostgreSQL
CREATE TABLE test (
id INT,
x INT
);
INSERT INTO test VALUES (1, 1);
INSERT INTO test VALUES (2, 3);
INSERT INTO test VALUES (3, 5);
INSERT INTO test VALUES (4, 20);
INSERT INTO test VALUES (5, 22);
SELECT
MAX(x) AS max, MIN(x) AS min, COUNT(*) AS count
FROM (
SELECT *, SUM(inc) OVER(ORDER BY x) AS grp
FROM (
SELECT *, CASE WHEN x - LAG(x) OVER(ORDER BY x) < 4 THEN NULL ELSE 1 END AS inc
FROM test
) q
) q
GROUP BY grp
db-fiddle
using $reduce
if i'm not mistaken for task2 just change 1 in $cond, $ne to any sequence step you want
playground
db.collection.aggregate([
{
"$sort": {
x: 1
}
},
{
$group: {
_id: null,
temp: {
$push: "$$ROOT"
}
}
},
{
"$project": {
_id: 0,
"temp_field": {
"$reduce": {
"input": "$temp",
"initialValue": {
"prev": -999999,
"min": -999999,
"count": 0,
"ranges": []
},
"in": {
"prev": "$$this.x",
"count": {
"$cond": [
{
$gt: [
{
"$subtract": [
"$$this.x",
"$$value.prev"
]
},
1//sequence step
],
},
1,
{
"$add": [
"$$value.count",
1
]
}
]
},
"min": {
"$cond": [
{
$gt: [
{
"$subtract": [
"$$this.x",
"$$value.prev"
]
},
1//sequence step
],
},
"$$this.x",
"$$value.min"
]
},
"ranges": {
"$concatArrays": [
"$$value.ranges",
{
"$cond": [
{
$gt: [
{
"$subtract": [
"$$this.x",
"$$value.prev"
]
},
1//sequence step
],
},
[
{
max: "$$value.prev",
min: "$$value.min",
count: "$$value.count"
}
],
[]
]
}
]
}
}
}
}
}
},
{
"$project": {
ranges: {
"$concatArrays": [
"$temp_field.ranges",
[
{
max: "$temp_field.prev",
min: "$temp_field.min",
count: "$temp_field.count"
}
]
]
}
}
}
])
and at the end pop the first element from array
Comment by R2D2 after testing in the real use case I hit the memory limit with allowDiskUse: true:
2022-02-14T09:38:27.575+0100 E QUERY [js] Error: command failed: {
"ok" : 0,
"errmsg" : "$push used too much memory and cannot spill to disk. Memory limit: 104857600 bytes",
"code" : 146,
"codeName" : "ExceededMemoryLimit",
Increased the memory to 2GB ( max allowed ) with:
db.adminCommand({setParameter:1 , internalQueryMaxPushBytes: 2048576000 })
But still faced the limit , then decided to split the collection to small ones so finally got my results , thank you once again!
there is way to multi group in mongodb ?
document i have and want to query it
[ {
_id: '1615658138236',
englishName: 'samsung smart tv 50',
screen_resulation: '4K',
screen_size: '50' }, {
_id: '1615750981674',
englishName: 'lg tv 55 led uhd',
screen_resulation: 'UHD',
screen_size: '55' }, {
_id: '1615791834538',
englishName: 'samsung smart 55 inch crystal 4k',
screen_resulation: '4K',
screen_size: '55' } ]
for example i have 2 unknown fields i use this method to get them
for (let i = 0; i < result[0].filters.length; i++) {
const item = result[0].filters[i].key;
groupBy[item] = `$${item}`;
}
and i try to query mongodb to get count of every field
const products = await Product.aggregate([
{
$match: {
category,
},
},
{
$group: {
_id: groupBy,
count: {
$sum: 1,
},
},
},
{
$sort: { count: -1 },
},
]);
result i get
[
{ _id: { screen_size: '50', screen_resulation: '4K' }, count: 1 },
{ _id: { screen_size: '55', screen_resulation: 'UHD' }, count: 1 },
{ _id: { screen_size: '55', screen_resulation: '4K' }, count: 1 }
]
what i expect is :
[
{ _id: { screen_size: '50' }, count: 1 },
{ _id: { screen_size: '55' }, count: 2 },
{ _id: { screen_resulation: '4K' }, count: 2 },
{ _id: { screen_resulation: 'UHD' }, count: 1 },
]
i really find mongodb is great but very hard for me i dont know why
You can use $facet for multiple aggregation pipelines.
db.collection.aggregate([
{
"$facet": {
"screen_size_count": [
{
"$group": {
"_id": "$screen_size",
"count": {
$sum: 1
}
}
}
],
"screen_resulation_count:": [
{
"$group": {
"_id": "$screen_resulation",
"count": {
$sum: 1
}
}
}
]
}
}
])
Mongo Playground: https://mongoplayground.net/p/cnEY8NV4HNs
I have ten stations stored in the stations collection: Station A, Station B, Station C, Station D, Station E, Station F, Station G, Station H, Station I, Station J.
Right now, to create a count list of all inter-station rides between all possible pairs of stations, I do the following in my Node.js code (using Mongoose):
const stationCombinations = []
// get all stations from the stations collection
const stationIds = await Station.find({}, '_id name').lean().exec()
// list of all possible from & to combinations with their names
stationIds.forEach(fromStation => {
stationIds.forEach(toStation => {
stationCombinations.push({ fromStation, toStation })
})
})
const results = []
// loop through all station combinations
for (const stationCombination of stationCombinations) {
// create aggregation query promise
const data = Ride.aggregate([
{
$match: {
test: false,
state: 'completed',
duration: { $gt: 2 },
fromStation: mongoose.Types.ObjectId(stationCombination.fromStation._id),
toStation: mongoose.Types.ObjectId(stationCombination.toStation._id)
}
},
{
$group: {
_id: null,
count: { $sum: 1 }
}
},
{
$addFields: {
fromStation: stationCombination.fromStation.name,
toStation: stationCombination.toStation.name
}
}
])
// push promise to array
results.push(data)
}
// run all aggregation queries
const stationData = await Promise.all(results)
// flatten nested/empty arrays and return
return stationData.flat()
Executing this function give me the result in this format:
[
{
"fromStation": "Station A",
"toStation": "Station A",
"count": 1196
},
{
"fromStation": "Station A",
"toStation": "Station B",
"count": 1
},
{
"fromStation": "Station A",
"toStation": "Station C",
"count": 173
},
]
And so on for all other combinations...
The query currently takes a lot of time to execute and I keep getting alerts from MongoDB Atlas about excessive load on the database server because of these queries. Surely there must be an optimized way to do something like this?
You need to use MongoDB native operations. You need to $group by fromStation and toStation and with $lookup join two collections.
Note: I assume you have MongoDB >=v3.6 and Station._id is ObjectId
db.ride.aggregate([
{
$match: {
test: false,
state: "completed",
duration: {
$gt: 2
}
}
},
{
$group: {
_id: {
fromStation: "$fromStation",
toStation: "$toStation"
},
count: {
$sum: 1
}
}
},
{
$lookup: {
from: "station",
let: {
fromStation: "$_id.fromStation",
toStation: "$_id.toStation"
},
pipeline: [
{
$match: {
$expr: {
$in: [
"$_id",
[
"$$fromStation",
"$$toStation"
]
]
}
}
}
],
as: "tmp"
}
},
{
$project: {
_id: 0,
fromStation: {
$reduce: {
input: "$tmp",
initialValue: "",
in: {
$cond: [
{
$eq: [
"$_id.fromStation",
"$$this._id"
]
},
"$$this.name",
"$$value"
]
}
}
},
toStation: {
$reduce: {
input: "$tmp",
initialValue: "",
in: {
$cond: [
{
$eq: [
"$_id.toStation",
"$$this._id"
]
},
"$$this.name",
"$$value"
]
}
}
},
count: 1
}
},
{
$sort: {
fromStation: 1,
toStation: 1
}
}
])
MongoPlayground
Not tested:
const data = Ride.aggregate([
{
$match: {
test: false,
state: 'completed',
duration: { $gt: 2 }
}
},
{
$group: {
_id: {
fromStation: "$fromStation",
toStation: "$toStation"
},
count: { $sum: 1 }
}
},
{
$lookup: {
from: "station",
let: {
fromStation: "$_id.fromStation",
toStation: "$_id.toStation"
},
pipeline: [
{
$match: {
$expr: {
$in: [
"$_id",
[
"$$fromStation",
"$$toStation"
]
]
}
}
}
],
as: "tmp"
}
},
{
$project: {
_id: 0,
fromStation: {
$reduce: {
input: "$tmp",
initialValue: "",
in: {
$cond: [
{
$eq: [
"$_id.fromStation",
"$$this._id"
]
},
"$$this.name",
"$$value"
]
}
}
},
toStation: {
$reduce: {
input: "$tmp",
initialValue: "",
in: {
$cond: [
{
$eq: [
"$_id.toStation",
"$$this._id"
]
},
"$$this.name",
"$$value"
]
}
}
},
count: 1
}
},
{
$sort: {
fromStation: 1,
toStation: 1
}
}
])
I have an aggregation query that returns the sum / total number of reviews submitted for a given location ( not the average star rating ). Reviews are scored 1 - 5 stars. This particular query groups these reviews into two categories, "internal" and "google".
I have a query that returns results that are almost what I'm looking for. However, I need to add an additional condition for internal reviews. I want to ensure that the internal reviews "stars" value exists / is not null and contains a value of at least 1. So, I was thinking adding something similar to this would work:
{ "stars": {$gte: 1} }
This is the current aggregation query:
[
{
$match: { createdAt: { $gte: fromDate, $lte: toDate } }
},
{
$lookup: {
from: 'branches',
localField: 'branch',
foreignField: '_id',
as: 'branch'
}
},
{ $unwind: '$branch' },
{
$match: { 'branch.org_id': branchId }
},
{
$group: {
_id: '$branch.name',
google: {
$sum: {
$cond: [{ $eq: ['$source', 'Google'] }, 1, 0]
}
},
internal: {
$sum: {
$cond: [ { $eq: ['$internal', true]}, 1, 0 ],
},
}
}
}
]
Truncated Schema:
{
branchId: { type: String, required: true },
branch: { type: Schema.Types.ObjectId, ref: 'branches' },
wouldRecommend: { type: String, default: '' }, // RECOMMENDATION ONLY
stars: { type: Number, default: 0 }, // IF 1 - 5 DOCUMENT IS A REVIEW
comment: { type: String, default: '' },
internal: { type: Boolean, default: true },
source: { type: String, required: true },
},
{ timestamps: true }
I need to make sure that I'm not counting "wouldRecommend" recommendations in the sum of the internal reviews. Do determine if something is a review it will have a star rating of 1 or more stars. Recommendations will have a star value of 0.
How can I add the condition that ensures the internal "$stars" value is >= 1 ( greater than or equal to 1 ) ?
Using Ashh's answer I was able to form this query:
[
{
$lookup: {
from: 'branches',
localField: 'branch',
foreignField: '_id',
as: 'branch'
}
},
{ $unwind: '$branch' },
{
$match: {
'branch.org_id': branchId
}
},
{
$group: {
_id: '$branch.name',
google: {
$sum: {
$cond: [{ $eq: ['$source', 'Google'] }, 1, 0]
}
},
internal: {
$sum: {
$cond: [
{
$and: [{ $gte: ['$stars', 1] }, { $eq: ['$internal', true] }]
},
1,
0
]
}
}
}
}
];
You can use $and with the $cond operator
{ "$group": {
"_id": "$branch.name",
"google": { "$sum": { "$cond": [{ "$eq": ["$source", "Google"] }, 1, 0] }},
"internal": { "$sum": { "$cond": [{ "$eq": ["$internal", true] }, 1, 0 ] }},
"rating": {
"$sum": {
"$cond": [
{
"$and": [
{ "$gte": ["$stars", 1] },
{ "$eq": ["$internal", true] }
]
},
1,
0
],
}
}
}}