(javascript)
hello, i have a mongodb collection who have this schema:
{
_id: "any",
ids: {
user: "some value who can repeat" // and more keys but i will use this key here
},
time: 400 // can vary
}
I need to get some documents from this collection, filter to "time less than 700" and dont repeat the key "user"
I tried to use js tools for this, but in only 1 find query i get +900 documents
const ids = [];
const query = (await Personagens.find({ time: { $lt: 700 }}).sort({ time: 1 }))
.filter(x => {
if (!ids.includes(x.ids.user)) {
ids.push(x.ids.user);
return true;
}
}).slice(0, 50)
the output who shows +900 documents in 1 query
so i want to know if has some mongo db operator to filter repeated keys (the key ids.user) and get only 50 documents (obs: i use mongoose)
db.collection.aggregate([
{
"$match": {
time: {
"$gt": 700
}
}
},
{
"$group": {
"_id": "$ids.user",
"doc": {
"$push": "$$ROOT"
}
}
},
{
"$set": {
"doc": {
"$first": "$doc"
}
}
},
{
"$replaceWith": "$doc"
}
])
mongoplayground
Related
Am really new to MongoDB or NoSQL database.
I have this userSchema schema
const postSchema = {
title: String,
posted_on: Date
}
const userSchema = {
name: String,
posts: [postSchema]
}
I want to retrieve the posts by a user in given range(/api/users/:userId/posts?from=date&to=date&limit=limit) using mongodb query. In a relational database, we generally create two different sets of tables and query the second table(posts) using some condition and get the required result.
How can we achieve the same in mongodb? I have tried using $elemMatch by referring this but it doesn't seem to work.
2 ways to do it with aggregation framework, that can do much more than a find can do.
With find we mostly select documents from a collection, or project to keep some fields from a document that is selected, but here you need only some members of an array, so aggregation is used.
Local way (solution at document level) no unwind etc
Test code here
Query
filter the array and keep only posted_on >1 and <4
(i used numbers fro simplicity use dates its the same)
take the first 2 elements of the array (limit 2)
db.collection.aggregate([
{
"$match": {
"name": {
"$eq": "n1"
}
}
},
{
"$set": {
"posts": {
"$slice": [
{
"$filter": {
"input": "$posts",
"cond": {
"$and": [
{
"$gt": [
"$$this.posted_on",
1
]
},
{
"$lt": [
"$$this.posted_on",
5
]
}
]
}
}
},
2
]
}
}
}
])
Uwind solution (solution at collection level)
(its smaller a bit, but keeping things local is better, but in your case it doesn't matter)
Test code here
Query
match user
unwind the array, and make each member to be ROOT
match the dates >1 <4
limit 2
db.collection.aggregate([
{
"$match": {
"name": {
"$eq": "n1"
}
}
},
{
"$unwind": {
"path": "$posts"
}
},
{
"$replaceRoot": {
"newRoot": "$posts"
}
},
{
"$match": {
"$and": [
{
"posted_on": {
"$gt": 1
}
},
{
"posted_on": {
"$lt": 5
}
}
]
}
},
{
"$limit": 2
}
])
Sorry, I didn't get the MongoDB aggregation well.
How can I achieve with an aggregation this:
[
{array: [1,2,3] },
{array: [4,5,6] },
{array: [7,8,9] }
]
desired result:
[1,2,3,4,5,6,7,8,9]
Does the performance change if instead of using MongoDB aggregation I consider documents as normal objects?
Aggregation is always a better option instead of using some language code and that is why database provides such type of relief to get the results in one go.
db.collection.aggregate([
{ "$group": {
"_id": null,
"data": { "$push": "$array" }
}},
{ "$project": {
"_id": 0,
"data": {
"$reduce": {
"input": "$data",
"initialValue": [],
"in": { "$concatArrays": ["$$this", "$$value"] }
}
}
}}
])
The only thing you have to take care here is the size of the returned result for single document should not exceed more 16MB Bson limit. More you can learn from here
You can $group by null to get an array of arrays as a single document and then you can run $reduce with $concatArrays to flatten that array:
db.col.aggregate([
{
$group: {
_id: null,
array: { $push: "$array" }
}
},
{
$project: {
_id: 0,
array: {
$reduce: {
input: "$array",
initialValue: [],
in: { $concatArrays: [ "$$value", "$$this" ] }
}
}
}
}
])
MongoDB Playground
I have a mongoDB collection with documents like the one bellow. I want to cumulatively, over all documents, count how many subdocuments that the event field has, which is not null.
{
name: "name1",
events: {
created: {
timestamp: 1512477520951
},
edited: {
timestamp: 1512638551022
},
deleted: null
}
}
{
name: "name2",
events: {
created: {
timestamp: 1512649915779
},
edited: null,
deleted: null
}
}
So the result of the query on these two documents should return 3, because there are 3 events that is not null in the collection. I can not change the format of the document to have the event field be an array.
You want $objectToArray from MongoDB 3.4.7 or greater in order to do this as an aggregation statement:
db.collection.aggregate([
{ "$group": {
"_id": null,
"total": {
"$sum": {
"$size": {
"$filter": {
"input": {
"$objectToArray": "$events"
},
"cond": { "$ne": [ "$$this.v", null ] }
}
}
}
}
}}
])
That part is needed to look at the "events" object and translate each of the "key/value" pairs into array entries. In this way you can apply the $filter operation in order to remove the null "values" ( the "v" property ) and then use $size in order to count the matching list.
All of that is done under a $group pipeline stage using the $sum accumulator
Or if you don't have a supporting version, you need mapReduce and JavaScript execution in order to to the same "object to array" operation:
db.collection.mapReduce(
function() {
emit(null,
Object.keys(this.events).filter(k => this.events[k] != null).length);
},
function(key,values) {
return Array.sum(values);
},
{ out: { inline: 1 } }
)
That uses the same basic process by obtaining the object keys as an array and rejecting those where the value is found to be null, then obtaining the length of the resulting array.
Because of the JavaScript evaluation, this is much slower than the aggregation framework counterpart. But it's really a question of what server version you have available to support what you need.
I have a mapreduce function I want to write in mongoDB to count how many times a character has been played with. The relevant part from my json looks like this:
"playerInfo": {
"Player 1": {
"info":{
"characterId":17
}
},
"Player 2": {
"info":{
"characterId":20
}
}
}
I want to count how many times every "characterId" persists in my documents, there are 10 players, from player 1 to player 10.
Two questions:
1. How do I use mapreduce in mongo when I have a number as a part of my key.
2. How do I concatinate string in mapreduce so the code that is shown lower can be correct?
db.LoL.mapReduce( function()
{
for (var i in this.playerInfo)
{
emit(this.playerInfo.'Player '+(i).info.characterId, 1);
}
},
function(keys, values) {
return Array.sum(values)
}, {out: { merge: "map_reduce_example5" } } )
Thank you very much for your answers!
So there are really a couple of things wrong with the structure here and you really "should" change it
The mapReduce is pretty simple since you can just iterate the key names via Object.keys()
db.LoL.mapReduce(
function() {
Object.keys(this.playerInfo).forEach(function(key) {
emit({ "player": key, "characterId": this.playerInfo[key].info.characterId }, 1)
})
},
function(values) { return Array.sum(values) },
{
"query": { "playerInfo": { "$exists": true } }
"out": { "inline": 1 }
}
)
If you instead change the data format to use an array, and properties with values instead of named keys:
{
"playerInfo": [
{ "player": "Player 1", "characterId": 17 },
{ "player": "Player 2", "characterId": 20 }
]
}
Then the .aggregate() method is much faster in processing this, and returns a cursor for large result sets:
db.collection.aggregate([
{ "$unwind": "$playerInfo" },
{ "$group": {
"_id": "$playerInfo",
"count": { "$sum": 1 }
}}
])
With MongoDB 3.4 and greater you can even use on your present structure
db.LoL.aggregate([
{ "$project": {
"playerInfo": { "$objectToArray": "$playerInfo" }
}},
{ "$unwind": "$playerInfo" },
{ "$group": {
"_id": {
"player": "$playerInfo.k",
"characterId": "$playerInfo.v.info.characterId"
},
"count": { "$sum": 1 }
}}
])
Which is basically the same as the mapReduce, only a lot faster due to the native operators used as opposed to JavaScript evaluation, which runs much slower.
I am using MongoDB aggregation in meteor.
The items in database look like this:
// item1
{
products: {
aaa: 100,
bbb: 200
}
}
// item2
{
products: {
aaa: 300,
bbb: 400
}
}
My pipeline looks like this
let pipeline = [{
$limit: 10
}, {
$group: {
_id: {
// …
},
total: {
$sum: "$products.aaa"
}
}
}];
And it is working perfect. But when I change my database structure to this
// item1
{
products: [
{code: "aaa", num: 100},
{code: "bbb", num: 200}
]
}
// item2
{
products: [
{code: "aaa", num: 300},
{code: "bbb", num: 400}
]
}
The results I got for total is always 0, I think my pipeline is wrong. Please see the comment inside:
let pipeline = [{
$limit: 10
}, {
$group: {
_id: {
// …
},
total: {
$sum: "$products.0.num" // Neither this nor "$products[0].num" works
}
}
}];
So how can I write it correctly? Thanks
With MongoDB 3.2 ( which won't be the bundled server with meteor, but there is noting stopping you using a seperate server instance. And actually would be recommended ) you can use $arrayElemAt with $map:
let pipeline = [
{ "$limit": 10 },
{ "$group": {
"_id": {
// …
},
"total": {
"$sum": { "$arrayElemAt": [
{ "$map": {
"input": "$products",
"as": "product",
"in": "$$product.num"
}},
0
]}
}
}}
];
With older versions, use "two" $group stages and the $first operator after processing with $unwind. And that's just for the "first" index value:
let pipeline = [
{ "$limit": 10 },
{ "$unwind": "$products" },
{ "$group": {
"_id": "$_id", // The document _id
"otherField": { "$first": "$eachOtherFieldForGroupingId" },
"productNum": { "$first": "$products.num" }
}},
{ "$group": {
"_id": {
// …
},
"total": {
"$sum": "$productNum"
}
}}
];
So in the latter case, after you $unwind you just want to use $first to get the "first" index from the array, and it would also be used to get every field you want to use as part of the grouping key from the original document. All elements would be copied for each array member after $unwind.
In the former case, $map just extracts the "num" values for each array member, then $arrayElemAt just retrieves the wanted index position.
Naturally the newer method for MongoDB 3.2 is better. If you wanted another array index then you would need to repeatedly get the $first element from the array and keep filtering it out from the array results until you reached the required index.
So whilst it's possible in earlier versions, it's a lot of work to get there.