MongoDB aggregation: How to extract the field in the results - javascript

all!
I'm new to MongoDB aggregation, after aggregating, I finally get the result:
"result" : [
{
"_id" : "531d84734031c76f06b853f0"
},
{
"_id" : "5316739f4031c76f06b85399"
},
{
"_id" : "53171a7f4031c76f06b853e5"
},
{
"_id" : "531687024031c76f06b853db"
},
{
"_id" : "5321135cf5fcb31a051e911a"
},
{
"_id" : "5315b2564031c76f06b8538f"
}
],
"ok" : 1
The data is just what I'm looking for, but I just want to make it one step further, I hope my data will be displayed like this:
"result" : [
"531d84734031c76f06b853f0",
"5316739f4031c76f06b85399",
"53171a7f4031c76f06b853e5",
"531687024031c76f06b853db",
"5321135cf5fcb31a051e911a",
"5315b2564031c76f06b8538f"
],
"ok" : 1
Yes, I just want to get all the unique id in a plain string array, is there anything I could do? Any help would be appreciated!

All MongoDB queries produce "key/value" pairs in the result document. All MongoDB content is basically a BSON document in this form, which is just "translated" back to native code form by the driver to the language it is implemented in.
So the aggregation framework alone is never going to produce a bare array of just the values as you want. But you can always just transform the array of results, as after all it is only an array
var result = db.collection.aggregate(pipeline);
var response = result.result.map(function(x) { return x._id } );
Also note that the default behavior in the shell and a preferred option is that the aggregation result is actually returned as a cursor from MongoDB 2.6 and onwards. Since this is in list form rather than as a distinct document you would process differently:
var response = db.collection.aggregate(pipeline).map(function(x) {
return x._id;
})

Related

Query firebase to return if value more than number

I want to get data from Firebase.
This is more or less my data structure:
"Reports" : {
"N06Jrz5hx6Q9bcVDBBUrF3GKSTp2" : 2,
"eLLfNlWLkTcImTRqrYnU0nWuu9P2" : 2
},
"Users":{
"N06Jrz5hx6Q9bcVDBBUrF3GKSTp2" : {
"completedWorks" : {
...
},
"reports" : {
"-LHs0yxUXn-TQC7z_MJM" : {
"category" : "Niewyraźne zdjęcie",
"creatorID" : "z8DxcXyehgMhRyMqmf6q8LpCYfs1",
"reportedID" : "N06Jrz5hx6Q9bcVDBBUrF3GKSTp2",
"resolved" : false,
"text" : "heh",
"workID" : "-LHs-aZJkAhEf1RHVasg"
},
"-LHs1hzlL4roUJfMlvyA" : {
"category" : "Zdjęcie nie przedstawia zadania",
"creatorID" : "z8DxcXyehgMhRyMqmf6q8LpCYfs1",
"reportedID" : "N06Jrz5hx6Q9bcVDBBUrF3GKSTp2",
"resolved" : false,
"text" : "",
"workID" : "-LHs-aZJkAhEf1RHVasg"
}
},
"userType" : "company",
"verified" : true
},
}
So as you can see the number of reports is listed in the Reports part. How can I make Firebase return only the ids of the users where the report number is over or equal 3?
Something like this (this will not work, but I hope kind of shows what I was thinking about):
firebase.database().ref('Reports').orderBy(whatHere?).moreThen(2).on('value', snap => {
Is this even doable like this? If yes how could I do it? I want to grab the IDs of the users where reports are >= 3
You're looking for orderByValue():
firebase.database().ref('Reports').orderByValue().startAt(3).on('value', snapshot => {
snapshot.forEach(reportSnapshot => {
console.log(reportSnapshot.key);
})
})
Also check out the Firebase documentation on ordering data.
There are two options for doing that but not exactly the way you wants. You have to use javascript for further processing. One is to use limitToLast after using order by. which will give the last numbers from the result.
firebase.database().ref('Reports').orderBy(reportid).limitToLast(2).on('value', snap => {
Or use startAt and endAt to skip and fetch the result as offset which can provide the data between two reportId.
firebase.database().ref('Reports').orderBy(reportid).
.startAt(reportIdStart)
.endAt(reportIdLast)
.limitToLast(15)
According Firebase documentation:
Using startAt(), endAt(), and equalTo() allows you to choose arbitrary
starting and ending points for your queries
To filter data, you can combine any of the limit or range methods with an order-by method when constructing a query.
Unlike the order-by methods, you can combine multiple limit or range
functions. For example, you can combine the startAt() and endAt()
methods to limit the results to a specified range of values.
For more information go through documentation on filtering data

Meteor: Return only single object in nested array within collection

I'm attempting to filter returned data sets with Meteor's find().fetch() to contain just a single object, it doesn't appear very useful if I query for a single subdocument but instead I receive several, some not even containing any of the matched terms.
I have a simple mixed data collection that looks like this:
{
"_id" : ObjectId("570d20de3ae6b49a54ee01e7"),
"name" : "Entertainment",
"items" : [
{
"_id" : ObjectId("57a38b5f2bd9ac8225caff06"),
"slug" : "this-is-a-long-slug",
"title" : "This is a title"
},
{
"_id" : ObjectId("57a38b835ac9e2efc0fa09c6"),
"slug" : "mc",
"title" : "Technology"
}
]
}
{
"_id" : ObjectId("570d20de3ae6b49a54ee01e8"),
"name" : "Sitewide",
"items" : [
{
"_id" : ObjectId("57a38bc75ac9e2efc0fa09c9"),
"slug" : "example",
"name" : "Single Example"
}
]
}
I can easily query for a specific object in the nested items array with the MongoDB shell as this:
db.categories.find( { "items.slug": "mc" }, { "items.$": 1 } );
This returns good data, it contains just the single object I want to work with:
{
"_id" : ObjectId("570d20de3ae6b49a54ee01e7"),
"items" : [
{
"_id" : ObjectId("57a38b985ac9e2efc0fa09c8")
"slug" : "mc",
"name" : "Single Example"
}
]
}
However, if a similar query within Meteor is directly attempted:
/* server/publications.js */
Meteor.publish('categories.all', function () {
return Categories.find({}, { sort: { position: 1 } });
});
/* imports/ui/page.js */
Template.page.onCreated(function () {
this.subscribe('categories.all');
});
Template.page.helpers({
items: function () {
var item = Categories.find(
{ "items.slug": "mc" },
{ "items.$": 1 } )
.fetch();
console.log('item: %o', item);
}
});
The outcome isn't ideal as it returns the entire matched block, as well as every object in the nested items array:
{
"_id" : ObjectId("570d20de3ae6b49a54ee01e7"),
"name" : "Entertainment",
"boards" : [
{
"_id" : ObjectId("57a38b5f2bd9ac8225caff06")
"slug" : "this-is-a-long-slug",
"name" : "This is a title"
},
{
"_id" : ObjectId("57a38b835ac9e2efc0fa09c6")
"slug" : "mc",
"name" : "Technology"
}
]
}
I can then of course filter the returned cursor even further with a for loop to get just the needed object, but this seems unscalable and terribly inefficient while dealing with larger data sets.
I can't grasp why Meteor's find returns a completely different set of data than MongoDB's shell find, the only reasonable explanation is both function signatures are different.
Should I break up my nested collections into smaller collections and take a more relational database approach (i.e. store references to ObjectIDs) and query data from collection-to-collection, or is there a more powerful means available to efficiently filter large data sets into single objects that contain just the matched objects as demonstrated above?
The client side implementation of Mongo used by Meteor is called minimongo. It currently only implements a subset of available Mongo functionality. Minimongo does not currently support $ based projections. From the Field Specifiers section of the Meteor API:
Field operators such as $ and $elemMatch are not available on the client side yet.
This is one of the reasons why you're getting different results between the client and the Mongo shell. The closest you can get with your original query is the result you'll get by changing "items.$" to "items":
Categories.find(
{ "items.slug": "mc" },
{ "items": 1 }
).fetch();
This query still isn't quite right though. Minimongo expects your second find parameter to be one of the allowed option parameters outlined in the docs. To filter fields for example, you have to do something like:
Categories.find(
{ "items.slug": "mc" },
{
fields: {
"items": 1
}
}
).fetch();
On the client side (with Minimongo) you'll then need to filter the result further yourself.
There is another way of doing this though. If you run your Mongo query on the server, you won't be using Minimongo, which means projections are supported. As a quick example, try the following:
/server/main.js
const filteredCategories = Categories.find(
{ "items.slug": "mc" },
{
fields: {
"items.$": 1
}
}
).fetch();
console.log(filteredCategories);
The projection will work, and the logged results will match the results you see when using the Mongo console directly. Instead of running your Categories.find on the client side, you could instead create a Meteor Method that calls your Categories.find on the server, and returns the results back to the client.

MongoDB Finding duplicates sharing multiple fields using MapReduce

I am trying to find duplicates in a Mongo version 2.4 database that is being used for production and therefore cannot be updated. Since aggregate does not exist in 2.4, I cannot use the aggregate pipeline to find duplicates, therefore I am trying to find a solution using MapReduce.
I have tried the following set of map, reduce, and finalize functions, through MongoVUE's Map Reduce interface, and they returned nothing after running for less than a second on a 3,000,000 record collection that definitely has duplicates on the indicated fields. Obviously something went wrong, but MongoVUE did not show any error messages or helpful indications.
function Map() {
emit(
{name: this.name, LocationId: this.LocationId,
version: this.version},
{count:1, ScrapeDate: this.ScrapeDate}
);
}
function Reduce(key, values) {
var reduced = {count:0, ScrapeDate:''2000-01-01''};
values.forEach(function(val) {
reduced.count += val.count;
if (reduced.ScrapeDate.localeCompare(val.ScrapeDate) < 0)
reduced.ScrapeDate=val.ScrapeDate;
});
return reduced;
return values[0];
}
function Finalize(key, reduced) {
if (reduced.count > 1)
return reduced;
}
I just need to find any instance of multiple records that share the same name, LocationId, and version, and ideally display the most recent ScrapeDate of such a record.
Your map-reduce code worked without any issues, though for a very small dataset. I think return values[0]; in the reduce function would be a copy paste error. You could try the same through the mongo shell.
Since aggregate does not exist in 2.4, I cannot use the aggregate pipeline to find duplicates, therefore I am trying to find a solution
using MapReduce.
You got it wrong here, db.collection.aggregate(pipeline, options) was introduced in the version 2.2.
Here is how it could be done with the aggregation framework, but it would not be preferred since your dataset is very huge, and the $sort operator has memory limit of 10% of RAM, in v2.4.
db.collection.aggregate(
[
// sort the records, based on the 'ScrapeDate' field, in descending order.
{$sort:{"ScrapeDate":-1}},
// group by the key fields, and take the 'ScrapeDate' of the first document,
// Since it is in sorted order, the first document would contain the
// highest field value.
{$group:{"_id":{"name":"$name","LocationId":"$LocationId","version":"$version"}
,"ScrapeDate":{$first:"$ScrapeDate"}
,"count":{$sum:1}}
},
// output only the group, having documents greater than 1.
{$match:{"count":{$gt:1}}}
]
);
Coming to your Map-reduce functions, it ran without issues on my test data.
db.collection.insert({"name":"c","LocationId":1,"version":1,"ScrapeDate":"2000-01-01"});
db.collection.insert({"name":"c","LocationId":1,"version":1,"ScrapeDate":"2001-01-01"});
db.collection.insert({"name":"c","LocationId":1,"version":1,"ScrapeDate":"2002-01-01"});
db.collection.insert({"name":"d","LocationId":1,"version":1,"ScrapeDate":"2002-01-01"});
running the map-reduce,
db.collection.mapReduce(Map,Reduce,{out:{"inline":1},finalize:Finalize});
o/p:
{
"results" : [
{
"_id" : {
"name" : "c",
"LocationId" : 1,
"version" : 1
},
"value" : {
"count" : 3,
"ScrapeDate" : "2002-01-01"
}
},
{
"_id" : {
"name" : "d",
"LocationId" : 1,
"version" : 1
},
"value" : null
}
],
"timeMillis" : 0,
"counts" : {
"input" : 4,
"emit" : 4,
"reduce" : 1,
"output" : 2
},
"ok" : 1,
}
Notice that the output contains value:null for a record which doesn't have any duplicates.
This is due to your finalize function:
function Finalize(key, reduced) {
if (reduced.count > 1)
return reduced; // returned null by default for keys with single value,
// i.e count=1
}
The finalize function do not filter out keys. So you can't get only the keys that are duplicates. You will get all the keys, in the map-reduce output. In your finalize functions, you can just not show their values, which is what you are doing.

MongoDB - Updated $ref value unable to query new value

I've posted the following question which has been answered correctly:
MongoDB - Updating only $ref from DBRef field type
Despite of this when I execute the find method like this:
{ "codeId" : { "$ref" : "code" , "$id" : { "$oid" :
"4ff1c08c6ef25616ce21c4b6"}} }
The document isn't returned... Any idea why?
After the update the document is stored like this:
{ "_id" : { "$oid" : "5097ae1cd3159eb52d05574c"} , "codeId" : { "$ref"
: "code" , "$id" : { "$oid" : "4ff1c08c6ef25616ce21c4b6"}} }
By the way, using uMongo GUI, if I select the Update option over this stored document, and save it, without making any changes whatsoever, and then make the find query once again, the document is returned by the query...
Thanks
This is a clearly one of those DBRef "tweaky" things...
As a temporary (but probably correct) fix, I managed to solve this problem executing this javascript procedure:
var cursor = db.menu.find( { "codeId.$ref" : "version" } );
while( cursor.hasNext() )
{
var document = cursor.next();
db.menu.update(
document,
{ $set: {"codeId" : DBRef("code", document.codeId.$id) }},
{ upsert: false, multi: true }
);
}
Still, I won't consider this to be the best way to achieve what I want... Any other solution that involves less lines?

how to push a dictionary to a nested array with mongodb?

i have data that looks like this in my database
> db.whocs_up.find()
{ "_id" : ObjectId("52ce212cb17120063b9e3869"), "project" : "asnclkdacd", "users" : [ ] }
and i tried to add to the 'users' array like thus:
> db.whocs_up.update({'users.user': 'usex', 'project' : 'asnclkdacd' },{ '$addToSet': { 'users': {'user':'userx', 'lastactivity' :2387843543}}},true)
but i get the following error:
Cannot apply $addToSet modifier to non-array
same thing happens with push operator, what im i doing wrong?
im on 2.4.8
i tried to follow this example from here:
MongoDB - Update objects in a document's array (nested updating)
db.bar.update( {user_id : 123456, "items.item_name" : {$ne : "my_item_two" }} ,
{$addToSet : {"items" : {'item_name' : "my_item_two" , 'price' : 1 }} } ,
false ,
true)
the python tag is because i was working with python when i ran into this, but it does nto work on the mongo shell as you can see
EDIT ============================== GOT IT TO WORK
apparently if i modify the update from
db.whocs_up.update({'users.user': 'usex', 'project' : 'asnclkdacd' },{ '$addToSet': { 'users': {'user':'userx', 'lastactivity' :2387843543}}},true)
to this:
db.whocs_up.update({'project' : 'asnclkdacd' },{ '$addToSet': { 'users': {'user':'userx', 'lastactivity' :2387843543}}},true)
it works, but can anyone explain why the two do not achieve the same thing, in my understanding they should have referenced the same document and hence done the same thing,
What does the addition of 'users.user': 'userx' change in the update? does it refer to some inner document in the array rather than the document as a whole?
This is a known bug in MongoDB (SERVER-3946). Currently, an update with $push/$addToSet with a query on the same field does not work as expected.
In the general case, there are a couple of workarounds:
Restructure your update operation to not have to query on a field that is also to be updated using $push/$addToSet (as you have done above).
Use the $all operator in the query, supplying a single-value array containing the lookup value. e.g. instead of this:
db.foo.update({ x : "a" }, { $addToSet : { x : "b" } }, true)
do this:
db.foo.update({ x : { $all : ["a"] } }, { $addToSet : { x : "b" } } , true)
In your specific case, I think you need to re-evaluate the operation you're trying to do. The update operation you have above will add a new array entry for each unique (user, lastactivity) pair, which is probably not what you want. I assume you want a unique entry for each user.
Consider changing your schema so that you have one document per user:
{
_id : "userx",
project : "myproj",
lastactivity : 123,
...
}
The update operation then becomes something like:
db.users.update({ _id : "userx" }, { $set : { lastactivity : 456 } })
All users in a given project may still be looked up efficiently by adding a secondary index on project.
This schema also avoids the unbounded document growth of the above schema, which is better for performance.

Categories