I have a document in my mongo instance in below format,
{
"_id" : "08d4a242-08fb-07f7-46e5-8717a81d5b70",
"fname" : "john",
"created_date" : ISODate("2017-05-24T01:13:06.829Z"),
"customProp" : [
[
"customX","{\"some data related to X \"}"
],
[
"customY","{\"some data related to Y \"}"
],
[
"customZ","{\"some data related to Z \"}"
]
]
}
the elements/values like "customX","customY" & "customZ" are not necessarily be in all documents. How to retrieve all the values in second element of "customProp" array, in this document it contains "customZ"?
I'm able to use following query to filter & find all the documents which are having "customZ" element,
db.getCollection('col1').find({$and : [{"customProp":{$elemMatch:{0:"customZ"}}}, {"created": { $gte: ISODate("2017-05-22T00:00:00.000Z") }}] },{"created":1}).limit(1) .pretty()
output:
{
"_id" : "08d4a242-08fb-07f7-46e5-8717a81d5b45",
"created" : ISODate("2017-05-24T01:13:06.829Z")
}
but finding a way to retrieve all the values in second element of array where the first value is "customZ".
expected result:
{
"_id" : "08d4a242-08fb-07f7-46e5-8717a81d5b45",
"created" : ISODate("2017-05-24T01:13:06.829Z"),
"customPro": ["customZ","{\"some data related to Z \"}"]
}
I'm fine if my query just returns
{
"{\"some data related to Z \"}"
}
Well it is a nested array, which is not a great idea but you are in fact matching the element with the $elemMatch expression, so you do get the position in the "outer" array of customProp, which allows you to project with the positional $ operator:
db.getCollection('coll1').find(
{
"customProp":{ "$elemMatch": { "0": "customZ" } },
"created_date": { "$gte": ISODate("2017-05-22T00:00:00.000Z") }
},
{ "created_date": 1, "customProp.$": 1 }
)
That yields the result:
{
"_id" : "08d4a242-08fb-07f7-46e5-8717a81d5b70",
"created_date" : ISODate("2017-05-24T01:13:06.829Z"),
"customProp" : [
[
"customZ",
"{\"some data related to Z \"}"
]
]
}
Where customProp is of course still in a nested array, but when processing the individual documents in python you can just access the property at the array index:
doc['customProp'][0][1]
Which of course returns the value:
'{"some data related to Z "}'
Same goes for JavaScript really, which is basically identical in syntax. As a shell example:
db.getCollection('coll1').find(
{
"customProp":{ "$elemMatch": { "0": "customZ" } },
"created_date": { "$gte": ISODate("2017-05-22T00:00:00.000Z") }
},
{ "created_date": 1, "customProp.$": 1 }
).map(function(doc) {
doc['customProp'] = doc['customProp'][0][1];
return doc;
})
And the output:
{
"_id" : "08d4a242-08fb-07f7-46e5-8717a81d5b70",
"created_date" : ISODate("2017-05-24T01:13:06.829Z"),
"customProp" : "{\"some data related to Z \"}"
}
And the positional $ projection here ensures there is only one element in the returned array, so the notation is always the same to extract from all document results. So you get the matched element from the database, and you extract the property through the code.
Also note that you do not need $and here since all the query arguments are already AND conditions. This is the MongoDB default, so you do not need to explicitly express it. See how much nicer this looks without it.
Related
I have a document structure something along the lines of the following:
{
"_id" : "777",
"someKey" : "someValue",
"someArray" : [
{
"name" : "name1",
"someNestedArray" : [
{
"name" : "value"
},
{
"name" : "delete me"
}
]
}
]
}
I want to delete the nested array element with the value "delete me".
I know I can find documents which match this description using nested $elemMatch expressions. What is the query syntax for removing the element in question?
To delete the item in question you're actually going to use an update. More specifically you're going to do an update with the $pull command which will remove the item from the array.
db.temp.update(
{ _id : "777" },
{$pull : {"someArray.0.someNestedArray" : {"name":"delete me"}}}
)
There's a little bit of "magic" happening here. Using .0 indicates that we know that we are modifying the 0th item of someArray. Using {"name":"delete me"} indicates that we know the exact data that we plan to remove.
This process works just fine if you load the data into a client and then perform the update. This process works less well if you want to do "generic" queries that perform these operations.
I think it's easiest to simply recognize that updating arrays of sub-documents generally requires that you have the original in memory at some point.
In response to the first comment below, you can probably help your situation by changing the data structure a little
"someObjects" : {
"name1": {
"someNestedArray" : [
{
"name" : "value"
},
{
"name" : "delete me"
}
]
}
}
Now you can do {$pull : { "someObjects.name1.someNestedArray" : ...
Here's the problem with your structure. MongoDB does not have very good support for manipulating "sub-arrays". Your structure has an array of objects and those objects contain arrays of more objects.
If you have the following structure, you are going to have a difficult time using things like $pull:
array [
{ subarray : array [] },
{ subarray : array [] },
]
If your structure looks like that and you want to update subarray you have two options:
Change your structure so that you can leverage $pull.
Don't use $pull. Load the entire object into a client and use findAndModify.
MongoDB 3.6 added $[] operator that facilitates updates to arrays that contain embedded documents. So the problem can be solved by:
db.test.update(
{ _id : "777" },
{$pull : {"someArray.$[].someNestedArray" : {"name":"delete me"}}}
)
As #Melkor has commented (should probably be an answer as itself),
If you do not know the index use:
{
_id: TheMainID,
"theArray._id": TheArrayID
},
{
$pull: {
"theArray.$.theNestedArray": {
_id: theNestedArrayID
}
}
}
From MongoDB 3.6 on you can use arrayFilters to do this:
db.test.update(
{ _id: "777" },
{ $pull: { "someArray.$[elem].someNestedArray": { name: "delete me" } } },
{ arrayFilters: [{ "elem.name": "name1"}] }
)
see also https://docs.mongodb.com/manual/reference/operator/update/positional-filtered/index.html#update-all-documents-that-match-arrayfilters-in-an-array
Other example and usage could be like this:
{
"company": {
"location": {
"postalCode": "12345",
"Address": "Address1",
"city": "Frankfurt",
"state": "Hessen",
"country": "Germany"
},
"establishmentDate": "2019-04-29T14:12:37.206Z",
"companyId": "1",
"ceo": "XYZ"
},
"items": [{
"name": "itemA",
"unit": "kg",
"price": "10"
},
{
"name": "itemB",
"unit": "ltr",
"price": "20"
}
]
}
DELETE : Mongodb Query to delete ItemB:
db.getCollection('test').update(
{"company.companyId":"1","company.location.city":"Frankfurt"},
{$pull : {"items" : {"name":"itemB"}}}
)
FIND: Find query for itemB:
db.getCollection('test').find(
{"company.companyId":"1","company.location.city":"Frankfurt","items.name":"itemB"},
{ "items.$": 1 }
)
3.UPDATE : update query for itemB:
db.getCollection('test').update
(
{"company.companyId":"1","company.location.city":"Frankfurt","items.name":"itemB"},
{ $set: { "items.$[].price" : 90 }},
{ multi: true });
I have the following sub-documents:
experiences: [
{
"workExperienceId" : ObjectId("59f8064e68d1f61441bec94a"),
"workType" : "Full Time",
"functionalArea" : "Law",
"company" : "Company A",
"title" : "new",
"from" : ISODate("2010-10-13T00:00:00.000Z"),
"to" : ISODate("2012-10-13T00:00:00.000Z"),
"_id" : ObjectId("59f8064e68d1f61441bec94b"),
"currentlyWorking" : false
},
...
...
{
"workExperienceId" : ObjectId("59f8064e68d1f61441bec94a"),
"workType" : "Full Time",
"functionalArea" : "Law",
"company" : "Company A",
"title" : "new",
"from" : ISODate("2014-10-14T00:00:00.000Z"),
"to" : ISODate("2015-12-13T00:00:00.000Z"),
"_id" : ObjectId("59f8064e68d1f61441bec94c"),
"currentlyWorking" : false
},
{
"workExperienceId" : ObjectId("59f8064e68d1f61441bec94a"),
"workType" : "Full Time",
"functionalArea" : "Law",
"company" : "Company A",
"title" : "new",
"from" : ISODate("2017-10-13T00:00:00.000Z"),
"to" : null,
"_id" : ObjectId("59f8064e68d1f61441bec94d"),
"currentlyWorking" : true
},
{
"workExperienceId" : ObjectId("59f8064e68d1f61441bec94a"),
"workType" : "Full Time",
"functionalArea" : "Law",
"company" : "Company A",
"title" : "new",
"from" : ISODate("2008-10-14T00:00:00.000Z"),
"to" : ISODate("2009-12-13T00:00:00.000Z"),
"_id" : ObjectId("59f8064e68d1f61441bec94c"),
"currentlyWorking" : false
},
]
As you see, there may not be date ordered within sequential date and maybe a non ordered date. Above data is for each user. So what I want is to get total years of experience for each user in year format. When to field is null and currentlyWorking is true then it means that I am currently working on that company.
Aggregation
Using the aggregation framework you could apply $indexOfArray where you have it available:
Model.aggregate([
{ "$addFields": {
"difference": {
"$subtract": [
{ "$cond": [
{ "$eq": [{ "$indexOfArray": ["$experiences.to", null] }, -1] },
{ "$max": "$experiences.to" },
new Date()
]},
{ "$min": "$experiences.from" }
]
}
}}
])
Failing that as long as the "latest" is always the last in the array, using $arrayElemAt:
Model.aggregate([
{ "$addFields": {
"difference": {
"$subtract": [
{ "$cond": [
{ "$eq": [{ "$arrayElemAt": ["$experiences.to", -1] }, null] },
new Date(),
{ "$max": "$experiences.to" }
]},
{ "$min": "$experiences.from" }
]
}
}}
])
That's pretty much the most efficient ways to do this, as a single pipeline stage applying $min and $max operators. For $indexOfArray you would need MongoDB 3.4 at least, and for simply using $arrayElemAt you can have MongoDB 3.2, which is the minimal version you should be running in production environments anyway.
One pass, means it gets done fast with little overhead.
The brief parts are the $min and $max allow you to extract the appropriate values directly from the array elements, being the "smallest" value of "from" and the largest value of "to" within the array. Where available the $indexOfArray operator can return the matched index from a provided array ( in this case from "to" values ) where a specified value ( as null here ) exists. If it's there the index of that value is returned, and where it is not the value of -1 is returned indicating that it is not found.
We use $cond which is a "ternary" or if..then..else operator to determine that when the null is not found then you want the $max value from "to". Of course when it is found this is the else where the value of the current Date which is fed into the aggregation pipeline as an external parameter on execution is returned instead.
The alternate case for a MongoDB 3.2 is that you instead "presume" the last element of your array is the most recent employment history item. In generally would be best practice to order these items so the most recent was either the "last" ( as seems to be indicated in your question ) or the "first" entry of the array. It would be logical to keep these entries in such order as opposed to relying on sorting the list at runtime.
So when using a "known" position such as "last", we can use the $arrayElemAt operator to return the value from the array at the specified position. Here it is -1 for the "last" element. The "first" element would be 0, and could arguably be applied to geting the "smallest" value of "from" as well, since you should have your array in order. Again $cond is used to transpose the values depending on whether null is returned. As an alternate to $max you can even use $ifNull to swap the values instead:
Model.aggregate([
{ "$addFields": {
"difference": {
"$subtract": [
{ "$ifNull": [{ "$arrayElemAt": ["$experiences.to", -1] }, new Date()] },
{ "$min": "$experiences.from" }
]
}
}}
])
That operator essentially switches out the values returned if the response of the first condition is null. So since we are grabbing the value from the "last" element already, we can "presume" that this does mean the "largest" value of "to".
The $subtract is what actually returns the "difference", since when you "subtract" one date from another the difference is returned as the milliseconds value between the two. This is how BSON Dates are actually internally stored, and it's the common internal date storage of date formats being the "milliseconds since epoch".
If you want the interval in a specific duration such as "years", then it's a simple matter of applying the "date math" to change from the milliseconds difference between the date values. So adjust by dividing out from the interval ( also showing $arrayElemAt on the "from" just for completeness ):
Model.aggregate([
{ "$addFields": {
"difference": {
"$floor": {
"$divide": [
{ "$subtract": [
{ "$ifNull": [{ "$arrayElemAt": ["$experiences.to", -1] }, new Date()] },
{ "$arrayElemAt": ["$experiences.from", 0] }
]},
1000 * 60 * 60 * 24 * 365
]
}
}
}}
])
That uses $divide as a math operator and 1000 milliseconds 60 for each of seconds and minutes, 24 hours and 365 days as the divisor value. The $floor "rounds down" the number from decimal places. You can do whatever you want there, but it "should" be used "inline" and not in separate stages, which simply add to processing overhead.
Of course, the presumption of 365 days is an "approximation" at best. If you want something more complete, then you can instead apply the date aggregation operators to the values to get a more accurate reading. So here, also applying $let to declare as "variables" for later manipulation:
Model.aggregate([
{ "$addFields": {
"difference": {
"$let": {
"vars": {
"to": { "$ifNull": [{ "$arrayElemAt": ["$experiences.to", -1] }, new Date()] },
"from": { "$arrayElemAt": ["$experiences.from", 0] }
},
"in": {
"years": {
"$subtract": [
{ "$subtract": [
{ "$year": "$$to" },
{ "$year": "$$from" }
]},
{ "$cond": {
"if": { "$gt": [{ "$month": "$$to" },{ "$month": "$$from" }] },
"then": 0,
"else": 1
}}
]
},
"months": {
"$add": [
{ "$subtract": [
{ "$month": "$$to" },
{ "$month": "$$from" }
]},
{ "$cond": {
"if": { "$gt": [{ "$month": "$$to" },{ "$month": "$$from" }] },
"then": 0,
"else": 12
}}
]
},
"days": {
"$add": [
{ "$subtract": [
{ "$dayOfYear": "$$to" },
{ "$dayOfYear": "$$from" }
]},
{ "$cond": {
"if": { "$gt": [{ "$month": "$$to" },{ "$month": "$$from" }] },
"then": 0,
"else": 365
}}
]
}
}
}
}
}}
])
Again that's a slight approximation on the days of the year. MongoDB 3.6 actually would allow you to test the "leap year" by implementing $dateFromParts to determine if 29th February was valid in the current year or not by assembling from the "pieces" we have available.
Work with returned data
Of course all the above is using the aggregation framework to determine the intervals from the array for each person. This would be the advised course if you were intending to "reduce" the data returned by essentially not returning the array items at all, or if you wanted these numbers for further aggregation in reporting to a larger "sum" or "average" statistic from the data.
If on the other hand you actually do want all the data returned for the person including the complete "experiences" array, then it's probably the best course of action to simply apply the calculations "after" all the data is returned from the server as you process each item returned.
The simple application of this would be to "merge" a new field into the results, just like $addFields does but on the "client" side instead:
Model.find().lean().cursor().map( doc =>
Object.assign(doc, {
"difference":
((doc.experiences.map(e => e.to).indexOf(null) === -1)
? Math.max.apply(null, doc.experiences.map(e => e.to))
: new Date() )
- Math.min.apply(null, doc.experiences.map(e => e.from)
})
).toArray((err, result) => {
// do something with result
})
That's just applying the same logic represented in the first aggregation example to a "client" side processing of the result cursor. Since you are using mongoose, the .cursor() method actually returns us a Cursor object from the underlying driver, of which mongoose normally hides away for "convenience". Here we want it because it gives us access to some handy methods.
The Cursor.map() is one such handy method which allows use to apply a "transform" on the content returned from the server. Here we use Object.assign() to "merge" a new property to the returned document. We could alternately use Array.map() on the "array" returned by mongoose by "default", but processing inline looks a little cleaner, as well as being a bit more efficient.
In fact Array.map() is the main tool here in manipulation since where we applied statements like "$experiences.to" in the aggregation statement, we apply on the "client" using doc.experiences.map(e => e.to), which does the same thing "transforming" the array of objects into an "array of values" for the specified field instead.
This allows the same checking using Array.indexOf() against the array of values, and also the Math.min() and Math.max() are used in the same way, implementing apply() to use those "mapped" array values as the argument values to the functions.
Finally of course since we still have a Cursor being returned, we convert this back into the more typical form you would work with mongoose results as an "array" using Cursor.toArray(), which is exactly what mongoose does "under the hood" for you on it's default requests.
The Query.lean() us a mongoose modifier which basically says to return and expect "plain JavaScript Objects" as opposed to "mongoose documents" matched to the schema with applied methods that are again the default return. We want that because we are "manipulating" the result. Again the alternate is to do the manipulation "after" the default array is returned, and convert via .toObject() which is present on all mongoose documents, in the event that "serializing virtual properties" is important to you.
So this is essentially a "mirror" of that first aggregation approach, yet applied to "client side" logic instead. As stated, it generally makes more sense to do it this way when you actually want ALL of the properties in the document in results anyway. The simple reason being that it makes no real since to add "additional" data to the results returned "before" you return those from the server. So instead, simply apply the transform "after" the database returns them.
Also much like above, the same client transformation approaches can be applied as was demonstrated in ALL the aggregation examples. You can even employ external libraries for date manipulation which give you "helpers" for some of the "raw math" approaches here.
you can achieve this with the aggregation framework like this:
db.collection.aggregate([
{
$unwind:"$experiences"
},
{
$sort:{
"experiences.from":1
}
},
{
$group:{
_id:null,
"from":{
$first:"$experiences.from"
},
"to":{
$last:{
$ifNull:[
"$to",
new Date()
]
}
}
}
},
{
$project:{
"diff":{
$subtract:[
"$to",
"$from"
]
}
}
}
])
This returns:
{ "_id" : null, "diff" : NumberLong("65357827142") }
Which is the difference in ms between the two dates, see $subtract for details
You can get the year by adding this additional stage to the end of the pipeline:
{
$project:{
"year":{
$floor:{
$divide:[
"$diff",
1000*60*60*24*365
]
}
}
}
}
This would then return:
{ "_id" : null, "year" : 2 }
I have two arrays, one I am getting from form request and the other I am getting from a mongoDb query. I want to look for all the items passed in request parameters inside the result obtained in mongodb array.
var userIdArr = ["6fbeb3e28de42200e7e7d36", "56fbde9be618221c1d67f01"];
var mongoDbResultsArr= [
{"_id": "56fce5bf7031bd482abdfb8a",
"userIds": "[56fbeb3e28de42200e7e7d36, 56fbde9be618221c1d67f010]"
},
{"_id": "56fd2c22ea0f3664156685ac",
"userIds": "[56fbeb3e28de42200e7e7d36, 56fbde9be618221c1d67f010, 56f53491886e496e32c68b00]"
}
]
Basically I want to search if the values given in userIdArr exists in the 'userIds' key of mongoDbResultsArr. Also if this can be done by writing just mongodb query, or any other better way.
This is how my schema looks like
{
"_id" : ObjectId("56fd2c22ea0f3664156685ac"),
"startDateTime" : ISODate("2016-03-31T08:34:59.000Z"),
"endDateTime" : ISODate("2016-03-31T08:34:59.000Z"),
"productId" : "#789990#2",
"applicableThemes" : [
"red"
],
"userIds" : "[56fbeb3e28de42200e7e7d36, 56fbde9be618221c1d67f010, 56f53491886e496e32c68b00]",
"price" : 45,
"dealType" : 1,
"photos" : {
"dealimageUrl3" : "http://mailers.shopping.indiatimes.com/shopstatic/shopping/images/diwali2013/banner.jpg",
"dealimageUrl2" : "http://mailers.shopping.indiatimes.com/shopstatic/shopping/images/diwali2013/banner.jpg",
"dealimageUrl1" : "http://mailers.shopping.indiatimes.com/shopstatic/shopping/images/diwali2013/banner.jpg"
},
"dealDescription" : "Bang Bang deal!!",
"dealName" : "Christmas Special Offer",
"storeId" : "56f5339fe99b1a294818ecbe",
"__v" : 0
}
and I am using mongoose ORM.
I'm using MongoDB 2.6.6
I have these documents in a MongoDB collection and here is an example:
{ ..., "field3" : { "one" : [ ISODate("2014-03-18T05:47:33Z"),ISODate("2014-06-02T20:00:25Z") ] }, ...}
{ ..., "field3" : { "two" : [ ISODate("2014-03-18T05:47:33Z"),ISODate("2014-06-02T20:00:25Z") ] }, ...}
{ ..., "field3" : { "three" : [ ISODate("2014-03-18T05:47:39Z"),ISODate("2014-03-19T20:18:38Z") ] }, ... }
I would like the merge these documents in one field. For an example, I would like the new result to be as follows:
{ "field3", : { "all" : [ ISODate("2014-03-18T05:47:39Z"),ISODate("2014-03-19T20:18:38Z"),...... ] },}
I'm just not sure any more how to have that result!
Doesn't really leave much to go on here but you can arguably get the kind of merged result with mapReduce:
db.collection.mapReduce(
function() {
var field = this.field3;
Object.keys(field).forEach(function(key) {
field[key].forEach(function(date) {
emit( "field3", { "all": [date] } )
});
});
},
function (key,values) {
var result = { "all": [] };
values.forEach(function(value) {
value.all.forEach(function(date) {
result.all.push( date );
});
});
result.all.sort(function(a,b) { return a.valueOf()-b.valueOf() });
return result;
},
{ "out": { "inline": 1 } }
)
Which being mapReduce is not exactly in the same output format given it's own restrictions for doing things:
{
"results" : [
{
"_id" : "field3",
"value" : {
"all" : [
ISODate("2014-03-18T05:47:33Z"),
ISODate("2014-03-18T05:47:33Z"),
ISODate("2014-03-18T05:47:39Z"),
ISODate("2014-03-19T20:18:38Z"),
ISODate("2014-06-02T20:00:25Z"),
ISODate("2014-06-02T20:00:25Z")
]
}
}
],
"timeMillis" : 86,
"counts" : {
"input" : 3,
"emit" : 6,
"reduce" : 1,
"output" : 1
},
"ok" : 1
}
Since the aggregation here into a single document is fairly arbitrary you could pretty much argue that you simply take the same kind of approach in client code.
At any rate this is only going to be useful over a relatively small set of data with next to the same sort of restrictions on the client processing. More than the 16MB BSON limit for MongoDB, but certainly limited by memory to be consumed.
So I presume you would need to add a "query" argument but it's not really clear from your question. Either using mapReduce or your client code, you are basically going to need to follow this sort of process to "mash" the arrays together.
I would personally go with the client code here.
I have a collection that looks like the ff:
{
"word":"approve",
"related" : [
{
"relationshipType" : "cross-reference",
"words" : [
"note"
]
},
{
"relationshipType" : "synonym",
"words" : [
"demonstrate",
"ratify",
]
}
],
},
{
"word": "note",
"related" : [
{
"relationshipType" : "synonym",
"words" : [
"butt",
"need",
],
},
{
"relationshipType" : "hypernym",
"words" : [
"air",
"tone",
]
},
{
"relationshipType" : "cross-reference",
"words" : [
"sign",
"letter",
"distinction",
"notice",
]
},
],
}
I want to group/categorize all the objects which have a word in another object, be it
the name (as the cross-reference field of 'approve', has note. searches for the word 'note'.
or
the word is in another object's related words. like having a synonym of 'ratify' under 'approve' then looking for other objects that have have 'ratify' in any field of their related words.
Then save these to a new collection called categories.
result should Be:
{
"category": 1,
"words":["approve","note"],
}
...and the value of the word field for all the linked objects in the words array.
Any way how to do this.. i'm thinking about some sort of recursion in checking links but i'm not sure how to implement. another potential problem is going back to the parent layer creating an infinite loop of sorts.
and is it possible through map reduce?
EDIT: clarity.