How do I identify that an array has unique entries in MongoDB? - javascript

I have an array of strings
users: ['user1', 'user2']
If I run a search looking for exactly ['user1', 'user2'] in that order, it will find that entry. However if they are back to front, the query returns nothing.
What's the best way to compare an input array against the list in the database to determine if it is a unique entry?

You can identify an unique array in a collection, by below query.
db.getCollection('mycollection').find({users: { $size: 2, $all: [ "user1" , "user2" ] }})
You need to mention the no. of elements in array you are checking, and check all elements in it by $all operator.

Using the aggregation framework with the $redact pipeline operator allows you to proccess the logical condition with the $cond operator and uses the special operations $$KEEP to "keep" the document where the logical condition is true or $$PRUNE to "remove" the document where the condition was false.
This operation is similar to having a $project pipeline that selects the fields in the collection and creates a new field that holds the result from the logical condition query and then a subsequent $match, except that $redact uses a single pipeline stage which is more efficient.
As for the logical condition, there are Set Operators that you can use since they allows expression that perform set operations on arrays, treating arrays as sets. Set expressions ignores the duplicate entries in each input array and the order of the elements, which is a suitable property in your case since you
want to disregard the order of the elements.
There are a couple of these operators that you can use to perform the logical condition, namely $setIsSubset and $setDifference.
Consider the following examples which demonstrate the above concept:
Populate Test Collection
db.collection.insert([
{ users: ['user1', 'user2'] },
{ users: ['user1', 'user2', 'user2'] },
{ users: ['user1', 'user2', 'user3'] },
{ users: ['user1', 'user3'] },
])
Example 1: $redact with $setEquals
var arr = [ "user2", "user1" ];
db.collection.aggregate([
{
"$redact": {
"$cond": [
{ "$setEquals": [ "$users", arr ] },
"$$KEEP",
"$$PRUNE"
]
}
}
])
Sample Output
/* 1 */
{
"_id" : ObjectId("5804902900ce8cbd028523d1"),
"users" : [
"user1",
"user2"
]
}
/* 2 */
{
"_id" : ObjectId("5804902900ce8cbd028523d2"),
"users" : [
"user1",
"user2",
"user2"
]
}
Example 2: $redact with $setDifference
var arr = [ "user2", "user1" ];
db.collection.aggregate([
{
"$redact": {
"$cond": [
{
"$eq": [
{ "$setDifference": [ "$users", arr ] },
[]
]
},
"$$KEEP",
"$$PRUNE"
]
}
}
])
Sample Output
/* 1 */
{
"_id" : ObjectId("5804902900ce8cbd028523d1"),
"users" : [
"user1",
"user2"
]
}
/* 2 */
{
"_id" : ObjectId("5804902900ce8cbd028523d2"),
"users" : [
"user1",
"user2",
"user2"
]
}
Another approach, though only recommended when $redact is not available, would be to use the $where operator as:
db.collection.find({
"$where": function() {
var arr = ["user2", "user1"];
return !(this.users.sort() > arr.sort() || this.users.sort() < arr.sort());
}
})
However, bear in mind that this won't perfom very well since a query operation with the $where operator calls the JavaScript engine to evaluate Javascript code on every document and checks the condition for each.
This is very slow as MongoDB evaluates non-$where query operations before $where expressions and non-$where query statements may use an index.
It is advisable to combine with indexed queries if you can so that the query may be faster. However, it's recommended to use JavaScript expressions and the $where operator as a last resort when you can't structure the data in any other way, or when you are dealing with a small subset of data.

Related

MongoDB: Efficiency of operation pushing to a nested array or updating it when identifier found, using aggregation pipeline

I have a document that holds lists containing nested objects. The document simplified looks like this:
{
"username": "user",
"listOne": [
{
"name": "foo",
"qnty": 5
},
{
"name": "bar",
"qnty": 3
},
],
"listTwo": [
{
"id": 1,
"qnty": 13
},
{
"id": 2,
"qnty": 9
},
]
}
And I need to update quantity in lists based on an indentifier. For list one it was easy. I was doing something like this:
db.collection.findOneAndUpdate(
{
"username": "user",
"listOne.name": name
},
{
$inc: {
"listOne.$.qnty": qntyChange,
}
}
)
Then I would catch whenever find failed because there was no object in the list with that name and nothing was updated, and do a new operation with $push. Since this is a rarer case, it didn't bother me to do two queries in the database collection.
But now I had to also add list two to the document. And since the identifiers are not the same I would have to query them individually. Meaning four searches in the database collection, in the worst case scenario, if using the same strategy I was using before.
So, to avoid this, I wrote an update using an aggregation pipeline. What it does is:
Look if there is an object in the list one with the queried identifier.
If true, map through the entire array and:
2.1) Return the same object if the identifier is different.
2.2) Return object with the quantity changed when identifier matches.
If false, push a new object with this identifier to the list.
Repeat for list two
This is the pipeline for list one:
db.coll1.updateOne(
{
"username": "user"
},
[{
"$set": {
"listOne": {
"$cond": {
"if": {
"$in": [
name,
"$listOne.name"
]
},
"then": {
"$map": {
"input": "$listOne",
"as": "one",
"in": {
"$cond": {
"if": {
"$eq": [
"$$one.name",
name
]
},
"then": {
"$mergeObjects": [
"$$one",
{
"qnty": {
"$add": [
"$$one.qnty",
qntyChange
]
}
}
]
},
"else": "$$one"
}
}
}
},
"else": {
"$concatArrays": [
"$listOne",
[
{
"name": name,
"qnty": qntyChange
}
]
]
}
}
}
}
}]
);
Entire pipeline can be foun on this Mongo Playgorund.
So my question is about how efficient is this. As I am paying for server time, I would like to use an efficient solution to this problem. Querying the collection four times, or even just twice but at every call, seems like a bad idea, as the collection will have thousands of entries. The two lists, on the other hand, are not that big, and should not exceed a thousand elements each. But the way it's written it looks like it will iterate over each list about two times.
And besides, what worries me the most, is if when I use map to change the list and return the same object, in cases where the identifier does not match, does MongoDB rewrite these elements too? Because not only would that increase my time on the server rewriting the entire list with the same objects, but it would also count towards the bytes size of my write operation, which are also charged by MongoDB.
So if anyone has a better solution to this, I'm all ears.
According to this SO answer,
What you actually do inside of the document (push around an array, add a field) should not have any significant impact on the total cost of the operation
So, in your case, your array operations should not be causing a heavy impact on the total cost.

How can I return an element of a MongoDB array?

If I have a document that looks like this:
{
"firstName": "John",
"lastName": "Doe",
"favoriteFoods": [{"name": "Cheeseburgers"}, {"name": "Broccoli"}]
}
And I want to create a search expression in NodeJS to return just the element's of favoriteFood name matches req.body.term, how could I implement this? I have tried the code below, but this returns an entire document, which I don't want because I have to filter the array.
User.find({"favoriteFoods.title": {$regex: req.body.term, $options: "i"}})
.then((food) => {
res.status(200).send(food);
})
You can use Array.filter() to match the values.
res.status(200).send(food.favoriteFoods
.filter(food => food.title.match(new RegExp(req.body.term, 'i'))
);
You used name in the example JSON but title in the code, so make sure you're using whichever of those is actually correct
Also, allowing users to specify their own regular expressions could allow for Regex DOS attacks, so be warned of that.
I don't know at all the desired format result so here you have multiple ways to get that:
Using $elemMatch into the projection stage:
db.collection.find({},
{
"favoriteFoods": {
"$elemMatch": {
"name": {
"$regex": "chee",
"$options": "i"
}
}
}
})
But be careful, $elemMatch only return one result. Check this example.
Using $filter into an aggregation stage: This query will return an array called "food" only with objects that matches the regex.
db.collection.aggregate([
{
"$project": {
"food": {
"$filter": {
"input": "$favoriteFoods",
"cond": {
"$regexMatch": {
"input": "$$this.name",
"regex": "chee",
"options": "i"
}
}
}
}
}
}
])
It will return more than one objecs if they match. Example here
Using $unwind and $match: This query uses $unwind which is not the best step you want to use but it is very useful. Using it with $match and $project you can get the result into an object and not an array (keeping in mind that mongo result is always an array, but each object inside will have food property and will not be an array).
db.collection.aggregate([
{
"$unwind": "$favoriteFoods"
},
{
"$match": {
"favoriteFoods.name": {
"$regex": "chee",
"$options": "i"
}
}
},
{
"$project": {
"food": "$favoriteFoods.name"
}
}
])
Example here

Mongoose array filtering and returning filtered array

My document have field roomname and field users which is an array:
['name1', 'name2', 'name3', 'name4' ,'name5' ,'name6' ,'name7']
How can I get filtered array of users from 'name2' to 'name5'?
I get from 'name1' to 'name7' array by coding :
roommodel.find({roomname:'room1'},'users').then(res=>{
console.log(res)
})
When there is less number of users like this one then there is a way:
let filteredusers=res.slice(1,4).map(i=>return i)
If there is huge amount of arrays it may slowdown server. I want to know if there is a direct method.
You can use it $nin Condition in Mongodb Query Like this,
roommodel.find({
roomname:'room1',
username: { $nin: [ 'name1', 'name7' ] }},'users')
.then(res=>{
console.log(res)
})
You can use Aggregation framework. Note that you will have to pass the input of all the indexes of users array that you want to be returned.
$match to filter relevant document
$map to iterate over the input array of indexes
$arrayElemAt to return the element from users array by the index
roommodel.aggregate([
{
"$match": {
"roomname": "room1"
}
},
{
"$set": {
"users": {
"$map": {
"input": [
2,
3,
4
],
"in": {
"$arrayElemAt": [
"$users",
"$$this"
]
}
}
}
}
}
])
Working example

MongoDB - Updating multiple subarrays in an array

Is it possible to remove elements from multiple subarrays inside of one big array? My structure looks something like this:
{
"_id": {
"$oid": ""
},
"users": [
{
"friends": [
"751573404103999569"
]
},
{
"friends": [
"220799458408005633"
]
}
]
}
I have a friend id and I need to remove it from all the "friends" arrays in the "users" array
You can do with the positional operator $[] as follow:
db.no_more_friends.update({ "users.friends":"the_friend_id" },{ $pull:{"users.$[].friends":"the_friend_id"}} ,{multi:true})
Just take in consideration that with {multi:true} it will perform the user id removal to all friends sub-arrays in all documents where "the_friend_id" is found

Concatenate string values in array in a single field in MongoDB

Suppose that I have a series of documents with the following format:
{
"_id": "3_0",
"values": ["1", "2"]
}
and I would like to obtain a projection of the array's values concatenated in a single field:
{
"_id": "3_0",
"values": "1_2"
}
Is this possible? I have tried $concat but I guess I can't use $values as the array for $concat.
In Modern MongoDB releases you can. You still cannot "directly" apply an array to $concat, however you can use $reduce to work with the array elements and produce this:
db.collection.aggregate([
{ "$addFields": {
"values": {
"$reduce": {
"input": "$values",
"initialValue": "",
"in": {
"$cond": {
"if": { "$eq": [ { "$indexOfArray": [ "$values", "$$this" ] }, 0 ] },
"then": { "$concat": [ "$$value", "$$this" ] },
"else": { "$concat": [ "$$value", "_", "$$this" ] }
}
}
}
}
}}
])
Combining of course with $indexOfArray in order to not "concatenate" with the "_" underscore when it is the "first" index of the array.
Also my additional "wish" has been answered with $sum:
db.collection.aggregate([
{ "$addFields": {
"total": { "$sum": "$items.value" }
}}
])
This kind of gets raised a bit in general with aggregation operators that take an array of items. The distinction here is that it means an "array" of "aguments" provided in the coded representation a opposed to an "array element" present in the current document.
The only way you can really do the kind of concatenation of items within an array present in the document is to do some kind of JavaScript option, as with this example in mapReduce:
db.collection.mapReduce(
function() {
emit( this._id, { "values": this.values.join("_") } );
},
function() {},
{ "out": { "inline": 1 } }
)
Of course if you are not actually aggregating anything, then possibly the best approach is to simply do that "join" operation within your client code in post processing your query results. But if it needs to be used in some purpose across documents then mapReduce is going to be the only place you can use it.
I could add that "for example" I would love for something like this to work:
{
"items": [
{ "product": "A", "value": 1 },
{ "product": "B", "value": 2 },
{ "product": "C", "value": 3 }
]
}
And in aggregate:
db.collection.aggregate([
{ "$project": {
"total": { "$add": [
{ "$map": {
"input": "$items",
"as": "i",
"in": "$$i.value"
}}
]}
}}
])
But it does not work that way because $add expects arguments as opposed to an array from the document. Sigh! :(. Part of the "by design" reasoning for this could be argued that "just because" it is an array or "list" of singular values being passed in from the result of the transformation it is not "guaranteed" that those are actually "valid" singular numeric type values that the operator expects. At least not at the current implemented methods of "type checking".
That means for now we still have to do this:
db.collection.aggregate([
{ "$unwind": "$items" },
{ "$group": {
"_id": "$_id",
"total": { "$sum": "$items.value" }
}}
])
And also sadly there would be no way to apply such a grouping operator to concatenate strings either.
So you can hope for some sort of change on this, or hope for some change that allows an externally scoped variable to be altered within the scope of a $map operation in some way. Better yet a new $join operation would be welcome as well. But these do not exist as of writing, and probably will not for some time to come.
You can use the reduce operator together with the substr operator.
db.collection.aggregate([
{
$project: {
values: {
$reduce: {
input: '$values',
initialValue: '',
in: {
$concat: ['$$value', '_', '$$this']
}
}
}
}
},
{
$project: {
values: { $substr: ['$values', 1 , -1]}
}
}])
Starting in Mongo 4.4, the $function aggregation operator allows applying a custom javascript function to implement behaviour not supported by the MongoDB Query Language.
For instance, in order to concatenate an array of strings:
// { "_id" : "3_0", "values" : [ "1", "2" ] }
db.collection.aggregate(
{ $set:
{ "values":
{ $function: {
body: function(values) { return values.join('_'); },
args: ["$values"],
lang: "js"
}}
}
}
)
// { "_id" : "3_0", "values" : "1_2" }
$function takes 3 parameters:
body, which is the function to apply, whose parameter is the array to join.
args, which contains the fields from the record that the body function takes as parameter. In our case "$values".
lang, which is the language in which the body function is written. Only js is currently available.

Categories