Idempotency in MongoDB nested array, possible?

Idempotency in MongoDB nested array, possible? - javascript

I am writing a REST api which I want to make idempotent. I am kind of struggling right now with nested arrays and idempotency. I want to update an item in product_notes array in one atomic operation. Is that possible in MongoDB? Or do I have to store arrays as objects instead (see my example at the end of this post)? Is it for example possible to mimic the upsert behaviour but for arrays?
{
username: "test01",
product_notes: [
{ product_id: ObjectID("123"), note: "My comment!" },
{ product_id: ObjectID("124"), note: "My other comment" } ]
}
If I want to update the note for an existing product_node I just use the update command and $set but what if the product_id isn't in the array yet. Then I would like to do an upsert but that (as far as I know) isn't part of the embedded document/array operators.
One way to solve this, and make it idempotent, would be to just add a new collection product_notes to relate between product_id and username.
This feels like violating the purpose of document-based databases.
Another solution:
{
username: "test01",
product_notes: {
"123": { product_id: ObjectID("123"), note: "My comment!" },
"124": { product_id: ObjectID("124"), note: "My other comment" } }
}
Anyone a bit more experienced than me who have anything to share regarding this?

My understanding of your requirement is that you would like to store unique product ids (array) for an user.
You could create an composite unique index on "username" and "username.product_id". So that when the same product id is inserted in the array, you would an exception which you could catch and handle in the code as you wanted the service to be Idempotent.
In terms of adding the new element to an array (i.e. product_notes), I have used Spring data in which you need to get the document by primary key (i.e. top level attribute - example "_id") and then add a new element to an array and update the document.
In terms of updating an attribute in existing array element:-
Again, get the document by primary key (i.e. top level attribute -
example "_id")
Find the correct product id occurrence by iterating the array data
Replace the "[]" with array occurrence
product_notes.[].note

Related

Can I create a map in Firestore Rules?

I have for every document an array of admins that are allowed access to that document. The items in the array are all objects similar to this:
[
{user_ID : "Wfdwwwrdfsdfsdf",
avatar: "www.dfsfsd.com/dfdfd"
name: "Ben Ben"
},
{user_ID : "Hdfsdbbf",
avatar: "www.dfsfsd.com/popo"
name: "Josh Josh"
}
]
In my Firestore Rules I want to check the user making the request is an admin, so I need to check if their uid is part of this array. In JS, I'd just create a new array from the array admins that would only include the IDs, using a map, and check if the ID is there. In Firestore Rules that doesn't seem to be an option. How can I get around this?
Do I have to create another array that only stores the IDs of admins for every document? That seems excessive.
Can't really find all the methods and functions that I can use when workign with Firestore. All I find are examples for certain operations.

There is no way to do loops in the rules, so you wont be able to go through the objects and create an array of ID. Having this array of admins pre-calculated seems the best option then you would just do
allow update: if request.auth.uid in resource.data.admins
The other option is to transform your array of admins into a map with the uid as the key. Then you dont need to duplicate the keys.
{
Wfdwwwrdfsdfsdf : {
avatar: "www.dfsfsd.com/dfdfd"
name: "Ben Ben"
},
Hdfsdbbf: {
avatar: "www.dfsfsd.com/popo"
name: "Josh Josh"
}
}
The rule remains the same
The reference containing all the functions you can use is here

Mongodb check If field exists in an sub-document of an array

I am trying to check If a field exists in a sub-document of an array and if it does, it will only provide those documents in the callback. But every time I log the callback document it gives me all values in my array instead of ones based on the query.
I am following this tutorial
And the only difference is I am using the findOne function instead of find function but it still gives me back all values. I tried using find and it does the same thing.
I am also using the same collection style as the example in the link above.
Example
In the image above you can see in the image above I have a document with a uid field and a contacts array. What I am trying to do is first select a document based on the inputted uid. Then after selecting that document then I want to display the values from the contacts array where contacts.uid field exists. So from the image above only values that would be displayed is contacts[0] and contacts[3] because contacts1 doesn't have a uid field.
Contact.contactModel.findOne({$and: [
{uid: self.uid},
{contacts: {
$elemMatch: {
uid: {
$exists: true,
$ne: undefined,
}
}
}}
]}

You problems come from a misconception about data modeling in MongoDB, not uncommon for developers coming from other DBMS. Let me illustrate this with the example of how data modeling works with an RDBMS vs MongoDB (and a lot of the other NoSQL databases as well).
With an RDBMS, you identify your entities and their properties. Next, you identify the relations, normalize the data model and bang your had against the wall for a few to get the UPPER LEFT ABOVE AND BEYOND JOIN™ that will answer the questions arising from use case A. Then, you pretty much do the same for use case B.
With MongoDB, you would turn this upside down. Looking at your use cases, you would try to find out what information you need to answer the questions arising from the use case and then model your data so that those questions can get answered in the most efficient way.
Let us stick with your example of a contacts database. A few assumptions to be made here:
Each user can have an arbitrary number of contacts.
Each contact and each user need to be uniquely identified by something other than a name, because names can change and whatnot.
Redundancy is not a bad thing.
With the first assumption, embedding contacts into a user document is out of question, since there is a document size limit. Regarding our second assumption: the uid field becomes not redundant, but simply useless, as there already is the _id field uniquely identifying the data set in question.
The use cases
Let us look at some use cases, which are simplified for the sake of the example, but it will give you the picture.
Given a user, I want to find a single contact.
Given a user, I want to find all of his contacts.
Given a user, I want to find the details of his contact "John Doe"
Given a contact, I want to edit it.
Given a contact, I want to delete it.
The data models
User
{
"_id": new ObjectId(),
"name": new String(),
"whatever": {}
}
Contact
{
"_id": new ObjectId(),
"contactOf": ObjectId(),
"name": new String(),
"phone": new String()
}
Obviously, contactOf refers to an ObjectId which must exist in the User collection.
The implementations
Given a user, I want to find a single contact.
If I have the user object, I have it's _id, and the query for a single contact becomes as easy as
db.contacts.findOne({"contactOf":self._id})
Given a user, I want to find all of his contacts.
Equally easy:
db.contacts.find({"contactOf":self._id})
Given a user, I want to find the details of his contact "John Doe"
db.contacts.find({"contactOf":self._id,"name":"John Doe"})
Now we have the contact one way or the other, including his/her/undecided/choose not to say _id, we can easily edit/delete it:
Given a contact, I want to edit it.
db.contacts.update({"_id":contact._id},{$set:{"name":"John F Doe"}})
I trust that by now you get an idea on how to delete John from the contacts of our user.
Notes
Indices
With your data model, you would have needed to add additional indices for the uid fields - which serves no purpose, as we found out. Furthermore, _id is indexed by default, so we make good use of this index. An additional index should be done on the contact collection, however:
db.contact.ensureIndex({"contactOf":1,"name":1})
Normalization
Not done here at all. The reasons for this are manifold, but the most important is that while John Doe might have only have the mobile number of "Mallory H Ousefriend", his wife Jane Doe might also have the email address "janes_naughty_boy#censored.com" - which at least Mallory surely would not want to pop up in John's contact list. So even if we had identity of a contact, you most likely would not want to reflect that.
Conclusion
With a little bit of data remodeling, we reduced the number of additional indices we need to 1, made the queries much simpler and circumvented the BSON document size limit. As for the performance, I guess we are talking of at least one order of magnitude.

In the tutorial you mentioned above, they pass 2 parameters to the method, one for filter and one for projection but you just passed one, that's the difference. You can change your query to be like this:
Contact.contactModel.findOne(
{uid: self.uid},
{contacts: {
$elemMatch: {
uid: {
$exists: true,
$ne: undefined,
}
}
}}
)

The agg framework makes filtering for existence of a field a little tricky. I believe the OP wants all docs where a field exists in an array of subdocs and then to return ONLY those subdocs where the field exists. The following should do the trick:
var inputtedUID = "0"; // doesn't matter
db.foo.aggregate(
[
// This $match finds the docs with our input UID:
{$match: {"uid": inputtedUID }}
// ... and the $addFields/$filter will strip out those entries in contacts where contacts.uid does NOT exist. We wish we could use {cond: {$zz.name: {$exists:true} }} but
// we cannot use $exists here so we need the convoluted $ifNull treatment. Note we
// overwrite the original contacts with the filtered contacts:
,{$addFields: {contacts: {$filter: {
input: "$contacts",
as: "zz",
cond: {$ne: [ {$ifNull:["$$zz.uid",null]}, null]}
}}
}}
,{$limit:1} // just get 1 like findOne()
]);
show(c);
{
"_id" : 0,
"uid" : 0,
"contacts" : [
{
"uid" : "buzz",
"n" : 1
},
{
"uid" : "dave",
"n" : 2
}
]
}

Create/update objects with mongoose/mongoDB

The internet is full of resources for dealing with arrays, but often objects are a more natural fit for data and seemingly more efficient.
I want to store key-value objects under dynamic field names like this:
project['en-US'] = { 'nav-back': 'Go back', ... }
project['pt-BR'] = { 'nav-back': 'Volte', ... }
Doing this seems like it would be more efficient than keeping an array of all languages and having to filter it to get all language entries for a given language.
My question is: How can I insert a key-value pair into an object with a dynamic name using mongoose? And would the object need to exist or can I create it if it doesn't in one operation?
I tried this:
await Project.update(
{ _id: projectId },
{
$set: {
[`${language}.${key}`]: value,
},
});
But no luck regardless of if I have an empty object there to begin with or not: { ok: 0, n: 0, nModified: 0 }.
Bonus: Should I index these objects and how? (I will want to update single items)
Thanks!

In mongoose, the schema is everything. It describe the data you gonna read/store from the database. If you wanna add dynamically a new key in the schema it's gonna be hard.
In this particulary case I would recommend to use the mongodb-native-driver which is way more permissive about the data manipulation. So you could read the data in a specific format and dynamically add your field into it.
To resume my thought, how should your dynamic change happen :
Use mongodb-native-driver to insert the new key into the database data
Modify the mongoose schema you have in the code (push a new key into it)
Use mongoose to manipulate the data afterward
Do not forget to dynamically update your mongoose model or you won't read the new key at the next find.

I solved this using the original code snippet unchanged, but adding { strict: false } to the schema:
const projectSchema = new Schema({ ...schema... }, { strict: false });

MongoDB - Query conundrum - Document refs or subdocument

I've run into a bit of an issue with some data that I'm storing in my MongoDB (Note: I'm using mongoose as an ODM). I have two schemas:
mongoose.model('Buyer',{
credit: Number,
})
and
mongoose.model('Item',{
bid: Number,
location: { type: [Number], index: '2d' }
})
Buyer/Item will have a parent/child association, with a one-to-many relationship. I know that I can set up Items to be embedded subdocs to the Buyer document or I can create two separate documents with object id references to each other.
The problem I am facing is that I need to query Items where it's bid is lower than Buyer's credit but also where location is near a certain geo coordinate.
To satisfy the first criteria, it seems I should embed Items as a subdoc so that I can compare the two numbers. But, in order to compare locations with a geoNear query, it seems it would be better to separate the documents, otherwise, I can't perform geoNear on each subdocument.
Is there any way that I can perform both tasks on this data? If so, how should I structure my data? If not, is there a way that I can perform one query and then a second query on the result from the first query?
Thanks for your help!

There is another option (besides embedding and normalizing) for storing hierarchies in mongodb, that is storing them as tree structures. In this case you would store Buyers and Items in separate documents but in the same collection. Each Item document would need a field pointing to its Buyer (parent) document, and each Buyer document's parent field would be set to null. The docs I linked to explain several implementations you could choose from.

If your items are stored in two separate collections than the best option will be write your own function and call it using mongoose.connection.db.eval('some code...');. In such case you can execute your advanced logic on the server side.
You can write something like this:
var allNearItems = db.Items.find(
{ location: {
$near: {
$geometry: {
type: "Point" ,
coordinates: [ <longitude> , <latitude> ]
},
$maxDistance: 100
}
}
});
var res = [];
allNearItems.forEach(function(item){
var buyer = db.Buyers.find({ id: item.buyerId })[0];
if (!buyer) continue;
if (item.bid < buyer.credit) {
res.push(item.id);
}
});
return res;
After evaluation (place it in mongoose.connection.db.eval("...") call) you will get the array of item id`s.
Use it with cautions. If your allNearItems array will be too large or you will query it very often you can face the performance problems. MongoDB team actually has deprecated direct js code execution but it is still available on current stable release.

Algorithm for data filter

Can you suggest me an algorithm for filtering out data.
I am using javascript and trying to write out a filter function which filters an array of data.I have an array of data and an array of filters, so in order to apply each filter on every data, I have written 2 for loops
foreach(data)
{
foreach(filter)
{
check data with filter
}
}
this is not the proper code, but in short that what my function does, the problem is this takes a huge amount of time, can someone suggest a better method.
I am using the Mootools library and the array of data is JSON array
Details of data and Filter
Data is JSON array of lets say user, so it will be
data = [{"name" : "first", "email" : "first#first", "age" : "20"}.
{"name" : "second", "email" : "second#second", "age" : "21"}
{"name" : "third", "email" : "third#third", "age" : "22"}]
Array of filters is basically self define class for different fields of data
alFilter[0] = filterName;
alFilter[1] = filterEmail;
alFilter[2] = filterAge;
So when I enter the first for loop, I get a single JSON opbject (first row) in the above case.
When I enter the second for loop (filters loop) I have a filter class which extracts the exact field on which the current filter would work and check the filter with the appropriate field of the data.
So in my example
foreach(data)
{
foreach(filter)
{
//loop one - filter name
// loop two - filter email
// loop three - filter age
}
}
when the second loop ends i set a flag denoting if the data has been filtered or not and depending on it the data is displayed.

You're going to have to give us some more detail about the exact structure of your data and filters to really be able to help you out. Are the filters being used to select a subset of data, or to modify the data? What are the filters doing?
That said, there are a few general suggestions:
Do less work. Is there some way you can limit the amount of data you're working on? Some pre-filter that can run quickly and cut it down before you do your main loop?
Break out of the inner loop as soon as possible. If one of the filters rejects a datum, then break out of the inner loop and move on to the next datum. If this is possible, then you should also try to make the most selective filters come first. (This is assuming that your filters are being used to reject items out of the list, rather than modify them)
Check for redundancy in the computation the filters perform. If each of them performs some complicated calculations that share some subroutines, then perhaps memoization or dynamic programming may be used to avoid redundant computation.
Really, it all boils down to the first point, do less work, at all three levels of your code. Can you do less work by limiting the items in the outer loop? Do less work by stopping after a particular filter and doing the most selective filters first? Do less work by not doing any redundant computation inside of each filter?

That's pretty much how you should do it. The trick is to optimize that "check data with filter"-part. You need to traverse all your data and check against all your filters - you'll not going to get any faster than that.
Avoid string comparisons, use data models as native as possible, try to reduce the data set on each pass with filter, etc.
Without further knowledge, it's hard to optimize this for you.

You should sort the application of your filters, so that two things are optimized: expensive checks should come last, and checks that eliminate a lot of data should come first. Then, you should make sure that checking is cut short as soon as an "out" result occurs.

If your filters are looking for specific values, a range, or start of a text then jOrder (http://github.com/danstocker/jorder) will fit your problem.
All you need to do is create a jOrder table like this:
var table = jOrder(data)
.index('name', ['name'], { grouped: true, ordered: true })
.index('email', ['email'])
.index('age', ['age'], { grouped: true, ordered: true, type: jOrder.number });
And then call table.where() to filter the table.
When you're looking for exact matches:
filtered = table.where([{name: 'first'}, {name: 'second'}]);
When you're looking for a certain range of one field:
filtered = table.where([{age: {lower: 20, upper: 21}}], {mode: jOrder.range});
Or, when you're looking for values starting with a given string:
filtered = table.where([{name: 'fir'}], {mode: jOrder.startof});
Filtering will be magnitudes faster this way than with nested loops.

Supposing that a filter removes the data if it doesn't match, I suggest, that you switch the two loops like so:
foreach(filter) {
foreach(data) {
check data with filter
}
}
By doing so, the second filter doesn't have to work all data, but only the data that passed the first filter, and so on. Of course the tips above (like doing expensive checks last) are still true and should additionally be considered.

We Keep Coding

JavaScript is the programming language of the Web.