I've run into a bit of an issue with some data that I'm storing in my MongoDB (Note: I'm using mongoose as an ODM). I have two schemas:
mongoose.model('Buyer',{
credit: Number,
})
and
mongoose.model('Item',{
bid: Number,
location: { type: [Number], index: '2d' }
})
Buyer/Item will have a parent/child association, with a one-to-many relationship. I know that I can set up Items to be embedded subdocs to the Buyer document or I can create two separate documents with object id references to each other.
The problem I am facing is that I need to query Items where it's bid is lower than Buyer's credit but also where location is near a certain geo coordinate.
To satisfy the first criteria, it seems I should embed Items as a subdoc so that I can compare the two numbers. But, in order to compare locations with a geoNear query, it seems it would be better to separate the documents, otherwise, I can't perform geoNear on each subdocument.
Is there any way that I can perform both tasks on this data? If so, how should I structure my data? If not, is there a way that I can perform one query and then a second query on the result from the first query?
Thanks for your help!
There is another option (besides embedding and normalizing) for storing hierarchies in mongodb, that is storing them as tree structures. In this case you would store Buyers and Items in separate documents but in the same collection. Each Item document would need a field pointing to its Buyer (parent) document, and each Buyer document's parent field would be set to null. The docs I linked to explain several implementations you could choose from.
If your items are stored in two separate collections than the best option will be write your own function and call it using mongoose.connection.db.eval('some code...');. In such case you can execute your advanced logic on the server side.
You can write something like this:
var allNearItems = db.Items.find(
{ location: {
$near: {
$geometry: {
type: "Point" ,
coordinates: [ <longitude> , <latitude> ]
},
$maxDistance: 100
}
}
});
var res = [];
allNearItems.forEach(function(item){
var buyer = db.Buyers.find({ id: item.buyerId })[0];
if (!buyer) continue;
if (item.bid < buyer.credit) {
res.push(item.id);
}
});
return res;
After evaluation (place it in mongoose.connection.db.eval("...") call) you will get the array of item id`s.
Use it with cautions. If your allNearItems array will be too large or you will query it very often you can face the performance problems. MongoDB team actually has deprecated direct js code execution but it is still available on current stable release.
Related
I am trying to check If a field exists in a sub-document of an array and if it does, it will only provide those documents in the callback. But every time I log the callback document it gives me all values in my array instead of ones based on the query.
I am following this tutorial
And the only difference is I am using the findOne function instead of find function but it still gives me back all values. I tried using find and it does the same thing.
I am also using the same collection style as the example in the link above.
Example
In the image above you can see in the image above I have a document with a uid field and a contacts array. What I am trying to do is first select a document based on the inputted uid. Then after selecting that document then I want to display the values from the contacts array where contacts.uid field exists. So from the image above only values that would be displayed is contacts[0] and contacts[3] because contacts1 doesn't have a uid field.
Contact.contactModel.findOne({$and: [
{uid: self.uid},
{contacts: {
$elemMatch: {
uid: {
$exists: true,
$ne: undefined,
}
}
}}
]}
You problems come from a misconception about data modeling in MongoDB, not uncommon for developers coming from other DBMS. Let me illustrate this with the example of how data modeling works with an RDBMS vs MongoDB (and a lot of the other NoSQL databases as well).
With an RDBMS, you identify your entities and their properties. Next, you identify the relations, normalize the data model and bang your had against the wall for a few to get the UPPER LEFT ABOVE AND BEYOND JOIN™ that will answer the questions arising from use case A. Then, you pretty much do the same for use case B.
With MongoDB, you would turn this upside down. Looking at your use cases, you would try to find out what information you need to answer the questions arising from the use case and then model your data so that those questions can get answered in the most efficient way.
Let us stick with your example of a contacts database. A few assumptions to be made here:
Each user can have an arbitrary number of contacts.
Each contact and each user need to be uniquely identified by something other than a name, because names can change and whatnot.
Redundancy is not a bad thing.
With the first assumption, embedding contacts into a user document is out of question, since there is a document size limit. Regarding our second assumption: the uid field becomes not redundant, but simply useless, as there already is the _id field uniquely identifying the data set in question.
The use cases
Let us look at some use cases, which are simplified for the sake of the example, but it will give you the picture.
Given a user, I want to find a single contact.
Given a user, I want to find all of his contacts.
Given a user, I want to find the details of his contact "John Doe"
Given a contact, I want to edit it.
Given a contact, I want to delete it.
The data models
User
{
"_id": new ObjectId(),
"name": new String(),
"whatever": {}
}
Contact
{
"_id": new ObjectId(),
"contactOf": ObjectId(),
"name": new String(),
"phone": new String()
}
Obviously, contactOf refers to an ObjectId which must exist in the User collection.
The implementations
Given a user, I want to find a single contact.
If I have the user object, I have it's _id, and the query for a single contact becomes as easy as
db.contacts.findOne({"contactOf":self._id})
Given a user, I want to find all of his contacts.
Equally easy:
db.contacts.find({"contactOf":self._id})
Given a user, I want to find the details of his contact "John Doe"
db.contacts.find({"contactOf":self._id,"name":"John Doe"})
Now we have the contact one way or the other, including his/her/undecided/choose not to say _id, we can easily edit/delete it:
Given a contact, I want to edit it.
db.contacts.update({"_id":contact._id},{$set:{"name":"John F Doe"}})
I trust that by now you get an idea on how to delete John from the contacts of our user.
Notes
Indices
With your data model, you would have needed to add additional indices for the uid fields - which serves no purpose, as we found out. Furthermore, _id is indexed by default, so we make good use of this index. An additional index should be done on the contact collection, however:
db.contact.ensureIndex({"contactOf":1,"name":1})
Normalization
Not done here at all. The reasons for this are manifold, but the most important is that while John Doe might have only have the mobile number of "Mallory H Ousefriend", his wife Jane Doe might also have the email address "janes_naughty_boy#censored.com" - which at least Mallory surely would not want to pop up in John's contact list. So even if we had identity of a contact, you most likely would not want to reflect that.
Conclusion
With a little bit of data remodeling, we reduced the number of additional indices we need to 1, made the queries much simpler and circumvented the BSON document size limit. As for the performance, I guess we are talking of at least one order of magnitude.
In the tutorial you mentioned above, they pass 2 parameters to the method, one for filter and one for projection but you just passed one, that's the difference. You can change your query to be like this:
Contact.contactModel.findOne(
{uid: self.uid},
{contacts: {
$elemMatch: {
uid: {
$exists: true,
$ne: undefined,
}
}
}}
)
The agg framework makes filtering for existence of a field a little tricky. I believe the OP wants all docs where a field exists in an array of subdocs and then to return ONLY those subdocs where the field exists. The following should do the trick:
var inputtedUID = "0"; // doesn't matter
db.foo.aggregate(
[
// This $match finds the docs with our input UID:
{$match: {"uid": inputtedUID }}
// ... and the $addFields/$filter will strip out those entries in contacts where contacts.uid does NOT exist. We wish we could use {cond: {$zz.name: {$exists:true} }} but
// we cannot use $exists here so we need the convoluted $ifNull treatment. Note we
// overwrite the original contacts with the filtered contacts:
,{$addFields: {contacts: {$filter: {
input: "$contacts",
as: "zz",
cond: {$ne: [ {$ifNull:["$$zz.uid",null]}, null]}
}}
}}
,{$limit:1} // just get 1 like findOne()
]);
show(c);
{
"_id" : 0,
"uid" : 0,
"contacts" : [
{
"uid" : "buzz",
"n" : 1
},
{
"uid" : "dave",
"n" : 2
}
]
}
I'm relatively new to NoSQL, but I have been enjoying the journey very much! I am however finding the map-reduce way of life a bit tricky! I need some help with a problem!
I have a database with two types of documents, opening transactions and closing transactions. For replication and offline functionality reasons I cannot merge the data into one document. The opening transaction document looks something like :
{
_id: "transaction-open-randomgeneratedstring",
type: "transactions-open",
vehicle: "vehicle-id",
created: "date string"
}
The closing documents looks something like:
{
_id: "transaction-close-randomgeneratedstring",
type: "transactions-close",
openid: "transaction-open-randomgeneratedstring",
created: "date string"
}
The randomgeneratedstring of a closing transactions match the randomgeneratedstring of the corresponding opening transaction.
I need a map-reduce to give me the list of open transactions that does not have a corresponding closing transaction. This will basically give me a list of outstanding transactions.
This is the map-reduce I have thus far, but it is not doing the job.
{
"map": function(doc) {
if(doc.type == "transactions-open") {
emit([doc._id, 0], "OPEN");
}
if(doc.type == "transactions-close"){
emit([doc.openid, 1], "CLOSE");
}
},
"reduce": function(keys, values, rereduce) {
var unique_labels = {};
var open = {};
keys.forEach(function(label) {
if(!unique_labels[label[0]]) {
unique_labels[label[0]] = true;
} else {
open[label[0]] = true;
}
});
return open;
}
}
I am open for changes in the _id naming / structure, but I cannot combine the two documents into one.
Thanks!
EDIT
Based on response from Hod, I changed the reduce to look like:
function(keys, values, rereducer)
{
if(values.length == 1)
return true;
}
This is certainly a step in the right direction, but the unwanted transactions are still in the result set, the value is only null. Is there no way to get those out of the result set?
As described - what you would do with a Join in SQL you do with a reduce in CouchDB. Code something like this - not tested:
{
"map": function(doc) {
if(doc.type == "transactions-open") {
emit([doc._id], 1);
}
if(doc.type == "transactions-close"){
emit([doc.openid], -1);
}
},
"reduce": "_sum";
}
So we emit a 1 for an open transaction under an ID and a -1 for a close under the same ID. Now when you reduce you will get a result for each ID of:
-1 = Closed with no record of an open (error condition).
0 = Opened and Closed
1 = Open and not yet closed.
The problem is with the keys parameter in your reduce function. The reduce phase is not called once with all possible keys. It's called per distinct key, and based on the group_level you specify.
Looking at your code, if you haven't specified any group_level, your reduce function is going to get called for every document separately.
Because you're emitting the id of the open transaction doc for both open and close markers, if you grouped at the first level, you'd get open or open/close pairs. You're still only getting a reduction on a limited set of docs at a time.
You could fix this either in your logic calling the query, or by emitting a key that let's you reduce on the entire set at once. (I imagine there are other ways too. These are the ones that come to mind.)
If you use the key approach, you'd need to emit something that looked like ["transaction", doc._id, 0]. Then a first level grouping would give you the whole transaction set like you're current code expects.
EDIT (Adding information based on edit of question.)
The reduce function is going to get called with whatever grouping you set up. It's always going to return something, even if it's just no results emitted (i.e. null).
If you don't want to handle that in the logic that's running the queries and processing the results, you need to use an approach that will allow you to group all the transaction documents together, instead of just the documents for a single transaction.
Based on what you've done so far, another approach would be to forgo the reduce phase and just look at the number of results returned by a query that's limited to the unique doc id.
I am writing a REST api which I want to make idempotent. I am kind of struggling right now with nested arrays and idempotency. I want to update an item in product_notes array in one atomic operation. Is that possible in MongoDB? Or do I have to store arrays as objects instead (see my example at the end of this post)? Is it for example possible to mimic the upsert behaviour but for arrays?
{
username: "test01",
product_notes: [
{ product_id: ObjectID("123"), note: "My comment!" },
{ product_id: ObjectID("124"), note: "My other comment" } ]
}
If I want to update the note for an existing product_node I just use the update command and $set but what if the product_id isn't in the array yet. Then I would like to do an upsert but that (as far as I know) isn't part of the embedded document/array operators.
One way to solve this, and make it idempotent, would be to just add a new collection product_notes to relate between product_id and username.
This feels like violating the purpose of document-based databases.
Another solution:
{
username: "test01",
product_notes: {
"123": { product_id: ObjectID("123"), note: "My comment!" },
"124": { product_id: ObjectID("124"), note: "My other comment" } }
}
Anyone a bit more experienced than me who have anything to share regarding this?
My understanding of your requirement is that you would like to store unique product ids (array) for an user.
You could create an composite unique index on "username" and "username.product_id". So that when the same product id is inserted in the array, you would an exception which you could catch and handle in the code as you wanted the service to be Idempotent.
In terms of adding the new element to an array (i.e. product_notes), I have used Spring data in which you need to get the document by primary key (i.e. top level attribute - example "_id") and then add a new element to an array and update the document.
In terms of updating an attribute in existing array element:-
Again, get the document by primary key (i.e. top level attribute -
example "_id")
Find the correct product id occurrence by iterating the array data
Replace the "[]" with array occurrence
product_notes.[].note
I'm trying to test out Firebase to allow users to post comments using push. I want to display the data I retrieve with the following;
fbl.child('sell').limit(20).on("value", function(fbdata) {
// handle data display here
}
The problem is the data is returned in order of oldest to newest - I want it in reversed order. Can Firebase do this?
Since this answer was written, Firebase has added a feature that allows ordering by any child or by value. So there are now four ways to order data: by key, by value, by priority, or by the value of any named child. See this blog post that introduces the new ordering capabilities.
The basic approaches remain the same though:
1. Add a child property with the inverted timestamp and then order on that.
2. Read the children in ascending order and then invert them on the client.
Firebase supports retrieving child nodes of a collection in two ways:
by name
by priority
What you're getting now is by name, which happens to be chronological. That's no coincidence btw: when you push an item into a collection, the name is generated to ensure the children are ordered in this way. To quote the Firebase documentation for push:
The unique name generated by push() is prefixed with a client-generated timestamp so that the resulting list will be chronologically-sorted.
The Firebase guide on ordered data has this to say on the topic:
How Data is Ordered
By default, children at a Firebase node are sorted lexicographically by name. Using push() can generate child names that naturally sort chronologically, but many applications require their data to be sorted in other ways. Firebase lets developers specify the ordering of items in a list by specifying a custom priority for each item.
The simplest way to get the behavior you want is to also specify an always-decreasing priority when you add the item:
var ref = new Firebase('https://your.firebaseio.com/sell');
var item = ref.push();
item.setWithPriority(yourObject, 0 - Date.now());
Update
You'll also have to retrieve the children differently:
fbl.child('sell').startAt().limitToLast(20).on('child_added', function(fbdata) {
console.log(fbdata.exportVal());
})
In my test using on('child_added' ensures that the last few children added are returned in reverse chronological order. Using on('value' on the other hand, returns them in the order of their name.
Be sure to read the section "Reading ordered data", which explains the usage of the child_* events to retrieve (ordered) children.
A bin to demonstrate this: http://jsbin.com/nonawe/3/watch?js,console
Since firebase 2.0.x you can use limitLast() to achieve that:
fbl.child('sell').orderByValue().limitLast(20).on("value", function(fbdataSnapshot) {
// fbdataSnapshot is returned in the ascending order
// you will still need to order these 20 items in
// in a descending order
}
Here's a link to the announcement: More querying capabilities in Firebase
To augment Frank's answer, it's also possible to grab the most recent records--even if you haven't bothered to order them using priorities--by simply using endAt().limit(x) like this demo:
var fb = new Firebase(URL);
// listen for all changes and update
fb.endAt().limit(100).on('value', update);
// print the output of our array
function update(snap) {
var list = [];
snap.forEach(function(ss) {
var data = ss.val();
data['.priority'] = ss.getPriority();
data['.name'] = ss.name();
list.unshift(data);
});
// print/process the results...
}
Note that this is quite performant even up to perhaps a thousand records (assuming the payloads are small). For more robust usages, Frank's answer is authoritative and much more scalable.
This brute force can also be optimized to work with bigger data or more records by doing things like monitoring child_added/child_removed/child_moved events in lieu of value, and using a debounce to apply DOM updates in bulk instead of individually.
DOM updates, naturally, are a stinker regardless of the approach, once you get into the hundreds of elements, so the debounce approach (or a React.js solution, which is essentially an uber debounce) is a great tool to have.
There is really no way but seems we have the recyclerview we can have this
query=mCommentsReference.orderByChild("date_added");
query.keepSynced(true);
// Initialize Views
mRecyclerView = (RecyclerView) view.findViewById(R.id.recyclerView);
mManager = new LinearLayoutManager(getContext());
// mManager.setReverseLayout(false);
mManager.setReverseLayout(true);
mManager.setStackFromEnd(true);
mRecyclerView.setHasFixedSize(true);
mRecyclerView.setLayoutManager(mManager);
I have a date variable (long) and wanted to keep the newest items on top of the list. So what I did was:
Add a new long field 'dateInverse'
Add a new method called 'getDateInverse', which just returns: Long.MAX_VALUE - date;
Create my query with: .orderByChild("dateInverse")
Presto! :p
You are searching limitTolast(Int x) .This will give you the last "x" higher elements of your database (they are in ascending order) but they are the "x" higher elements
if you got in your database {10,300,150,240,2,24,220}
this method:
myFirebaseRef.orderByChild("highScore").limitToLast(4)
will retrive you : {150,220,240,300}
In Android there is a way to actually reverse the data in an Arraylist of objects through the Adapter. In my case I could not use the LayoutManager to reverse the results in descending order since I was using a horizontal Recyclerview to display the data. Setting the following parameters to the recyclerview messed up my UI experience:
llManager.setReverseLayout(true);
llManager.setStackFromEnd(true);
The only working way I found around this was through the BindViewHolder method of the RecyclerView adapter:
#Override
public void onBindViewHolder(final RecyclerView.ViewHolder holder, int position) {
final SuperPost superPost = superList.get(getItemCount() - position - 1);
}
Hope this answer will help all the devs out there who are struggling with this issue in Firebase.
Firebase: How to display a thread of items in reverse order with a limit for each request and an indicator for a "load more" button.
This will get the last 10 items of the list
FBRef.child("childName")
.limitToLast(loadMoreLimit) // loadMoreLimit = 10 for example
This will get the last 10 items. Grab the id of the last record in the list and save for the load more functionality. Next, convert the collection of objects into and an array and do a list.reverse().
LOAD MORE Functionality: The next call will do two things, it will get the next sequence of list items based on the reference id from the first request and give you an indicator if you need to display the "load more" button.
this.FBRef
.child("childName")
.endAt(null, lastThreadId) // Get this from the previous step
.limitToLast(loadMoreLimit+2)
You will need to strip the first and last item of this object collection. The first item is the reference to get this list. The last item is an indicator for the show more button.
I have a bunch of other logic that will keep everything clean. You will need to add this code only for the load more functionality.
list = snapObjectAsArray; // The list is an array from snapObject
lastItemId = key; // get the first key of the list
if (list.length < loadMoreLimit+1) {
lastItemId = false;
}
if (list.length > loadMoreLimit+1) {
list.pop();
}
if (list.length > loadMoreLimit) {
list.shift();
}
// Return the list.reverse() and lastItemId
// If lastItemId is an ID, it will be used for the next reference and a flag to show the "load more" button.
}
I'm using ReactFire for easy Firebase integration.
Basically, it helps me storing the datas into the component state, as an array. Then, all I have to use is the reverse() function (read more)
Here is how I achieve this :
import React, { Component, PropTypes } from 'react';
import ReactMixin from 'react-mixin';
import ReactFireMixin from 'reactfire';
import Firebase from '../../../utils/firebaseUtils'; // Firebase.initializeApp(config);
#ReactMixin.decorate(ReactFireMixin)
export default class Add extends Component {
constructor(args) {
super(args);
this.state = {
articles: []
};
}
componentWillMount() {
let ref = Firebase.database().ref('articles').orderByChild('insertDate').limitToLast(10);
this.bindAsArray(ref, 'articles'); // bind retrieved data to this.state.articles
}
render() {
return (
<div>
{
this.state.articles.reverse().map(function(article) {
return <div>{article.title}</div>
})
}
</div>
);
}
}
There is a better way. You should order by negative server timestamp. How to get negative server timestamp even offline? There is an hidden field which helps. Related snippet from documentation:
var offsetRef = new Firebase("https://<YOUR-FIREBASE-APP>.firebaseio.com/.info/serverTimeOffset");
offsetRef.on("value", function(snap) {
var offset = snap.val();
var estimatedServerTimeMs = new Date().getTime() + offset;
});
To add to Dave Vávra's answer, I use a negative timestamp as my sort_key like so
Setting
const timestamp = new Date().getTime();
const data = {
name: 'John Doe',
city: 'New York',
sort_key: timestamp * -1 // Gets the negative value of the timestamp
}
Getting
const ref = firebase.database().ref('business-images').child(id);
const query = ref.orderByChild('sort_key');
return $firebaseArray(query); // AngularFire function
This fetches all objects from newest to oldest. You can also $indexOn the sortKey to make it run even faster
I had this problem too, I found a very simple solution to this that doesn't involved manipulating the data in anyway. If you are rending the result to the DOM, in a list of some sort. You can use flexbox and setup a class to reverse the elements in their container.
.reverse {
display: flex;
flex-direction: column-reverse;
}
myarray.reverse(); or this.myitems = items.map(item => item).reverse();
I did this by prepend.
query.orderByChild('sell').limitToLast(4).on("value", function(snapshot){
snapshot.forEach(function (childSnapshot) {
// PREPEND
});
});
Someone has pointed out that there are 2 ways to do this:
Manipulate the data client-side
Make a query that will order the data
The easiest way that I have found to do this is to use option 1, but through a LinkedList. I just append each of the objects to the front of the stack. It is flexible enough to still allow the list to be used in a ListView or RecyclerView. This way even though they come in order oldest to newest, you can still view, or retrieve, newest to oldest.
You can add a column named orderColumn where you save time as
Long refrenceTime = "large future time";
Long currentTime = "currentTime";
Long order = refrenceTime - currentTime;
now save Long order in column named orderColumn and when you retrieve data
as orderBy(orderColumn) you will get what you need.
just use reverse() on the array , suppose if you are storing the values to an array items[] then do a this.items.reverse()
ref.subscribe(snapshots => {
this.loading.dismiss();
this.items = [];
snapshots.forEach(snapshot => {
this.items.push(snapshot);
});
**this.items.reverse();**
},
For me it was limitToLast that worked. I also found out that limitLast is NOT a function:)
const query = messagesRef.orderBy('createdAt', 'asc').limitToLast(25);
The above is what worked for me.
PRINT in reverse order
Let's think outside the box... If your information will be printed directly into user's screen (without any content that needs to be modified in a consecutive order, like a sum or something), simply print from bottom to top.
So, instead of inserting each new block of content to the end of the print space (A += B), add that block to the beginning (A = B+A).
If you'll include the elements as a consecutive ordered list, the DOM can put the numbers for you if you insert each element as a List Item (<li>) inside an Ordered Lists (<ol>).
This way you save space from your database, avoiding unnecesary reversed data.
I'm trying to make a MongoDB database for a simple voting system. If I were to draw a schema, then it looks something like this:
User {
name: String
, email: String
}
Vote {
message: String
, voters: [ObjectId(User._id)]
}
I have some questions about this design when there are a lot of voters for one vote:
Sending the whole voters array to the client's side (not to mention caching it in memory) is very expensive, right? Is there a way to get the Vote in a "shallow" way, so when I need vote.voters, it will make another database request to the array of voters?
If a voter has voted already, I want to check that and not count his vote. To do that, is there a query I can run in the array of embedded voters to quickly find this?
When showing votes, I'd want to show the number of votes without fetching the voters array to the client side. Is there some kind of count query I can run to count the voters length?
I would add a bit of redundancy to the schema to avoid some of the potential problems you mention. I assume you want to 1) quickly count the number of votes and 2) make sure a user cannot vote twice.
One way to achieve this is to keep both a list of users and a count of votes, and add a clause to the update query that makes sure that a vote is only added if the user's ID is not in the list of voters:
var query = {_id: xyz, voters: {$ne: user_id}
var update = {$inc: {votes: 1}, $push: {voters: user_id}}
db.votes.update(query, update, true)
(the last parameter is upsert, a very nice feature of Mongo)
Then, if you want to show the number of votes you can limit the properties of the result to the votes property:
db.votes.find({_id: xyz}, {votes: true})
You can find a complete description of more or less exactly what you want to do here: http://cookbook.mongodb.org/patterns/votes/
1) You can specify only to return a subset of fields to the client (docs).
e.g. to find a specific message and only return "AnotherField".
db.Vote.find({ message : "search for this message" }, { AnotherField : 1 } );
2) You could use the $addToSet operator to add a voter into the voters array (docs) which (quote):
Adds value to the array only if its
not in the array already
e.g.
{ $addToSet : { voters: "Bob"} }
3) You could store the count as an extra field in the document and then just return that.
Hope this helps.