Azure function update CosmosDB SQL array in document

Azure function update CosmosDB SQL array in document - javascript

I am using Azure Functions and Cosmos DB SQL to create a serverless application with javascript.
I have the following database schema of a user item:
{
"id": "user_id_2",
"username": "username_2",
"pass": "pass_1",
"feed": [],
"followed": [
"username_1",
"username_3",
"username_4"
],
"followers": [
"username_3",
"username_4"
]
}
Currently when a user_1 follows another user_2 I update the database document for user_1 - no problem. But now I also need to update the document for user_2, particularly the array of Strings - followers. How can I do that through an azure function with bindings? The only way I came up with is to query the database for the whole document, update it in the client-side and then PUT back in the database, overwriting the previous document. However, this seems ridiculous...

Cosmos DB does not support partial updates at this moment, so pulling the document, adding the item to the array and then doing the PUT is the only option.
Having said that, the problem with your data design is that followed and followers are unbounded arrays, where your user size can grow unchecked. The bigger the size, the more RUs operations will take.
Please see When not to embed here https://learn.microsoft.com/azure/cosmos-db/modeling-data#when-not-to-embed
I wrote a design doc for social apps back when I worked in creating a social application: https://learn.microsoft.com/azure/cosmos-db/social-media-apps
Ideally, the relationship of A follows B would be a document of its own. You could store them on a Graph account for optimal performance, since what you are really building here is a graph, and the queries you will be running are "Who follows B?" or "Who follows my followers?".

Related

Querying multiple database in a microservice architecture

I am currently working on a project where we use microservice architecture. I am also somewhat new to this architecture and have had a few concerns. I understand the concept of microservices in general and also how we can have one database per service. This brings me to a point where I get confused on how to pull data from different databases for a particular user.
Scenario
Assuming I have a Users and a Posts service with their schema like this
User
const schema = {
name: String
id: String
...
}
Post
const schema = {
text: String
user: Id // reference of the user who made this post.
}
Now on the UI, I want to load a set of posts and the associated users who made the post, how do I get a Post alongside the User who made the respective Post. I am using MongoDB, how do I populate data that are stored in other databases? I am also using Kafka handle async operations, how do I leverage Kafka for this usecase? Or is there a much better way of doing this? The final response of a Post could be something like this.
{
text: 'Some random message',
user: {
name: 'John Doe',
id: 1234
}
}
Also, I know I could make a call to the User service to get the User, then make a call to the Post service to get the Post and merge both objects together, but is there a much better option than this basically? I am thinking in cases where I want to do multiple lookups for a user, e.g to get a User and their associated Posts, Messages, etc, how can I handle scenarios like this, are their any techniques I could leverage for situations like this?
Thank you in advance!

I think your issue is service boundaries are too granular. I would recommend aligning your services to bounded contexts (https://martinfowler.com/bliki/BoundedContext.html). For example if you have a "blog" service with posts and users, its quite alright for the blog service to contain both a mongo and relational database for the different models.
Then you ask the service "give me posts for a user" and it is responsible for combining that data as part of its logic.
If you MUST keep them separate (which i would not recommend for the exact problem you are having) then I would keep a lightweight cache of usernames inside the posts service.
Use that to populate the usernames into the post when you return one. You can either update the cache on a regular basis using events, polling, batches. Or just query the user service on a cache-miss.
When dealing with distributed systems you cannot rely on consistency and synchronous, stable communication like you can in a monolith.

Meteor, mongodb - canteen optimization

TL;DR:
I'm making an app for a canteen. I have a collection with the persons and a collection where I "log" every meat took. I need to know those who DIDN'T take the meal.
Long version:
I'm making an application for my local Red Cross.
I'm trying to optimize this situation:
there is a canteen at wich the helped people can take food at breakfast, lunch and supper. We need to know how many took the meal (and this is easy).
if they are present they HAVE TO take the meal and eat, so we need to know how many (and who) HAVEN'T eat (this is the part that I need to optimize).
When they take the meal the "cashier" insert their barcode, the program log the "transaction" in the log collection.
Actually, on creation of the template "canteen" I create a local collection "meals" and populate it with the data of all the people in the DB, (so ID, name, fasting/satiated), then I use this collection for my counters and to display who took the meal and who didn't.
(the variable "mealKind" is = "breakfast" OR "lunch" OR "dinner" depending on the actual serving.)
Template.canteen.created = function(){
Meals=new Mongo.Collection(null);
var today= new Date();today.setHours(0,0,1);
var pers=Persons.find({"status":"present"},{fields:{"Name":1,"Surname":1,"barcode":1}}).fetch();
pers.forEach(function(uno){
var vediamo=Log.findOne({"dest":uno.codice,"what":mealKind, "when":{"$gte": today}});
if(typeof vediamo=="object"){
uno['eat']="satiated";
}else{
uno['eat']="fasting";
}
Meals.insert(uno);
});
};
Template.canteen.destroyed = function(){
meals.remove({});
};
From the meal collection I estrapolate the two colums of people satiated (with name, surname and barcode) and fasting, and I also use two helpers:
fasting:function(){
return Meals.find({"eat":"fasting"});
}
"countFasting":function(){
return Meals.find({"eat":"fasting"}).count();
}
//same for satiated
This was ok, but now the number of people is really increasing (we are arount 1000 and counting) and the creation of the page is very very slow, and usually it stops with errors so I can read that "100 fasting, 400 satiated" but I have around 1000 persons in the DB.
I can't figure out how to optimize the workflow, every other method that I tried involved (in a manner or another) more queries to the DB; I think that I missed the point and now I cannot see it.
I'm not sure about aggregation at this level and inside meteor, because of minimongo.
Although making this server side and not client side is clever, the problem here is HOW discriminate "fasting" vs "satiated" without cycling all the person collection.
+1 if the solution is compatibile with aleed:tabular

EDIT
I am still not sure about what is causing your performance issue (too many things in client memory / minimongo, too many calls to it?), but you could at least try different approaches, more traditionally based on your server.
By the way, you did not mention either how you display your data or how you get the incorrect reading for your number of already served / missing Persons?
If you are building a classic HTML table, please note that browsers struggle rendering more than a few hundred rows. If you are in that case, you could implement a client-side table pagination / infinite scrolling. Look for example at jQuery DataTables plugin (on which is based aldeed:tabular). Skip the step of building an actual HTML table, and fill it directly using $table.rows.add(myArrayOfData).draw() to avoid the browser limitation.
Original answer
I do not exactly understand why you need to duplicate your Persons collection into a client-side Meals local collection?
This requires that you have first all documents of Persons sent from server to client (this may not be problematic if your server is well connected / local. You may also still have autopublish package on, so you would have already seen that penalty), and then cloning all documents (checking for your Logs collection to retrieve any previous passages), effectively doubling your memory need.
Is your server and/or remote DB that slow to justify your need to do everything locally (client side)?
Could be much more problematic, should you have more than one "cashier" / client browser open, their Meals local collections will not be synchronized.
If your server-client connection is good, there is no reason to do everything client side. Meteor will automatically cache just what is needed, and provide optimistic DB modification to keep your user experience fast (should you structure your code correctly).
With aldeed:tabular package, you can easily display your Persons big table by "pages".
You can also link it with your Logs collection using the dburles:collection-helpers (IIRC there is an example en the aldeed:tabular home page).

How to use Firebase as a relational database with generated keys (JS)?

I'm working on a simple JavaScript Twitter clone utilizing Firebase as the backend storage mechanism (JSON). I am familiar with relational databases (SQL) but not with non-relational databases. I am currently trying to work out how to design the structure of the dataset in Firebase, as there is no foreign key relationships or table joining.
The app has three tables, users, tweets, and followers. Users can post tweets, as well as follow other users and see a feed of their tweets. The problem comes when attempting to create the data structure, as I have no idea how I will join the necessary tables. For instance, how will I be able to implement the user-follower functionality?
Here is the ERD that I am using to give myself a starting point:
As I've been trying to wrap my head around this whole JSON thing, this is the closest that I could relate it to a relational database, while still using the Firebase .push() functions to add to the lists in the database (as seen from the Firebase dashboard):
I've seen some people attempting to solve this by "de-normalizing" that data, citing that Firebase doesn't have any query mechanisms. However, these articles are all primarily before 2014, which is when Firebase did add queries. That being said, I don't understand how using the queries could help me, and I think I'm getting stuck with the generated keys.
How should I best structure my data to work with the Firebase JSON lists? I've read some of their documentation but haven't been able to locate anything that uses what I'm looking for and the generated keys.
Should I be attempting to use the .set() method somehow, and using the email addresses as the "primary keys" for the users instead of the generated key? [As mentioned below, this is something I do plan to avoid. Thanks #metame ]
Update
Is this more what I should be looking at for the data structure? And then querying by the keys?
users: {
Hj83Kd83: {
username: "test",
password: "2K44Djl3",
email: "a#b.c"
},
J3kk0dK4: {
username: "another",
password: "33kj4l5K",
email: "d#e.f"
}
}
tweets: {
Jkk3ld92: {
userkey: "Hj83Kd83",
message: "Test message here!"
},
K3Ljdi94: {
userkey: "J3kk0dK4",
message: "Another message here!"
}
}
followers: {
Lk3jdke9: {
userkey: "Hj83Kd83",
followerkey: "J3kk0dK4"
}
}
Let me know if I should include anything else!

Representing relationships in non-relational or noSQL databases, in general, is solved either through embedding documents (noSQL verbiage for rows) or through document references as you have done in your example solution.
The MongoDB site has some decent articles that are mostly applicable to all non-relational databases including Model One-to-Many Relationships with Document References, which I think is most relevant to your issue.
As far as the reference key, it is typically best practice to use the generated IDs as you have assurance that they are unique.

Manage relations among users in db

I am creating a mock app with user creation/auth/friend in a node js learning exercise. Having spent my time mostly at the front end of things, I am a n00b as far as DBs are concerned. I want to create a user database where I want to keep track of user profiles and their connections/friends.
Primary objective is to load/store users connections in the database.
Fetch this information and give it to the user most efficiently in least number of queries.
I'd really appreciate some help with a DB structure I should be using that can accomplish this. I am using mongodb and node.
Off the top of my head: I can store the user's connections in an object in the "connections" field. But this will involve making a lot of queries to fetch connections' details like their "about me" information - which I can also store in the same object as well.
Confused. Would really appreciate some pointers.

Take a look at the Mongoose ORM. It has a populate method that grabs foreign documents. Lots of other great stuff too.
You could say
Users.find({}).populate('connections').exec(function(err,users) { ... });
Before popualte the users' array of connections was an array of IDs, after, its an array of user documents.

How to do SQL-like queries in client side browser?

I've been looking for a way to do complex queries like SQL can perform but totally client side. I know that I can get the exact results that I want from doing SQL queries off of the server and I could even AJAX it so that it looks smooth. However for scaleability, performance, and bandwidth reasons I'd prefer to do this all client side.
Some requirements:
Wide browser compatibility. Anything that can run jQuery is fine. I'd actually prefer that it be a jQuery plugin.
Can sort on more than one column. For instance, order by state alphabetically and list all cities alphabetically within each state.
Can filter results. For instance, the equivalent of "where state = 'CA' or 'NY' or 'TX'".
Must work completely client side so the user only needs to download a large set of data once and can cut the data however they want without constantly fetching data from the server and would in fact be able to do all queries offline after the initial pull.
I've looked around on stackoverflow and found jslinq but it was last updated in 2009 and has no documentation. I also can't tell if it can do more complex queries like ordering on two different columns or doing "and" or "or" filtering.
I would think that something like this would have been done already. I know HTML5 got started down this path but then hit a roadblock. I just need basic queries, no joins or anything. Does anyone know of something that can do this? Thanks.
Edit: I think I should include a use case to help clarify what I'm looking for.
For example, I have a list of the 5000 largest cities in the US. Each record include Cityname, State, and Population. I would like to be able to download the entire dataset once and populate a JS array with it then, on the client side only, be able to run queries like the following and create a table from the resulting records.
Ten largest cities in California
All cities that start with "S" with populations of 1,000,000 or more.
Largest three cities in California, New York, Florida, Texas, and Illinois and order them alphabetically by state then by population. i.e. California, Los Angeles, 3,792,621; California, San Diego, 1,307,402; California, San Jose, 945,942...etc.
All of these queries would be trivial to do via SQL but I don't want to keep going back and forth to the server and I also want to allow offline use.

Take a look at http://linqjs.codeplex.com/
It easily meets all your requirements.

Try Alasql.js. This is a javascript client-side SQL database.
You can do complex queries with joins and grouping, even optimization of joins and where parts. It does not use WebSQL.
Your requirements support:
Wide browser compatibility - all modern versions of browsers, including mobiles.
Can sort on more than one column.- Alasql does it with ORDER BY clause.
Can filter results. - with WHERE clause.
Must work completely client side so the user only needs to download a large set of data once and can cut the data however they want without constantly fetching data from the server and would in fact be able to do all queries offline after the initial pull. - you can use pure JavaScript (Array.push(), etc.) operations to modify data (do not forget to set table.dirty flag).
Here is a simple example ( play with it in jsFiddle ):
// Fill table with data
var person = [
{ name: 'bill' , sex:'M', income:50000 },
{ name: 'sara' , sex:'F', income:100000 },
{ name: 'larry' , sex:'M', income:90000 },
{ name: 'olga' , sex:'F', income:85000 },
];
// Do the query
var res = alasql("SELECT * FROM ? person WHERE sex='F' AND income > 60000", [person]);

As long as the data can fit in memory as an array of objects, you can just use sort and filter. For example, say you want to filter products. You want to find all products either below $5 or above $100 and you want to sort by price (ascending), and if there are two products with the same price, sort by manufacturer (descending). You could do that like this:
var results = products.filter(function(product) {
// price is in cents
return product.price < 500 || product.price > 10000;
});
results.sort(function(a, b) {
var order = a.price - b.price;
if(order == 0) {
order = b.manufacturer.localeCompare(a.manufacturer);
}
return order;
});
For cross-browser compatibility, just shim filter.

How about Yahoo's YQL? I've only briefly looked at it, but it looks interesting.

Backbone is a pretty good js library which (their words) "gives structure to web applications by providing models with key-value binding and custom events, collections with a rich API of enumerable functions, views with declarative event handling, and connects it all to your existing API over a RESTful JSON interface."
I am not sure if this is what you are looking for but you can use it to mock up your model and bind event listeners to it. This seems to be a good tutorial to go through some of the base uses for it.

You can use CanJS. Its a relativelly new library that performs better then Backbone and other and its based on the infamous JavaScript MVC library. In reallity, its the MVC part of the JS MVC with a bit of spices.
You can take a look at this tut by net.tutsplus.com http://net.tutsplus.com/tutorials/javascript-ajax/diving-into-canjs-part-3/
It's pretty powerfull and fast. Has features like live binding that make's your life easy.

Coils is a Clojurescript framework which compiles to Javascropt and has client side SQL queries like this:
(go
(log (sql "SELECT * FROM test_table where name = ?" ["shopping"] )))
: It is full SQL that is passed to a server side relational database:
https://github.com/zubairq/coils

We Keep Coding

JavaScript is the programming language of the Web.