MySQL suggestions on DB design of N° values in 1 column or 1 column for value - javascript

I need to move my local project to a webserver and it is time to start saving things locally (users progress and history).
The main idea is that the webapp every 50ms or so will calculate 8 values that are related to the user which is using the webapp.
My questions are:
Should i use MySQL to store the data? At the moment im using a plain text file with a predefined format like:
Option1,Option2,Option3
Iteration 1
value1,value2,value3,value4,value5
Iteration 2
value1,value2,value3,value4,value5
Iteration 3
value1,value2,value3,value4,value5
...
If so, should i use 5 (or more in the future) columns (one for each value) and their ID as Iteration? Keep in mind i will have 5000+ Iterations per session (roughly 4mins)
Each users can have 10-20 sessions a day.
Will the DB become too big to be efficient?
Due to the sample speed a call to the DB every 50 ms seems a problem to me (especially since i have to animate the webpage heavily). I was wondering if it would be better to implement a Save button which populate all the DB with all the 5000+ values in one go. If so what could it be the best way?
Would it be better to save the *.txt directly in a folder in the webserver? Something like DB/usernameXYZ/dateZXY/filename_XZA.txt . To me yes, way less effort. If so which is the function that allows me to do so (possible JS/HTML).

The rules are simple, and are discussed in many Q&A here.
With rare exceptions...
Do not have multiple tables with the same schema. (Eg, one table per User)
Do not splay an array across columns. Use another table.
Do not put an array into a single column as a commalist. Exception: If you never use SQL to look at the individual items in the list, then it is ok for it to be an opaque text field.
Be wary of EAV schema.
Do batch INSERTs or use LOAD DATA. (10x speedup over one-row-per-INSERT)
Properly indexed, a billion-row table performs just fine. (Problem: It may not be possible to provide an adequate index.)
Images (a la your .txt files) could be stored in the filesystem or in a TEXT column in the database -- there is no universal answer of which to do. (That is, need more details to answer your question.)
"calculate 8 values that are related to the user" -- to vague. Some possibilities:
Dynamically computing a 'rank' is costly and time-consuming.
Summary data is best pre-computed
Huge numbers (eg, search hits) are best approximated
Calculating age from birth date - trivial
Data sitting in the table for that user is, of course, trivial to get
Counting number of 'friends' - it depends
etc.

Related

How to get total number of pages in DynamoDB if we set a limit?

I have a list of documents that are stored in DynamoDB. There are more than 10,000 documents being stored in the db. I have set the limit to 100 when requesting. DynamoDB is doing a good job by returning me the first 100 documents and the LastEvaluatedKey to get the next 100 documents.
The problem here is I also want the DynamoDB to return me the total number of pages for pagination purpose. In this case since i have 10,000 documents it should return 100 (the number of pagination)
For now what I have done is that I have been counting manually by looping the queries until it doesn't return me the LastEvaluatedKey. Add up how many looping has been done to get the total page. But I believe there are better approach.
As the other answer has correctly explained, there is no efficient way to get total result counts for DynamoDB query or scan operations. As such there is no efficient way to get total number of pages.
However, what I wanted to call out is that modern UIs have been moving away from classic pagination towards an infinite scroll design pattern. Where the “next page” of results is loaded on demand as the List is scrolled. This can be achieved with DynamoDB. You can still show discrete pages but cannot show, apriori, how many results or how many pages there are.. It’s a current shortcoming of DynamoDB.
Neither "Scan" (to read the entire table) nor "Query" (to read all the items with the same partition key) operations return an estimate on how many total results there are. In some cases (e.g., when a FilterExpression is provided), there is no efficient way for DynamoDB to do this estimation. In other cases there may be a way for DynamoDB to provide this information, but it doesn't.
If I understand you correctly, you want to Scan the entire table, without a filter. Like I said, Scan itself doesn't give you the number of items in the table. But you can find this number using DescribeTable, which returns, among other things, an "ItemCount" attribute, which is an estimate on the total number of items in the table, which may not be completely up-to-date but perhaps is good enough for your needs (if you want an estimate for some sort of progress report, that doesn't need to be 100% accurate).
If you really need accurate and up-to-the-moment counts of items in a partition or an entire table, you can always try to maintain such counters as separate data. Doing this correctly is not trivial, and will have performance implications, but in some use cases (e.g., rare writes and a lot of reads) may be useful.
You can maintain your own counts using DynamoDB streams - basically you create a Lambda function that watches for items being created/deleted and then write back to a counter item in DynamoDB that stores the current count of items.

Best Way To Save Data that keeps coming every 1 second to Postgres using node.js

I am building a project using node.js that is integrated with 4 other systems that keeps sending data from sensors every 1 second. I am trying to have like a timeline so I need to save that data, but I don't feel it's correct to hit a couple of insert statements every one second.
what is the best way to save data that is that redundant. I was thinking about having some log files and then insert data in bulk. Any suggestions?
Thank you.
This would make it a premature optimization. I've bench-marked PostgreSQL under Node.js many times. And at any given moment inserting several records per second will take under 10ms, i.e. less than 1% of your app's load, if you do it every second.
The only worthwhile optimization you should do from start - use multi-row insert, even if you insert only 2 rows at a time. The reasons for this are as follows:
Node.js IO is a valuable resource, so the fewer round trips you do the better
Multi-row inserts are tremendously faster than separate insert queries
Separate inserts typically require a transaction, and a single multi-row insert doesn't.
You can find a good example here: Multi-row insert with pg-promise.

Is filtering a JSON object faster than querying the database through ajax?

I'm working on a product page where you have a set of options which will affect the price.
The main option, which is always there, lets you choose a material. Depending on the material, then, the set of option can change.
On the database I have a table listing the final products with their prices, which is a table of about 2000 rows listing every single product available with the different options.
Something like:
product_id / code / price / size / option / color
1 ABC 20$ 1 3 5
2 DEF 30$ 2 4 5
3 FFF 30$ 3 4 5
and so on.
The whole thing works with ajax calls, so everytime an option changes, I query the database, look for the product with that set of options and show the price.
Would it make sense in this specific case to get the whole list of products at the beginning (would be a single query, and about 2000 rows), store it in a Javascript object and filter it?
If it's of any importance, I'm using MySql
Likely yes, but there's a lot of variables that could affect it. I'm assuming that:
The visitor is a typical web user
The ajax request has a round trip time of roughly 100ms
Given these circumstances, your average visitors browser could almost certainly search through millions of products during that time.
However, assuming you're optimising the user experience (i.e. That delay caused by ajax is rather noticeable), you probably want a hybrid:
Cache everywhere
The chances are your product set changes far less often than people access it; that means your data is very read-heavy. This is a great opportunity to avoid hitting the database entirely and cache something like example.com/products/14/all-options.json as a static file.
Storage of text is cheap. Server CPU time less so.
If there are a lot of options for a particular product (i.e. tens of thousands), then alternatively in this case, you could possibly cache them as a tree of static files. For example, example.com/products/14/size-1/all-options.json gives all the options that are size #1 of product #14. example.com/products/14/size-1/option-4/all.json is all size 1, option #4, and so-on.
You can then go ahead and filter these smaller sets with Javascript and potentially have millions of products without needing to have huge database hits or large-ish downloads on startup.
2000 objects in javascript and filtering it doesn't have a problem. but bear this in mind. mysql is for query databases is made for that there for is better and think in mobile devices too with low specifications,pcs with low resources and so on. and if 2000 objects turn into more ?... it will lengthen the request time and the filtering with javascript.

Meteor, mongodb - canteen optimization

TL;DR:
I'm making an app for a canteen. I have a collection with the persons and a collection where I "log" every meat took. I need to know those who DIDN'T take the meal.
Long version:
I'm making an application for my local Red Cross.
I'm trying to optimize this situation:
there is a canteen at wich the helped people can take food at breakfast, lunch and supper. We need to know how many took the meal (and this is easy).
if they are present they HAVE TO take the meal and eat, so we need to know how many (and who) HAVEN'T eat (this is the part that I need to optimize).
When they take the meal the "cashier" insert their barcode, the program log the "transaction" in the log collection.
Actually, on creation of the template "canteen" I create a local collection "meals" and populate it with the data of all the people in the DB, (so ID, name, fasting/satiated), then I use this collection for my counters and to display who took the meal and who didn't.
(the variable "mealKind" is = "breakfast" OR "lunch" OR "dinner" depending on the actual serving.)
Template.canteen.created = function(){
Meals=new Mongo.Collection(null);
var today= new Date();today.setHours(0,0,1);
var pers=Persons.find({"status":"present"},{fields:{"Name":1,"Surname":1,"barcode":1}}).fetch();
pers.forEach(function(uno){
var vediamo=Log.findOne({"dest":uno.codice,"what":mealKind, "when":{"$gte": today}});
if(typeof vediamo=="object"){
uno['eat']="satiated";
}else{
uno['eat']="fasting";
}
Meals.insert(uno);
});
};
Template.canteen.destroyed = function(){
meals.remove({});
};
From the meal collection I estrapolate the two colums of people satiated (with name, surname and barcode) and fasting, and I also use two helpers:
fasting:function(){
return Meals.find({"eat":"fasting"});
}
"countFasting":function(){
return Meals.find({"eat":"fasting"}).count();
}
//same for satiated
This was ok, but now the number of people is really increasing (we are arount 1000 and counting) and the creation of the page is very very slow, and usually it stops with errors so I can read that "100 fasting, 400 satiated" but I have around 1000 persons in the DB.
I can't figure out how to optimize the workflow, every other method that I tried involved (in a manner or another) more queries to the DB; I think that I missed the point and now I cannot see it.
I'm not sure about aggregation at this level and inside meteor, because of minimongo.
Although making this server side and not client side is clever, the problem here is HOW discriminate "fasting" vs "satiated" without cycling all the person collection.
+1 if the solution is compatibile with aleed:tabular
EDIT
I am still not sure about what is causing your performance issue (too many things in client memory / minimongo, too many calls to it?), but you could at least try different approaches, more traditionally based on your server.
By the way, you did not mention either how you display your data or how you get the incorrect reading for your number of already served / missing Persons?
If you are building a classic HTML table, please note that browsers struggle rendering more than a few hundred rows. If you are in that case, you could implement a client-side table pagination / infinite scrolling. Look for example at jQuery DataTables plugin (on which is based aldeed:tabular). Skip the step of building an actual HTML table, and fill it directly using $table.rows.add(myArrayOfData).draw() to avoid the browser limitation.
Original answer
I do not exactly understand why you need to duplicate your Persons collection into a client-side Meals local collection?
This requires that you have first all documents of Persons sent from server to client (this may not be problematic if your server is well connected / local. You may also still have autopublish package on, so you would have already seen that penalty), and then cloning all documents (checking for your Logs collection to retrieve any previous passages), effectively doubling your memory need.
Is your server and/or remote DB that slow to justify your need to do everything locally (client side)?
Could be much more problematic, should you have more than one "cashier" / client browser open, their Meals local collections will not be synchronized.
If your server-client connection is good, there is no reason to do everything client side. Meteor will automatically cache just what is needed, and provide optimistic DB modification to keep your user experience fast (should you structure your code correctly).
With aldeed:tabular package, you can easily display your Persons big table by "pages".
You can also link it with your Logs collection using the dburles:collection-helpers (IIRC there is an example en the aldeed:tabular home page).

AngularJS - Large sets of data

I've been pondering moving our current admin system over to a JS framework for a while and I tested out AngularJS today. I really like how powerful it is. I created a demo application (source: https://github.com/andyhmltn/portfolio-viewer) that has a list of 'items' and displays them in a paginated list that you can order/search in realtime.
The problem that I'm having is figuring out how I would replicate this kind of behaviour with a larger data set. Ideally, I want to have a table of items that's sortable/searchable and paginated that's all in realtime.
The part that concerns me is that this table will have 10,000+ records at least. At current, that's no problem as it's a PHP file that limits the query to the current page and appends any search options to the end. The demo above only has about 15-20 records in. I'm wondering how hard it would be to do the same thing with such a large amount of records without pulling all of them into one JSON request at once as it'll be incredibly slow.
Does anyone have any ideas?
I'm used to handle large datasets in JavaScript, and I would suggest you to :
use pagination (either server-sided or client-sided, depending on the actual volume of your data, see below)
use Crossfilter.js to group your records and adopt a several-levels architecture in your GUI (records per month, double click, records per day for the clicked month, etc.)
An indicator I often use is the following :
rowsAmount x columnsAmount x dataManipulationsPerRow
Also, consider the fact that handling large datasets and displaying them are two very differents things.
Indeed pulling so many rows in one request would be a killer. Fortunately Angular has the ng-grid component that can do server-side paging (among many other things). Instructions are provided in the given link.

Categories