Best performance choice for cookie arrays in PHP and JS? - javascript

Hi I have a website with hundreds of tv shows and tens of thousands of episodes. I want the user (who didn't log in) to be able to save his favorites shows and to mark the episodes he's watched and save these data using cookies.
So I've already implemented it both in PHP and JS. I just wanted to know what's the best choice when it comes to arrays of cookies.
Now I save the data in a cookie called 'favorites', which is a JSON encoded array which simply contains the IDs of the tv shows.
In the cookie 'watched' there are the IDs of the episodes that the user has marked as watched.
What's the best option for performance and memory? You must keep in mind this is also intended for mobiles so, if the user marks thousand of episodes as watched, I don't want this to slow down everything.
The code below is not real code, it's just to make you understand the differences between the options.
Option 1
one cookie called 'watched' which is a JSON encoded array containing the IDs. I would need to check if the array contains that value
$_COOKIE['watched'] = array(1,2,3,4,...)
With this option, in the tv-show page I would always have to pick the whole array and check if every single episode ID is a value in the array.
Option 2
one cookie for each ID, called for example "watched_[ID]". I would just check if it exists or not.
exists( $_COOKIE['watched_1'] ) ? watched : !watched
exists( $_COOKIE['watched_2'] ) ? watched : !watched
Option 3
one cookie called 'watched' which is an associative JSON encoded array containing an array for each tv show, called for example 'watched[show_ID]', and each one of these contain only the IDs of the episodes marked as watched belonging to that show. (so in the tv-show page I first check for 'watched[show_ID]' then I already have all the watched episodes, instead of comparing every single episode ID with the array 'watched' from Option 1
$_COOKIE['watched'] = array(
1 => array(1,2,3),
2 => array(4,5,6)
)

The Approach you select will purely depend on the type of users you have, if data size generally is large in your case then you should go with option 3 as associative arrays are a good approach when it comes to large data size as it will be somewhat easy to loop through them and get the results.
If your users data set is fairly small then option 2 is the best approach as we have different cookies for each shows and you could immediately check the corresponding values.
Further I would like you to go through the html5 api to use indexed db as a solution to your problem as it provides client side DB and further it is a persistent database.
Hope this helps

Something not addressed in the accepted answer is browser cookie limitations (on the client side of things). Option 2 will clearly not work due to the limits on the number of cookies allowed:
Browser Num Cookies Size Per Cookie Size Per Domain
------- ----------- --------------- ---------------
Chrome 8-25 180 4096 bytes
FireFox 3.6.13-19 150 4097 characters
IE 8/9/10 50 5117 characters 10234 characters
Safari 600 4093 bytes 4093 bytes
Android 2.1/2.3.4 50 4096 bytes
Safari Mobile 5.1 600 4093 bytes 4093 bytes
Some of those numbers may be outdated...[source]
However, you will still have to be careful when utilizing options 1 or 3 due to the size limitation per cookie/domain. This fiddle shows how big a strigified associative array is when it contains 1000 items with a key length of 7 characters each (it's about 12000 bytes).
If you're intent on storing thousands of items in cookies, you will probably have to write a cookie wrapper library that chunks your data across multiple cookies to avoid the per-cookie size limitations.

Related

Node.js + websocket - Low-bandwidth game entity ids

I have been working on a game for the past few days and I am currently working on optimizing the websocket frame data. I've changed data type from simple JSON to an arraybuffer, and that part is working fine. Entity data such as position and rotation is sent and received with no problem.
The entity ID is the problem. Currently the client is keeping track of every entity by its ID to store last position for the smooth movement, and so it will continue to be.
Every entity on the server currently has an UUID (ex f2e9f5e2-a810-416e-a1ce-a300a0b7a088). That is 16 bytes. Im not sending that!
And now to my question. How to they do it in big games? Is there any way of getting around this, or a way to generate a unique 2 byte or some other low-bandwith UID?
UPDATE
I need more than 256 IDs, which means that 1 byte is obviously not going to work. 2 bytes gives me 65535 IDs which probably is enough. I also know that iterating through a loop until you find the next free ID is an option, but maybe that is too "costly".

Is filtering a JSON object faster than querying the database through ajax?

I'm working on a product page where you have a set of options which will affect the price.
The main option, which is always there, lets you choose a material. Depending on the material, then, the set of option can change.
On the database I have a table listing the final products with their prices, which is a table of about 2000 rows listing every single product available with the different options.
Something like:
product_id / code / price / size / option / color
1 ABC 20$ 1 3 5
2 DEF 30$ 2 4 5
3 FFF 30$ 3 4 5
and so on.
The whole thing works with ajax calls, so everytime an option changes, I query the database, look for the product with that set of options and show the price.
Would it make sense in this specific case to get the whole list of products at the beginning (would be a single query, and about 2000 rows), store it in a Javascript object and filter it?
If it's of any importance, I'm using MySql
Likely yes, but there's a lot of variables that could affect it. I'm assuming that:
The visitor is a typical web user
The ajax request has a round trip time of roughly 100ms
Given these circumstances, your average visitors browser could almost certainly search through millions of products during that time.
However, assuming you're optimising the user experience (i.e. That delay caused by ajax is rather noticeable), you probably want a hybrid:
Cache everywhere
The chances are your product set changes far less often than people access it; that means your data is very read-heavy. This is a great opportunity to avoid hitting the database entirely and cache something like example.com/products/14/all-options.json as a static file.
Storage of text is cheap. Server CPU time less so.
If there are a lot of options for a particular product (i.e. tens of thousands), then alternatively in this case, you could possibly cache them as a tree of static files. For example, example.com/products/14/size-1/all-options.json gives all the options that are size #1 of product #14. example.com/products/14/size-1/option-4/all.json is all size 1, option #4, and so-on.
You can then go ahead and filter these smaller sets with Javascript and potentially have millions of products without needing to have huge database hits or large-ish downloads on startup.
2000 objects in javascript and filtering it doesn't have a problem. but bear this in mind. mysql is for query databases is made for that there for is better and think in mobile devices too with low specifications,pcs with low resources and so on. and if 2000 objects turn into more ?... it will lengthen the request time and the filtering with javascript.

MySQL suggestions on DB design of N° values in 1 column or 1 column for value

I need to move my local project to a webserver and it is time to start saving things locally (users progress and history).
The main idea is that the webapp every 50ms or so will calculate 8 values that are related to the user which is using the webapp.
My questions are:
Should i use MySQL to store the data? At the moment im using a plain text file with a predefined format like:
Option1,Option2,Option3
Iteration 1
value1,value2,value3,value4,value5
Iteration 2
value1,value2,value3,value4,value5
Iteration 3
value1,value2,value3,value4,value5
...
If so, should i use 5 (or more in the future) columns (one for each value) and their ID as Iteration? Keep in mind i will have 5000+ Iterations per session (roughly 4mins)
Each users can have 10-20 sessions a day.
Will the DB become too big to be efficient?
Due to the sample speed a call to the DB every 50 ms seems a problem to me (especially since i have to animate the webpage heavily). I was wondering if it would be better to implement a Save button which populate all the DB with all the 5000+ values in one go. If so what could it be the best way?
Would it be better to save the *.txt directly in a folder in the webserver? Something like DB/usernameXYZ/dateZXY/filename_XZA.txt . To me yes, way less effort. If so which is the function that allows me to do so (possible JS/HTML).
The rules are simple, and are discussed in many Q&A here.
With rare exceptions...
Do not have multiple tables with the same schema. (Eg, one table per User)
Do not splay an array across columns. Use another table.
Do not put an array into a single column as a commalist. Exception: If you never use SQL to look at the individual items in the list, then it is ok for it to be an opaque text field.
Be wary of EAV schema.
Do batch INSERTs or use LOAD DATA. (10x speedup over one-row-per-INSERT)
Properly indexed, a billion-row table performs just fine. (Problem: It may not be possible to provide an adequate index.)
Images (a la your .txt files) could be stored in the filesystem or in a TEXT column in the database -- there is no universal answer of which to do. (That is, need more details to answer your question.)
"calculate 8 values that are related to the user" -- to vague. Some possibilities:
Dynamically computing a 'rank' is costly and time-consuming.
Summary data is best pre-computed
Huge numbers (eg, search hits) are best approximated
Calculating age from birth date - trivial
Data sitting in the table for that user is, of course, trivial to get
Counting number of 'friends' - it depends
etc.

Data Structure for storing partial urls where search speed is the priority

We have a large database of partial urls (strings) such as:
"example1.com"
"example2.com/test.js"
"/foo.js"
Our software listens for HTTP requests and tries to find one of our database's partial urls in the HTTP request's full url.
So we are getting full urls (i.e.: http://www.example.com/blah.js?foo=bar") and trying to match one of our database's partial patterns on it.
Which would be the best data structure to store our partial urls database on if all we care about is search speed?
Right now, this is what we do:
Iterating through the entire database of partial urls (strings) and using indexOf (in javascript) to see if the full url contains each partial string.
UPDATE:
This software is an extension for Firefox written in Javascript on Firefox's Addon SDK.
Assuming that your partial strings are only domain names and/or page names you could try to generate all possible combinations from the URL starting from the end:
http://www.example.com/blah.js?foo=bar
blaj.js
example.com/blah.js
www.example.com/blah.js
Then hash all the combinations, store them in an array and try to find any of them in another array that contains the hashes of all your partial strings from the database.
NOTE:
In case you want to match ANY string in the url, like ample in example.com it becomes little complicated in terms of storage, because all random combinations of strings in an url are
where n is the length of the url and k is the length of the string to find. According to this SO question the maximum reasonable length of a url is 2000 characters. And assuming that you want to match random string you'd have k vary from 1 to 2000 which would result in a large amount of hashes generated from the url - Sum of n over k for each k from 1 to 2000.
Or more precisely - 2000! / (k!*(2000-k)!) different hashes
There are a few things you could do:
Don't process the URLs on the client side. JavaScript is going to be slow, especially if you have a lot of these URLs. You can create a REST API and pass in the URL for matching as a query parameter i.e. domain.com/api/?url=.... Placing the heavy lifting and memory use on the server side will also decrease your bandwidth.
Bootstrap the URLs into RAM and don't read from the database every time. Something like memcached could work perfectly in this case.
Once in ram, a HashTable structure would work the best since you are doing simple matching. Whatever you do, avoid string comparison.
If you follow these few suggestions, you would have a significant speedup. Hope this helps.

Best way to encode/decode large quantities of data for a JavaScript client?

I'm writing a standalone javascript application with Spine, Node.js, etc.(Here is an earlier incarnation of it if you are interested). Basically, the application is an interactive 'number property' explorer. The idea being that you can select any number, and see what properties it possesses. Is it a prime, or triangular, etc? Where are other numbers that share the same properties? That kind of thing.
At the moment I can pretty easily show like numbers 1-10k, but I would like to show properties for numbers 1-million, or even better 1-billion.
I want my client to download a set of static data files, and then use them to present the information to the user. I don't want to write a server backend.
Currently I'm using JSON for the data files. For some data, I know a simple algorithm to derive the information I'm looking for on the client side, and I use that (ie, is it even?). For the harder numbers, I pre compute them, and then store the values in JSON parseable data files. I've kinda gone a little overboard with the whole thing - I implemented a pure javascript bloom filter and when that didn't scale to 1 million for primes, I tried using CONCISE bitmaps underneath (which didn't help). Eventually I realized that it doesn't matter too much how 'compressed' I get my data, if I'm representing it as JSON.
So the question is - I want to display 30 properties for each number, and I want to show a million numbers...thats like 30 million data points. I want the javascript app to download this data and present it to the user, but I don't want the user to have to download megabytes of information to use the app...
What options do I have for efficiently sending these large sets of data to my javascript only solution?
Can I convert to binary and then read binary on the client side? Examples, please!
How about just computing these data points on the client?
You'll save yourself a lot of headache. You can pre-compute the index chart and leave the rest of the data-points to be processed only when the user selects a particular number.
For the properties exhibited per number. Pure JavaScript on modern desktops is blindingly fast (if you stay away from DOM), I think you'll find processing speed differences are negligible between the algorithmic vs pre-computed JSON solution and you'll be saving yourself a lot of pain and unnecessary bandwith usage.
As for the initial index chart, this displays only the number of properties per number and can be transferred as an array:
'[18,12,9,11,9,7,8,2,6,1,4, ...]'
or in JSON:
{"i": [18,12,9,11,9,7,8,2,6,1,4, ...]}
Note that this works the same for a logarithmic scale since either way you can only attach a value to 1 point in the screen at any one time. You just have to cater the contents of the array accordingly (by returning logarithmic values sequentially on a 1-2K sized array).
You can even use a DEFLATE algorithm to compress it further, but since you can only display a limited amount of numbers on screen (<1-2K pixels on desktop), I would recommend you create your solution around this fact, for example by checking if you can calculate 2K *30 = 60K properties on the go with minimal impact, which will probably be faster than asking the server at this point to give you some JSON.
UPDATE 10-Jan-2012
I just saw your comment about users being able to click on a particular property and get a list of numbers that display that property.
I think the intial transfer of number of properties above can be jazzed up to include all properties in the initial payload, bearing in mind that you only want to transfer the values for numbers displayed in the initial logarithmic scale you wish to display (that means that you can skip numbers if they are not going to be represented on screen when a user first loads the page or clicks on a property). Anything beyond the initial payload can be calculated on the client.
{
"n": [18,12,9,11,9,7,8,2,6,1,4, ...] // number of properties x 1-2K
"p": [1,2,3,5,7,13,...] // prime numbers x 1-2K
"f": [1,2,6, ...] // factorials x 1-2K
}
My guess is that a JSON object like this will be around 30-60K, but you can further reduce this by removing properties whose algorithms are not recursive and letting the client calculate those locally.
If you want an alternative way to compress those arrays when you get to large numbers, you can format your array as a VECTOR instead of a list of numbers, storing differences between one number and the next, this will keep space down when you are dealing with large numbers (>1000). An example of the JSON above using vectors would be as follows:
{
"n": [18,-6,-3,2,-2,-2,1,-6,4,-5,-1, ...] // vectorised no of properties x 1-2K
"p": [1,1,2,2,2,6,...] // vectorised prime numbers x 1-2K
"f": [1,1,4, ...] // vectorised factorials x 1-2K
}
I would say the easiest way would be to break the dataset out into multiple data files. The "client" can then download the files as-needed based on what number(s) the user is looking for.
One advantage of this is that you can tune the size of the data files as you see fit, from one number per file up to all of the numbers in one file. The client only has to know how to pick the file it's numbers are in. This does require there to be some server, but all it needs to do is serve out the static data files.
To reduce the data load, you can also cache the data files using local storage within the browser.

Categories