I have been working on a game for the past few days and I am currently working on optimizing the websocket frame data. I've changed data type from simple JSON to an arraybuffer, and that part is working fine. Entity data such as position and rotation is sent and received with no problem.
The entity ID is the problem. Currently the client is keeping track of every entity by its ID to store last position for the smooth movement, and so it will continue to be.
Every entity on the server currently has an UUID (ex f2e9f5e2-a810-416e-a1ce-a300a0b7a088). That is 16 bytes. Im not sending that!
And now to my question. How to they do it in big games? Is there any way of getting around this, or a way to generate a unique 2 byte or some other low-bandwith UID?
UPDATE
I need more than 256 IDs, which means that 1 byte is obviously not going to work. 2 bytes gives me 65535 IDs which probably is enough. I also know that iterating through a loop until you find the next free ID is an option, but maybe that is too "costly".
I'm working on a product page where you have a set of options which will affect the price.
The main option, which is always there, lets you choose a material. Depending on the material, then, the set of option can change.
On the database I have a table listing the final products with their prices, which is a table of about 2000 rows listing every single product available with the different options.
Something like:
product_id / code / price / size / option / color
1 ABC 20$ 1 3 5
2 DEF 30$ 2 4 5
3 FFF 30$ 3 4 5
and so on.
The whole thing works with ajax calls, so everytime an option changes, I query the database, look for the product with that set of options and show the price.
Would it make sense in this specific case to get the whole list of products at the beginning (would be a single query, and about 2000 rows), store it in a Javascript object and filter it?
If it's of any importance, I'm using MySql
Likely yes, but there's a lot of variables that could affect it. I'm assuming that:
The visitor is a typical web user
The ajax request has a round trip time of roughly 100ms
Given these circumstances, your average visitors browser could almost certainly search through millions of products during that time.
However, assuming you're optimising the user experience (i.e. That delay caused by ajax is rather noticeable), you probably want a hybrid:
Cache everywhere
The chances are your product set changes far less often than people access it; that means your data is very read-heavy. This is a great opportunity to avoid hitting the database entirely and cache something like example.com/products/14/all-options.json as a static file.
Storage of text is cheap. Server CPU time less so.
If there are a lot of options for a particular product (i.e. tens of thousands), then alternatively in this case, you could possibly cache them as a tree of static files. For example, example.com/products/14/size-1/all-options.json gives all the options that are size #1 of product #14. example.com/products/14/size-1/option-4/all.json is all size 1, option #4, and so-on.
You can then go ahead and filter these smaller sets with Javascript and potentially have millions of products without needing to have huge database hits or large-ish downloads on startup.
2000 objects in javascript and filtering it doesn't have a problem. but bear this in mind. mysql is for query databases is made for that there for is better and think in mobile devices too with low specifications,pcs with low resources and so on. and if 2000 objects turn into more ?... it will lengthen the request time and the filtering with javascript.
I need to move my local project to a webserver and it is time to start saving things locally (users progress and history).
The main idea is that the webapp every 50ms or so will calculate 8 values that are related to the user which is using the webapp.
My questions are:
Should i use MySQL to store the data? At the moment im using a plain text file with a predefined format like:
Option1,Option2,Option3
Iteration 1
value1,value2,value3,value4,value5
Iteration 2
value1,value2,value3,value4,value5
Iteration 3
value1,value2,value3,value4,value5
...
If so, should i use 5 (or more in the future) columns (one for each value) and their ID as Iteration? Keep in mind i will have 5000+ Iterations per session (roughly 4mins)
Each users can have 10-20 sessions a day.
Will the DB become too big to be efficient?
Due to the sample speed a call to the DB every 50 ms seems a problem to me (especially since i have to animate the webpage heavily). I was wondering if it would be better to implement a Save button which populate all the DB with all the 5000+ values in one go. If so what could it be the best way?
Would it be better to save the *.txt directly in a folder in the webserver? Something like DB/usernameXYZ/dateZXY/filename_XZA.txt . To me yes, way less effort. If so which is the function that allows me to do so (possible JS/HTML).
The rules are simple, and are discussed in many Q&A here.
With rare exceptions...
Do not have multiple tables with the same schema. (Eg, one table per User)
Do not splay an array across columns. Use another table.
Do not put an array into a single column as a commalist. Exception: If you never use SQL to look at the individual items in the list, then it is ok for it to be an opaque text field.
Be wary of EAV schema.
Do batch INSERTs or use LOAD DATA. (10x speedup over one-row-per-INSERT)
Properly indexed, a billion-row table performs just fine. (Problem: It may not be possible to provide an adequate index.)
Images (a la your .txt files) could be stored in the filesystem or in a TEXT column in the database -- there is no universal answer of which to do. (That is, need more details to answer your question.)
"calculate 8 values that are related to the user" -- to vague. Some possibilities:
Dynamically computing a 'rank' is costly and time-consuming.
Summary data is best pre-computed
Huge numbers (eg, search hits) are best approximated
Calculating age from birth date - trivial
Data sitting in the table for that user is, of course, trivial to get
Counting number of 'friends' - it depends
etc.
We have a large database of partial urls (strings) such as:
"example1.com"
"example2.com/test.js"
"/foo.js"
Our software listens for HTTP requests and tries to find one of our database's partial urls in the HTTP request's full url.
So we are getting full urls (i.e.: http://www.example.com/blah.js?foo=bar") and trying to match one of our database's partial patterns on it.
Which would be the best data structure to store our partial urls database on if all we care about is search speed?
Right now, this is what we do:
Iterating through the entire database of partial urls (strings) and using indexOf (in javascript) to see if the full url contains each partial string.
UPDATE:
This software is an extension for Firefox written in Javascript on Firefox's Addon SDK.
Assuming that your partial strings are only domain names and/or page names you could try to generate all possible combinations from the URL starting from the end:
http://www.example.com/blah.js?foo=bar
blaj.js
example.com/blah.js
www.example.com/blah.js
Then hash all the combinations, store them in an array and try to find any of them in another array that contains the hashes of all your partial strings from the database.
NOTE:
In case you want to match ANY string in the url, like ample in example.com it becomes little complicated in terms of storage, because all random combinations of strings in an url are
where n is the length of the url and k is the length of the string to find. According to this SO question the maximum reasonable length of a url is 2000 characters. And assuming that you want to match random string you'd have k vary from 1 to 2000 which would result in a large amount of hashes generated from the url - Sum of n over k for each k from 1 to 2000.
Or more precisely - 2000! / (k!*(2000-k)!) different hashes
There are a few things you could do:
Don't process the URLs on the client side. JavaScript is going to be slow, especially if you have a lot of these URLs. You can create a REST API and pass in the URL for matching as a query parameter i.e. domain.com/api/?url=.... Placing the heavy lifting and memory use on the server side will also decrease your bandwidth.
Bootstrap the URLs into RAM and don't read from the database every time. Something like memcached could work perfectly in this case.
Once in ram, a HashTable structure would work the best since you are doing simple matching. Whatever you do, avoid string comparison.
If you follow these few suggestions, you would have a significant speedup. Hope this helps.
I'm writing a standalone javascript application with Spine, Node.js, etc.(Here is an earlier incarnation of it if you are interested). Basically, the application is an interactive 'number property' explorer. The idea being that you can select any number, and see what properties it possesses. Is it a prime, or triangular, etc? Where are other numbers that share the same properties? That kind of thing.
At the moment I can pretty easily show like numbers 1-10k, but I would like to show properties for numbers 1-million, or even better 1-billion.
I want my client to download a set of static data files, and then use them to present the information to the user. I don't want to write a server backend.
Currently I'm using JSON for the data files. For some data, I know a simple algorithm to derive the information I'm looking for on the client side, and I use that (ie, is it even?). For the harder numbers, I pre compute them, and then store the values in JSON parseable data files. I've kinda gone a little overboard with the whole thing - I implemented a pure javascript bloom filter and when that didn't scale to 1 million for primes, I tried using CONCISE bitmaps underneath (which didn't help). Eventually I realized that it doesn't matter too much how 'compressed' I get my data, if I'm representing it as JSON.
So the question is - I want to display 30 properties for each number, and I want to show a million numbers...thats like 30 million data points. I want the javascript app to download this data and present it to the user, but I don't want the user to have to download megabytes of information to use the app...
What options do I have for efficiently sending these large sets of data to my javascript only solution?
Can I convert to binary and then read binary on the client side? Examples, please!
How about just computing these data points on the client?
You'll save yourself a lot of headache. You can pre-compute the index chart and leave the rest of the data-points to be processed only when the user selects a particular number.
For the properties exhibited per number. Pure JavaScript on modern desktops is blindingly fast (if you stay away from DOM), I think you'll find processing speed differences are negligible between the algorithmic vs pre-computed JSON solution and you'll be saving yourself a lot of pain and unnecessary bandwith usage.
As for the initial index chart, this displays only the number of properties per number and can be transferred as an array:
'[18,12,9,11,9,7,8,2,6,1,4, ...]'
or in JSON:
{"i": [18,12,9,11,9,7,8,2,6,1,4, ...]}
Note that this works the same for a logarithmic scale since either way you can only attach a value to 1 point in the screen at any one time. You just have to cater the contents of the array accordingly (by returning logarithmic values sequentially on a 1-2K sized array).
You can even use a DEFLATE algorithm to compress it further, but since you can only display a limited amount of numbers on screen (<1-2K pixels on desktop), I would recommend you create your solution around this fact, for example by checking if you can calculate 2K *30 = 60K properties on the go with minimal impact, which will probably be faster than asking the server at this point to give you some JSON.
UPDATE 10-Jan-2012
I just saw your comment about users being able to click on a particular property and get a list of numbers that display that property.
I think the intial transfer of number of properties above can be jazzed up to include all properties in the initial payload, bearing in mind that you only want to transfer the values for numbers displayed in the initial logarithmic scale you wish to display (that means that you can skip numbers if they are not going to be represented on screen when a user first loads the page or clicks on a property). Anything beyond the initial payload can be calculated on the client.
{
"n": [18,12,9,11,9,7,8,2,6,1,4, ...] // number of properties x 1-2K
"p": [1,2,3,5,7,13,...] // prime numbers x 1-2K
"f": [1,2,6, ...] // factorials x 1-2K
}
My guess is that a JSON object like this will be around 30-60K, but you can further reduce this by removing properties whose algorithms are not recursive and letting the client calculate those locally.
If you want an alternative way to compress those arrays when you get to large numbers, you can format your array as a VECTOR instead of a list of numbers, storing differences between one number and the next, this will keep space down when you are dealing with large numbers (>1000). An example of the JSON above using vectors would be as follows:
{
"n": [18,-6,-3,2,-2,-2,1,-6,4,-5,-1, ...] // vectorised no of properties x 1-2K
"p": [1,1,2,2,2,6,...] // vectorised prime numbers x 1-2K
"f": [1,1,4, ...] // vectorised factorials x 1-2K
}
I would say the easiest way would be to break the dataset out into multiple data files. The "client" can then download the files as-needed based on what number(s) the user is looking for.
One advantage of this is that you can tune the size of the data files as you see fit, from one number per file up to all of the numbers in one file. The client only has to know how to pick the file it's numbers are in. This does require there to be some server, but all it needs to do is serve out the static data files.
To reduce the data load, you can also cache the data files using local storage within the browser.