I'm working on a PhoneGap app and find it most efficient to cache as much user data as possible on the device when the user logs in. (Profile information, etc.) Due to project constraints, I can't use local storage.
I'm making an API call and pulling back of JSON data that I use to power the app. My specific question: It is somewhat safe to assume that the byte size of the JSON results will be roughly equal to the memory consumed? i.e. if the API call response is 200k of JSON data, that about 200k of memory will be used to store it in a javascript object?
The best answer is, "Test it."
As several commenters have said, under many common conditions where will be a rough 1:1 correspondence, but there are so many caveats and gotchas, some of which are engine-specific, that it's impossible to say with any certainty.
In other words, if you test it and it looks like the assumption holds under your particular use-case, then 'somewhat safe' is a reasonable assumption. Just don't bet the farm on it.
Related
I am new to that area, so the question may seem strange. However before asking I've read bunch of introductory articles about what are the key points about in machine learning and what are the acting parts of neural networks. Including very useful that one What is machine learning. Basically as I got it - an educated NN is (correct me if it's wrong):
set of connections between neurons (maybe self-connected, may have gates, etc.)
formed activation probabilities on each connection.
Both things are adjusted during the training to fit expected output as close as possible. Then, what we do with an educated NN - we load the test subset of data into it and check how good it performs. But what happens if we're happy with the test results and we want to store the education results and not run training again later when dataset get new values.
So my question is - is that education knowledge is stored somewhere except RAM? can be dumped (think of object serialisation in a way) so that you don't need to educate your NN with data you get tomorrow or later.
Now I am trying to make simple demo with my dataset using synaptic.js but I could not spot that kind of concept of saving education in project's wiki.
That library is just an example, if you reference some python lib would be good to!
With regards to storing it via synaptic.js:
This is quite easy to do! It actually has a built-in function for this. There are two ways to do this.
If you want to use the network without training it again
This will create a standalone function of your network, you can use it anywhere with javascript without requiring synaptic.js! Wiki
var standalone = myNetwork.standalone();
If you want to modify the network later on
Just convert your network to a JSON. This can be loaded up anytime again with synaptic.js! Wiki
// Export the network to a JSON which you can save as plain text
var exported = myNetwork.toJSON();
// Conver the network back to useable network
var imported = Network.fromJSON(exported);
I will assume in my answer that you are working with a simple multi-layer perceptron (MLP), although my answer is applicable to other networks too.
The purpose of 'training' an MLP is to find the correct synaptic weights that minimise the error on the network output.
When a neuron is connected to another neuron, its input is given a weight. The neuron performs a function, such as the weighted sum of all inputs, and then outputs the result.
Once you have trained your network, and found these weights, you can verify the results using a validation set.
If you are happy that your network is performing well, you simply record the weights that you applied to each connection. You can store these weights wherever you like (along with a description of the network structure) and then retrieve them later. There is no need to re-train the network every time you would like to use it.
Hope this helps.
I have a website that contains graphs which display employee activity records. There are tiers of data (ie: region -> state -> office -> manager -> employee -> activity record) and each time you click the graph it drills down a level to get to display more specific information. The highest level (region) requires me to load ~1000 objects into an array and the lowest level is ~500,000 objects. I am populating the graphs via a JSON formatted text file using:
$.ajax({url:'data/jsondata.txt', dataType: 'json',
success: function (data) {
largeArray = data.employeeRecords;
}
Is there an alternative method I could use without hindering response time/performance? I am caught up in the thought that I must pre-load all of the data client side otherwise there will be lagtime if I need to fetch it on a user click. If anyone can point me to best practices and maybe even explain what is considered "TOO MUCH" client side data i'd appreciate it.
FYI i'm restricted to using an old web server and if I want to do anything server side i'd be limited to classic ASP otherwise it has to be client side. thank you!
If your server responds quickly
In this case, you can probably simply load data on demand when a user clicks. The server is quick, so why bother trying to be smarter for no gain.
If the server is quick, but not quick enough, then you might be able to preload the next level while drawing the first. Eg if you have just rendered the graph at the "office" level, then silently preload the "manager" next level down data while the user is still reacting to the screen update.
If the server is too slow for data on demand
In this case you probably need to model exactly where it is slow and address that. There are several things in play here and your question doesnt exactly say.
Is the server slow to query the database, if yes fix it. There is little you can do client side to solve this.
Is the server slow to package for transmission? Harder to fix, server big enough?
Network transmission is slow? Hmmm, need to send less data or get users onto faster bandwidth.
Browser unpack time is slow? (ie delay decoding the data before your script can chart it). Change how you package the data, or send less data, such as chunks.
Can browsers handle 500,000 objects? You should be able to just monitor memory of tHe browser you are using, and there are opionions yes/no on this. Will really depend or target users browser/hardware.
You might like to look at this question What is the most efficient way of sending data for a very large playlist over http? as it shows an alternative way of sending and handling data which I've found to be much quicker for step 4 above. Of course, at 500k objects you will no longer be able to use localStorage, but I've been experimenting with downloading millions of array elements and it works ok. ( still WIP ) I dont use jquery, so not sure how useable this is either.
Best practice? Sorry cannot help with that part of the question.
Let's say I'm making an HTML5 game using JavaScript and the <canvas> The varaibles are stored in the DOM such as level, exp, current_map, and the like.
Obviously, they can be edited client-side using Firebug. What would I have to do to maximize security, so it would be really hard to edit (and cheat)?
Don't store the variables in the DOM if you wish a reasonable level of security. JavaScript, even if obfuscated, can easily be reverse engineered. That defeats any local encryption mechanisms.
Store key variables server-side and use https to maximize security. Even so, the client code (JavaScript) is quite vulnerable to hacking.
You can use Object.freeze or a polyfill or a framework which does the hiding for you.
Check out http://netjs.codeplex.com/
You could also optionally implement some type of signing system but nothing is really impenetrable. For instance objects locked with Object.freeze or Object.watch can still be manually modified in memory.
What are you really trying to accomplish in the end?
What you could do is send a representation of the matrix of the game or the game itself or a special hash or a combination of both and tally the score at the server... causing the user to not only have to modify the score but to correctly modify the state of the game.
Server-side game logic
You need to keep the sensitive data on the server and a local copy on the browser for display purposes only. Then for every action that changes these values the server should be the one responsible for verifying them. For example if the player needs to solve a puzzle you should never verify the solution client side, but take for example the hash value of the ordered pieces represented as a string and send it to the server to verify that the hash value is correct. Then increase the xp/level of the player and send the information back to the client.
Anything that is living in the client can be modified. That is because in MMORPG the character's data is living on the server, so players can't hack their characters using any memory tools, hex editor, etc (they actually "can", but because the server keep the correct version of the character's data is useless).
A good example was Diablo 2: you have actually two different characters: one for single player (and Network playing with other players where one was the server), and one for Battle.net. In the first case, people could "hack" the character's level and points just editing the memory on the fly or the character file with an hex editor. But that wasn't possible with the character you was using on Battle.net.
Another simple example could be a quiz where you have a limited time to answer. If you handle everything on client side, players could hack it and modify the elapsed time and always get the best score: so you need to store the timestamp on the server as well, and use that value as comparison when you get the answer.
To sum up, it doesn't matter if it's JavaScript, C++ or Assembly: the rule is always "Don't rely on client". If you need security for you game data, you have to use something where the clients have no access: the server.
For maximum load speed and page efficiency, is it better to have:
An 18MB JSON file, containing an array of dictionaries, that I can load and start using as a native JavaScript object (e.g. var myname = jsonobj[1]['name']).
A 4MB CSV file, that I need to read using the jquery.csv plugin, and then use lookups to refer to: var nameidx = titles.getPos('name'); var myname = jsonobj[1][nameidx]).
I'm not really expecting anyone to give me a definitive answer, but a general suspicion would be very useful. Or tips for how to measure - perhaps I can check the trade-off between load speed and efficiency using Developer Tools.
My suspicion is that any extra efficiency from using a native JavaScript object in (1) will be outweighed by the much smaller size of the CSV file, but I would like to know if others think the same.
Did you considered delivering the json content using gzip - here is some benchmarks on gzip http://www.cowtowncoder.com/blog/archives/2009/05/entry_263.html
What is your situation? Are you writing some intranet site where you know what browser users are using and have something like a reasonable expectation of bandwidth, or is this a public-facing site?
If you have control of what browsers people use, for example because they're your employees, consider taking advantage of client-side caching. If you're trying to convince people to use this data you should probably consider breaking the data up into chunks and serving it via XHR.
If you really need to serve it all at once then:
Use gzip
Are you doing heavy processing of the data on the client side? How many of the items are you actually likely to go through? If you're only likely to access fewer than 1,000 of them in any given session then I would imagine that the 14MB savings would be worth it. If on the other hand you're comparing all kinds of things against each other all the time (because you're doing some sort of visualization or... anything) then I imagine that the JSON would pay off.
In other words: it depends. Benchmark it.
4MB vs 18MB? Where problem? Json is just standard format now, csv is maybe same good and ok if you using it. My opinion.
14Mb of data are a HUGE difference, but I will try first to serve both the content with GZIP/Deflate server side compression and, thus, make a comparison of these requests (probably the CSV request will be again better in content length)
Then, I would also try to create some data manipulation tests on jsperf both with CSV and JSON data with a real test case/common usage
That depends a lot on the bandwidth of the connection to the user.
Unless this is only going to be used by people who have a super fast connection to the server, I would say that the best option would be an even smaller file that only contains the actual information that you need to display right away, and then load more data as needed.
I am playing around with CouchDB to test if it is "possible" [1] to store scientific data (simulated and experimental raw data + metadata). A big pro is the schema-less approach of CouchDB: we have to be very flexible with the metadata, as the set of parameters changes very often.
Up to now I have some code to feed raw data, plots (both as attachments), and hierarchical metadata (as JSON) into CouchDB documents, and have written some prototype Javascript for filtering and showing. But the filtering is done on the client side (a.k.a. browser): The map function simply returns everything.
How could I change the (or push a second) map function of a specific _design-document with simple browser-JS?
I do not think that a temporary view would yield any performance gain...
Thanks for your time and answers.
[1]: of course it is possible, but is it also useful? feasible? reasonable?
[added]
Ah, the jquery.couch.js (version 0.9.0) provides a saveDoc() function, which could update the _design document with the new map function.
But I also tried out the query function, which uses a temporary view. Okay, "do not use this in the real product, only during development"... But scientific research is steady development, right?
Temporary views are getting cached, as I noticed, and it works well for ~1000 documents per DB. A second plus: all users (think of 1 to 3, so a big user management is quit of an overkill) can work with their own temporary view.
Never ever use temporary views. They are really only there for dev and debugging purposes. For more information, see http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views (specifically the bold "NOTE").
And yes, because design documents are really just documents with special powers, you can run you GET/POST/PUT/DELETE methods on them. However, you will usually need admin privileges to do this. So, if you are allowing a client side piece of software to do that, you are making your entire database public for read/write access - this may be fine for your application, but is important to remember.
Ex., if you restrict access to your database, but put the username and password in client side javascript, then anyone can see that username and password.
Cheers.
I´ve written an helper functions for jquery.couch and design docs, take a look at:
https://github.com/grischaandreew/jquery.couch.js