I'm writing a diagnostics application that writes data to IndexedDB every three seconds. The size of the data is fixed and data older than one week is automatically deleted, resulting in max 201600 entries. This makes calculating the IndexedDB space requirements rather easy and precise. I would like to reserve space in IndexedDB when the site is launched, as the end user is likely to leave the browser unattended for large periods of time.
The simple solution seems to be to store a very large object that will prompt the user to accept space requirements. This requires storing then deleting a very large object, which takes a lot of time and processing power. This would also require checking whether this test has already taken place to ensure that it is only run once.
Is there a better solution?
I don't think there is a good solution to this. Even your solution doesn't really work. Just because you were once able to store a certain amount of data doesn't mean you will always be able to store that amount of data. The quota isn't constant. For instance, in Chrome it's roughly 10% of free hard drive space, which clearly can change over time.
If possible, the best solution would be to make your app gracefully handle the case where the quota is exceeded. Because it will happen no matter how you design your app.
Related
What I am trying to do is set a limit on how much processing power my script can use in javascript. For instance, after 2 MB it starts to lag similar to how NES games would lag when too much action is being displayed on the screen.
I have tried doing research, but nothing is giving me the answer I'm looking for. Everything is about improving performance as opposed to reducing the amount that can be used.
Since it seems you are trying to limit memory usage you might be able to track whatever it is that is using memory and simply limit that for example if you're creating a lot of objects you could reuse created objects once you created a certain number wich would limit used space.
As an asside I would suggest you check out service workers they might present an alternate way of solving your issue.
Do javascript variables have a storage capacity limit?
I'm designing one YUI datatable where I fetch the data from database and store it in a js object and wherever required I'll extract it and update the YUI datatable. Right now in Dev I've very few records and its storing correctly. In production I may have 1000s of records, this js object is capable to store all these 1000s of records?
If its not capable I'll create on hidden textarea in jsp and store the data there
Yes, objects and arrays have storage limits. They are sufficiently large to be, for most purposes, theoretical. You will be more limited by the VM than the language.
In your specific case (sending thousands of items to a client), you will run into the same problem whether it is JSON, JavaScript, or plain text on the JSP page: client memory. The client is far more likely to run out of usable system memory than you are to run into a language restriction. For thousands of small objects, this shouldn't be an issue.
Arrays have a limit of 4.2 billion items, shown in the spec at 15.4.2.2, for example. This is caused by the length being a 32-bit counter. Assuming each element is a single integer, that allows you to store 16GB of numeric data in a single array.
The semantics on objects are more complex, but most functions to work with objects end up using arrays, so you're limited to 4.2 billion keys in most practical scenarios. Again, that's over 16GB of data, not counting the overhead to keep references.
The VM, and probably the garbage collector, will start to hang for long periods of time well before you get near the limits of the language. Some implementations will have smaller limits, especially older ones or interpreters. Since the JS spec does not specify minimum limits in most cases, those may be implementation-defined and could be much lower (this question on the maximum number of arguments discusses that).
With a good optimizing VM that tries to track the structures you use, at that size, will cause enough overhead that the VM will probably fall back to using maps for your objects (it's theoretically possible to define a struct representing that much data, but not terribly practical). Maps have a small amount of overhead and lookup times get longer as size increases, so you will see performance implications: just not at any reasonable object size.
If you run into another limit, I suspect it will be 65k elements (2^16), as discussed in this answer. Finding an implementation that supports less than 65k elements seems unlikely, as most browsers were written after 32 bit architectures became the norm.
There isn't such a limit.
It looks that there is a limit at 16GB, but you can read some tests below or in #ssube's answer.
But probably when your object/json is around 50 mb you'll encounter strange behaviour.
For Json here is an interesting article : http://josh.zeigler.us/technology/web-development/how-big-is-too-big-for-json/
For Js Object you have more knowledge here: javascript object max size limit (saying that there isn't such a limit but encounter strange behaviour at ~40 mb)
The limit depends on the available memory of the browser. So every PC, Mac, Mobile setup will give you a different limit.
I don't know how much memory one of your records needs, but I would guess that 1000 records should work on the most machines.
But: You should avoid storing massive data amounts in simple variables, depending on the records memory it slows down the whole website behavior. Your users with average computers may see ugly scrolling effects, delayed hover effects and so on..
I would recommend you to use local storage. I'm sorry to don't know the YUI library, but I am pretty sure that you can point to the storage for your datatable source.
There is a limit on JavaScript objects max js object limit. What i would suggest using is session objects because that's what it sounds like your trying to do anyway.
I need to get from thousands of online JSON about 300.000 final lines, equal to 30MB.
Being beginner in coding, I prefer to stick to JS to $getJSON data, cut it, append interesting parts to my <body>, and loop on the thousands online JSON. But I wonder :
can my web-browser handles 300.000 $getJSON queries and the resulting 30~50MB webpage without crashing ?
is it possible to use JS to write down a file with this results, so the script's works is constantly saved ?
I expect my script to run about 24 hours. Numbers are estimations.
Edit: I don't have server side knowledge, just JS.
A few things aren't right about your approach for this:
If what you are doing is fetching (and processing) data from another source then displaying it for a visitor, processing of this scale should be done separately and beforehand in a background process. Web browsers should not be used as data processors on the scale you're talking about.
If you try to display a 30-50MB webpage, your user is going to experience lots of frustrating issues - browser crashes, lack of responsiveness, timeouts, long load times, and so on. If you expect any users on older IE browsers, they might as well give up without even trying.
My recommendation is to pull this task out and do it using your backend infrastructure, saving the results in a database which can then be searched, filtered, and accessed by your user. Some options worth looking into:
Cron
Cron will allow you to run a task on a repeated and regular basis, such as daily or hourly. Use this if you want to continually update your dataset.
Worker (Heroku)
If running Heroku, take it out of the dyno and use a separate worker so as not to clog up any existing traffic on your app.
Lately, I've been playing around with Ramsey's theorem for R(5,5). You can see some examples of previous attempts here: http://zacharymaril.com/thoughts/constructionGraph.html
Essence: find all the k4's in a graph/its complement and then connect another point in such a way that no k5's are formed (I know with one type of choice, mathematically it becomes improbable that you would get past 14. But there are ways around that choice and I've gotten it to run as far as 22-23 without bricking my browser.)
With new ideas, I started playing around with storing information from batch to batch. The current construction graph goes through and searches for all the k4's in a graph every time it sees the graph. I thought this was overkill, since the k4's will stay the same in the previous graph and only new k4's could show up in the connections produced by the addition of the new point. If you store the previous k4's each time you find them and then only search in the frontier boundaries that were newly created, then you reduce the number of comparisons you have to do from (n 4) to (n-1 3).
I took a shot at implementing this last night and got it to work without obvious errors. While I am going to go back after this and comb through it for any problems, the new method makes the program much much slower. Before, the program was only ~doubling in terms of time it took to do all the comparisons. Now, it is going up in what looks to be factorial time. I've gone back through and tried to ferret out any obvious errors, but I am wondering whether the new dependence on memory could have created the whole slow down.
So, with that long intro, my main question is how are the memory and speed of a program related in a web browser like chrome? Am I slowing down the program by keeping a bunch of little graphs around as JSON objects? Should it not matter in theory how much memory I take up in terms of speed? Where can I learn more about the connection between the two? Is there a book that could explain this sort of thing better?
Thank you for any advice or answers. Sorry about the length of this: I am still buried pretty deep in the idea and its hard to explain it shortly.
Edit:
Here are the two webpages that show each algorithm,
With storage of previous finds:
http://zacharymaril.com/thoughts/constructionGraph.html
Without storage of previous find:
http://zacharymaril.com/thoughts/Expanding%20Frontier/expandingFrontier.html
They are both best viewed with Chrome. It is the browser I have been using to make this, and if you open up the dev panel with ctrl shift i and type "times", you can see a collection of all the times so far.
Memory and speed of a program are not closely interrelated.
Simple examples:
Computer with almost no ram but lots
of hard drive space is going to be
thrashing the hard drive for virtual
memory. This will slow things down
as hard drives are significantly
slower than ram.
A computer built out
of all ram is not going to do the
same thing. It won't have to go to the hard drive so will stay quicker.
Caching usually takes up a lot of
ram. It also significantly increases
the speed of an application. This is
how memcache works.
An algorithm may take a long time but
use very little ram. Think of a
program that attempts calculating PI.
It will never finish, but needs very
little ram.
In general, the less ram you use (minus caching) the better for speed because there's less chance you're going to run into constraints imposed by other processes.
If you have a program that takes considerable time to calculate items that are going to be referenced again. It makes sense to cache them memory so you don't need to recalculate them.
You can mix the two by adding timeouts to the cached items. Every time you add another item to the cache, you check the items there and remove any that haven't been accessed in a while. "A while" is determined by your need.
I came across a site that does something very similar to Google Suggest. When you type in 2 characters in the search box (e.g. "ca" if you are searching for "canon" products), it makes 4 Ajax requests. Each request seems to get done in less than 125ms. I've casually observed Google Suggest taking 500ms or longer.
In either case, both sites are fast. What are the general concepts/strategies that should be followed in order to get super-fast requests/responses? Thanks.
EDIT 1: by the way, I plan to implement an autocomplete feature for an e-commerce site search where it 1.) provides search suggestion based on what is being typed and 2.) a list of potential products matches based on what has been typed so far. I'm trying for something similar to SLI Systems search (see http://www.bedbathstore.com/ for example).
This is a bit of a "how long is a piece of string" question and so I'm making this a community wiki answer — everyone feel free to jump in on it.
I'd say it's a matter of ensuring that:
The server / server farm / cloud you're querying is sized correctly according to the load you're throwing at it and/or can resize itself according to that load
The server /server farm / cloud is attached to a good quick network backbone
The data structures you're querying server-side (database tables or what-have-you) are tuned to respond to those precise requests as quickly as possible
You're not making unnecessary requests (HTTP requests can be expensive to set up; you want to avoid firing off four of them when one will do); you probably also want to throw in a bit of hysteresis management (delaying the request while people are typing, only sending it a couple of seconds after they stop, and resetting that timeout if they start again)
You're sending as little information across the wire as can reasonably be used to do the job
Your servers are configured to re-use connections (HTTP 1.1) rather than re-establishing them (this will be the default in most cases)
You're using the right kind of server; if a server has a large number of keep-alive requests, it needs to be designed to handle that gracefully (NodeJS is designed for this, as an example; Apache isn't, particularly, although it is of course an extremely capable server)
You can cache results for common queries so as to avoid going to the underlying data store unnecessarily
You will need a web server that is able to respond quickly, but that is usually not the problem. You will also need a database server that is fast, and can query very fast which popular search results start with 'ca'. Google doesn't use conventional database for this at all, but use large clusters of servers, a Cassandra-like database, and a most of that data is kept in memory as well for quicker access.
I'm not sure if you will need this, because you can probably get pretty good results using only a single server running PHP and MySQL, but you'll have to make some good choices about the way you store and retrieve the information. You won't get these fast results if you run a query like this:
select
q.search
from
previousqueries q
where
q.search LIKE 'ca%'
group by
q.search
order by
count(*) DESC
limit 1
This will probably work as long as fewer than 20 people have used your search, but will likely fail on you before you reach a 100.000.
This link explains how they made instant previews fast. The whole site highscalability.com is very informative.
Furthermore, you should store everything in memory and should avoid retrieving data from the disc (slow!). Redis for example is lightning fast!
You could start by doing a fast search engine for your products. Check out Lucene for full text searching. It is available for PHP, Java and .NET amongst other.