I have some very large objects that are used intensively, but occasionally in my Node.JS program. Loading these objects are expensive. In total they occupy more memory than I have in the system.
Is there any way to create a "weak reference" in JavaScript, so that the garbage collector will remove my objects when memory is low, and then I can check whether the object is and reload it again if it was garbage collected since my last access?
The particular use case I had in mind was cartographic reprojection and tiling of gigabytes of map imagery.
Is there any way to create a "weak reference" in Javascript, so that the garbage collector will remove my objects when memory is low?
No, Javascript does not have that.
I don't think a weakMap or weakSet will offer you anything useful here. They don't do what you're asking for. Instead, they allow you to have a reference to something that will NOT prohibit garbage collection. But, if there are no other references to the data, then it will be garbage collected immediately. So, they won't keep the data around for awhile like you want. If you have any other reference to those objects (to keep them around), then they will never get garbage collected. Javascript doesn't offer a weak reference that is only garbage collected when memory starts to get full. Something is either eligible for garbage collection or it isn't. If it's eligible, it will be freed in the next GC pass.
It sounds like what you probably want is a memory cache. You could decide how large you want the cache to be and then keep items in the cache based on some strategy. The most common strategy is LRU (least recently used) where you kick an item out of the cache when you reach the cache size limit and you need to load a new item in the cache. With LRU, you keep track of when an item was last used from the cache and you kick out the oldest one. If you are trying to manage the cache to a memory usage size, you will have to have some scheme for estimating the memory usage of your objects in the cache.
Note that many databases will essentially offer you this functionality as a built-in feature since they will usually contain some sort of caching themselves so if you request an item from the database that you have recently requested, it probably comes from a read cache. You don't really say what your objects are so it's hard for us to suggest exactly how they could be used from a database.
In Javascript I'm iterating through an array of UNIDs and getting a NotesDocument by UNID, then I do a doc.remove(true);
having done that is it necessary to do a doc.recycle()?
Short answer is yes.
For newbies, Notes objects in Java consist of a Java object and a reference to a C++ object. So when you a Java object becomes null (or useless), garbage collector will clear the memory space after a certain amount of time. However, the C++ handle will persist. So we are recycling notes objects to destroy C++ object references. This page has a good explanation abouyt recycling.
On the other hand, doc.remove() can be thought as a state change. Moreover, if soft deletion is enabled in your database, it won't even remove the document, will just mark as deleted (you have to call .removePermanently() to hard-delete it). The C++ object reference will stay in the memory.
Therefore, remove method will not trigger a recycle for the object. Recycle is only triggerred by the object itself or its parent.
I believe you should still recycle it. It's an object at that point not a document.
When I remove items from an array in javascript using the splice method, an array of the removed items are returned.
var a = [{name:'object1'},{name:'object2'},{name:'object3'}];
// a.splice(0,2) -> [{name:'object1'},{name:'object2'}]
// Where do these guys live now? Are they really gone?
Do I then need to call 'delete' on those returned objects to make sure they are taken out of memory? Does the garbage collector just handle this? Can I trust that?
The objects are 'gone' (from your perspective) and the GC will actually free the memory when it deems appropriate. JavaScript does not give you explicit control over garbage collection.
If you're concerned about performance, it's generally better (after profiling, of course) to focus on saving allocations rather than worrying about when exactly things will get GC'd, since that behavior will change depending on which JS engine you're on.
I have a very large collection of objects in node.js (millions) that I need to keep in memory for caching purposes (they are maintained in several global hash objects). Each hash collections stores about 750k keys.
In order to keep GC to a minimum, I want to find out the best way to store these items. Would it be better to split these items into 100s of thousands of hashes? Should I maybe not use hashes at all? Is there any way to keep them off the heap completely so they never are examined by GC (and if so how would I do so)?
There is no public API to control garbage collection from JavaScript.
But GC has gone a long way over the years. Modern GC implementations will notice that some objects live long and put them into a special "area" which will be collected only rarely.
How this exactly works is completely implementation dependent; every browser does it's own thing and usually, this also often changes when a new browser version is released.
EDIT The layout on memory and the organization is completely irrelevant. Modern GCs are really hard to understand in detail without investing a couple of weeks reading the actual code. So what I explain now is a really simplified picture; the real code will work different (and some GCs will use completely different tricks to achieve the same goal).
Imagine that the GC has a counter for each object in which it counts how often it has seen it in the past. Plus it has several lists where it keeps objects of different age, that is with objects whose counters have passed certain thresholds. So when the counter reaches some limit, the object is moved to the next list.
The first list is visited every time GC runs. The second list is only considered for every Nth GC run.
An alternate implementation might add new objects to the top of the "GC list" and for every GC run, it will only check N elements. So long living objects will move down the list and after a while, they don't be checked every time.
What this means for you is that you don't have to do anything; the GC will figure out that your huge map lives a long time (and the same is true for all objects in the map) and after a while, it will start to ignore this data structure.
I want to trigger JavaScript garbage collection. Is it possible? Why would I want to, or not want to, do this?
I went out on a small journey to seek an answer to one of your questions: Is it possible?
People all over town are saying that deleting the references will do the trick. Some people say that wiping the object is an extra guarantee (example). So I wrote a script that will try every trick in the book, and I was astonished to see that in Chrome (22.0.1229.79) and IE (9.0.8112.16421), garbage collection doesn't even seem to work. Firefox (15.0.1) managed without any major drawbacks apart from one (see case 4f down below).
In pseudo-code, the test goes something like this.
Create a container, an array, that will hold objects of some sort. We'll call this container Bertil here on.
Each and every object therein, as an element in Bertil, shall have his own array-container declared as a property. This array will hold a whole lot of bytes. We'll call any one of Bertil's elements, the object, Joshua. Each Joshua's byte array will be called Smith.
Here's a mind map for you to lean back on:
Bertil [Array of objects] -> Joshua [Object] -> Smith [Array of bytes] -> Unnamed [Bytes].
When we've made a mess out of our available memory, hang around for a sec or two and then execute any one of the following "destruction algorithms":
4a. Throw a delete operand on the main object container, Bertil.
4b. Throw a delete operand on each and every object in that container, kill every Joshua alive.
4c. Throw a delete operand on each and every array of bytes, the Smiths.
4d. Assign NULL to every Joshua.
4e. Assign UNDEFINED to every Joshua.
4f. Manually delete each and every byte that any Joshua holds.
4g. Do all of the above in a working order.
So what happened? In case 4a and 4b, no browser's garbage collector (GC) kicked in. In case 4c to 4e, Firefox did kick in and displayed some proof of concept. Memory was reclaimed shortly within the minute. With current hardcoded default values on some of the variables used as test configuration, case 4f and 4e caused Chrome to hang, so I can't draw any conclusions there. You are free to do your own testing with your own variables, links will be posted soon. IE survived case 4f and 4e but his GC was dead as usual. Unexpectedly, Firefox survived but didn't pass 4f. Firefox survived and passed 4g.
In all of the cases when a browser's GC failed to kick in, waiting around for at least 10 minutes didn't solve the problem. And reloading the entire page caused the memory footprint to double.
My conclusion is that I must have made a horrible error in the code or the answer to your question is: No we can't trigger the GC. Whenever we try to do so we will be punished severely and we should stick our heads in the sand. Please I encourage you to go ahead, try these test cases on your own. Have a look in the code were comment on the details. Also, download the page and rewrite the script and see if you can trigger the GC in a more proper way. I sure failed and I can't for the life of me believe that Chrome and IE doesn't have a working garbage collector.
http://martinandersson.com/dev/gc_test/?case=1
http://martinandersson.com/dev/gc_test/?case=2
http://martinandersson.com/dev/gc_test/?case=3
http://martinandersson.com/dev/gc_test/?case=4
http://martinandersson.com/dev/gc_test/?case=5
http://martinandersson.com/dev/gc_test/?case=6
http://martinandersson.com/dev/gc_test/?case=7
You can trigger manually JavaScript garbage collector in IE and Opera, but it's not recommended, so better don't use it at all. I give commands more just for information purpose.
Internet Explorer:
window.CollectGarbage()
Opera 7+:
window.opera.collect()
Garbage collection runs automatically. How and when it runs and actually frees up unreferenced objects is entirely implementation specific.
If you want something to get freed, you just need to clear any references to it from your javascript. The garbage collector will then free it.
If you explain why you even think you need to do this or want to do this and show us the relevant code, we might be able to help explain what your alternatives are.
Check your code for global variables. There may be data coming through an ajax call that is stored, and then referenced somewhere and you did not take this into account.
As a solution, you should wrap huge data processing into an anonymous function call and use inside this call only local variables to prevent referencing the data in a global scope.
Or you can assign to null all used global variables.
Also check out this question. Take a look at the third example in the answer. Your huge data object may still be referenced by async call closure.
This answer suggests the following garbage collection request code for Gecko based browsers:
window.QueryInterface(Components.interfaces.nsIInterfaceRequestor)
.getInterface(Components.interfaces.nsIDOMWindowUtils)
.garbageCollect();
Came across this question and decided to share with my recent findings.
I've looked to see a proper handling of a WeakMap in Chrome and it's actually looks okay:
1) var wm = new WeakMap()
2) var d = document.createElement('div')
3) wm.set(d, {})
at this stage weak map holds the entry cause d is still referencing the element
4) d = null
at this stage nothing references the element and it's weakly referenced object, and indeed after a couple of minutes entry disappeared and garbage collected.
when did the same but appended the element to the DOM, it was not reclaimed, which is correct, removed from the DOM and still waiting for it to be collected :)
Yes, you can trigger garbage collection by re-loading the page.
You might want to consider using a Factory Pattern to help re-use objects, which will greatly cut down on how many objects are created. Especially, if you are continuously creating objects that are the same.
If you need to read up on Factory Patterns then get yourself this book, "Pro Javascript Design Patterns" by Ross Harmes and Dustin Diaz and published by APress.
I was reading Trevor Prime's answer and it gave me a chuckle but then I realized he was on to something.
Reloading the page does 'garbage-collect'.
location.reload() or alternatives to refresh page.
JSON.parse/stringify and localStorage.getItem/setItem for persistence of needed data.
iframes as reloading pages for user experience.
All you need to do is run your code in an iframe and refresh the iframe page while saving useful information into localStorage. You'll have to make sure the iframe is on the same domain as main page to access its DOM.
You could do it without an iframe but user experience will no doubt suffer as the page will be visibly resetting.
If it is true that there is no way to trigger a GC, as implied by the other answers, then the solution would be for the appropriate browser standards group to add a new JavaScript function to window or document to do this. It is useful to allow the web page to trigger a GC at a time of its own choosing, so that animations and other high-priority operations (sound output?) will not be interrupted by a GC.
This might be another case of "we've always done it this way; don't rock the boat" syndrome.
ADDED:
MDN documents a function "Components.utils.schedulePreciseGC" that lets a page schedule a GC sometime in the future with a callback function that is called when the GC is complete. This may not exist in browsers other than Firefox; the documentation is unclear. This function might be usable prior to animations; it needs to be tested.