Garbage collector issues on spidermonkey.... JS_AnchorPtr()?

Garbage collector issues on spidermonkey.... JS_AnchorPtr()? - javascript

I've rolled my own javascript server side language called bondi. Just recently upgraded to the new spider monkey.
Now that JS enter local roots and leave local roots function is gone/useless from the 1.8.5 api, is it enough to just use anchor pointer(JS_AnchorPtr(varname)) at the end of your function calls to make sure the compiler isn't removing references to keep the garbage collector happy?
I've been testing it by removing all my references to JS_EnterLocalRootScope (see here)
/ Leave local root scope and adding JS_AnchorPtr() to the bottom of the script.
I looked up AnchorPoint function in the source code of spider monkey. Guess what... it does nothing. There's no doco for it either. I'm using it just so that I can get a mention in of those variables so the garbage collector doesn't kill them.

Well, blame seems to say that bug 519949 is recommending you use js::Anchor so that the conservative stack scanner will pick it up.
Note that the conservative scanner can find any GC thing that's on the stack or in registers, so the only really tricky case is where you use derived values when the "owning" GC thing may be dead, like so:
{
JSString *str = GetMeSomeStringYo();
const jschar *chars = str->chars();
// Note, |str| is not "live" here, but the derived |chars| is!
// The conservative stack scanner won't see |chars| and know
// to keep |str| alive, so we should be anchoring |str|.
DoSomethingThatCanCauseGC();
return chars[0];
}
If you're using C the JS_AnchorPtr at the end of the functions should be enough. You are correct that the function has a nop implementation! The idea is that, so long as it's performing a call to a shared object symbol with the variable to keep alive as a parameter, the calling function will have to keep that value around in machine state in order to perform the do-nothing call. This is more sucky for perf than js::Anchor.
There's one potential trap in the unlikely case that you're statically linking against SpiderMonkey and have Link Time Optimization enabled: the cross-object call may be inlined with a null implementation, eliminating liveness of the variable, in which case the same GC hazards may pop back up.

Related

Do languages like JS with a copying GC ever store anything on the cpu registers?

I am learning about GC's and I know there's a thing called HandleScope which 'protects' your local variables from the GC and updates them if a gc heap copy happens. For example, if I have a routine which adds togother 2 values and I call it, it may invoke the garbage collector which will copy the Object that my value is pointing to (or the GC will not even know that the Object the value is pointing to is referenced). A really minimal example:
#include <vector>
Value addValues(Value a, Value b);
std::vector<Value*> gc_vals_with_protection;
Value func(Value a, Value b)
{
vars.push_back(&a); // set protection for a
gc_vals_with_protection.push_back(&b); // set protection for b
Value res = addValues(a, b); // do calcuations
gc_vals_with_protection.pop_back(); // remove protection for b
gc_vals_with_protection.pop_back(); // remove protection for a
return res;
}
But this has got me thinking, it will mean that a and b will NEVER be on the physical CPU registers because you have taken their addresses (and CPU registers don't have addresses) which will make calcuations on them inefficient. Also, at the beggining of every function, you would have to push back twice to the vector (https://godbolt.org/z/dc6vY1Yc5 for assembly).
I think I may be missing something, as this must be not optimal. Is there any other trick I am missing?

(V8 developer here.)
Do languages like JS with a copying GC ever store anything on the cpu registers?
Yes, of course. Pretty much anything at all that a CPU does involves its registers.
That said, JavaScript objects are generally allocated on the heap anyway, for at least the following reasons:
(1) They are bigger than a register. So registers typically hold pointers to objects on the heap. It's these pointers, not the objects themselves, that Handles are needed for (both to update them, and to inform the GC that there are references to the object in question, so the object must not be freed).
(2) They tend to be much longer-lived than the typical amount of time you can hold something in a register, which is only a couple of machine instructions: since the set of registers is so small, they are reused for something else all the time (regardless of JavaScript or GC etc), so whatever they held before will either be spilled (usually though not necessarily to the stack), or re-read from wherever it originally came from next time it's needed.
(3) They have "pointer identity": JavaScript code like obj1 === obj2 (for objects, not primitives) only works correctly when there is exactly one location where an object is stored. Trying to store objects in registers would imply copying them around, which would break this.
There is certainly some cost to creating Handles; it's faster than adding something to a std::vector though.
Also, when passing Handles from one function to another, the called function doesn't have to re-register anything: Handles can be passed around without having to create new entries in the HandleScope's backing store.
A very important observation is that JavaScript functions don't need Handles for their locals. When executing JavaScript, V8 carefully keeps track of the contents of the stack (i.e. spilled contents of registers), and can walk and update the stack directly. HandleScopes are only needed for C++ code dealing with JS objects, because this technique isn't possible for C++ stack frames (which are controlled by the C++ compiler). Such C++ code is typically not the most critical performance bottleneck of an app; so while its performance certainly matters, some amount of overhead is acceptable.
(Side note: one can "blindly" (i.e. without knowledge about their contents) scan C++ stack frames and do so-called "conservative" (instead of "precise") garbage collection; this comes with its own pros and cons, in particular it makes a moving GC impossible, so is not directly relevant to your question.)
Taking this one step further: sufficiently "hot" functions will get compiled to optimized machine code; this code is the result of careful analysis and hence can be quite aggressive about keeping values (primarily numbers) in registers as long as possible, for example for chains of calculations before the final result is eventually stored in some property of some object.
For completeness, I'll also mention that sometimes, entire objects can be held in registers: this is when the optimizing compiler successfully performs "escape analysis" and can prove that the object never "escapes" to the outside world. A simple example would be:
function silly(a, b) {
let vector = {x: a, y: b};
return vector.x + vector.y;
}
When this function gets optimized, the compiler can prove that vector never escapes, so it can skip the allocation and keep a and b in registers (or at least as "standalone values", they might still get spilled to the stack if the function is bigger and needs those registers for something else).

Life of JavaScript objects & Memory Leaks

I have researched quite a bit about this but mostly by piecing other questions together, which still leaves some doubt. In an app that does not refresh the browser page at any time and may live for quite a while (hours) without closing (assuming that refreshing a page or navigating to another would restart the js code), what's the best way to ensure objects are released and that there's no memory leak.
These are the specific scenarios I'm concerned about:
All of the code below is within a revealing module pattern.
mycode = function(){}()
variables within functions, I'm sure this one is collected by the GC just fine
function(){ var h = "ss";}
variables within the module, should g = null when it's no longer needed?
var g;
function(){ g = "dd";}
And lastly the life of a jqXHR: is it cleaned up after it returns? Should it be set to null in all cases as a precaution whether kept inside a function or module?
If doing this, is it x cleaned up by the GC after it returns?:
function(){
var x = $.get();
x.done = ...;
x.fail = ...;
}
How about when doing this, will it also be cleaned up after x returns?:
var x;
function(){
x = $.get();
x.done = ...;
x.fail = ...;
}
Lastly, is there a way to cleanup all variables and restart a module without restarting the browser?

variables within functions, I'm sure this one is collected by the GC just fine
Yes.
variables within the module, should g = null when it's no longer needed?
Sure.
And lastly the life of a jqXHR: is it cleaned up after it returns? Should it be set to null in all cases as a precaution whether kept inside a function or module?
Various browsers have had bugs related to XHR that caused the onreadystatechange and anything it closed over to remain uncollectable unless the dev was careful to replace it with a dummy value (xhr.onreadystatechange = new Function('')) but I believe jQuery handles this for you.
Lastly, is there a way to cleanup all variables and restart a module without restarting the browser?
Global state associated with the page will take up browser memory until the page is evicted from the browser history stack. location.replace can help you here by letting you kill the current page and replace it with a new version of the same app without expanding the history stack.
Replace the current document with the one at the provided URL. The difference from the assign() method is that after using replace() the current page will not be saved in session history, meaning the user won't be able to use the Back button to navigate to it.
When you use the word "module", that is not a term that has a well-defined meaning to the browser or its JavaScript interpreter so there is no way to evict a module and only a module from memory. There are several things that you have to worry about that might keep things in memory:
References to JavaScript objects that have been attached to DOM nodes and everything they close over -- event handlers are a very common example.
Live setInterval and setTimeout callbacks and everything they close over.
Properties of the global object and everything they close over.
As you noted, properties of certain host objects like XHR instances, web worker callbacks, etc. and (you guessed it) everything they close over.
Any scheme that is going to unload a module and only a module would need to deal with all of these and figure out which of them are part of the module and which are not. That's a lot of different kinds of cleanup.

Javascript is a garbage-collected language. It relies on the garbage collector to clean up unused memory. So essentially, you have to trust that the GC will do its job.
The GC will (eventually, not necessarily immediately) collect objects that are unreachable to you. If you have a reference to an object, then it is potentially still in use, and so the GC won't touch it.
If you have no reference to the object, directly or indirectly, then the GC knows that the object cannot possibly be used, and the object can be collected. So all you have to do, really, is make sure you reset any references to the object.
However, the GC makes no guarantees about when the object will be collected. And you shouldn't need to worry about that.

Really the only leaks you should worry about are closures.
function foo(a){
var b = 10 + a;
return function(c){
return b + c;
}
}
var bar = foo(20);
var baz = bar(5);
The GC has no way to delete var b - it's out of scope. This is a big problem with IE, not so much with Mozilla and much less with Chrome.

As a rule of thumb with any garbage-collected language (this applies to Java, .NET and JavaScript for example), what you want to do is make sure that there is no lingering reference to a block of memory that you want to have the GC clean up for you. When the GC looks at a block of memory and finds that there is still something in the program referencing it, then it will avoid releasing it.
With regard to the jqXHR, there's no point in you setting them to null at the end of an AJAX function call. All of the parameters of a AJAX success/error/complete will be released once the function returns by the GC unless jQuery is doing something bizarre like keeping a reference to them.

Every variable you can no longer access can be collected by the GC. If you declare a variable inside a function, once the function is quit, the variable can be removed. It is, when the computer runs out of memory, or at any other time.
This becomes more complicated when you execute asynchronous functions like XHR. The done and fail closures can access all variables declared in the outer functions. Thus, as long as done and fail can be executed, all the variables must remain in memory. Once the request is finished, the variables can be released.
Anyway, you should simply make sure that every variable is declared as deep as possible.

Is the first example with 'g', g should be set to null. It will retain the pointer to "dd" otherwise.
In the second example, the 'x' in the first case does not need to be set to null, since that variable will "go away" when the surrounding function exits. In the second case, with 'x' outside of the function, 'x' will retain whatever is assigned to it, and that will not be GC'd until 'x' is set to null or something else.

Can I trigger JavaScript's garbage collection?

I want to trigger JavaScript garbage collection. Is it possible? Why would I want to, or not want to, do this?

I went out on a small journey to seek an answer to one of your questions: Is it possible?
People all over town are saying that deleting the references will do the trick. Some people say that wiping the object is an extra guarantee (example). So I wrote a script that will try every trick in the book, and I was astonished to see that in Chrome (22.0.1229.79) and IE (9.0.8112.16421), garbage collection doesn't even seem to work. Firefox (15.0.1) managed without any major drawbacks apart from one (see case 4f down below).
In pseudo-code, the test goes something like this.
Create a container, an array, that will hold objects of some sort. We'll call this container Bertil here on.
Each and every object therein, as an element in Bertil, shall have his own array-container declared as a property. This array will hold a whole lot of bytes. We'll call any one of Bertil's elements, the object, Joshua. Each Joshua's byte array will be called Smith.
Here's a mind map for you to lean back on:
Bertil [Array of objects] -> Joshua [Object] -> Smith [Array of bytes] -> Unnamed [Bytes].
When we've made a mess out of our available memory, hang around for a sec or two and then execute any one of the following "destruction algorithms":
4a. Throw a delete operand on the main object container, Bertil.
4b. Throw a delete operand on each and every object in that container, kill every Joshua alive.
4c. Throw a delete operand on each and every array of bytes, the Smiths.
4d. Assign NULL to every Joshua.
4e. Assign UNDEFINED to every Joshua.
4f. Manually delete each and every byte that any Joshua holds.
4g. Do all of the above in a working order.
So what happened? In case 4a and 4b, no browser's garbage collector (GC) kicked in. In case 4c to 4e, Firefox did kick in and displayed some proof of concept. Memory was reclaimed shortly within the minute. With current hardcoded default values on some of the variables used as test configuration, case 4f and 4e caused Chrome to hang, so I can't draw any conclusions there. You are free to do your own testing with your own variables, links will be posted soon. IE survived case 4f and 4e but his GC was dead as usual. Unexpectedly, Firefox survived but didn't pass 4f. Firefox survived and passed 4g.
In all of the cases when a browser's GC failed to kick in, waiting around for at least 10 minutes didn't solve the problem. And reloading the entire page caused the memory footprint to double.
My conclusion is that I must have made a horrible error in the code or the answer to your question is: No we can't trigger the GC. Whenever we try to do so we will be punished severely and we should stick our heads in the sand. Please I encourage you to go ahead, try these test cases on your own. Have a look in the code were comment on the details. Also, download the page and rewrite the script and see if you can trigger the GC in a more proper way. I sure failed and I can't for the life of me believe that Chrome and IE doesn't have a working garbage collector.
http://martinandersson.com/dev/gc_test/?case=1
http://martinandersson.com/dev/gc_test/?case=2
http://martinandersson.com/dev/gc_test/?case=3
http://martinandersson.com/dev/gc_test/?case=4
http://martinandersson.com/dev/gc_test/?case=5
http://martinandersson.com/dev/gc_test/?case=6
http://martinandersson.com/dev/gc_test/?case=7

You can trigger manually JavaScript garbage collector in IE and Opera, but it's not recommended, so better don't use it at all. I give commands more just for information purpose.
Internet Explorer:
window.CollectGarbage()
Opera 7+:
window.opera.collect()

Garbage collection runs automatically. How and when it runs and actually frees up unreferenced objects is entirely implementation specific.
If you want something to get freed, you just need to clear any references to it from your javascript. The garbage collector will then free it.
If you explain why you even think you need to do this or want to do this and show us the relevant code, we might be able to help explain what your alternatives are.

Check your code for global variables. There may be data coming through an ajax call that is stored, and then referenced somewhere and you did not take this into account.
As a solution, you should wrap huge data processing into an anonymous function call and use inside this call only local variables to prevent referencing the data in a global scope.
Or you can assign to null all used global variables.
Also check out this question. Take a look at the third example in the answer. Your huge data object may still be referenced by async call closure.

This answer suggests the following garbage collection request code for Gecko based browsers:
window.QueryInterface(Components.interfaces.nsIInterfaceRequestor)
.getInterface(Components.interfaces.nsIDOMWindowUtils)
.garbageCollect();

Came across this question and decided to share with my recent findings.
I've looked to see a proper handling of a WeakMap in Chrome and it's actually looks okay:
1) var wm = new WeakMap()
2) var d = document.createElement('div')
3) wm.set(d, {})
at this stage weak map holds the entry cause d is still referencing the element
4) d = null
at this stage nothing references the element and it's weakly referenced object, and indeed after a couple of minutes entry disappeared and garbage collected.
when did the same but appended the element to the DOM, it was not reclaimed, which is correct, removed from the DOM and still waiting for it to be collected :)

Yes, you can trigger garbage collection by re-loading the page.
You might want to consider using a Factory Pattern to help re-use objects, which will greatly cut down on how many objects are created. Especially, if you are continuously creating objects that are the same.
If you need to read up on Factory Patterns then get yourself this book, "Pro Javascript Design Patterns" by Ross Harmes and Dustin Diaz and published by APress.

I was reading Trevor Prime's answer and it gave me a chuckle but then I realized he was on to something.
Reloading the page does 'garbage-collect'.
location.reload() or alternatives to refresh page.
JSON.parse/stringify and localStorage.getItem/setItem for persistence of needed data.
iframes as reloading pages for user experience.
All you need to do is run your code in an iframe and refresh the iframe page while saving useful information into localStorage. You'll have to make sure the iframe is on the same domain as main page to access its DOM.
You could do it without an iframe but user experience will no doubt suffer as the page will be visibly resetting.

If it is true that there is no way to trigger a GC, as implied by the other answers, then the solution would be for the appropriate browser standards group to add a new JavaScript function to window or document to do this. It is useful to allow the web page to trigger a GC at a time of its own choosing, so that animations and other high-priority operations (sound output?) will not be interrupted by a GC.
This might be another case of "we've always done it this way; don't rock the boat" syndrome.
ADDED:
MDN documents a function "Components.utils.schedulePreciseGC" that lets a page schedule a GC sometime in the future with a callback function that is called when the GC is complete. This may not exist in browsers other than Firefox; the documentation is unclear. This function might be usable prior to animations; it needs to be tested.

in JavaScript, when finished with an object created via new ActiveXObject, do I need to set it to null?

In a Javascript program that runs within WSH and creates objects, let's say Scripting.FileSystemObject or any arbitrary COM object, do I need to set the variable to null when I'm finished with it? Eg, am I recommended to do this:
var fso = new ActiveXObject("Scripting.FileSystemObject");
var fileStream = fso.openTextFile(filename);
fso = null; // recommended? necessary?
... use fileStream here ...
fileStream.Close();
fileStream = null; // recommended? necessary?
Is the effect different than just letting the vars go out of scope?

Assigning null to an object variable will decrement the reference counter so that the memory management system can discard the resource - as soon as it feels like it. The reference counter will be decremented automagically when the variable goes out of scope. So doing it manually is a waste of time in almost all cases.
In theory a function using a big object A in its first and another big object B in its second part could be more memory efficient if A is set to null in the middle. But as this does not force the mms to destroy A, the statement could still be a waste.
You may get circular references if you do some fancy class design. Then breaking the circle by hand may be necessary - but perhaps avoiding such loops in the first place would be better.
There are rumours about ancient database access objects with bugs that could be avoided by zapping variables. I wouldn't base my programming rules on such voodoo.
(There are tons of VBscript code on the internet that is full of "Set X = Nothing"; when asked, the authors tend to talk about 'habit' and other languages (C, C++))

Building on what Ekkehard.Horner has said...
Scripts like VBScript, JScript, and ASP are executed within an environment that manages memory for you. As such, explicitly setting an object reference to Null or Empty, does not necessarily remove it from memory...at least not right away. (In practice it's often nearly instantaneous, but in actuality the task is added to a queue within the environment that is executed at some later point in time.) In this regard, it's really much less useful than you might think.
In compiled code, it's important to clean up memory before a program (or section of code in some cases) ends so that any allocated memory is returned to the system. This prevents all kinds of problems. Outside of slowly running code, this is most important when a program exits. In scripting environments like ASP or WSH, memory management takes care of this cleanup automatically when a script exits. So all object references are set to null for you even if you don't do it explicitly yourself which makes the whole mess unnecessary in this instance.
As far as memory concerns during script execution, if you are building arrays or dictionary objects large enough to cause problems, you've either gone way beyond the scope of scripting or you've taken the wrong approach in your code. In other words, this should never happen in VBScript. In fact, the environment imposes limits to the sizes of arrays and dictionary objects in order to prevent these problems in the first place.

If you have long running scripts which use objects at the top/start, which are unneeded during the main process, setting these objects to null may free up memory sooner and won't do any harm. As mentioned by other posters, there may be little practical benefit.

Memory leak in JavaScript (Chrome)

I'm calling a function 50 times a second, which does some expensive things as it is painting alot on a <canvas> element.
It works great, no problems there, but I just took a look at the memory usage and it was stealing 1MB a second of my RAM. Chrome seems to garbage collect, as it went down each minute or so, but then the usage grew again.
What I tried is putting return at certain places in my function so as to decide what part of my function exactly causes the leak. I've been able to cut it down to a specific line of code, after which the evil part comes, but I don't really know how to solve it.
My questions are:
What tool is available to effectively measure JavaScript memory leaks in Chrome?
Would it be effective to set variables to null / undefined after they have been used, something like disposing them?
If the source code is really necessary I wouldn't hestitate to post it here, but I must admit that it's both long and perhaps a little ununderstandable for others.

I'm just going to pull this quote directly, linked from the article;
Speaking of memory leaks, breaking circular references — the cause of the leaks — is usually done with simple null assignment. There’s usually no need to use delete. Moreover, null‘ing allows to “dereference” variables — what delete would normally not be able to do.
var el = document.getElementById('foo');
// circular reference is formed
el.onclick = function() { /* ... */ };
// circular reference is broken
el = null;
// can't `delete el` in this case, as `el` has DontDelete
For these reasons, it’s best to stick with null‘ing when breaking circular references.
delete Explained

Look at heap profile under the Profiles tab in Chrome's developer tools for information about memory usage.
You can do the following to prevent memory leaks:
Test your code with JSLint, to see if that will give you some pointers.
Use the var keyword to give your variables function scope, so they can be garbage collected when they go out of scope. Without the var keyword variables have global scope.
Use delete variable; statements to remove the object as well as the reference from memory. Setting the variable to null will only remove the object from memory, but not its reference.

We Keep Coding

JavaScript is the programming language of the Web.