Detecting changes in a complex object - javascript

I'm writing a compiler in JavaScript, and for the optimizer I'm following the usual pattern where a list of optimizations is run repeatedly until nothing happens. The obvious way to detect the 'nothing happens' condition is to have each optimization set a flag if it successfully does something. I'm wondering if there's a more elegant way to go about it.
In the abstract, the problem can be phrased like this: given a complex object (with many levels of subobjects including arrays with circular references etc.), run it through a possible transformation then detect whether anything has changed. So the question is whether there is a simple way to detect changes in a complex object.
Watch.js provides ways to detect changes in an object, but only at top level, and will trigger if a field is changed even if it is subsequently returned to its original value.
Another approach would be to make a deep copy of the object and then deep compare with the original. However, from other questions here, deep copy looks like a nontrivial operation and deep compare has challenges of its own.
Is there an elegant trick I'm missing, or should I just stick to letting each optimization pass do its own bit of bookkeeping?

I would just post this as a comment, but I don't have the required rep.
I don't know if this would work in your situation, but what you could do is convert to JSON, then compare the strings:
JSON.stringify(firstObject) === JSON.stringify(secondObject)
Was looking around a bit more, and found another stackoverflow post with a similar question. There is a solution similar to mine, but what I found most interesting was the second solution, not chosen as the answer, I think it has what you need: Object comparison in JavaScript

Related

Object.constructor() vs Presence of Join in terms of performance

I've been reading this thread to check if argument is Array or String
Check if object is array?
The solution mentioned is:
function isArray(obj){
return !!obj && Array === obj.constructor;
}
But I've written the following function
function isArray(obj) {
return obj.join;
}
I've done some sample test. I'm finding array.join() returns faster result than object.constructor(). Am I missing something?
I thought Object.constructor() is the parent class hence faster, but I'm able to return faster results using obj.split. Does it mean the answer in the other SO post was incorrect?
Jsfiddle:
https://jsfiddle.net/vdpseotp/6/
Please explain.
The approach of using obj.join is sometimes called "duck-typing". Using join as you propose will fail miserably if you use it on an object which happens to contain a join property.
Just use the APIs which are designed for the task, such as Array.isArray, or a routine from a utility library. They have already thought through all of the issues. For example, Underscore uses Object.prototype.toString.call(obj) === '[object Array]'. There is also an extensive literature right here on SO on the topic, starting with the one you mention.
The more basic question is, why do you have things and don't know whether they are arrays or not? Perhaps some utility routines for traversing arbitrary objects might need to check for array-ness, or a server might send down something that might or might not be an array, but a well designed user-program should already know which objects are what. To put it a different way, if you were ever to move to a typed approach such as TypeScript, what type would you propose to assign to these maybe-an-array-maybe-not objects? Would you use any, which sort of defeats the whole purpose?
The second question is, why are you obsessing over a few microseconds? Do you have a game which is recalculating the position of hundreds of thousands of objects, sixty times per second? If not, it doesn't matter. If it does matter, see previous paragraph, and refactor your code so as not to have to check for something being an array all the time.
By the way, a meaningful benchmark should be run a million times. FWIW, console.time[End] often comes in handy for quick-and-dirty benchmarks. Anyway, according to your fiddle, the time difference is nothing more than a factor 1.5 or so at worst.
Minor note: Array === obj.constructor is not going to play well with subclasses of Array.

Is it a bad practice to declare an Array in arguments?

validationError([elem1,elem2],type,shiftNo);
or
var arr = [elem1,elem2];
validationError(arr,type,shiftNo);
What I mean to ask is approach 1 of calling the function considered bad ( also does it have any performance ramifications). and for that matter is it a bad approach to declare strings, object and functions inside arguments.
Performance is not an issue, not in a language like JS, Ruby or whatnot. So all we can do is think about code readability. And this case is not strongly related to JS, so will be my examples.
move = ["E2", "E4"];
if chessboard.valid(move, player) {
...
}
This clearly states: "if the move (E2 E4) is valid for this chessboard, then...", you don't even need to look at the docs to know that. If we write that without assigning our array a name, the result looks a little cryptic (still easy to guess, but harder for such a tiny example):
if chessboard.valid(["E2", "E4"], player) {
...
}
What is this supposed to mean? What does valid stand for here? Maybe, it's asking whether these cells contain valid player's pieces? This is a sympthom of a design flaw, more precisely bad naming. It makes bold assumptions about how the chessboard code will be used. We can make it obvious that this array represents a move by renaming the chessboard's method:
if chessboard.valid_move(["E2", "E4"], player) {
...
}
This is better, but you may not have an API that allows your code to stay so readable without some additional naming.
So, I suggest a rule of thumb:
If the array will be used more than once, name it.
If the meaning of the array is not obvious from where it goes (function name), name it.
Don't name it, unless points 1 or 2 apply.
It doesn't make any difference really. Either way, you create a Javascript Array, which basically is an Object and get a reference in return (which you pass to your method). If you don't need to access that array (or other data) later in your code, the second approach is completely fine.
Are the contents of arr ever going to get used again? If so, then option 2 is definitely the way to go. If not... something as simple as this is probably just personal opinion.
Personally, I'd have to say that option 2 is better practice, even though sometimes I'm guilty of using option 1. Option 2 is easier to read, it's easier to follow and it's less likely someone will have to re-read it because they became temporarily confused or lost in flow of thought whilst reading through your code (especially newer programmers). For those reasons it's easier to maintain, you, and potentially future developers working with your code, will likely save time working with it.
The only negatives I can see would be generating an absolutely miniscule amount of overhead, and now you have 2 lines of code instead of 1. But I think that's irrelevant, the tiny potential benefits of option 2 outweigh the tiny negatives of option 1.
It is subjective, but in my opinion it is better to use the second approach.
As #jAndy said there is no difference in the code execution neither the performance of your code, but it is easier to debug and easier to read and understand the second approach.

Comparing a string against any other variable to determine type in javascript

I am aware of being able to use typeof, however, i would like to know if using
String(anyVariable) === anyVariable
in order to figure out if anyVariable is a string:
Is a generally valid approach?
Works consistently among browsers?
Has any pitfalls?
I would say do not do that, and use typeof because "String" is used to manipulate a stored piece of text, not compare types. It is best to use the features in their intended use, to assure the most stability, and best practice out of it. Also, the purpose is to extend the type with methods. So you are basically causing more work and processing, instead of just a type comparison. Hopefully that answers it, though this is a question that merely has an "opinion" as an answer. You wouldn't create a new object, assign it to your current object, to check if it is a type of object would you? No, you would just use "typeof".
I can think of no reason to use your method vs. the much simpler typeof. Yours is likely to perform worse (15x slower by Matti's jsperf) and be more complex.
Your method is going to require multiple memory manipulations (creating string object, then assign string value to it) and then need to run the garbage collector afterwards whereas typeof just looks at a property of the internal javascript object.
When in doubt, choose the simplest method that solves your problem.
When in doubt, choose the method that is specified in the language definition for solving your problem.
When in doubt, choose the method that requires less memory manipulation.

Chrome performance: "standard" property names vs. non-standard

So this is an interesting one... While I was testing the performance of setAttribute vs. normal property set on an element, I found an odd behavior, which I then tested on regular objects and... It's still odd!
So if you have an object A = {},
and you set its property like A['abc_def'] = 1, or A.abc_def = 1, they are basically the same.
But then if you do A['abc-def'] = 1 or A['123-def'] = 1 then you are in trouble. It goes wayyy slower.
I set up a test here: http://jsfiddle.net/naPYL/1/. They all work the same on all browsers except chrome.
The funny thing is that for "abc_def" property, chrome is actually much faster than Firefox and IE, as I expected. But for "abc-def" it's at least twice as slow.
So what happens here basically (at least from my tests) is that when using "correct" syntax for properties (legal C syntax, which you can use with dot properties) - It's fast, but when you use syntax that requires using brackets (a[...]) then you're in trouble.
I tried to imagine what implementation detail would distinguish in such a way between the two modes, and couldn't. Because as I think of it, if you do support those non-standard names, you are probably translating all names to the same mechanics, and the rest is just syntax which is compiled into that mechanic. So . syntax and [] should be all the same after compilation. But obviously something is going the other way around here...
Without looking at V8's source code, could anyone think of a really satisfying answer? (Think of it as an exercise :-))
Here's also a quick jsperf.com example
Thanks to NDM for the jsperf example!
Edit:
To clarify, of course I want also a concrete answer from the real code
(which I already found) or to be more precise - the reason behind that
specific implementation. That is one of the reasons I asked you to
look at it "as an exercise", to look behind the technical
implementation and try to find the reason.
But I also wanted to see how other people's minds work in cases like these.
This may sound "vague" to some of you - but it is very useful to try and think
like other people from time to time, or take their point of view. It
enhances your own ways of thinking.
So JS objects can be used for two conflicting purposes. They can be used as objects but they can be used as hash tables too. However what is fast and makes sense
for objects is not so for hash tables, so V8 tries to guess what a given object is.
Some signs the user can give that he wants a dictionary are deleting a property or giving a property a name that cannot be accessed using dot notation.
Some other heuristics are also used, I have made a gist https://gist.github.com/petkaantonov/6327915.
There is however a really cool hack that redempts an object from hash table hell:
function ensureFastProperties(obj) {
function f() {}
f.prototype = obj;
return obj;
}
See it in action: http://jsperf.com/property-dash-parformance/2.
The redempted object is not as fast as the original because the properties are stored in the external properties array rather than in-object. But that's still far better than hash table. Note that this is still pretty broken benchmark, do not think for a second that hash tables are only 2x slower than inobject properties.

Which way is faster to copy an array of objects: slice or clone?

This is related to: How do I pass the value instead of the refererence of an array?
I need to send the value instead of reference to an array. To that question I got 2-3 valid answers. One was use slice, second (and third was similar) was to use clone or make my own clone function.
From a (very) quick test, it seems like slice was faster (tested on a 100,000 elements array). But I don't have any explanation for that.
Can anyone clarify if and why is slice faster?
The clone function presented in that answer is very general (also quite poor; never, ever, ever add enumerable properties to Object.prototype, and there are other issues as well), and is implemented in JavaScript. In contrast, the slice answer uses the JavaScript engine's built-in function, which can be written in highly optimized machine code. (Or not, of course.)

Categories