According to my research and googling, Javascript seems to lack support for locale aware sorting and string comparisons. There is localeCompare(), but it has been reported of browser specific differencies and impossibility to explicitly set which locale is used (the OS locale is not always the one wanted). There is some intentions to add collation support inside ECMAScript, but before it, we are on our own. And depending how consistent the results are across browsers, may be we are on our own forever :(.
I have the following code, which makes alphabetical sort of an array. It's made speed in mind, and the ideas are got from https://stackoverflow.com/a/11598969/1691517, to which I made some speed improvements.
In this example, the words array has 13 members and the sort-function is called 34 times. I want to replace some of the letters in the words-array (you don't have to know what replacements are made, because it's not the point in this question). If I make these replacements in sort-function ( the one that starts with return function(a, b) ), the code is inefficient, because replacements are made more than once per array member. Of course I can make these replacements outside of this closure, I mean before the line words.sort(sortbyalphabet_timo);, but it's not what I want.
Question 1: Is it possible to modify the words-array in between the lines "PREPARATION STARTS" and "PREPARATION ENDS" so that the sort function uses modified words-array?
Question 2: Is it possible to input arguments to the closure so that code between PREPARATION STARTS and PREPARATION ENDS can use them? I have tried this without success:
var caseinsensitive = true;
words.sort( sortbyalphabet_timo(caseinsensitive) );
And here is finally the code example, and the ready to run example is in http://jsfiddle.net/3E7wb/:
var sortbyalphabet_timo = (function() {
// PREPARATION STARTS
var i, alphabet = "-0123456789AaÀàÁáÂâÃãÄäBbCcÇçDdEeÈèÉéÊêËëFfGgHhIiÌìÍíÎîÏïJjKkLlMmNnÑñOoÒòÓóÔôÕõÖöPpQqRrSsTtUuÙùÚúÛûÜüVvWwXxYyÝýŸÿZz",
index = {};
i = alphabet.length;
while (i--) index[alphabet.charCodeAt(i)] = i;
// PREPARATION ENDS
return function(a, b) {
var i, len, diff;
if (typeof a === "string" && typeof b === "string") {
(a.length > b.length) ? len = a.length : len = b.length;
for (i = 0; i < len; i++) {
diff = index[a.charCodeAt(i)] - index[b.charCodeAt(i)];
if (diff !== 0) {
return diff;
}
}
// sort the shorter first
return a.length - b.length;
} else {
return 0;
}
};
})();
var words = ['tauschen', '66', '55', '33', 'täuschen', 'andern', 'ändern', 'Ast', 'Äste', 'dosen', 'dösen', 'Donaudam-0', 'Donaudam-1'];
$('#orig').html(words.toString());
words.sort(sortbyalphabet_timo);
$('#sorted').html(words.toString());`
Is it possible to modify the words-array in between the lines "PREPARATION STARTS" and "PREPARATION ENDS" so that the sort function uses modified words-array?
No, not really. You don't have access to the array itself, your function only builds the compare-function that is later used when .sort is invoked on the array. If you needed to alter the array, you'll need to write a function that gets it as an argument; for example you could add a method on Array.prototype. It would look like
function mysort(arr) {
// Preparation
// declaration of compare function
// OR execution of closure to get the compare function
arr.sort(comparefn);
return arr;
}
Is it possible to input arguments to the closure so that code between PREPARATION STARTS and PREPARATION ENDS can use them?
Yes, of course - that is the reason to use closures :-) However, you can't use sortbyalphabet_timo(caseinsensitive) with your current code. The closure you have is immediately invoked (called an IIFE) and returns the compare-function, which you pass into sort as in your demo.
If you want sortbyalphabet_timo to be the closure instead of the result, you have to remove the brackets after it. You also you can use arguments there, which are accessible in the whole closure scope (including the comparefunction):
var sortbyalphabet_timo_closure = function(caseinsensitive) {
// Preparation, potentially using the arguments
// Declaration of compare function, potentially using the arguments
return comparefn;
}
// then use
words.sort(sortbyalphabet_timo_closure(true));
Currently, you are doing this:
var sortbyalphabet_timo_closure = function(/*having no arguments*/) {
// Preparation, potentially using the arguments
// Declaration of compare function, potentially using the arguments
return comparefn;
}
var sortbyalphabet_timo = sortbyalphabet_timo_closure();
// then use
words.sort(sortbyalphabet_timo);
…which just caches the result of executing the closure, if you'd need to sort multiple times.
Related
Theoretical question, if for e.g. I have a big object called Order and it has a tons on props: strings, numbers, arrays, nested objects.
I have a function:
function removeShipment(order) {
order.shipment.forEach(
// remove shipment action
);
}
Which mean I access only one prop (shipment), but send a big object.
From perspective of garbage collection and performance is there a difference, between pass Order and pass Order.shipment?
Because object passed by reference, and don't actually copy Order into variable.
As ibrahim mahrir stated in a comment-- though I don't know why they didn't post an answer, because OPs are incentivised to pick a "best answer" & the sole, bewildering response was therefore chosen-- there is no practical performance difference between passing order to your removeShipment method, or passing order.shipment
This is because JavaScript functions are "pass-by-value" for primitive types, like number and boolean, and it uses something known as "call-by-sharing" for passing copies of references for Objects (like your order and assumedly your Array of shipments). The entire object is not copied when passed as a parameter, just a copy of a reference to it in memory. Either approach, passing order or order.shipments, is effectively identical.
I did write a couple timing tests for this, but the actual difference is so small that it's exceptionally difficult to write a test that even properly measures it. I'll include my code at the end for completeness' sake, but from my limited testing in Firefox & Chrome, they were practically identical, as expected.
For another question / answer in the same vein as yours (as well as a great video on why "Micro-benchmarking" often doesn't produce correct results) that corroborates what I wrote, see: does size of argument in a javascript function affects its performance?
See this answer regarding the implications of "call-by-sharing" Is JavaScript a pass-by-reference or pass-by-value language?
You didn's specify what, "remove shipment action" actually "means" in practice. You could just do testOrder.shipments = [] if you just wanted to "remove all shipments" from the order object. They'd be garbage collected at some point after this if nothing else can reach them. I'm just going to iterate through each & perform an addition operation as a stub, as I'm afraid otherwise everything would just be optimised out.
// "num" between 0 inclusive & 26 exclusive
function letter(num)
{
return String.fromCharCode(num + 65)
}
// Ships have a 3-letter name & a random value between 0 & 1
function getShipment() {
return { "name": "Ship", "val": Math.random() }
}
// "order" has 100 "Shipments"
// As well as 676 other named object properties with random "result" values
// e.g. order.AE => Object { result: 14.9815045239037 }
function getOrder() {
var order = {}
for (var i = 0; i < 26; i++)
for (var j = 0; j < 26; j++) {
order[letter(i) + letter(j)] = { "result": (i+j) * Math.random() }
}
order.shipments = Array.from({length: 100}).map(getShipment)
return order
}
function removeShipmentOrder(order) {
order.shipments.forEach(s => s.val++);
}
function removeShipmentList(shipmentList) {
shipmentList.forEach(s => s.val++);
}
// Timing tests
var testOrder = getOrder();
console.time()
for(var i = 0; i < 1000000; i++)
removeShipmentOrder(testOrder)
console.timeEnd()
// Break in-between tests;
// Running them back-to-back, the second test always took longer.
// I assume it's actually due to some kind of compiler optimisation
var testOrder = getOrder();
console.time()
for(var i = 0; i < 1000000; i++)
removeShipmentList(testOrder.shipments)
console.timeEnd()
I was wondering this myself. I decided to test it. Here is my test code:
var a = "Here's a string value";
var b = 5; // and a number
var c = false;
var object = {
a, b, c
}
var array = [
a, b, c
];
var passObject = (obj) => {
return obj.a.length + obj.b * obj.c ? 2 : 1;
}
var passRawValues = (val_a, val_b, val_c) => {
return val_a.length + val_b * val_c ? 2 : 1;
}
var passArray = (arr) => {
return arr[0].length + arr[1] * arr[2] ? 2 : 1;
}
var x = 0;
Then I called the three functions like this:
x << 1;
x ^= passObject(object);
x << 1;
x ^= passRawValues(a, b, c);
x << 1;
x ^= passArray(array);
The reason it does the bit shifting and XORing is that without it, the function call was optimized away entirely by some JS runtimes. By storing the result of the function, I forced the runtime to actually do the function call.
Results
In Webkit and Chromium, passing an object and passing an array were about the same speed, and passing raw values was a little bit slower. Firefox showed about the same performance ratio but I'm not sure that I trust the results since it was literally ten times faster than Chromium.
Here is a link to my my test case on MeasureThat. In case the link doesn't work: it's the same code as above.
Here's a screenshot of the run results (in Chromium on an M1 Macbook Air):
About 5 million ops/s in Chromium for passing an object, versus about 3.7 million for passing a trio of primitive values.
Explanation
So why is that? Well, JavaScript strictly uses pass-by-value semantics. But when you pass an object to a function, the value that you're passing isn't actually the object itself, but rather a pointer to the object. So the variable storing the pointer gets duplicated, but the contents of what it points to does not. This is also why you can have a function that takes an object and alters its properties and that change will happen outside the function as well, but if you reassign the object, the outside scope will still reference the old object.
For this reason, the size of the passed object is largely irrelevant for performance. If the var object = {...} above is changed to contain a bunch of other data, the operations per second achieved when passing it to the function remains exactly the same, because the only thing changing is the amount of data in the block of memory storing the object. The value being passed to the function isn't bigger just because the object is bigger.
Created a simple test here https://jsperf.com/passing-object-vs-passing-raw-value
Test results:
in Chrome passing object is ~7% slower that passing raw value
in Firefox passing object is ~15% slower that passing raw value
in IE11 passing object is ~10% slower that passing raw value
This is syntetic test for passing only one variable, so in other cases results may differ
I have an assignment to count repeated strings base on a Heap's Algorithm Permutation.The first thing I want to do is output the swapped strings, I found this code from jake's answer Can someone please help me understand recursion within this code in a loop? The output of this function are swapped strings.
function permAlone(string) {
var arr = string.split(''), // Turns the input string into a letter array.
permutations = []; // results
function swap(a, b) {
debugger; // This function will simply swap positions a and b inside the input array.
var tmp = arr[a];
arr[a] = arr[b];
arr[b] = tmp;
}
function gen(n) {
debugger;
if (n === 1) {
var x =arr.join('');
permutations.push(x);
} else {
for (var i = 0; i != n; i++) { // how does this loop executes within the call stack?
gen(n - 1);
debugger;
swap(n % 2 ? 0 : i, n - 1); // i don't understand this part. i understand the swap function, but I don't get how indexes are swapped here
}
}
}
gen(arr.length);
return permutations;
}
permAlone('xyz'); // output -> ["xyz","yxz","zxy","xzy","yzx","zyx"]
I have been experimenting it on debugger but still can't get what's happening.
I'm not sure what you mean by
understand recursion within this code in a loop
If you mean you want to see the algorithm in a loop form rather than a recursion version you can see them one by side in pseudocode in the wikipedia page here.
For your questions within the code:
how does this loop executes within the call stack?
You are right to refer to the call stack, and this is a general question regarding recursion. If you don't understand how recursion works with the stack you can refer to this really nice and simple video that demonstrates recursive calls using factorial calculation in java (start around min 4:00).
The line you look at is no different than any other line in the recursive function. We start by defining i and assigning the value 0 to it. We continue to check if it satisfies the condition of the for loop. If it does we step into the loop and execute the first line inside the loop which is the recursive call. Inside the recursive call we have a new stack frame which has no knowledge of the i variable we defined before executing the recursive call, because it is a local variable. So when we get to the loop in the new call we define a new variable i, assigning it 0 at first and incrementing it as the loop repeats in this stack frame/call instance. When this call finishes we delete the stack frame and resume to the previous stack frame (the one we started with) where i=0 still, and we continue to the next line.
All the calls have access to the arr and permutations variables since the function is defined in the same scope as the variables (inside the function permAlone) so within each call - no matter what the stack frame we are in, the changes made to those are made to the same instances. That's why every push done to permutations adds to the existing results and will be there when the function returns the variable at the end.
i don't understand this part. i understand the swap function, but I don't get how indexes are swapped here
Indexes are not swapped here. It is merely a call for the swap function with the correct indices.
swap(n % 2 ? 0 : i, n - 1);
is just
swap(a, b);
with
a = n% 2 ? 0 : i;
b = n - 1;
If the a part is what confuses you, then this is a use of the ternary operator for conditional value. That is, it's symbols used to form an expression that is evaluated differently according to the circumstances. The use is by
<<i>boolean epression</i>> ? <<i>value-if-true</i>> : <<i>value-if-false</i>>
to evaluate the above, first <boolean expression> is evaluated. If it's value it true then the whole expression is evaluated as <value-if-true>. Otherwise, the whole expression is evaluated as <value-if-false>.
In the code itself, for a, n % 2 is the boolean expression - js divides n by 2 and takes the remainder. The remainder is either 1 or 0. js implicitly converts those to true and false respectively. So if n is odd we get
a = 0
and if it's even we get
a = i
as the algorithm requires.
I need to create wrapper functions around other functions. It is well known that arguments object is quite cranky and can't be passed to any function verbatim. Array creation in V8 is not cheap either. So the fastest code I could come up with is this:
function wrap (fun) {
// reuse the same array object for each call
var args = [];
var prevLen = 0;
return function () {
// do some wrappy things here
// only set args.length if it changed (unlikely)
var l = arguments.length;
if (l != prevLen) {
prevLen = args.length = l;
}
// copy the args and run the functon
for (var i = 0; i < l; i++) {
args[i] = arguments[i];
}
fun.apply(this, args);
};
}
var test = wrap(function (rec) {
document.write(arguments[1] + '<br />');
if (rec) test(false, 'something else');
document.write(arguments[1] + '<br />');
});
test(true, 'something');
This way I avoid creating or changing length of the array object unless really needed. The performance gain is quite serious.
The problem: I use the same array all over the place and it could change before the function call is finished (see example)
The question: is the array passed to .apply() copied to somewhere else in all JavaScript implementations? Is it guaranteed by the current EcmaScript spec that the fourth line of output will never ever be something else?
It works fine in all the browsers I checked, but I want to be future-proof here.
Is the array passed to .apply() copied to somewhere else, and is it guaranteed by the current EcmaScript spec?
Yes. The apply method does convert the array (or whatever you pass in) to a separate arguments list using CreateListFromArrayLike, which is then passed around and from which the arguments object for the call is created as well as the parameters are set.
According to the spec it is indeed supposed to be copied so the code seems to be safe.
Let argList be CreateListFromArrayLike(argArray).
Return Call(func, thisArg, argList).
How does Babel implement tail recursion?
How does this transpiled es5 code work?
Thought that I would post a detailed explanation, to allow those landing here from Google to hopefully be able to come to a faster understanding more quickly.
Note that the following explanation was derived from what I can tell from the code (no consultation with the author of Babel or other experts in its code was performed), so it is unknown whether the meaning I derived was the intended meaning of #sebmck or others who contributed to this transformation.
"use strict"; // ES2015/ES6 modules are assumed to be in strict mode.
function factorial(_x2) {
// This variable is the list of arguments to factorial.
// Since factorial may be called again from within this function
// (but must be called indirectly to prevent call stack growth),
// a custom arguments-like variable must be maintained.
var _arguments = arguments;
// _again equals true when the original factorial function should be called
// once more.
var _again = true;
// This line creates a labeled while loop, use to allow continuing this
// specific loop, without the requirement of having no nested while loops
// Nested while loops would cause a "continue" statement within them to
// execute them, not this loop.
_function: while (_again) {
// The variable used in the original factorial function was called "n",
// this method allows Babel to not rename it, but simply assign the
// possibly modified first argument of the factorial function to it.
var n = _x2;
// Temporal dead zone (TDZ) mechanic
acc = undefined;
// The "use strict" directive from the original function
"use strict";
// Beginning of user code, this is always inserted to ensure that if
// this is the final run, the while loop will not run again.
_again = false;
// This is Babel's default argument handling. The steps, in order, are:
// 1. Make sure that there will not be an out-of-bounds access for the
// undefined check.
// 2. Check if the second argument to the current iteration is undefined,
// if yes: the default value is `1`, if no, use the value of the first argument.
var acc = _arguments.length <= 1 || _arguments[1] === undefined ? 1 : _arguments[1];
// Input code - no modifications.
if (n <= 1) return acc;
// The following three lines are the call to factorial() within factorial
// in the input code. The first line assigns the new arguments, as well
// as updating the _x2 variable to it's new value. The second line
// overrides the assignment in the beginning of the loop.
// The third line brings the loop back to the beginning.
_arguments = [_x2 = n - 1, n * acc];
_again = true;
continue _function;
}
}
here is the function:
var M = [];
function haveComponents () {
var a = 0;
for (var n in this.M) a++;
return a > 0;
}
I would like to understand:
the construct of "for(var n in this.M)"; I'm used to a regular for loop and I'm not familiar with this construct.
how "this.M" fits into the code i.e. its purpose
generally speaking, what this function would likely be used for.
Thanks
There appears to be some missing code.
var M = [];
Assigns a new array to the variable M, which seems to be a global variable (but likely isn't, you just haven't shown enough code to properly determine the context).
haveComponents: function () {
That appears to be part of an object literal that assigns a function to a property called haveComponents.
var a = 0;
Creates a local variable a and when the code executes, assigns it a value of 0.
for (var n in this.M) a++;
Creates a local variable n and sequentially assigns it the name of an enumerable property of whatever this.M references. If this is the global object, M will be the array initialised above. If not, it may or may not be something else. You haven't shown any other assignment, or what this has been set to.
For each enumerable property of M (which includes its inherited properties), a will be incremented by one.
return a > 0;
}
Returns true if a is greater than zero.
An equivalent function is:
haveComponents: function () {
for (var n in this.M) {
// this.M has at least one enumerable property
return true;
}
// this.M has no enumerable properties
return false;
}
or for the purists:
haveComponents: function () {
var hasEnumerable = false;
for (var n in this.M) {
hasEnumerable = true;
break;
}
return hasEnumerable;
}
The function counts how many elements are in the M array.
The for in allows you to iterate object's enumerable properties , note that this is different from a for each behaviour where the iteration is over items rather tha properties. In javascript this translates into going into the prototype property names and list them as well, possibly resulting in unexpected result.
for(var n in this.M) this is a for-each loop, used to iterate over a set of values instead that by using conditions. It is used to iterate over properties of objects.
the this keyword refer to the owner of the function (whose is the haveComponents function), while M is a property of this
this function just, uselessly, counts elements in M to see if they are more than 0. Counting them is absolutely superfluous for this purpose though.
The for(var n in this.M) iterates through all of the elements of this.M, successively storing them in the variable n.
I have no idea what this.M is, that depends on where your code comes from.
In general, I would say that this code returns whether M is empty or not (and returns true if it is not empty).