I am developing a JavaScript application and I needed a recursive algorithm for the longest common subsequence, so I went here and tried this one out.
It goes like this:
function lcs(a, b) {
var aSub = a.substr(0, a.length - 1);
var bSub = b.substr(0, b.length - 1);
if (a.length === 0 || b.length === 0) {
return '';
} else if (a.charAt(a.length - 1) === b.charAt(b.length - 1)) {
return lcs(aSub, bSub) + a.charAt(a.length - 1);
} else {
var x = lcs(a, bSub);
var y = lcs(aSub, b);
return (x.length > y.length) ? x : y;
}
}
It worked fine with the few test cases i tried until now, but I found that it loops on the following test case:
a: This entity works ok
b: This didn't work ok but should after
It also loops with:
a: This entity works ok
b: This didn't work as well
which at some point should get in the middle branch.
I have noticed that it is a translation of a Java version (here) of the same algorithm. It goes like this:
public static String lcs(String a, String b){
int aLen = a.length();
int bLen = b.length();
if(aLen == 0 || bLen == 0){
return "";
}else if(a.charAt(aLen-1) == b.charAt(bLen-1)){
return lcs(a.substring(0,aLen-1),b.substring(0,bLen-1))
+ a.charAt(aLen-1);
}else{
String x = lcs(a, b.substring(0,bLen-1));
String y = lcs(a.substring(0,aLen-1), b);
return (x.length() > y.length()) ? x : y;
}
}
I supposed that the JavaScript translation was wrong assuming that String.substr() and String.substring() were the same (which they aren't).
To be sure that it wasn't the case, I tried the Java one on the same test case here.
Guess what? Also the java version does not end.
I am struggling to debug it, as it is recursive.
Anyone has any idea on what is going wrong with it?
As others have pointed out in the comments, the program itself is correct. The issue you are experiencing is due that, in this implementation, the code has an exponential time complexity, and therefore takes A LONG time to run with your example input. If you let it run for a LONG time, it will return the correct result.
As others have also pointed out in the comments, LCS between two Strings is solvable with a lower time complexity using dynamic programming, which will solve it much quicker. Refer to the internet for more help (wikipedia ) or, better, try to solve it yourself thinking about the fact that there are, for each String of length n, exactly N^2 substrings. You can trivially solve it in N^2*M^2 (n m are the lengths of the two strings) by just checking if any substring of a is present in b. Ask yourself if you can do better for exercise? If yes how, if no, why.
Related
Problem statement: I'm trying to get string > binary without using the inbuilt method in javascript.
This is a piece of program where a string input (like "ABC") is accepted, then it is translated to an array of equivalent code value ([65,66,67]).
Function binary() will change a number to binary. But I'm unable to join them together to loop through all the contents. Please help. (I'm a noob, please forgive my bad code and bad explanation)
var temp3 = [65,66,67];
var temp2 = [];
var r;
for(i=0;i<temp3.length;i++) {
var r = temp3[i];
temp2.push(binary(r));
}
function binary(r) {
if (r === 0) return;
temp2.unshift(r % 2);
binary(Math.floor(r / 2));
return temp2;
}
console.log(temp2);
I think this is a cleaner version of this function. It should work for any non-negative integers, and would be easy enough to extend to the negatives. If we have a single binary digit (0 or 1) and hence are less than 2, we just return the number converted to a string. Otherwise we call recursively on the floor of half the number (as yours does) and append the final digit.
const binary = (n) =>
n < 2
? String (n)
: binary (Math.floor (n / 2)) + (n % 2)
console.log (binary(22)) //=> '10110'
console.log ([65, 66, 67] .map (binary)) //=> ['1000001', '1000010', '1000011']
In your function you have this code
var r = temp3[i];
I don't see any temp3 variable anywhere in your code above so I'd imagine that could be causing some issues.
I am not sure my issue is related to programming or related to concept of LLL algorithm and what has been mentioned on Wikipedia.
I decided to implement LLL algorithm as it has been written on Wikipedia (step-by-step / line-by-line) to actually learn the algorithm and make sure it is truly working but I am getting unexpected or invalid results.
So, I used JavaScript (programming language) and node.js (JavaScript engine) to implement it and this is the git repository to get the complete code.
Long story short, value of K gets out of range, for example when we have only 3 vectors (array size is 3, thus maximum value of index would be 2), but k becomes 3 and it is nonsense.
My code is step-by-step (line-by-line) implementation of the algorithm mentioned on Wikipedia and what I did was only implementing it. So I don't what is the issue.
// ** important
// {b} set of vectors are denoted by this.matrix_before
// {b*} set of vectors are denoted by this.matrix_after
calculate_LLL() {
this.matrix_after = new gs(this.matrix_before, false).matrix; // initialize after vectors: perform Gram-Schmidt, but do not normalize
var flag = false; // invariant
var k = 1;
while (k <= this.dimensions && !flag) {
for (var j = k - 1; j >= 0; j--) {
if (Math.abs(this.mu(k, j)) > 0.5) {
var to_subtract = tools.multiply(Math.round(this.mu(k, j)), this.matrix_before[j], this.dimensions);
this.matrix_before[k] = tools.subtract(this.matrix_before[k], to_subtract, this.dimensions);
this.matrix_after = new gs(this.matrix_before, false).matrix; // update after vectors: perform Gram-Schmidt, but do not normalize
}
}
if (tools.dot_product(this.matrix_after[k], this.matrix_after[k], this.dimensions) >= (this.delta - Math.pow(this.mu(k, k - 1), 2)) * tools.dot_product(this.matrix_after[k - 1], this.matrix_after[k - 1], this.dimensions)) {
if (k + 1 >= this.dimensions) { // invariant: there is some issue, something is wrong
flag = true; // invariant is broken
console.log("something bad happened ! (1)");
}
k++;
// console.log("if; k, j");
// console.log(k + ", " + j);
} else {
var temp_matrix = this.matrix_before[k];
this.matrix_before[k] = this.matrix_before[k - 1];
this.matrix_before[k - 1] = temp_matrix;
this.matrix_after = new gs(this.matrix_before, false).matrix; // update after vectors: perform Gram-Schmidt, but do not normalize
if (k === Math.max(k - 1, 1) || k >= this.dimensions || Math.max(k - 1, 1) >= this.dimensions) { // invariant: there is some issue, something is wrong
flag = true; // invariant is broken
console.log("something bad happened ! (2)");
}
k = Math.max(k - 1, 1);
// console.log("else; k, j");
// console.log(k + ", " + j);
}
console.log(this.matrix_before);
console.log("\n");
} // I added this flag variable to prevent getting exceptions and terminate the loop gracefully
console.log("final: ");
console.log(this.matrix_before);
}
// calculated mu as been mentioned on Wikipedia
// mu(i, j) = <b_i, b*_j> / <b*_j, b*_j>
mu(i, j) {
var top = tools.dot_product(this.matrix_before[i], this.matrix_after[j], this.dimensions);
var bottom = tools.dot_product(this.matrix_after[j], this.matrix_after[j], this.dimensions);
return top / bottom;
}
Here is the screenshot of the algorithm that is on Wikipedia:
Update #1: I added more comments to the code to clarify the question hoping that someone would help.
Just in case you are wondering about the already available implementation of the code, you can type: LatticeReduce[{{0,1},{2,0}}] wolfram alpha to see how this code suppose to behave.
Update #2: I cleaned up the code more and added a validate function to make Gram Schmidt code is working correctly, but still code fails and value of k exceeds number of dimensions (or number of vectors) which doesn't make sense.
The algorithm description in Wikipedia uses rather odd notation -- the vectors are numbered 0..n (rather than, say, 0..n-1 or 1..n), so the total number of vectors is n+1.
The code you've posted here treats this.dimensions as if it corresponds to n in the Wikipedia description. Nothing wrong with that so far.
However, the constructor in the full source file on GitHub sets this.dimensions = matrix[0].length. Two things about this look wrong. The first is that surely matrix[0].length is more like m (the dimension of the space) than n (the number of vectors, minus 1 for unclear reasons). The second is that if it's meant to be n then you need to subtract 1 because the number of vectors is n+1, not n.
So if you want to use this.dimensions to mean n, I think you need to initialize it as matrix.length-1. With the square matrix in your test case, using matrix[0].length-1 would work, but I think the code will then break when you feed in a non-square matrix. The name dimensions is kinda misleading, too; maybe just n to match the Wikipedia description?
Or you could call it something like nVectors, let it equal matrix.length, and change the rest of the code appropriately, which just means an adjustment in the termination condition for the main loop.
I want to know whether it is possible?
Let Suppose:
var a = 2592;
var b = 2584;
if(a nearly equal to b) {
// do something
}
Like so.
var diff = Math.abs( a - b );
if( diff > 50 ) {
console.log('diff greater than 50');
}
That would compare if the absolute difference is greater than 50 using Math.abs and simple comparison.
Here's the old school way to do it...
approxeq = function(v1, v2, epsilon) {
if (epsilon == null) {
epsilon = 0.001;
}
return Math.abs(v1 - v2) < epsilon;
};
so,
approxeq(5,5.000001)
is true, while
approxeq(5,5.1)
is false.
You can adjust pass in epsilons explicitly to suit your needs. One part in a thousand usually covers my javascript roundoff issues.
var ratio = 0;
if ( a > b) {
ratio = b / a;
}
else {
ratio = a / b;
}
if (ratio > 0.90) {
//do something
}
One line Es6 way version of The Software Barbarian:
const approxeq = (v1, v2, epsilon = 0.001) => Math.abs(v1 - v2) <= epsilon;
console.log(approxeq(3.33333, 3.33322)); // true
console.log(approxeq(2.3, 2.33322)); // false
console.log(approxeq(3, 4, 1)); // true
I changed it to include the number in the margin. So with an epsilon margin of 1 approxeq between 1 and 2 is true
Floating point comparison gets complicated in a hurry. It's not as simple as diff less than epsilon in a lot of cases.
Here's an article about the subject, though not javascript specific.
https://floating-point-gui.de/errors/comparison/
TLDR:
When one of the numbers being compared is very close to zero, subtracting the smaller from the larger can lose digits of precision, making the diff appear smaller than it is (or zero).
Very small numbers with different signs work in a weird way.
Dividing by zero will cause problems.
In the article is a function (java) which solves better for these cases:
public static boolean nearlyEqual(float a, float b, float epsilon) {
final float absA = Math.abs(a);
final float absB = Math.abs(b);
final float diff = Math.abs(a - b);
if (a == b) { // shortcut, handles infinities
return true;
} else if (a == 0 || b == 0 || (absA + absB < Float.MIN_NORMAL)) {
// a or b is zero or both are extremely close to it
// relative error is less meaningful here
return diff < (epsilon * Float.MIN_NORMAL);
} else { // use relative error
return diff / Math.min((absA + absB), Float.MAX_VALUE) < epsilon;
}
}
Before you complain: Yes, that's Java, so you'd have to rewrite it in Javascript. It's just to illustrate the algorithm and it's just copied from the article.
I'm still looking for a thorough solution to this problem, ideally with an NPM package so I don't have to figure this out again every time I need it.
Edit: I've found a package which implements the solution from the article linked above (which has the same link in their readme).
https://www.npmjs.com/package/#intocode-io/nearly-equal
This will be a less error-prone solution than others shown in other answers. There are several npm packages which implement the naive solutions which have error cases near zero as described above. Make sure you look at the source before you use them.
[5, 4, 4, 6].indexOfArray([4, 6]) // 2
['foo', 'bar', 'baz'].indexOfArray(['foo', 'baz']) // -1
I came up with this:
Array.prototype.indexOfArray = function(array) {
var m = array.length;
var found;
var index;
var prevIndex = 0;
while ((index = this.indexOf(array[0], prevIndex)) != -1) {
found = true;
for (var i = 1; i < m; i++) {
if (this[index + i] != array[i]) {
found = false;
}
}
if (found) {
return index;
}
prevIndex = index + 1
}
return index;
};
Later I have find wikipedia calls it Naïve string search:
In the normal case, we only have to look at one or two characters for each wrong position to see that it is a wrong position, so in the average case, this takes O(n + m) steps, where n is the length of the haystack and m is the length of the needle; but in the worst case, searching for a string like "aaaab" in a string like "aaaaaaaaab", it takes O(nm) steps.
Can someone write a faster indexOfArray method in JavaScript?
The algorithm you want is the KMP algorithm (http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm) used to find the starting index of a substring within a string -- you can do exactly the same thing for an array.
I couldn't find a javascript implementation, but here are implementations in other languages http://en.wikibooks.org/wiki/Algorithm_implementation/String_searching/Knuth-Morris-Pratt_pattern_matcher -- it shouldn't be hard to convert one to js.
FWIW: I found this article a good read Efficient substring searching It discusses several variants of Boyer-Moore although it's not in JavaScript. The Boyer-Moore-Horspool variant (by Timo Raita’s -- see first link for link) was going to be my "suggestion" for a potential practical speed gain (does not reduce big-O though -- big-O is upper limit only!). Pay attention to the Conclusion at the bottom of the article and the benchmarks above.
I'm mainly trying to put up opposition for the Knuth-Morris-Pratt implementation ;-)
When I wrote in JavaScript "Ł" > "Z" it returns true. In Unicode order it should be of course false. How to fix this? My site is using UTF-8.
You can use Intl.Collator or String.prototype.localeCompare, introduced by ECMAScript Internationalization API:
"Ł".localeCompare("Z", "pl"); // -1
new Intl.Collator("pl").compare("Ł","Z"); // -1
-1 means that Ł comes before Z, like you want.
Note it only works on latest browsers, though.
Here is an example for the french alphabet that could help you for a custom sort:
var alpha = function(alphabet, dir, caseSensitive){
return function(a, b){
var pos = 0,
min = Math.min(a.length, b.length);
dir = dir || 1;
caseSensitive = caseSensitive || false;
if(!caseSensitive){
a = a.toLowerCase();
b = b.toLowerCase();
}
while(a.charAt(pos) === b.charAt(pos) && pos < min){ pos++; }
return alphabet.indexOf(a.charAt(pos)) > alphabet.indexOf(b.charAt(pos)) ?
dir:-dir;
};
};
To use it on an array of strings a:
a.sort(
alpha('ABCDEFGHIJKLMNOPQRSTUVWXYZaàâäbcçdeéèêëfghiïîjklmnñoôöpqrstuûüvwxyÿz')
);
Add 1 or -1 as the second parameter of alpha() to sort ascending or descending.
Add true as the 3rd parameter to sort case sensitive.
You may need to add numbers and special chars to the alphabet list
You may be able to build your own sorting function using localeCompare() that - at least according to the MDC article on the topic - should sort things correctly.
If that doesn't work out, here is an interesting SO question where the OP employs string replacement to build a "brute-force" sorting mechanism.
Also in that question, the OP shows how to build a custom textExtract function for the jQuery tablesorter plugin that does locale-aware sorting - maybe also worth a look.
Edit: As a totally far-out idea - I have no idea whether this is feasible at all, especially because of performance concerns - if you are working with PHP/mySQL on the back-end anyway, I would like to mention the possibility of sending an Ajax query to a mySQL instance to have it sorted there. mySQL is great at sorting locale aware data, because you can force sorting operations into a specific collation using e.g. ORDER BY xyz COLLATE utf8_polish_ci, COLLATE utf8_german_ci.... those collations would take care of all sorting woes at once.
Mic's code improved for non-mentioned chars:
var alpha = function(alphabet, dir, caseSensitive){
dir = dir || 1;
function compareLetters(a, b) {
var ia = alphabet.indexOf(a);
var ib = alphabet.indexOf(b);
if(ia === -1 || ib === -1) {
if(ib !== -1)
return a > 'a';
if(ia !== -1)
return 'a' > b;
return a > b;
}
return ia > ib;
}
return function(a, b){
var pos = 0;
var min = Math.min(a.length, b.length);
caseSensitive = caseSensitive || false;
if(!caseSensitive){
a = a.toLowerCase();
b = b.toLowerCase();
}
while(a.charAt(pos) === b.charAt(pos) && pos < min){ pos++; }
return compareLetters(a.charAt(pos), b.charAt(pos)) ? dir:-dir;
};
};
function assert(bCondition, sErrorMessage) {
if (!bCondition) {
throw new Error(sErrorMessage);
}
}
assert(alpha("bac")("a", "b") === 1, "b is first than a");
assert(alpha("abc")("ac", "a") === 1, "shorter string is first than longer string");
assert(alpha("abc")("1abc", "0abc") === 1, "non-mentioned chars are compared as normal");
assert(alpha("abc")("0abc", "1abc") === -1, "non-mentioned chars are compared as normal [2]");
assert(alpha("abc")("0abc", "bbc") === -1, "non-mentioned chars are compared with mentioned chars in special way");
assert(alpha("abc")("zabc", "abc") === 1, "non-mentioned chars are compared with mentioned chars in special way [2]");
You have to keep two sortkey strings. One is for primary order, where German ä=a (primary a->a) and French é=e (primary sortkey e->e) and one for secondary order, where ä comes after a (translating a->azzzz in secondary key) or é comes after e (secondary key e->ezzzz). Especially in Czech some letters are variations of a letter (áéí…) whereas others stand in their full right in the list (ABCČD…GHChI…RŘSŠT…). Plus the problem to consider digraphs a single letters (primary ch->hzzzz). No trivial problem, and there should be a solution within JS.
Funny, I have to think about that problem and finished searching here, because it came in mind, that I can use my own javascript module. I wrote a module to generate a clean URL, therefor I have to translitate the input string... (http://pid.github.io/speakingurl/)
var mySlug = require('speakingurl').createSlug({
maintainCase: true,
separator: " "
});
var input = "Schöner Titel läßt grüßen!? Bel été !";
var result;
slug = mySlug(input);
console.log(result); // Output: "Schoener Titel laesst gruessen bel ete"
Now you can sort with this results. You can ex. store the original titel in the field "title" and the field for sorting in "title_sort" with the result of mySlug.