This question already has answers here:
Find difference between two strings in JavaScript
(6 answers)
Closed last year.
I want to create a function to compare two strings and return changes in string (like Stack Overflow shows changes in answer which was edited).
Expected results should be.
console.log(detectChange("SHANTI DEVI","SHANT DEVI")); // SHANT_I_ DEVI
console.log(detectChange("MOHAN SINGH","MOHAN SINGH")); // MOHAN SINGH
console.log(detectChange("SURESH SINGH","MOHAN SINGH")); // -MOHAN-_SURESH_ SINGH
console.log(detectChange("SEETA DEVI","SITA SINGH")); // S-I-_EE_TA -SINGH-_DEVI_
First parameter is the new value and second parameter is the old value.
Bracket letter or word using "-" if that word or letter was removed
The word or letter that was added should be bracketed using "_"
The below code was unsuccessful for me.
function detectChange(name1, name2) {
name1 = name1.split("");
name2 = name2.split("");
var visit = 0;
var final_name = [];
if (name2.length > name1.length) {
nameTmp = name1;
name1 = name2;
name2 = nameTmp;
}
for (i = 0; i <= name1.length; i++) {
if (name1[i] == name2[visit]) {
final_name.push(name1[i]);
visit++;
} else if (name1[i] !== null) {
final_name.push("_" + name1[i] + "_");
visit++;
}
}
return final_name.join("");
}
// Getting unexpected results
console.log(detectChange("SHANTI DEVI", "SHANT DEVI")); // SHANT_I__ __D__E__V__I_
console.log(detectChange("MOHAN SINGH", "MOHAN SINGH")); // MOHAN SINGH
console.log(detectChange("SURESH SINGH", "MOHAN SINGH")); // _S__U__R__E__S__H__ __S__I__N__G__H_
console.log(detectChange("SEETA DEVI", "SITA SINGH")); // S_E__E__T__A__ __D__E__V__I_
Here every single output is invalid, please help me regarding this how can I handle this.
You probably want to read this paper, "An O(ND) Difference Algorithm and Its Variations". Here's the abstract:
Abstract
The problems of finding a longest common subsequence of two sequences A and B and a shortest edit script for transforming A into B have long been known to be dual problems. In this paper, they are shown to be equivalent to finding a shortest/longest path in an edit graph. Using this perspective, a simple O(ND) time and space algorithm is developed where N is the sum of the lengths of A and B and D is the size of the minimum edit script for A and B. The algorithm performs well when differences are small (sequences are similar) and is consequently fast in typical applications. The algorithm is shown to have O(N + D2) expected-time performance under a basic stochastic model. A refinement of the algorithm requires only O(N) space, and the use of suffix trees leads to an O(NlgN + D2) time variation.
And then implement the algorithm.
But no... you don't actually want to do that, because it's already been done for you: See the npm package diff.
Related
This question already has answers here:
Length of a JavaScript object
(43 answers)
Closed 2 years ago.
Lately I have been trying to create a webpage with a search feature. My way of implementing this, while not the fastest or most elegant, should work in theory. All it does is split the search term into a list, the delimiter being a space, and then splits the keywords (in dictionary format, with the value being a download link, and with the key being the "keywords" I was referring to) and finally, it has an outer loop looping through the keys (being split each iteration into a list), and an inner loop looping through the words input through the input field. If a word in the search field matches one keyword of the key words list, then that key from the dictionary gets a score of +1.
This should sort the keys into order of best result to worst, and then the code can continue on to process all this information and display links to the downloadable files (the point of the webpage is to supply downloads to old software [of which I have collected over the years] etc.). However, when I run the program, whenever the alert(ranking.length) function is called, all I get is undefined in the output window.
Here is the code. (The search() function is called whenever the search button is pressed):
var kw_href = {
"windows":["windows3.1.7z"],
"ms dos 6.22":["ms-dos 6.22.7z"]
}
function search(){
var element = document.getElementById("search_area");
var search_term = element.value.toLowerCase();
var s_tags = search_term.split(" ");
var keys = Object.keys(kw_href);
ranking = {
"windows":0,
"ms dos 6.22":0
};
for (i = 0; i < keys.length; i++){
keywords_arr = keys[i].split(" ");
for (x = 0; x < s_tags.length; x++){
if (keywords_arr.includes(s_tags[x])){
ranking[keys[i]] = ranking[keys[i]] + 1;
}
}
}
// now we have a results list with the best results. Lets sort them into order.
alert(ranking.length);
}
Edit
alert(ranking.length) line is for debugging purposes only, and I was not specifically trying to find the length.
ranking is a generic object, not an array, so it won't have a computed length property.
If you want to count the number of properties in it, convert it to an array with Object.keys(ranking).
ranking should be array of object like ranking =[{"windows":0,"ms dos 6.22":0},{"windows":1,"ms dos 6.22":10}]
Then length ranking.length will work
I am currently studying another user’s code for a coding question from LeetCode. My question is about certain aspects of his code. Here’s a link to the question.
Question:
Why does this user use a # to mark the end of the array?
Under the second if case, the user writes:
ans.push(nums[t] + '->' + (nums[i-1]))
Now, I understand what this statement does. My question is: Why does this produce an output of ["0->2",...] instead of [0"->"2,...]?
var summaryRanges = function(nums) {
var t = 0
var ans = []
nums.push('#')
for(var i=1;i<nums.length;i++)
if(nums[i]-nums[t] !== i-t){
if(i-t>1)
ans.push(nums[t]+'->'+(nums[i-1]))
else
ans.push(nums[t].toString())
t = i
}
return ans
}
The algorithm depends on that the difference between nums[i] and nums[t] is not the same as the difference between i and t. When that happens, the algorithm adds more to the output. This creates a special case when the last range is just a single number, since this cannot trigger the condition.
Hence the hash character is padding to extend the array in order to make the algorithm work, so that the condition nums[i]-nums[t] !== i-t will trigger even for a finishing range of a single number. It could be any string really as long as it is not an integer number.
I'm working on the MIU system problem from "Gödel, Escher, Bach" chapter 2.
One of the rules states
Rule III: If III occurs in one of the strings in your collection, you may make a new string with U in place of III.
Which means that the string MIII can become MU, but for other, longer strings there may be multiple possibilities [matches in brackets]:
MIIII could yield
M[III]I >> MUI
MI[III] >> MIU
MUIIIUIIIU could yield
MU[III]UIIIU >> MUUUIIIU
MUIIIU[III]U >> MUIIIUUU
MUIIIIU could yield
MU[III]IU >> MUUIU
MUI[III]U >> MUIUU
Clearly regular expressions such as /(.*)III(.*)/ are helpful, but I can't seem to get them to generate every possible match, just the first one it happens to find.
Is there a way to generate every possible match?
(Note, I can think of ways to do this entirely manually, but I am hoping there is a better way using the built in tools, regex or otherwise)
(Edited to clarify overlapping needs.)
Here's the regex you need: /III/g - simple enough, right? Now here's how you use it:
var text = "MUIIIUIIIU", find = "III", replace "U",
regex = new RegExp(find,"g"), matches = [], match;
while(match = regex.exec(text)) {
matches.push(match);
regex.lastIndex = match.index+1;
}
That regex.lastIndex... line overrides the usual regex behaviour of not matching results that overap. Also I'm using a RegExp constructor to make this more flexible. You could even build it into a function this way.
Now you have an array of match objects, you can do this:
matches.forEach(function(m) { // older browsers need a shim or old-fashioned for loop
console.log(text.substr(0,m.index)+replace+text.substr(m.index+find.length));
});
EDIT: Here is a JSFiddle demonstrating the above code.
Sometimes regexes are overkill. In your case a simple indexOf might be fine too!
Here is, admittedly, a hack, but you can transform it into pretty, reusable code on your own:
var s = "MIIIIIUIUIIIUUIIUIIIIIU";
var results = [];
for (var i = 0; true; i += 1) {
i = s.indexOf("III", i);
if (i === -1) {
break;
}
results.push(i);
}
console.log("Match positions: " + JSON.stringify(results));
It takes care of overlaps just fine, and at least to me, the indexOf just looks simpler.
I want to find the number of tabs at the beginning of a string (and of course I want it to be fast running code ;) ). This is my idea, but not sure if this is the best/fastest choice:
//The regular expression
var findBegTabs = /(^\t+)/g;
//This string has 3 tabs and 2 spaces: "<tab><tab><space>something<space><tab>"
var str = " something ";
//Look for the tabs at the beginning
var match = reg.exec( str );
//We found...
var numOfTabs = ( match ) ? match[ 0 ].length : 0;
Another possibility is to use a loop and charAt:
//This string has 3 tabs and 2 spaces: "<tab><tab><space>something<space><tab>"
var str = " something ";
var numOfTabs = 0;
var start = 0;
//Loop and count number of tabs at beg
while ( str.charAt( start++ ) == "\t" ) numOfTabs++;
In general if you can calculate the data by simply iterating through the string and doing a character check at every index, this will be faster than a regex/regular expression which must build up a more complex searching engine. I encourage you to profile this but I think you'll find the straight search is faster.
Note: Your search should use === instead of == here as you don't need to introduce conversions in the equality check.
function numberOfTabs(text) {
var count = 0;
var index = 0;
while (text.charAt(index++) === "\t") {
count++;
}
return count;
}
Try using a profiler (such as jsPerf or one of the many available backend profilers) to create and run benchmarks on your target systems (the browsers and/or interpreters you plan to support for your software).
It's useful to reason about which solution will perform best based on your expected data and target system(s); however, you may sometimes be surprised by which solution actually performs fastest, especially with regard to big-oh analysis and typical data sets.
In your specific case, iterating over characters in the string will likely be faster than regular expression operations.
One-liner (if you find smallest is best):
"\t\tsomething".split(/[^\t]/)[0].length;
i.e. splitting by all non-tab characters, then fetching the first element and obtaining its length.
I have a group of strings in Javascript and I need to write a function that detects if another specific string belongs to this group or not.
What is the fastest way to achieve this? Is it alright to put the group of values into an array, and then write a function that searches through the array?
I think if I keep the values sorted and do a binary search, it should work fast enough. Or is there some other smart way of doing this, which can work faster?
Use a hash table, and do this:
// Initialise the set
mySet = {};
// Add to the set
mySet["some string value"] = true;
...
// Test if a value is in the set:
if (testValue in mySet) {
alert(testValue + " is in the set");
} else {
alert(testValue + " is not in the set");
}
You can use an object like so:
// prepare a mock-up object
setOfValues = {};
for (var i = 0; i < 100; i++)
setOfValues["example value " + i] = true;
// check for existence
if (setOfValues["example value 99"]); // true
if (setOfValues["example value 101"]); // undefined, essentially: false
This takes advantage of the fact that objects are implemented as associative arrays. How fast that is depends on your data and the JavaScript engine implementation, but you can do some performance testing easily to compare against other variants of doing it.
If a value can occur more than once in your set and the "how often" is important to you, you can also use an incrementing number in place of the boolean I used for my example.
A comment to the above mentioned hash solutions.
Actually the {} creates an object (also mentioned above) which can lead to some side-effects.
One of them is that your "hash" is already pre-populated with the default object methods.
So "toString" in setOfValues will be true (at least in Firefox).
You can prepend another character e.g. "." to your strings to work around this problem or use the Hash object provided by the "prototype" library.
Stumbled across this and realized the answers are out of date. In this day and age, you should not be implementing sets using hashtables except in corner cases. You should use sets.
For example:
> let set = new Set();
> set.add('red')
> set.has('red')
true
> set.delete('red')
true
> set.has('red')
false
Refer to this SO post for more examples and discussion: Ways to create a Set in JavaScript?
A possible way, particularly efficient if the set is immutable, but is still usable with a variable set:
var haystack = "monday tuesday wednesday thursday friday saturday sunday";
var needle = "Friday";
if (haystack.indexOf(needle.toLowerCase()) >= 0) alert("Found!");
Of course, you might need to change the separator depending on the strings you have to put there...
A more robust variant can include bounds to ensure neither "day wed" nor "day" can match positively:
var haystack = "!monday!tuesday!wednesday!thursday!friday!saturday!sunday!";
var needle = "Friday";
if (haystack.indexOf('!' + needle.toLowerCase() + '!') >= 0) alert("Found!");
Might be not needed if the input is sure (eg. out of database, etc.).
I used that in a Greasemonkey script, with the advantage of using the haystack directly out of GM's storage.
Using a hash table might be a quicker option.
Whatever option you go for its definitely worth testing out its performance against the alternatives you consider.
Depends on how much values there are.
If there are a few values (less than 10 to 50), searching through the array may be ok. A hash table might be overkill.
If you have lots of values, a hash table is the best option. It requires less work than sorting the values and doing a binary search.
I know it is an old post. But to detect if a value is in a set of values we can manipulate through array indexOf() which searches and detects the present of the value
var myString="this is my large string set";
var myStr=myString.split(' ');
console.log('myStr contains "my" = '+ (myStr.indexOf('my')>=0));
console.log('myStr contains "your" = '+ (myStr.indexOf('your')>=0));
console.log('integer example : [1, 2, 5, 3] contains 5 = '+ ([1, 2, 5, 3].indexOf(5)>=0));
You can use ES6 includes.
var string = "The quick brown fox jumps over the lazy dog.",
substring = "lazy dog";
console.log(string.includes(substring));