Find the longest common starting substring in a set of strings [closed] - javascript

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
This is a challenge to come up with the most elegant JavaScript, Ruby or other solution to a relatively trivial problem.
This problem is a more specific case of the Longest common substring problem. I need to only find the longest common starting substring in an array. This greatly simplifies the problem.
For example, the longest substring in [interspecies, interstelar, interstate] is "inters". However, I don't need to find "ific" in [specifics, terrific].
I've solved the problem by quickly coding up a solution in JavaScript as a part of my answer about shell-like tab-completion (test page here). Here is that solution, slightly tweaked:
function common_substring(data) {
var i, ch, memo, idx = 0
do {
memo = null
for (i=0; i < data.length; i++) {
ch = data[i].charAt(idx)
if (!ch) break
if (!memo) memo = ch
else if (ch != memo) break
}
} while (i == data.length && idx < data.length && ++idx)
return (data[0] || '').slice(0, idx)
}
This code is available in this Gist along with a similar solution in Ruby. You can clone the gist as a git repo to try it out:
$ git clone git://gist.github.com/257891.git substring-challenge
I'm not very happy with those solutions. I have a feeling they might be solved with more elegance and less execution complexity—that's why I'm posting this challenge.
I'm going to accept as an answer the solution I find the most elegant or concise. Here is for instance a crazy Ruby hack I come up with—defining the & operator on String:
# works with Ruby 1.8.7 and above
class String
def &(other)
difference = other.to_str.each_char.with_index.find { |ch, idx|
self[idx].nil? or ch != self[idx].chr
}
difference ? self[0, difference.last] : self
end
end
class Array
def common_substring
self.inject(nil) { |memo, str| memo.nil? ? str : memo & str }.to_s
end
end
Solutions in JavaScript or Ruby are preferred, but you can show off clever solution in other languages as long as you explain what's going on. Only code from standard library please.
Update: my favorite solutions
I've chosen the JavaScript sorting solution by kennebec as the "answer" because it struck me as both unexpected and genius. If we disregard the complexity of actual sorting (let's imagine it's infinitely optimized by the language implementation), the complexity of the solution is just comparing two strings.
Other great solutions:
"regex greed" by FM takes a minute or two to grasp, but then the elegance of it hits you. Yehuda Katz also made a regex solution, but it's more complex
commonprefix in Python — Roberto Bonvallet used a feature made for handling filesystem paths to solve this problem
Haskell one-liner is short as if it were compressed, and beautiful
the straightforward Ruby one-liner
Thanks for participating! As you can see from the comments, I learned a lot (even about Ruby).

It's a matter of taste, but this is a simple javascript version:
It sorts the array, and then looks just at the first and last items.
//longest common starting substring in an array
function sharedStart(array){
var A= array.concat().sort(),
a1= A[0], a2= A[A.length-1], L= a1.length, i= 0;
while(i<L && a1.charAt(i)=== a2.charAt(i)) i++;
return a1.substring(0, i);
}
DEMOS
sharedStart(['interspecies', 'interstelar', 'interstate']) //=> 'inters'
sharedStart(['throne', 'throne']) //=> 'throne'
sharedStart(['throne', 'dungeon']) //=> ''
sharedStart(['cheese']) //=> 'cheese'
sharedStart([]) //=> ''
sharedStart(['prefix', 'suffix']) //=> ''

In Python:
>>> from os.path import commonprefix
>>> commonprefix('interspecies interstelar interstate'.split())
'inters'

Ruby one-liner:
l=strings.inject{|l,s| l=l.chop while l!=s[0...l.length];l}

My Haskell one-liner:
import Data.List
commonPre :: [String] -> String
commonPre = map head . takeWhile (\(x:xs)-> all (==x) xs) . transpose
EDIT: barkmadley gave a good explanation of the code below. I'd also add that haskell uses lazy evaluation, so we can be lazy about our use of transpose; it will only transpose our lists as far as necessary to find the end of the common prefix.

You just need to traverse all strings until they differ, then take the substring up to this point.
Pseudocode:
loop for i upfrom 0
while all strings[i] are equal
finally return substring[0..i]
Common Lisp:
(defun longest-common-starting-substring (&rest strings)
(loop for i from 0 below (apply #'min (mapcar #'length strings))
while (apply #'char=
(mapcar (lambda (string) (aref string i))
strings))
finally (return (subseq (first strings) 0 i))))

Yet another way to do it: use regex greed.
words = %w(interspecies interstelar interstate)
j = '='
str = ['', *words].join(j)
re = "[^#{j}]*"
str =~ /\A
(?: #{j} ( #{re} ) #{re} )
(?: #{j} \1 #{re} )*
\z/x
p $1
And the one-liner, courtesy of mislav (50 characters):
p ARGV.join(' ').match(/^(\w*)\w*(?: \1\w*)*$/)[1]

In Python I wouldn't use anything but the existing commonprefix function I showed in another answer, but I couldn't help to reinvent the wheel :P. This is my iterator-based approach:
>>> a = 'interspecies interstelar interstate'.split()
>>>
>>> from itertools import takewhile, chain, izip as zip, imap as map
>>> ''.join(chain(*takewhile(lambda s: len(s) == 1, map(set, zip(*a)))))
'inters'
Edit: Explanation of how this works.
zip generates tuples of elements taking one of each item of a at a time:
In [6]: list(zip(*a)) # here I use list() to expand the iterator
Out[6]:
[('i', 'i', 'i'),
('n', 'n', 'n'),
('t', 't', 't'),
('e', 'e', 'e'),
('r', 'r', 'r'),
('s', 's', 's'),
('p', 't', 't'),
('e', 'e', 'a'),
('c', 'l', 't'),
('i', 'a', 'e')]
By mapping set over these items, I get a series of unique letters:
In [7]: list(map(set, _)) # _ means the result of the last statement above
Out[7]:
[set(['i']),
set(['n']),
set(['t']),
set(['e']),
set(['r']),
set(['s']),
set(['p', 't']),
set(['a', 'e']),
set(['c', 'l', 't']),
set(['a', 'e', 'i'])]
takewhile(predicate, items) takes elements from this while the predicate is True; in this particular case, when the sets have one element, i.e. all the words have the same letter at that position:
In [8]: list(takewhile(lambda s: len(s) == 1, _))
Out[8]:
[set(['i']),
set(['n']),
set(['t']),
set(['e']),
set(['r']),
set(['s'])]
At this point we have an iterable of sets, each containing one letter of the prefix we were looking for. To construct the string, we chain them into a single iterable, from which we get the letters to join into the final string.
The magic of using iterators is that all items are generated on demand, so when takewhile stops asking for items, the zipping stops at that point and no unnecessary work is done. Each function call in my one-liner has a implicit for and an implicit break.

This is probably not the most concise solution (depends if you already have a library for this), but one elegant method is to use a trie. I use tries for implementing tab completion in my Scheme interpreter:
http://github.com/jcoglan/heist/blob/master/lib/trie.rb
For example:
tree = Trie.new
%w[interspecies interstelar interstate].each { |s| tree[s] = true }
tree.longest_prefix('')
#=> "inters"
I also use them for matching channel names with wildcards for the Bayeux protocol; see these:
http://github.com/jcoglan/faye/blob/master/client/channel.js
http://github.com/jcoglan/faye/blob/master/lib/faye/channel.rb

Just for the fun of it, here's a version written in (SWI-)PROLOG:
common_pre([[C|Cs]|Ss], [C|Res]) :-
maplist(head_tail(C), [[C|Cs]|Ss], RemSs), !,
common_pre(RemSs, Res).
common_pre(_, []).
head_tail(H, [H|T], T).
Running:
?- S=["interspecies", "interstelar", "interstate"], common_pre(S, CP), string_to_list(CPString, CP).
Gives:
CP = [105, 110, 116, 101, 114, 115],
CPString = "inters".
Explanation:
(SWI-)PROLOG treats strings as lists of character codes (numbers). All the predicate common_pre/2 does is recursively pattern-match to select the first code (C) from the head of the first list (string, [C|Cs]) in the list of all lists (all strings, [[C|Cs]|Ss]), and appends the matching code C to the result iff it is common to all (remaining) heads of all lists (strings), else it terminates.
Nice, clean, simple and efficient... :)

A javascript version based on #Svante's algorithm:
function commonSubstring(words){
var iChar, iWord,
refWord = words[0],
lRefWord = refWord.length,
lWords = words.length;
for (iChar = 0; iChar < lRefWord; iChar += 1) {
for (iWord = 1; iWord < lWords; iWord += 1) {
if (refWord[iChar] !== words[iWord][iChar]) {
return refWord.substring(0, iChar);
}
}
}
return refWord;
}

Combining answers by kennebec, Florian F and jberryman yields the following Haskell one-liner:
commonPrefix l = map fst . takeWhile (uncurry (==)) $ zip (minimum l) (maximum l)
With Control.Arrow one can get a point-free form:
commonPrefix = map fst . takeWhile (uncurry (==)) . uncurry zip . (minimum &&& maximum)

Python 2.6 (r26:66714, Oct 4 2008, 02:48:43)
>>> a = ['interspecies', 'interstelar', 'interstate']
>>> print a[0][:max(
[i for i in range(min(map(len, a)))
if len(set(map(lambda e: e[i], a))) == 1]
) + 1]
inters
i for i in range(min(map(len, a))), number of maximum lookups can't be greater than the length of the shortest string; in this example this would evaluate to [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
len(set(map(lambda e: e[i], a))), 1) create an array of the i-thcharacter for each string in the list; 2) make a set out of it; 3) determine the size of the set
[i for i in range(min(map(len, a))) if len(set(map(lambda e: e[i], a))) == 1], include just the characters, for which the size of the set is 1 (all characters at that position were the same ..); here it would evaluate to [0, 1, 2, 3, 4, 5]
finally take the max, add one, and get the substring ...
Note: the above does not work for a = ['intersyate', 'intersxate', 'interstate', 'intersrate'], but this would:
>>> index = len(
filter(lambda l: l[0] == l[1],
[ x for x in enumerate(
[i for i in range(min(map(len, a)))
if len(set(map(lambda e: e[i], a))) == 1]
)]))
>>> a[0][:index]
inters

Doesn't seem that complicated if you're not too concerned about ultimate performance:
def common_substring(data)
data.inject { |m, s| s[0,(0..m.length).find { |i| m[i] != s[i] }.to_i] }
end
One of the useful features of inject is the ability to pre-seed with the first element of the array being interated over. This avoids the nil memo check.
puts common_substring(%w[ interspecies interstelar interstate ]).inspect
# => "inters"
puts common_substring(%w[ feet feel feeble ]).inspect
# => "fee"
puts common_substring(%w[ fine firkin fail ]).inspect
# => "f"
puts common_substring(%w[ alpha bravo charlie ]).inspect
# => ""
puts common_substring(%w[ fork ]).inspect
# => "fork"
puts common_substring(%w[ fork forks ]).inspect
# => "fork"
Update: If golf is the game here, then 67 characters:
def f(d)d.inject{|m,s|s[0,(0..m.size).find{|i|m[i]!=s[i]}.to_i]}end

This one is very similar to Roberto Bonvallet's solution, except in ruby.
chars = %w[interspecies interstelar interstate].map {|w| w.split('') }
chars[0].zip(*chars[1..-1]).map { |c| c.uniq }.take_while { |c| c.size == 1 }.join
The first line replaces each word with an array of chars. Next, I use zip to create this data structure:
[["i", "i", "i"], ["n", "n", "n"], ["t", "t", "t"], ...
map and uniq reduce this to [["i"],["n"],["t"], ...
take_while pulls the chars off the array until it finds one where the size isn't one (meaning not all chars were the same). Finally, I join them back together.

The accepted solution is broken (for example, it returns a for strings like ['a', 'ba']). The fix is very simple, you literally have to change only 3 characters (from indexOf(tem1) == -1 to indexOf(tem1) != 0) and the function would work as expected.
Unfortunately, when I tried to edit the answer to fix the typo, SO told me that "edits must be at least 6 characters". I could change more then those 3 chars, by improving naming and readability but that feels like a little bit too much.
So, below is a fixed and improved (at least from my point of view) version of the kennebec's solution:
function commonPrefix(words) {
max_word = words.reduce(function(a, b) { return a > b ? a : b });
prefix = words.reduce(function(a, b) { return a > b ? b : a }); // min word
while(max_word.indexOf(prefix) != 0) {
prefix = prefix.slice(0, -1);
}
return prefix;
}
(on jsFiddle)
Note, that it uses reduce method (JavaScript 1.8) in order to find alphanumeric max / min instead of sorting the array and then fetching the first and the last elements of it.

While reading these answers with all the fancy functional programming, sorting and regexes and whatnot, I just thought: what's wrong a little bit of C? So here's a goofy looking little program.
#include <stdio.h>
int main (int argc, char *argv[])
{
int i = -1, j, c;
if (argc < 2)
return 1;
while (c = argv[1][++i])
for (j = 2; j < argc; j++)
if (argv[j][i] != c)
goto out;
out:
printf("Longest common prefix: %.*s\n", i, argv[1]);
}
Compile it, run it with your list of strings as command line arguments, then upvote me for using goto!

Here's a solution using regular expressions in Ruby:
def build_regex(string)
arr = []
arr << string.dup while string.chop!
Regexp.new("^(#{arr.join("|")})")
end
def substring(first, *strings)
strings.inject(first) do |accum, string|
build_regex(accum).match(string)[0]
end
end

I would do the following:
Take the first string of the array as the initial starting substring.
Take the next string of the array and compare the characters until the end of one of the strings is reached or a mismatch is found. If a mismatch is found, reduce starting substring to the length where the mismatch was found.
Repeat step 2 until all strings have been tested.
Here’s a JavaScript implementation:
var array = ["interspecies", "interstelar", "interstate"],
prefix = array[0],
len = prefix.length;
for (i=1; i<array.length; i++) {
for (j=0, len=Math.min(len,array[j].length); j<len; j++) {
if (prefix[j] != array[i][j]) {
len = j;
prefix = prefix.substr(0, len);
break;
}
}
}

Instead of sorting, you could just get the min and max of the strings.
To me, elegance in a computer program is a balance of speed and simplicity.
It should not do unnecessary computation, and it should be simple enough to make its correctness evident.
I could call the sorting solution "clever", but not "elegant".

Oftentimes it's more elegant to use a mature open source library instead of rolling your own. Then, if it doesn't completely suit your needs, you can extend it or modify it to improve it, and let the community decide if that belongs in the library.
diff-lcs is a good Ruby gem for least common substring.

My solution in Java:
public static String compute(Collection<String> strings) {
if(strings.isEmpty()) return "";
Set<Character> v = new HashSet<Character>();
int i = 0;
try {
while(true) {
for(String s : strings) v.add(s.charAt(i));
if(v.size() > 1) break;
v.clear();
i++;
}
} catch(StringIndexOutOfBoundsException ex) {}
return strings.iterator().next().substring(0, i);
}

Golfed JS solution just for fun:
w=["hello", "hell", "helen"];
c=w.reduce(function(p,c){
for(r="",i=0;p[i]==c[i];r+=p[i],i++){}
return r;
});

Here's an efficient solution in ruby. I based the idea of the strategy for a hi/lo guessing game where you iteratively zero in on the longest prefix.
Someone correct me if I'm wrong, but I think the complexity is O(n log n), where n is the length of the shortest string and the number of strings is considered a constant.
def common(strings)
lo = 0
hi = strings.map(&:length).min - 1
return '' if hi < lo
guess, last_guess = lo, hi
while guess != last_guess
last_guess = guess
guess = lo + ((hi - lo) / 2.0).ceil
if strings.map { |s| s[0..guess] }.uniq.length == 1
lo = guess
else
hi = guess
end
end
strings.map { |s| s[0...guess] }.uniq.length == 1 ? strings.first[0...guess] : ''
end
And some checks that it works:
>> common %w{ interspecies interstelar interstate }
=> "inters"
>> common %w{ dog dalmation }
=> "d"
>> common %w{ asdf qwerty }
=> ""
>> common ['', 'asdf']
=> ""

Fun alternative Ruby solution:
def common_prefix(*strings)
chars = strings.map(&:chars)
length = chars.first.zip( *chars[1..-1] ).index{ |a| a.uniq.length>1 }
strings.first[0,length]
end
p common_prefix( 'foon', 'foost', 'forlorn' ) #=> "fo"
p common_prefix( 'foost', 'foobar', 'foon' ) #=> "foo"
p common_prefix( 'a','b' ) #=> ""
It might help speed if you used chars = strings.sort_by(&:length).map(&:chars), since the shorter the first string, the shorter the arrays created by zip. However, if you cared about speed, you probably shouldn't use this solution anyhow. :)

Javascript clone of AShelly's excellent answer.
Requires Array#reduce which is supported only in firefox.
var strings = ["interspecies", "intermediate", "interrogation"]
var sub = strings.reduce(function(l,r) {
while(l!=r.slice(0,l.length)) {
l = l.slice(0, -1);
}
return l;
});

This is by no means elegant, but if you want concise:
Ruby, 71 chars
def f(a)b=a[0];b[0,(0..b.size).find{|n|a.any?{|i|i[0,n]!=b[0,n]}}-1]end
If you want that unrolled it looks like this:
def f(words)
first_word = words[0];
first_word[0, (0..(first_word.size)).find { |num_chars|
words.any? { |word| word[0, num_chars] != first_word[0, num_chars] }
} - 1]
end

It's not code golf, but you asked for somewhat elegant, and I tend to think recursion is fun. Java.
/** Recursively find the common prefix. */
public String findCommonPrefix(String[] strings) {
int minLength = findMinLength(strings);
if (isFirstCharacterSame(strings)) {
return strings[0].charAt(0) + findCommonPrefix(removeFirstCharacter(strings));
} else {
return "";
}
}
/** Get the minimum length of a string in strings[]. */
private int findMinLength(final String[] strings) {
int length = strings[0].size();
for (String string : strings) {
if (string.size() < length) {
length = string.size();
}
}
return length;
}
/** Compare the first character of all strings. */
private boolean isFirstCharacterSame(String[] strings) {
char c = string[0].charAt(0);
for (String string : strings) {
if (c != string.charAt(0)) return false;
}
return true;
}
/** Remove the first character of each string in the array,
and return a new array with the results. */
private String[] removeFirstCharacter(String[] source) {
String[] result = new String[source.length];
for (int i=0; i<result.length; i++) {
result[i] = source[i].substring(1);
}
return result;
}

A ruby version based on #Svante's algorithm. Runs ~3x as fast as my first one.
def common_prefix set
i=0
rest=set[1..-1]
set[0].each_byte{|c|
rest.each{|e|return set[0][0...i] if e[i]!=c}
i+=1
}
set
end

My Javascript solution:
IMOP, using sort is too tricky.
My solution is compare letter by letter through looping the array.
Return string if letter is not macthed.
This is my solution:
var longestCommonPrefix = function(strs){
if(strs.length < 1){
return '';
}
var p = 0, i = 0, c = strs[0][0];
while(p < strs[i].length && strs[i][p] === c){
i++;
if(i === strs.length){
i = 0;
p++;
c = strs[0][p];
}
}
return strs[0].substr(0, p);
};

Realizing the risk of this turning into a match of code golf (or is that the intention?), here's my solution using sed, copied from my answer to another SO question and shortened to 36 chars (30 of which are the actual sed expression). It expects the strings (each on a seperate line) to be supplied on standard input or in files passed as additional arguments.
sed 'N;s/^\(.*\).*\n\1.*$/\1\n\1/;D'
A script with sed in the shebang line weighs in at 45 chars:
#!/bin/sed -f
N;s/^\(.*\).*\n\1.*$/\1\n\1/;D
A test run of the script (named longestprefix), with strings supplied as a "here document":
$ ./longestprefix <<EOF
> interspecies
> interstelar
> interstate
> EOF
inters
$

Related

Check length between two characters in a string

So I've been working on this coding challenge for about a day now and I still feel like I haven't scratched the surface even though it's suppose to be Easy. The problem asks us to take a string parameter and if there are exactly 3 characters (not including spaces) in between the letters 'a' and 'b', it should be true.
Example: Input: "maple bread"; Output: false // Because there are > 3 places
Input: "age bad"; Output: true // Exactly three places in between 'a' and 'b'
Here is what I've written, although it is unfinished and most likely in the wrong direction:
function challengeOne(str) {
let places = 0;
for (let i=0; i < str.length; i++) {
if (str[i] != 'a') {
places++
} else if (str[i] === 'b'){
}
}
console.log(places)
}
So my idea was to start counting places after the letter 'a' until it gets to 'b', then it would return the amount of places. I would then start another flow where if 'places' > 3, return false or if 'places' === 3, then return true.
However, attempting the first flow only returns the total count for places that aren't 'a'. I'm using console.log instead of return to test if it works or not.
I'm only looking for a push in the right direction and if there is a method I might be missing or if there are other examples similar to this. I feel like the solution is pretty simple yet I can't seem to grasp it.
Edit:
I took a break from this challenge just so I could look at it from fresh eyes and I was able to solve it quickly! I looked through everyone's suggestions and applied it until I found the solution. Here is the new code that worked:
function challengeOne(str) {
// code goes here
str = str.replace(/ /g, '')
let count = Math.abs(str.lastIndexOf('a')-str.lastIndexOf('b'));
if (count === 3) {
return true
} else return false
}
Thank you for all your input!
Here's a more efficient approach - simply find the indexes of the letter a and b and check whether the absolute value of subtracting the two is 4 (since indexes are 0 indexed):
function challengeOne(str) {
return Math.abs(str.indexOf("a") - str.indexOf("b")) == 4;
}
console.log(challengeOne("age bad"));
console.log(challengeOne("maple bread"));
if there are exactly 3 characters (not including spaces)
Simply remove all spaces via String#replace, then perform the check:
function challengeOne(str) {
return str = str.replace(/ /g, ''), Math.abs(str.indexOf("a") - str.indexOf("b")) == 4;
}
console.log(challengeOne("age bad"));
console.log(challengeOne("maple bread"));
References:
Math#abs
String#indexOf
Here is another approach: This one excludes spaces as in the OP, so the output reflects that. If it is to include spaces, that line could be removed.
function challengeOne(str) {
//strip spaces
str = str.replace(/\s/g, '');
//take just the in between chars
let extract = str.match(/a(.*)b/).pop();
return extract.length == 3
}
console.log(challengeOne('maple bread'));
console.log(challengeOne('age bad'));
You can go recursive:
Check if the string starts with 'a' and ends with 'b' and check the length
Continue by cutting the string either left or right (or both) until there are 3 characters in between or the string is empty.
Examples:
maple bread
aple brea
aple bre
aple br
aple b
ple
le
FALSE
age bad
age ba
age b
TRUE
const check = (x, y, z) => str => {
const exec = s => {
const xb = s.startsWith(x);
const yb = s.endsWith(y);
return ( !s ? false
: xb && yb && s.length === z + 2 ? true
: xb && yb ? exec(s.slice(1, -1))
: xb ? exec(s.slice(0, -1))
: exec(s.slice(1)));
};
return exec(str);
}
const challenge = check('a', 'b', 3);
console.log(`
challenge("maple bread"): ${challenge("maple bread")}
challenge("age bad"): ${challenge("age bad")}
challenge("aabab"): ${challenge("aabab")}
`)
I assume spaces are counted and your examples seem to indicate this, although your question says otherwise. If so, here's a push that should be helpful. You're right, there are JavaScript methods for strings, including one that should help you find the index (location) of the a and b within the given string.
Try here:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String#instance_methods

How to return all combinations of N coin flips using recursion?

Request:
Using JavaScript, write a function that takes an integer. The integer represents the number of times a coin is flipped. Using recursive strategies only, return an array containing all possible combinations of coin flips. Use "H" to represent heads and "T" to represent tails. The order of combinations does not matter.
For example, passing in "2" would return:
["HH", "HT", "TH", "TT"]
Context:
I am relatively new to JavaScript as well as the concept of recursion. This is purely for practice and understanding, so the solution does not necessarily need to match the direction of my code below; any useful methods or other ways of thinking through this are helpful, as long as it's purely recursive (no loops).
Attempt:
My attempt at this started out simple however the "action" progressively got more convoluted as I increased the input. I believe this works for inputs of 2, 3, and 4. However, inputs of 5 or higher are missing combinations in the output. Many thanks in advance!
function coinFlips(num){
const arr = [];
let str = "";
// adds base str ("H" * num)
function loadStr(n) {
if (n === 0) {
arr.push(str);
return traverseArr();
}
str += "H";
loadStr(n - 1);
}
// declares start point, end point, and index to update within each str
let start = 0;
let end = 1;
let i = 0;
function traverseArr() {
// base case
if(i === str.length) {
console.log(arr);
return arr;
}
// updates i in base str to "T"
// increments i
// resets start and end
if(end === str.length) {
str = str.split('');
str[i] = "T";
str = str.join('');
i++;
start = i;
end = i + 1;
return traverseArr();
}
// action
let tempStr = str.split('');
tempStr[start] = "T";
tempStr = tempStr.join('');
if(!arr.includes(tempStr)){
arr.push(tempStr);
};
tempStr = tempStr.split('');
tempStr.reverse();
tempStr = tempStr.join('');
if(!arr.includes(tempStr)){
arr.push(tempStr);
};
tempStr = str.split('');
tempStr[end] = "T";
tempStr = tempStr.join('');
if(!arr.includes(tempStr)){
arr.push(tempStr);
};
tempStr = tempStr.split('');
tempStr.reverse();
tempStr = tempStr.join('');
if(!arr.includes(tempStr)){
arr.push(tempStr);
};
tempStr = str.split('');
tempStr[start] = "T";
tempStr[end] = "T";
tempStr = tempStr.join('');
if(!arr.includes(tempStr)){
arr.push(tempStr);
};
tempStr = tempStr.split('');
tempStr.reverse();
tempStr = tempStr.join('');
if(!arr.includes(tempStr)){
arr.push(tempStr);
};
// recursive case
start++;
end++;
return traverseArr();
}
loadStr(num);
}
coinFlips(5);
Below is a long description about how to create such recursive functions. I think the steps described help solve a great number of problems. They are not a panacea, but they can be quite useful. But first, here's what we'll work toward:
const getFlips = (n) =>
n <= 0
? ['']
: getFlips (n - 1) .flatMap (r => [r + 'H', r + 'T'])
Determining our algorithm
To solve a problem like this recursively, we need to answer several questions:
What value are we recurring on?
For simple recursions, it's often a single numeric parameter. In all cases there must be a way to demonstrate that we are making progress toward some final state.
This is a simple case, and it should be pretty obvious that we want to recur on the number of flips; let's call it n.
When does our recursion end?
We need to stop recurring eventually. Here we might consider stopping when n is 0 or possibly when n is 1. Either choice could work. Let's hold off on this decision for a moment to see which might be simpler.
How do we convert the answer from one step into the answer for the next?
For recursion to do anything useful, the important part is calculating the result of our next step based on the current one.
(Again, there are possible complexities here for more involved recursions. We might for instance have to use all the lower results to calculate the next value. For an example look up the Catalan Numbers. Here we can ignore that; our recursion is simple.)
So how do we convert, say ['HH', 'HT', 'TH', 'TT'] into the next step, ['HHH', 'HHT', 'HTH', 'HTT', 'THH', 'THT', 'TTH', 'TTT']? Well if we look at the next result closely, we can see that the in first half all elements begin with 'H' and in the second one they begin with 'T'. If we ignore the first letters, each half is a copy of our input, ['HH', 'HT', 'TH', 'TT']. That looks very promising! So our recursive step can be to make two copies of the previous result, the first one with each value preceded by 'H', the second one by 'T'.
What is the value for our base case?
This is tied to the question we skipped. We can't say what it ends on without also knowing when it ends. But a good way to make the determination for both is to work backward.
To go backward from ['HHH', 'HHT', 'HTH', 'HTT', 'THH', 'THT', 'TTH', 'TTT'] to ['HH', 'HT', 'TH', 'TT'], we can take the first half and remove the initial 'H' from each result. Let's do it again. From ['HH', 'HT', 'TH', 'TT'], we take the first half and remove the initial 'H' from each to get ['H', 'T']. While that might be our stopping point, what happens if we take it one step further? Taking the first half and removing the initial H from the one remaining element leaves us just ['']. Does this answer make sense? I'd argue that it does: How many ways are there to flip the coin zero times? Just one. How would we record it as a string of Hs and Ts? As the empty string. So an array containing just the empty string is a great answer for the case of 0. That also answers our second question, about when the recursion ends. It ends when n is zero.
Writing code for that algorithm
Of course now we have to turn that algorithm into code. We can do this in a few steps as well.
Declaring our function
We write this by starting with a function definition. Our parameter is called n. I'm going to call the function getFlips. So we start with
const getFlips = (n) =>
<something here>
Adding our base case.
We've already said that we're going to end when n is zero. I usually prefer to make that a little more resilient by checking for any n that is less than or equal to zero. This will stop an infinite recursion if someone passes a negative number. We could instead choose to throw an exception in this case, but our explanation of [''] for the case of zero seems to hold as well for the negative values. (Besides, I absolutely hate throwing exceptions!)
That gives us the following:
const getFlips = (n) =>
n <= 0
? ['']
: <something here>
I choose here to use the conditional (ternary) expression instead of if-else statements because I prefer working with expressions over statements as much as possible. This same technique can easily be written with if-else instead if that feels more natural to you.
Handling the recursive case
Our description was to "make two copies of the previous result, the first one with each value preceded by 'H', the second one by 'T'." Our previous result is of course getFlips (n - 1). If we want to precede each value in that array with 'H', we're best using .map. We can to id like this: getFlips (n - 1) .map (r => 'H' + r). And of course the second half is just getFlips (n - 1) .map (r => 'T' + r). If we want to combine two arrays into one, there are many techniques, including .push and .concat. But the modern solution would probably be to use spread parameters and just return [...first, ...second].
Putting that all together, we get this snippet:
const getFlips = (n) =>
n <= 0
? ['']
: [...getFlips (n - 1) .map (r => 'H' + r), ...getFlips (n - 1) .map (r => 'T' + r)]
console .log (getFlips (3))
Examining the results
We can test this on a few cases. But we should be fairly convinced by the code. It seems to work, it's relatively simple, there are no obvious edge cases missing. But I still see a problem. We're calculating getFlips (n - 1) twice, for no good reason. In a recursive situation that it usually quite problematic.
There are several obvious fixes for this. First would be to give up my fascination with expression-based programming and simply use if-else logic with a local variable:
Replace conditional operator with if-else statements
const getFlips = (n) => {
if (n <= 0) {
return ['']
} else {
const prev = getFlips (n - 1)
return [...prev .map (r => 'H' + r), ...prev .map (r => 'T' + r)]
}
}
(Technically, the else isn't necessary, and some linters would complain about it. I think the code reads better with it included.)
Calculate a default parameter to use as a local variable
Another would be to use a parameter default value in the earlier definition.
const getFlips = (n, prev = n > 0 && getFlips (n - 1)) =>
n <= 0
? ['']
: [...prev .map (r => 'H' + r), ...prev .map (r => 'T' + r)]
This might rightly be viewed as over-tricky, and it can cause problems when your function is used in unexpected circumstances. Don't pass this to an array's map call, for instance.
Rethink the recursive step
Either of the above would work. But there is a better solution.
We can also write much the same code with a different approach to the recursive step if we can see another way of turning ['HH', 'HT', 'TH', 'TT'] into ['HHH', 'HHT', 'HTH', 'HTT', 'THH', 'THT', 'TTH', 'TTT']. Our technique was to split the array down the middle and remove the first letters. But there are other copies of that base version in the version of the array without one of their letters. If we remove the last letters from each, we get ['HH', 'HH', 'HT', 'HT', 'TH', 'TH', 'TT', 'TT'], which is just our original version with each string appearing twice.
The first code that comes to mind to implement this is simply getFlips (n - 1) .map (r => [r + 'H', r + 'T']). But this would be subtly off, as it would convert ['HH', 'HT', 'TH', ' TT'] into [["HHH", "HHT"], ["HTH", "HTT"], ["THH", "THT"], [" TTH", " TTT"]], with an extra level of nesting, and applied recursively would just yield nonsense. But there is an alternative to .map that removes that extra level of nesting, .flatMap.
And that leads us to a solution I'm very happy with:
const getFlips = (n) =>
n <= 0
? ['']
: getFlips (n - 1) .flatMap (r => [r + 'H', r + 'T'])
console .log (getFlips (3))
function getFlips(n) {
// Helper recursive function
function addFlips(n, result, current) {
if (n === 1) {
// This is the last flip, so add the result to the array
result.push(current + 'H');
result.push(current + 'T');
} else {
// Let's say current is TTH (next combos are TTHH and TTHT)
// Then for each of the 2 combos call add Flips again to get the next flips.
addFlips(n - 1, result, current + 'H');
addFlips(n - 1, result, current + 'T');
}
}
// Begin with empty results
let result = [];
// Current starts with empty string
addFlips(n, result, '');
return result;
}
In case this is of any interest, here's a solution that doesn't use recursion as such but make use of the Applicative type.
Except when n is 1, the list of all possible combinations is obtained by combining all possible outcomes of each coin flip:
22 → [H, T] × [H, T] → [HH, HT, TH, TT]
23 → [H, T] × [H, T] × [H, T] → [HHH, HHT, HTH, HTT, THH, THT, TTH, TTT]
...
A function that can take n characters and concat them can be written as such:
const concat = (...n) => n.join('');
concat('H', 'H'); //=> 'HH'
concat('H', 'H', 'T'); //=> 'HHT'
concat('H', 'H', 'T', 'H'); //=> 'HHTH'
//...
A function that produces a list of outcomes for n coin flips can be written as such:
const outcomes = n => Array(n).fill(['H', 'T']);
outcomes(2); //=> [['H', 'T'], ['H', 'T']]
outcomes(3); //=> [['H', 'T'], ['H', 'T'], ['H', 'T']]
// ...
We can now sort of see a solution here: to get the list of all possible combinations, we need to apply concat across all lists.
However we don't want to do that. Instead we want to make concat work with containers of values instead of individual values.
So that:
concat(['H', 'T'], ['H', 'T'], ['H', 'T']);
Produces the same result as:
[ concat('H', 'H', 'H')
, concat('H', 'H', 'T')
, concat('H', 'T', 'H')
, concat('H', 'T', 'T')
, concat('T', 'H', 'H')
, concat('T', 'H', 'T')
, concat('T', 'T', 'H')
, concat('T', 'T', 'T')
]
In functional programming we say that we want to lift concat. In this example I'll be using Ramda's liftN function.
const flip = n => {
const concat = liftN(n, (...x) => x.join(''));
return concat(...Array(n).fill(['H', 'T']));
};
console.log(flip(1));
console.log(flip(2));
console.log(flip(3));
console.log(flip(4));
<script src="https://cdnjs.cloudflare.com/ajax/libs/ramda/0.27.1/ramda.min.js"></script>
<script>const {liftN} = R;</script>

Sort array by specific array objects column [duplicate]

I have a list of objects I wish to sort based on a field attr of type string. I tried using -
list.sort(function (a, b) {
return a.attr - b.attr
})
but found that - doesn't appear to work with strings in JavaScript. How can I sort a list of objects based on an attribute with type string?
Use String.prototype.localeCompare as per your example:
list.sort(function (a, b) {
return ('' + a.attr).localeCompare(b.attr);
})
We force a.attr to be a string to avoid exceptions. localeCompare has been supported since Internet Explorer 6 and Firefox 1. You may also see the following code used that doesn't respect a locale:
if (item1.attr < item2.attr)
return -1;
if ( item1.attr > item2.attr)
return 1;
return 0;
An updated answer (October 2014)
I was really annoyed about this string natural sorting order so I took quite some time to investigate this issue.
Long story short
localeCompare() character support is badass, just use it.
As pointed out by Shog9, the answer to your question is:
return item1.attr.localeCompare(item2.attr);
Bugs found in all the custom JavaScript "natural string sort order" implementations
There are quite a bunch of custom implementations out there, trying to do string comparison more precisely called "natural string sort order"
When "playing" with these implementations, I always noticed some strange "natural sorting order" choice, or rather mistakes (or omissions in the best cases).
Typically, special characters (space, dash, ampersand, brackets, and so on) are not processed correctly.
You will then find them appearing mixed up in different places, typically that could be:
some will be between the uppercase 'Z' and the lowercase 'a'
some will be between the '9' and the uppercase 'A'
some will be after lowercase 'z'
When one would have expected special characters to all be "grouped" together in one place, except for the space special character maybe (which would always be the first character). That is, either all before numbers, or all between numbers and letters (lowercase & uppercase being "together" one after another), or all after letters.
My conclusion is that they all fail to provide a consistent order when I start adding barely unusual characters (i.e., characters with diacritics or characters such as dash, exclamation mark and so on).
Research on the custom implementations:
Natural Compare Lite https://github.com/litejs/natural-compare-lite : Fails at sorting consistently https://github.com/litejs/natural-compare-lite/issues/1 and http://jsbin.com/bevututodavi/1/edit?js,console, basic Latin characters sorting http://jsbin.com/bevututodavi/5/edit?js,console
Natural Sort https://github.com/javve/natural-sort : Fails at sorting consistently, see issue https://github.com/javve/natural-sort/issues/7 and see basic Latin characters sorting http://jsbin.com/cipimosedoqe/3/edit?js,console
JavaScript Natural Sort https://github.com/overset/javascript-natural-sort: seems rather neglected since February 2012, Fails at sorting consistently, see issue https://github.com/overset/javascript-natural-sort/issues/16
Alphanum http://www.davekoelle.com/files/alphanum.js , Fails at sorting consistently, see http://jsbin.com/tuminoxifuyo/1/edit?js,console
Browsers' native "natural string sort order" implementations via localeCompare()
localeCompare() oldest implementation (without the locales and options arguments) is supported by Internet Explorer 6 and later, see http://msdn.microsoft.com/en-us/library/ie/s4esdbwz(v=vs.94).aspx (scroll down to localeCompare() method).
The built-in localeCompare() method does a much better job at sorting, even international & special characters.
The only problem using the localeCompare() method is that "the locale and sort order used are entirely implementation dependent". In other words, when using localeCompare such as stringOne.localeCompare(stringTwo): Firefox, Safari, Chrome, and Internet Explorer have a different sort order for Strings.
Research on the browser-native implementations:
http://jsbin.com/beboroyifomu/1/edit?js,console - basic Latin characters comparison with localeCompare()
http://jsbin.com/viyucavudela/2/ - basic Latin characters comparison with localeCompare() for testing on Internet Explorer 8
http://jsbin.com/beboroyifomu/2/edit?js,console - basic Latin characters in string comparison : consistency check in string vs when a character is alone
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare - Internet Explorer 11 and later supports the new locales & options arguments
Difficulty of "string natural sorting order"
Implementing a solid algorithm (meaning: consistent but also covering a wide range of characters) is a very tough task. UTF-8 contains more than 2000 characters and covers more than 120 scripts (languages).
Finally, there are some specification for this tasks, it is called the "Unicode Collation Algorithm", which can be found at http://www.unicode.org/reports/tr10/. You can find more information about this on this question I posted https://softwareengineering.stackexchange.com/questions/257286/is-there-any-language-agnostic-specification-for-string-natural-sorting-order
Final conclusion
So considering the current level of support provided by the JavaScript custom implementations I came across, we will probably never see anything getting any close to supporting all this characters and scripts (languages). Hence I would rather use the browsers' native localeCompare() method. Yes, it does have the downside of being non-consistent across browsers but basic testing shows it covers a much wider range of characters, allowing solid & meaningful sort orders.
So as pointed out by Shog9, the answer to your question is:
return item1.attr.localeCompare(item2.attr);
Further reading:
https://softwareengineering.stackexchange.com/questions/257286/is-there-any-language-agnostic-specification-for-string-natural-sorting-order
How to sort strings in JavaScript
Natural sort of alphanumerical strings in JavaScript
Sort Array of numeric & alphabetical elements (Natural Sort)
Sort mixed alpha/numeric array
https://web.archive.org/web/20130929122019/http://my.opera.com/GreyWyvern/blog/show.dml/1671288
https://web.archive.org/web/20131005224909/http://www.davekoelle.com/alphanum.html
http://snipplr.com/view/36012/javascript-natural-sort/
http://blog.codinghorror.com/sorting-for-humans-natural-sort-order/
Thanks to Shog9's nice answer, which put me in the "right" direction I believe.
Answer (in Modern ECMAScript)
list.sort((a, b) => (a.attr > b.attr) - (a.attr < b.attr))
Or
list.sort((a, b) => +(a.attr > b.attr) || -(a.attr < b.attr))
Description
Casting a boolean value to a number yields the following:
true -> 1
false -> 0
Consider three possible patterns:
x is larger than y: (x > y) - (y < x) -> 1 - 0 -> 1
x is equal to y: (x > y) - (y < x) -> 0 - 0 -> 0
x is smaller than y: (x > y) - (y < x) -> 0 - 1 -> -1
(Alternative)
x is larger than y: +(x > y) || -(x < y) -> 1 || 0 -> 1
x is equal to y: +(x > y) || -(x < y) -> 0 || 0 -> 0
x is smaller than y: +(x > y) || -(x < y) -> 0 || -1 -> -1
So these logics are equivalent to typical sort comparator functions.
if (x == y) {
return 0;
}
return x > y ? 1 : -1;
Since strings can be compared directly in JavaScript, this will do the job:
list.sort(function (a, b) {
return a.attr < b.attr ? -1: 1;
})
This is a little bit more efficient than using
return a.attr > b.attr ? 1: -1;
because in case of elements with same attr (a.attr == b.attr), the sort function will swap the two for no reason.
For example
var so1 = function (a, b) { return a.atr > b.atr ? 1: -1; };
var so2 = function (a, b) { return a.atr < b.atr ? -1: 1; }; // Better
var m1 = [ { atr: 40, s: "FIRST" }, { atr: 100, s: "LAST" }, { atr: 40, s: "SECOND" } ].sort (so1);
var m2 = [ { atr: 40, s: "FIRST" }, { atr: 100, s: "LAST" }, { atr: 40, s: "SECOND" } ].sort (so2);
// m1 sorted but ...: 40 SECOND 40 FIRST 100 LAST
// m2 more efficient: 40 FIRST 40 SECOND 100 LAST
You should use > or < and == here. So the solution would be:
list.sort(function(item1, item2) {
var val1 = item1.attr,
val2 = item2.attr;
if (val1 == val2) return 0;
if (val1 > val2) return 1;
if (val1 < val2) return -1;
});
Nested ternary arrow function
(a,b) => (a < b ? -1 : a > b ? 1 : 0)
I had been bothered about this for long, so I finally researched this and give you this long winded reason for why things are the way they are.
From the spec:
Section 11.9.4 The Strict Equals Operator ( === )
The production EqualityExpression : EqualityExpression === RelationalExpression
is evaluated as follows:
- Let lref be the result of evaluating EqualityExpression.
- Let lval be GetValue(lref).
- Let rref be the result of evaluating RelationalExpression.
- Let rval be GetValue(rref).
- Return the result of performing the strict equality comparison
rval === lval. (See 11.9.6)
So now we go to 11.9.6
11.9.6 The Strict Equality Comparison Algorithm
The comparison x === y, where x and y are values, produces true or false.
Such a comparison is performed as follows:
- If Type(x) is different from Type(y), return false.
- If Type(x) is Undefined, return true.
- If Type(x) is Null, return true.
- If Type(x) is Number, then
...
- If Type(x) is String, then return true if x and y are exactly the
same sequence of characters (same length and same characters in
corresponding positions); otherwise, return false.
That's it. The triple equals operator applied to strings returns true iff the arguments are exactly the same strings (same length and same characters in corresponding positions).
So === will work in the cases when we're trying to compare strings which might have arrived from different sources, but which we know will eventually have the same values - a common enough scenario for inline strings in our code. For example, if we have a variable named connection_state, and we wish to know which one of the following states ['connecting', 'connected', 'disconnecting', 'disconnected'] is it in right now, we can directly use the ===.
But there's more. Just above 11.9.4, there is a short note:
NOTE 4
Comparison of Strings uses a simple equality test on sequences of code
unit values. There is no attempt to use the more complex, semantically oriented
definitions of character or string equality and collating order defined in the
Unicode specification. Therefore Strings values that are canonically equal
according to the Unicode standard could test as unequal. In effect this
algorithm assumes that both Strings are already in normalized form.
Hmm. What now? Externally obtained strings can, and most likely will, be weird unicodey, and our gentle === won't do them justice. In comes localeCompare to the rescue:
15.5.4.9 String.prototype.localeCompare (that)
...
The actual return values are implementation-defined to permit implementers
to encode additional information in the value, but the function is required
to define a total ordering on all Strings and to return 0 when comparing
Strings that are considered canonically equivalent by the Unicode standard.
We can go home now.
tl;dr;
To compare strings in javascript, use localeCompare; if you know that the strings have no non-ASCII components because they are, for example, internal program constants, then === also works.
An explanation of why the approach in the question doesn't work:
let products = [
{ name: "laptop", price: 800 },
{ name: "phone", price:200},
{ name: "tv", price: 1200}
];
products.sort( (a, b) => {
{let value= a.name - b.name; console.log(value); return value}
});
> 2 NaN
Subtraction between strings returns NaN.
Echoing Alejadro's answer, the right approach is:
products.sort( (a,b) => a.name > b.name ? 1 : -1 )
A typescript sorting method modifier using a custom function to return a sorted string in either ascending or descending order
const data = ["jane", "mike", "salome", "ababus", "buisa", "dennis"];
const sortStringArray = (stringArray: string[], mode?: 'desc' | 'asc') => {
if (!mode || mode === 'asc') {
return stringArray.sort((a, b) => a.localeCompare(b))
}
return stringArray.sort((a, b) => b.localeCompare(a))
}
console.log(sortStringArray(data, 'desc'));// [ 'salome', 'mike', 'jane', 'dennis', 'buisa', 'ababus' ]
console.log(sortStringArray(data, 'asc')); // [ 'ababus', 'buisa', 'dennis', 'jane', 'mike', 'salome' ]
There should be ascending and descending orders functions
if (order === 'asc') {
return a.localeCompare(b);
}
return b.localeCompare(a);
If you want to control locales (or case or accent), then use Intl.collator:
const collator = new Intl.Collator();
list.sort((a, b) => collator.compare(a.attr, b.attr));
You can construct a collator like:
new Intl.Collator("en");
new Intl.Collator("en", {sensitivity: "case"});
...
See the above link for documentation.
Note: unlike some other solutions, it handles null, undefined the JavaScript way, i.e., moves them to the end.
Use sort() straightforward without any - or <
const areas = ['hill', 'beach', 'desert', 'mountain']
console.log(areas.sort())
// To print in descending way
console.log(areas.sort().reverse())
In your operation in your initial question, you are performing the following operation:
item1.attr - item2.attr
So, assuming those are numbers (i.e. item1.attr = "1", item2.attr = "2") You still may use the "===" operator (or other strict evaluators) provided that you ensure type. The following should work:
return parseInt(item1.attr) - parseInt(item2.attr);
If they are alphaNumeric, then do use localCompare().
list.sort(function(item1, item2){
return +(item1.attr > item2.attr) || +(item1.attr === item2.attr) - 1;
})
How they work samples:
+('aaa'>'bbb')||+('aaa'==='bbb')-1
+(false)||+(false)-1
0||0-1
-1
+('bbb'>'aaa')||+('bbb'==='aaa')-1
+(true)||+(false)-1
1||0-1
1
+('aaa'>'aaa')||+('aaa'==='aaa')-1
+(false)||+(true)-1
0||1-1
0
var str = ['v','a','da','c','k','l']
var b = str.join('').split('').sort().reverse().join('')
console.log(b)
<!doctype html>
<html>
<body>
<p id = "myString">zyxtspqnmdba</p>
<p id = "orderedString"></p>
<script>
var myString = document.getElementById("myString").innerHTML;
orderString(myString);
function orderString(str) {
var i = 0;
var myArray = str.split("");
while (i < str.length){
var j = i + 1;
while (j < str.length) {
if (myArray[j] < myArray[i]){
var temp = myArray[i];
myArray[i] = myArray[j];
myArray[j] = temp;
}
j++;
}
i++;
}
var newString = myArray.join("");
document.getElementById("orderedString").innerHTML = newString;
}
</script>
</body>
</html>

Punch/Combine multiple strings into a single (shortest possible) string that includes all the chars of each strings in forward direction

My purpose is to punch multiple strings into a single (shortest) string that will contain all the character of each string in a forward direction. The question is not specific to any language, but more into the algorithm part. (probably will implement it in a node server, so tagging nodejs/javascript).
So, to explain the problem:
Let's consider I have few strings
["jack", "apple", "maven", "hold", "solid", "mark", "moon", "poor", "spark", "live"]
The Resultant string should be something like:
"sjmachppoalidveonrk"
jack: sjmachppoalidveonrk
apple: sjmachppoalidveonrk
solid: sjmachppoalidveonrk
====================================>>>> all in the forward direction
These all are manual evaluation and the output may not 100% perfect in the example.
So, the point is all the letters of each string have to exist in the output in
FORWARD DIRECTION (here the actual problem belongs), and possibly the server will send the final strings and numbers like 27594 will be generated and passed to extract the token, in the required end. If I have to punch it in a minimal possible string it would have much easier (That case only unique chars are enough). But in this case there are some points:
Letters can be present multiple time, though I have to reuse any
letter if possible, eg: for solid and hold o > l > d can be
reused as forward direction but for apple (a > p) and spark
(p > a) we have to repeat a as in one case it appears before p
for apple, and after p for sparks so either we need to repeat
a or p. Even, we cannot do p > a > p as it will not cover both the case
because we need two p after a for apple
We directly have no option to place a single p and use the same
index twice in a time of extract, we need multiple p with no option
left as the input string contains that
I am (not) sure, that there is multiple outputs possible for a set of
strings. but the concern is it should be minimal in length,
the combination doesn't matter if its cover all the tokens in a forward direction. all (or one ) outputs of minimal possible length
need to trace.
Adding this point as an EDIT to this post. After reading the comments and knowing that it's already an existing
problem is known as shortest common supersequence problem we can
define that the resultant string will be the shortest possible
string from which we can re generate any input string by simply
removing some (0 to N) chars, this is same as all inputs can be found in a forward direction in the resultant string.
I have tried, by starting with an arbitrary string, and then made an analysis of next string and splitting all the letters, and place them accordingly, but after some times, it seems that current string letters can be placed in a better way, If the last string's (or a previous string's) letters were placed according to the current string. But again that string was analysed and placed based on something (multiple) what was processed, and placing something in the favor of something that is not processed seems difficult because to that we need to process that. Or might me maintaining a tree of all processed/unprocessed tree will help, building the building the final string? Any better way than it, it seems a brute force?
Note: I know there are a lot of other transformation possible, please try not to suggest anything else to use, we are doing a bit research on it.
I came up with a somewhat brute force method. This way finds the optimal way to combine 2 words then does it for each element in the array.
This strategy works by trying finding the best possible way to combine 2 words together. It is considered the best by having the fewest letters. Each word is fed into an ever growing "merged" word. Each time a new word is added the existing word is searched for a matching character which exists in the word to be merged. Once one is found both are split into 2 sets and attempted to be joined (using the rules at hand, no need 2 add if letter already exists ect..). The strategy generally yields good results.
The join_word method takes 2 words you wish to join, the first parameter is considered to be the word you wish to place the other into. It then searches for the best way to split into and word into 2 separate parts to merge together, it does this by looking for any shared common characters. This is where the splits_on_letter method comes in.
The splits_on_letter method takes a word and a letter which you wish to split on, then returns a 2d array of all the possible left and right sides of splitting on that character. For example splits_on_letter('boom', 'o') would return [["b","oom"],["bo","om"],["boo","m"]], this is all the combinations of how we could use the letter o as a split point.
The sort() at the beginning is to attempt to place like elements together. The order in which you merge the elements generally effects the results length. One approach I tried was to sort them based upon how many common letters they used (with their peers), however the results were varying. However in all my tests I had maybe 5 or 6 different word sets to test with, its possible with a larger, more varying word arrays you might find different results.
Output is
spmjhooarckpplivden
var words = ["jack", "apple", "maven", "hold", "solid", "mark", "moon", "poor", "spark", "live"];
var result = minify_words(words);
document.write(result);
function minify_words(words) {
// Theres a good sorting method somewhere which can place this in an optimal order for combining them,
// hoever after quite a few attempts i couldnt get better than just a regular sort... so just use that
words = words.sort();
/*
Joins 2 words together ensuring each word has all its letters in the result left to right
*/
function join_word(into, word) {
var best = null;
// straight brute force each word down. Try to run a split on each letter and
for(var i=0;i<word.length;i++) {
var letter = word[i];
// split our 2 words into 2 segments on that pivot letter
var intoPartsArr = splits_on_letter(into, letter);
var wordPartsArr = splits_on_letter(word, letter);
for(var p1=0;p1<intoPartsArr.length;p1++) {
for(var p2=0;p2<wordPartsArr.length;p2++) {
var intoParts = intoPartsArr[p1], wordParts = wordPartsArr[p2];
// merge left and right and push them together
var result = add_letters(intoParts[0], wordParts[0]) + add_letters(intoParts[1], wordParts[1]);
if(!best || result.length <= best.length) {
best = result;
}
}
}
}
// its possible that there is no best, just tack the words together at that point
return best || (into + word);
}
/*
Splits a word at the index of the provided letter
*/
function splits_on_letter(word, letter) {
var ix, result = [], offset = 0;;
while((ix = word.indexOf(letter, offset)) !== -1) {
result.push([word.substring(0, ix), word.substring(ix, word.length)]);
offset = ix+1;
}
result.push([word.substring(0, offset), word.substring(offset, word.length)]);
return result;
}
/*
Adds letters to the word given our set of rules. Adds them starting left to right, will only add if the letter isnt found
*/
function add_letters(word, addl) {
var rIx = 0;
for (var i = 0; i < addl.length; i++) {
var foundIndex = word.indexOf(addl[i], rIx);
if (foundIndex == -1) {
word = word.substring(0, rIx) + addl[i] + word.substring(rIx, word.length);
rIx += addl[i].length;
} else {
rIx = foundIndex + addl[i].length;
}
}
return word;
}
// For each of our words, merge them together
var joinedWords = words[0];
for (var i = 1; i < words.length; i++) {
joinedWords = join_word(joinedWords, words[i]);
}
return joinedWords;
}
A first try, not really optimized (183% shorter):
function getShort(arr){
var perfect="";
//iterate the array
arr.forEach(function(string){
//iterate over the characters in the array
string.split("").reduce(function(pos,char){
var n=perfect.indexOf(char,pos+1);//check if theres already a possible char
if(n<0){
//if its not existing, simply add it behind the current
perfect=perfect.substr(0,pos+1)+char+perfect.substr(pos+1);
return pos+1;
}
return n;//continue with that char
},-1);
})
return perfect;
}
In action
This can be improved trough simply running the upper code with some variants of the array (200% improvement):
var s=["jack",...];
var perfect=null;
for(var i=0;i<s.length;i++){
//shift
s.push(s.shift());
var result=getShort(s);
if(!perfect || result.length<perfect.length) perfect=result;
}
In action
Thats quite close to the minimum number of characters ive estimated ( 244% minimization might be possible in the best case)
Ive also wrote a function to get the minimal number of chars and one to check if a certain word fails, you can find them here
I have used the idea of Dynamic programming to first generate the shortest possible string in forward direction as stated in OP. Then I have combined the result obtained in the previous step to send as a parameter along with the next String in the list. Below is the working code in java. Hope this would help to reach the most optimal solution, in case my solution is identified to be non optimal. Please feel free to report any countercases for the below code:
public String shortestPossibleString(String a, String b){
int[][] dp = new int[a.length()+1][b.length()+1];
//form the dynamic table consisting of
//length of shortest substring till that points
for(int i=0;i<=a.length();i++){
for(int j=0;j<=b.length();j++){
if(i == 0)
dp[i][j] = j;
else if(j == 0)
dp[i][j] = i;
else if(a.charAt(i-1) == b.charAt(j-1))
dp[i][j] = 1+dp[i-1][j-1];
else
dp[i][j] = 1+Math.min(dp[i-1][j],dp[i][j-1]);
}
}
//Backtrack from here to find the shortest substring
char[] sQ = new char[dp[a.length()][b.length()]];
int s = dp[a.length()][b.length()]-1;
int i=a.length(), j=b.length();
while(i!=0 && j!=0){
// If current character in a and b are same, then
// current character is part of shortest supersequence
if(a.charAt(i-1) == b.charAt(j-1)){
sQ[s] = a.charAt(i-1);
i--;
j--;
s--;
}
else {
// If current character in a and b are different
if(dp[i-1][j] > dp[i][j-1]){
sQ[s] = b.charAt(j-1);
j--;
s--;
}
else{
sQ[s] = a.charAt(i-1);
i--;
s--;
}
}
}
// If b reaches its end, put remaining characters
// of a in the result string
while(i!=0){
sQ[s] = a.charAt(i-1);
i--;
s--;
}
// If a reaches its end, put remaining characters
// of b in the result string
while(j!=0){
sQ[s] = b.charAt(j-1);
j--;
s--;
}
return String.valueOf(sQ);
}
public void getCombinedString(String... values){
String sSQ = shortestPossibleString(values[0],values[1]);
for(int i=2;i<values.length;i++){
sSQ = shortestPossibleString(values[i],sSQ);
}
System.out.println(sSQ);
}
Driver program:
e.getCombinedString("jack", "apple", "maven", "hold",
"solid", "mark", "moon", "poor", "spark", "live");
Output:
jmapphsolivecparkonidr
Worst case time complexity of the above solution would be O(product of length of all input strings) when all strings have all characters distinct and not even a single character matches between any pair of strings.
Here is an optimal solution based on dynamic programming in JavaScript, but it can only get through solid on my computer before it runs out of memory. It differs from #CodeHunter's solution in that it keeps the entire set of optimal solutions after each added string, not just one of them. You can see that the number of optimal solutions grows exponentially; even after solid there are already 518,640 optimal solutions.
const STRINGS = ["jack", "apple", "maven", "hold", "solid", "mark", "moon", "poor", "spark", "live"]
function map(set, f) {
const result = new Set
for (const o of set) result.add(f(o))
return result
}
function addAll(set, other) {
for (const o of other) set.add(o)
return set
}
function shortest(set) { //set is assumed non-empty
let minLength
let minMatching
for (const s of set) {
if (!minLength || s.length < minLength) {
minLength = s.length
minMatching = new Set([s])
}
else if (s.length === minLength) minMatching.add(s)
}
return minMatching
}
class ZipCache {
constructor() {
this.cache = new Map
}
get(str1, str2) {
const cached1 = this.cache.get(str1)
if (!cached1) return undefined
return cached1.get(str2)
}
set(str1, str2, zipped) {
let cached1 = this.cache.get(str1)
if (!cached1) {
cached1 = new Map
this.cache.set(str1, cached1)
}
cached1.set(str2, zipped)
}
}
const zipCache = new ZipCache
function zip(str1, str2) {
const cached = zipCache.get(str1, str2)
if (cached) return cached
if (!str1) { //str1 is empty, so only choice is str2
const result = new Set([str2])
zipCache.set(str1, str2, result)
return result
}
if (!str2) { //str2 is empty, so only choice is str1
const result = new Set([str1])
zipCache.set(str1, str2, result)
return result
}
//Both strings start with same letter
//so optimal solution must start with this letter
if (str1[0] === str2[0]) {
const zipped = zip(str1.substring(1), str2.substring(1))
const result = map(zipped, s => str1[0] + s)
zipCache.set(str1, str2, result)
return result
}
//Either do str1[0] + zip(str1[1:], str2)
//or str2[0] + zip(str1, str2[1:])
const zip1 = zip(str1.substring(1), str2)
const zip2 = zip(str1, str2.substring(1))
const test1 = map(zip1, s => str1[0] + s)
const test2 = map(zip2, s => str2[0] + s)
const result = shortest(addAll(test1, test2))
zipCache.set(str1, str2, result)
return result
}
let cumulative = new Set([''])
for (const string of STRINGS) {
console.log(string)
const newCumulative = new Set
for (const test of cumulative) {
addAll(newCumulative, zip(test, string))
}
cumulative = shortest(newCumulative)
console.log(cumulative.size)
}
console.log(cumulative) //never reached

JavaScript: check if an array is a subsequence of another array (write a faster naïve string search algo)

[5, 4, 4, 6].indexOfArray([4, 6]) // 2
['foo', 'bar', 'baz'].indexOfArray(['foo', 'baz']) // -1
I came up with this:
Array.prototype.indexOfArray = function(array) {
var m = array.length;
var found;
var index;
var prevIndex = 0;
while ((index = this.indexOf(array[0], prevIndex)) != -1) {
found = true;
for (var i = 1; i < m; i++) {
if (this[index + i] != array[i]) {
found = false;
}
}
if (found) {
return index;
}
prevIndex = index + 1
}
return index;
};
Later I have find wikipedia calls it Naïve string search:
In the normal case, we only have to look at one or two characters for each wrong position to see that it is a wrong position, so in the average case, this takes O(n + m) steps, where n is the length of the haystack and m is the length of the needle; but in the worst case, searching for a string like "aaaab" in a string like "aaaaaaaaab", it takes O(nm) steps.
Can someone write a faster indexOfArray method in JavaScript?
The algorithm you want is the KMP algorithm (http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm) used to find the starting index of a substring within a string -- you can do exactly the same thing for an array.
I couldn't find a javascript implementation, but here are implementations in other languages http://en.wikibooks.org/wiki/Algorithm_implementation/String_searching/Knuth-Morris-Pratt_pattern_matcher -- it shouldn't be hard to convert one to js.
FWIW: I found this article a good read Efficient substring searching It discusses several variants of Boyer-Moore although it's not in JavaScript. The Boyer-Moore-Horspool variant (by Timo Raita’s -- see first link for link) was going to be my "suggestion" for a potential practical speed gain (does not reduce big-O though -- big-O is upper limit only!). Pay attention to the Conclusion at the bottom of the article and the benchmarks above.
I'm mainly trying to put up opposition for the Knuth-Morris-Pratt implementation ;-)

Categories