Improve Regex to match duplicates in a list - javascript

I am using a regex to find dupliates in a list. It is only a short comma seperated list, and performance is not an issue, so there is no need to tell me I should not use regex for those reasons.
// returns a match because some is repeated
"some,thing,here,some,whatever".match(/(^|,)(.+?)(,|,.+,)\2(,|$)/g)
Questions...
Can this regex be improved?
Does it cover all possible scenarios where comma is not in the seperated strings
Is there a better (preferably more readable and more efficient) way to do this?

I don't see the purpose of using regexes here, unless you like unimaginable pain. If I had to find duplicates I would
Obtain an array of words
var words = "...".split(',');
optionally lowercase everything, if you feel like doing that
sort the array
words.sort()
Duplicates should now all be in consecutive positions of the array.
As an extra advantage, I`m pretty sure this would be vastly more efficient than a regex version.

If I wanted to find dups in a comma separated list, I'd do it like this, using the hash capabilities of an object to accumulate unique values and detect dups:
function getDups(list) {
var data = list.split(",");
var uniques = {}, dups = {}, item, uniqueList = [];
for (var i = 0; i < data.length; i++) {
item = data[i];
if (uniques[item]) {
// found dup
dups[item] = true;
} else {
// found unique item
uniques[item] = true;
}
}
// at the end here, you'd have an object called uniques with all the unique items in it
// you could turn that back into an array easily if you wanted to
// Since it uses the object hash for dup detection, it scales to large numbers of items just fine
// you can return whatever you want here (a new list, a list of dups, etc...)
// in this implementation, I chose to return an array of unique values
for (var key in uniques) {
uniqueList.push(key);
}
return(uniqueList); // return array of unique values
}
var list = "some,thing,here,some,whatever";
getDups(list);
Here's a jsFiddle that shows it working: http://jsfiddle.net/jfriend00/NGQCz/
This type of implementation scales well with large numbers of words because the dup detection is relatively efficient.

Related

Javascript making combinations with an array

I am trying to find all combinations and create an array of string from an array. For example:
var text_array = [["h"],["e","è","é","ê","ë"],["l"],["l"],["o","ò","ó","ô","õ"]]
output = ["hello","hèllo","héllo",etc...]
I have tried several ways but they all seemed extremely long winded and I think I'm just maybe missing a function I don't know about.
One way to abstract the implementation from the number of characters is to incrementally build the list of final values and the values themselves. You start with an empty result, [""]. Then you take the first list of variants and add each variant to every result we have so far, producing a new list of intermediate results, ["h"].
The second list of variants has multiple elements, so after this iteration you'll have this list of results, ["he", "hè", "hé", "hê", "hë"]. And so on.
In pseudo-code, it could look like this:
results = [""]
for each list_of_variants in text_array {
new_results = []
for each variant in list_of_variants {
for each result in results {
new_result.push(result + variant)
}
}
results = new_results
}

How get the probability of shuffling an array without getting duplicate Values in javascript

In javascript, I am little bit confused that how to get the actual and accurate probability of shuffling an object in an array.
For example
var numberOfOrder=[
{
id:1
},
{
id:2
},
{
id:3
}
]
From above example The above object can be manipulated in 6 ways By finding the factorial numberOfOrder.length;
But what is the actual way to shuffle that object in an array.
My Try
function newShuffle(value) {
for(var i = value.length-1;i >=0; i--){
var randomIndex = Math.floor(Math.random()*(i+1));
var itemAtIndex = value[randomIndex];
value[randomIndex] = value[i];
value[i] = itemAtIndex
}
return value
}
But the above function won't return accurate value if I run that function 6 times it returning Duplicate Values
What is the correct function to do it
You have to understand the difference between probability and permutations. The second term comes from combinatorics. There are some algorithms that allow to get all possible permutations of array items. Here is one of them:
function permutations(items) {
// single item array - no permutations available
if (items.length == 1) return [items];
var combos = [];
for (var i = 0; i < items.length; i++) {
// first - the current item, rest - array without the current item
var first = items[i], rest = items.slice(0);
rest.splice(i, 1);
// getting permutations of shorter array and for each of them...
permutations(rest).forEach(function(combo){
// prepend the current item
combo.unshift(first);
// save the permutation
combos.push(combo);
});
}
return combos;
}
alert(permutations([ 1, 2, 3 ]).join("\n"));
Update
The recursive algorithm is implemented above. The function permutations gets an array and for each item recursively gets all permutations beginning with current item. At each step of recursion the array is shorter by one item (minus the current item) and at the last step single element array is not being processed because permutations are not available anymore.
Also some comments added to the code.
The last line is just the test to get all permutations of array [1, 2, 3] and to show them via alert. To get more illustrative view all found permutations are glued with new line symbol (.join("\n")).
As stated by the comments and the above answer you need permutations operation. However there are many ways to obtain the permutations of an array. For further information on permutations i would advise you to have a look at Permutations in JavaScript topic.
On the other hand a recursive approach is always much slower compared to a dynamical programming approach. Recently i have come up with a permutations algorithm which seems to be the fastest of all. Check it up

Sorting a dynamically filled array of objects

I have an array that is initialized like such var generationObject = [{string:"", score: 0}];
which I then fill dynamically:
for(var i = 0; i < amount_offspring; i++)
{
// "load" text into array and send the string to see if it evolves
generationObject[i].string = evolve(start_text, characters, mutation_rate);
// then score the string
generationObject[i].score = score(target_text, generationObject.string);
}
I then want to sort this array by score. I don't know what's best, to sort it in the for loop or sort the entire array afterwards.
I will then take the string of the highest scoring object and pass it through the function again, recursively.
So what would be a good way to go about this sort function? I've seen some here use this
generationObject.sort(function(a, b) {
return (a.score) - (b.score);
});
But I'm not sure if .sort is still supported? This didnt seem to work for me though.
generationObject is an array, not an object, so score(target_text, generationObject.string); could be the problem, as .string will be undefined. (Did you mean generationObject[i].string?)
Try building your array like this:
var generationObject = []
for(var i = 0; i < amount_offspring; i++)
{
evolved_string = evolve(start_text, characters, mutation_rate)
generationObject.push({
string: evolved_string,
score: score(target_text, evolved_string)
})
}
And then Array.prototype.sort should do the trick.
You should write your sorting logic outside the for loop, since if you put it inside, the object array will be sorted N times, where N being the iterations of your loop. The following are two ways to do it-
By using sort() function- To clarify your question, sort() is still supported across almost all the browsers. If you are still concerned about the browser compatibility, you can check the MDN documentation to see the list of supported browsers.
generationObject = generationObject.sort(function(a, b) {
return parseInt(a.score) - parseInt(b.score);
});
By using underscorejs-
In underscore, you can take advantage of the sortBy() function.
Returns a (stably) sorted copy of list, ranked in ascending order by the results of running each value through iteratee. iteratee may also be the string name of the property to sort by (eg. length).
You can simply do this in underscorejs-
generationObject = _.sortBy(generationObj, 'score');

Which javascript structure has a faster access time for this particular case?

I need to map specific numbers to string values. These numbers are not necessarily consecutive, and so for example I may have something like this:
var obj = {};
obj[10] = "string1";
obj[126] = "string2";
obj[500] = "string3";
If I'm doing a search like this obj[126] would it be faster for me to use an object {} or an array []?
There will be no difference. ECMAScript arrays, if sparse (that is don't have consecutive indices set) are implemented as hash tables. In any case, you are guaranteed the O(n) access time, so this shouldn't concern you at all.
I created a microbenchmark for you - check out more comprehensive test by #Bergi. On my browser object literal is a little bit slower, but not significantly. Try it yourself.
A JS-array is a object, so it should not matter what you choose.
Created a jsperf test (http://jsperf.com/array-is-object) to demonstrate this.
Definetely an object should be the best choice.
If you have such code:
var arr = [];
arr[10] = 'my value';
, your array becomes an array of 11 values
alert(arr.length); // will show you 11
, where first 10 are undefined.
Obviously you don't need an array of length 1000 to store just
var arr = [];
arr[999] = 'the string';
Also I have to notice that in programming you have to chose an appropriate classes for particular cases.
Your task is to make a map of key: value pairs and object is the better choice here.
If your task was to make an ordered collection, then sure you need an array.
UPDATE:
Answering to your question in comments.
Imagine that you have two "collections" - an array and an object. Each of them has only one key/index equal to 999.
If you need to find a value, you need to iterate through your collection.
For array you'll have 999 iterations.
For object - only one iteration.
http://jsfiddle.net/f0t0n/PPnKL/
var arrayCollection = [],
objectCollection = {};
arrayCollection[999] = 1;
objectCollection[999] = 1;
var i = 0,
l = arrayCollection.length;
for(; i < l; i++) {
if(arrayCollection[i] == 1) {
alert('Count of iterations for array: ' + i); // displays 999
}
}
i = 0;
for(var prop in objectCollection) {
i++;
if(objectCollection[prop] == 1) {
alert('Count of iterations for object: ' + i); // displays 1
}
}
​
Benchmark
​
In total:
You have to design an application properly and take into account possible future tasks which will require some different manipulations with your collection.
If you'll need your collection to be ordered, you have to chose an array.
Otherwise an object could be a better choice since the speed of access to its property is roughly same as a speed of access to array's item but the search of value in object will be faster than in sparse array.

JavaScript Multidimensional arrays - column to row

Is it possible to turn a column of a multidimensional array to row using JavaScript (maybe Jquery)? (without looping through it)
so in the example below:
var data = new Array();
//data is a 2D array
data.push([name1,id1,major1]);
data.push([name2,id2,major2]);
data.push([name3,id3,major3]);
//etc..
Is possible to get a list of IDs from data without looping? thanks
No, it is not possible to construct an array of IDs without looping.
In case you were wondering, you'd do it like this:
var ids = [];
for(var i = 0; i < data.length; i++)
ids.push(data[i][1]);
For better structural integrity, I'd suggest using an array of objects, like so:
data.push({"name": name1, "id": id1, "major":major1});
data.push({"name": name2, "id": id2, "major":major2});
data.push({"name": name3, "id": id3, "major":major3});
Then iterate through it like so:
var ids = [];
for(var i = 0; i < data.length; i++)
ids.push(data[i].id);
JavaScript doesn't really have multidimensional arrays. What JavaScript allows you to have is an array of arrays, with which you can interact as if it was a multidimensional array.
As for your main question, no, you would have to loop through the array to get the list of IDs. It means that such an operation cannot be done faster than in linear time O(n), where n is the height of the "2D array".
Also keep in mind that arrays in JavaScript are not necessarily represented in memory as contiguous blocks. Therefore any fast operations that you might be familiar with in other low level languages will not apply. The JavaScript programmer should treat arrays as Hash Tables, where the elements are simply key/value pairs, and the keys are the indices (0, 1, 2...). You can still access/write elements in constant time O(1) (at least in modern JavaScript engines), but copying of elements will often be done in O(n).
You could use the Array map function which does the looping for you:
var ids = data.map(function(x) { return x[1] });
Unfortunately, like everything else on the web that would be really nice to use, INTERNET EXPLORER DOESN'T SUPPORT IT.
See this page for details on how the map function works:
https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Array/map
The good news it that the link above provides some nice code in the "Compatibility" section which will check for the existence of Array.prototype.map and define it if it's missing.
You don't need anything special- make a string by joining with newlines, and match the middle of each line.
var data1=[['Tom Swift','gf102387','Electronic Arts'],
['Bob White','ea3784567','Culinarey Arts'],
['Frank Open','bc87987','Janitorial Arts'],
['Sam Sneer','qw10214','Some Other Arts']];
data1.join('\n').match(/([^,]+)(?=,[^,]+\n)/g)
/* returned value: (Array)
gf102387,ea3784567,bc87987
*/

Categories