javascript - Set vs Map - which is faster? - javascript

Set and Map both are newer data types in es6 and for certain situations both can be used.
e.g - if i want to store all unique elements, i can use Set as well as Map with true as value.
const data: string[] ;
// console.log('data', data[0])
const set = new Set();
const map = new Map<string, boolean>();
data.forEach((item) => {
map.set(item, true);
});
data.forEach((item) => {
set.add(item);
});
Both works, but i was wondering which one is faster ?
Update 1
I am looking for which of the data structure is faster in case of storing data.
checking if value exist using -
map.has(<value>)
set.has(<value>)
deleting values
Also i can understand true is redundant and not used anywhere, but i am just trying to show how map and set can be used alternatively.
What matters is speed.

In the most basic sense:
Maps are for holding key-value pairs
Sets are for holding values
The true in your map is completely redundant ... if a key exists, that automatically implies, that it is true/exists - so you will never ever need to use the value of the key-value pair in the map (so why use the map at all, if you're never gonna make use of what it is actually for? - to me that sounds like a set/array with extra steps)
If you just want to store values use an array or set. - Which of the two depends on what you are trying to do.
The question of "which is faster" can't really be answered properly, because it largely depends on what you are trying to do with the stored values. (What you are trying to do also determines what data structure to use)
So choose whatever data structure you think fits your needs best, and when you run into a problem that another would fix, you can always change it later/convert from one into another.
And the more you use them and the more you see what they can and can not do, the better you'll get at determining which to use from the start (for a given problem)

Related

accessing and removing objects by ID

I have certain requirements , I wanted to do the following in quickest way as possible.
I have 1000's of objects like below
{id:1,value:"value1"} . . {id:1000,value:"value1000"}
I want to access above objects by id
I want to clean the objects Lesser than certain id every few minutes (Because it generates 1000's of objects every second for my high frequency algorithm)
I can clean easily by using this.
myArray = myArray.filter(function( obj ) {
return obj.id > cleanSize;
});
I can find the object by id using
myArray.find(x => x.id === '45');
Problem is here , I feel that find is little slower when there is larger sets of data.So I created some objects of object like below
const id = 22;
myArray["x" + id] = {};
myArray["x" + id] = { id: id, value:"test" };
so I can access my item by id easily by myArray[x22]; , but problem is i am not able find the way to remove older items by id.
someone guide me better way to achieve the three points I mentioned above using arrays or objects.
The trouble with your question is, you're asking for a way to finish an algorithm that is supposed to solve a problem of yours, but I think there's something fundamentally wrong with the problem to begin with :)
If you store a sizeable amount of data records, each associated with an ID, and allow your code to access them freely, then you cannot have another part of your code dump some of them to the bin out of the blue (say, from within some timer callback) just because they are becoming "too old". You must be sure nobody is still working on them (and will ever need to) before deleting any of them.
If you don't explicitly synchronize the creation and deletion of your records, you might end up with a code that happens to work (because your objects happen to be processed quickly enough never to be deleted too early), but will be likely to break anytime (if your processing time increases and your data becomes "too old" before being fully processed).
This is especially true in the context of a browser. Your code is supposed to run on any computer connected to the Internet, which could have dozens of reasons to be running 10 or 100 times slower than the machine you test your code on. So making assumptions about the processing time of thousands of records is asking for serious trouble.
Without further specification, it seems to me answering your question would be like helping you finish a gun that would only allow you to shoot yourself in the foot :)
All this being said, any JavaScript object inherently does exactly what you ask for, provided you're okay with using strings for IDs, since an object property name can also be used as an index in an associative array.
var associative_array = {}
var bob = { id:1456, name:"Bob" }
var ted = { id:2375, name:"Ted" }
// store some data with arbitrary ids
associative_array[bob.id] = bob
associative_array[ted.id] = ted
console.log(JSON.stringify(associative_array)) // Bob and Ted
// access data by id
var some_guy = associative_array[2375] // index will be converted to string anyway
console.log(JSON.stringify(some_guy)) // Ted
var some_other_guy = associative_array["1456"]
console.log(JSON.stringify(some_other_guy)) // Bob
var some_AWOL_guy = associative_array[9999]
console.log(JSON.stringify(some_AWOL_guy)) // undefined
// delete data by id
delete associative_array[bob.id] // so long, Bob
console.log(JSON.stringify(associative_array)) // only Ted left
Though I doubt speed will really be an issue, this mechanism is about as fast as you will ever get JavaScript to run, since the underlying data structure is a hash table, theoretically O(1).
Anything involving array methods like find() or filter() will run in at least O(n).
Besides, each invocation of filter() would waste memory and CPU recreating the array to no avail.

How to speed up performance of autocomplete from indexeddb database

I have jQuery autocomplete field that has to search through several thousand items, populated from an IndexedDB query (using the idb wrapper). The following is the autocomplete function called when the user begins typing in the box. hasKW() is a function that finds keywords.
async function siteAutoComplete(request, response) {
const db = await openDB('AgencySite');
const hasKW = createKeyWordFunction(request.term);
const state = "NY";
const PR = 0;
const agency_id = 17;
const range = IDBKeyRange.bound([state, PR, agency_id], [state, PR, agency_id || 9999999]);
let cursor = await db.transaction('sites').store.index("statePRAgency").openCursor(range);
let result = [];
while (cursor) {
if (hasKW(cursor.value.name)) result.push({
value: cursor.value.id,
label: cursor.value.name
});
cursor = await cursor.continue();
}
response(result);
}
My question is this: I'm not sure if the cursor is making everything slow. Is there a way to get all database rows that match the query without using a cursor? Is building the result array slowing me down? Is there a better way of doing this? Currently it takes 2-3s to show the autocomplete list.
I hope this will be useful to someone else. I removed the cursor and just downloaded the whole DB into a javascript array and then used .filter. The speedup was dramatic. It took 2300ms using the way above and about 21ms using this:
let result = await db.transaction('sites').store.index("statePRAgency").getAll();
response(result.filter(hasKW));
You probably want to use an index, where by the term index, I mean a custom built one that represents a search engine index. You cannot easily and efficiently perform "startsWith" style queries over one of indexedDB's indices because they are effectively whole value (or least lexicographic).
There are many ways to create the search engine index I am suggesting. You probably want something like a prefix-tree, also known informally as a trie.
Here is a nice article by John Resig that you might find helpful: https://johnresig.com/blog/javascript-trie-performance-analysis/. Otherwise, I suggest searching around on Google for trie implementations and then figuring out how to represent a similar data structure within an indexedDb object store or indexdDb index on an object store.
Essentially, insert the data first without the properties used by the index. Then, in an "indexing step", visit each object and index its value, and set the properties used by the indexedDb index. Or do this at time of insert/update.
From there, you probably want to open a connection shortly after page load and keep it open for the entire duration of the page. Then query against the index every time a character is typed (probably want to rate limit this call to refrain from querying more than n/second, perhaps using some kind of debounce helper function).
On the other hand, I might be a bit rusty on this one, but maybe you can create an index on the string prop, then use a lower bound that is the entered characters. A string that is lesser length than another string that contains it is present earlier in lexicographic order. So maybe it is actually that easy. You would also need to impose an upper bound that contains the entered characters thus far concatenated with some kind of sentinel value that can never realistically exist in the data, something silly like ZZZZZ.
Try this out in the browser's console:
indexedDB.cmp('test', 'tasting'); // 1
indexedDB.cmp('test', 'testing'); // -1
indexedDB.cmp('test', 'test'); // 0
You essentially want to experiment with a query like this:
const sentinel = 'ZZZ';
const index = store.index('myStore');
const bounds = IDBKeyRange.bound(value, value + sentinel);
const request = index.get(bounds);
You might need to tweak the sentinel, experiment with other parameters to IDBKeyRange.bound (the inclusive/exclusive flags), probably need to store the value in homogenized case so that the search is case insensitive, avoid every sending a query when nothing has been typed, etc.

update, instead of replace, list used for ng-repeat

How it is
I have an array of objects called vm.queued_messages (vm is set to this in my controller), and vm.queued_messages is used in ng-repeat to display a list of div's.
When I make an API call which changes the underlying model in the database, I have the API call return a fresh list of queued messages, and in my controller I set the variable vm.queued_messages to that new value, that fresh list of queued messages.
vm.queued_messages = data; // data is the full list of new message objects
The problem
This "full replacement" of vm.queued_messages worked exactly as I wanted, at first. But what I didn't think about was the fact that even objects in that list which had no changes to any properties were leaving and new objects were taking their place. This made no different to the display because the new objects had identical keys and values, they were technically different objects, and thus the div's were secretly leaving and entering every time. THIS MEANS THERE ARE MANY UNWANTED .ng-enter's AND .ng-leave's OCCURRING, which came to my attention when I tried to apply an animation to these div's when they entered or left. I would expect a single div to do the .ng-leave animation on some click, but suddenly a bunch of them did!
My solution attempt
I made a function softRefreshObjectList which updates the keys and values (as well as any entirely new objects, or now absent objects) of an existing list to match those of a new list, WITHOUT REPLACING THE OBJECTS, AS TO MAINTAIN THEIR IDENTITY. I matched objects between the new list and old list by their _id field.
softRefreshObjectList: function(oldObjs, newObjs) {
var resultingObjList = [];
var oldObjsIdMap = {};
_.each(oldObjs, function(obj) {
oldObjsIdMap[obj._id] = obj;
});
_.each(newObjs, function(newObj) {
var correspondingOldObj = oldObjsIdMap[newObj._id];
if (correspondingOldObj) {
// clear out the old obj and put in the keys/values from the new obj
for (var key in correspondingOldObj) delete correspondingOldObj[key];
for (var key in newObj) correspondingOldObj[key] = newObj[key];
resultingObjList.push(correspondingOldObj);
} else {
resultingObjList.push(newObj);
};
});
return resultingObjList;
}
which works for certain things, but with other ng-repeat lists I get odd behavior, I believe because of the delete's and values of the objects being references to other controller variables. Before continuing down this rabbit hole, I want to make this post in case I'm thinking about this wrong, or there's something I'm missing.
My question
Is there a more appropriate way to handle this case, which would either make it easier to handle, or bypass my issue altogether?
Perhaps a way to signal to Angular that these objects are identified by their _id instead of their reference, so that it doesn't make them leave and enter as long as the _id doesn't change.
Or perhaps a better softRefreshObjectList function which iterates through the objects differently, if there's something fishy about how I'm doing it.
Thanks to Petr's comment, I now know about track by for ng-repeat. It's where you can specify a field in your elements that "identifies" that element, so that angular can know when that element really is leaving or entering. In my case, that field was _id, and adding track by message._id to my ng-repeat (ng-repeat="message in ctrl.queued_messages track by message._id") solved my issue perfectly.
Docs here. Search for track by.

High performance JS map for int-string pairs

I need a high performance map in Javascript (hashmap or whatever) that maps int(s) to string(s). The map would be used to build some sections of the webpage after dom is ready. I know simple javascript object also works like a map but I need to work with best performance.
I would like to initialize the map with all data pairs just at once by appending a string to the webpage while generating the response page from server.
Any ways how to improve performance of javascript map for int- string pairs or are there any implementations for the same ?
--
Using jQuery 1.7
Ok, I'll post it here since it's more of an answer:
Use an array. Taken into account that any implementation will have to use js primitives and objects, you'll be hard pressed to find something more performant than that.
Arrays in most (all?) implementations of javascript can be sparse. So array.length will return the index of last element + 1, but in sparse case the array will not have all elements allocated and will use object property semantics to access it's elements (meaning, it's effectively a hashtable with ints as keys).
It basically gives you the behavior you're looking for.
In case of negative ints, use a second array.
In relation to a single statement initialization: you can't do it in general, since it bases itself on implicitly knowing the item index.
What you can do is to append something along the lines:
var arr = [];
arr[int1] = val1;
arr[int2] = val2;
arr[int3] = val3;
arr[int4] = val4;
...
arr[intn] = valn;
I mean you have to list (Number, String) pairs somehow anyway.
Please check out this jperf test case, and draw your conclusion.
Objects are also sparse. Arrays are simply specialized objects that account for their own length among other things.
I think you should use the following
var l_map = {};
to add an element use
l_map[<your integer>] = <your string>
and to retrieve is
var l_value = l_map[<your integer>];
This is one way to solve your problem.
The second way is quite simple just use an array (or list) because it stores the values based on position as follows:
var l_array = [];
to add element at the last use : l_array.push(<your string>);
to add element at a specified position : l_array.splice(<position>,0,<your string>);
and to retrieve use : l_array[<posit>];

CouchDB - Variables in map function

I am quite new to CouchDB and have a very basic question:
Is there any possibility to pass a variable from the client into the map function, e.g.:
function (doc, params) {
if (doc.property > params.property) emit(doc, null);
}
Thanks for your help,
Christian
While Dominic's answer is true, the example in the actual question can probably be implemented as a map function with an appropriate key and a query that includes a startkey. So if you want the functionality that you show in your example you should change your view to this:
function(doc) {
if( doc.property )
emit( doc.property, null);
}
And then your query would become:
/db_name/_design/view_doc/_view/view_name?startkey="property_param"&include_docs=true
Which would give you what your example suggests you're after.
This is the key (puns are funny) to working with CouchDB: create views that allow you to select subsets of the view based on the key using either key, keys or some combination of startkey and/or endkey
No, map functions are supposed to create indexes that always take the same input and yield the same output so they can remain incremental. (and fast)
If you need to do some sort of filtering on the results of a view, consider using a _list function, as that can take client-supplied querystring variables and use them in their transformation.

Categories