accessing and removing objects by ID - javascript

I have certain requirements , I wanted to do the following in quickest way as possible.
I have 1000's of objects like below
{id:1,value:"value1"} . . {id:1000,value:"value1000"}
I want to access above objects by id
I want to clean the objects Lesser than certain id every few minutes (Because it generates 1000's of objects every second for my high frequency algorithm)
I can clean easily by using this.
myArray = myArray.filter(function( obj ) {
return obj.id > cleanSize;
});
I can find the object by id using
myArray.find(x => x.id === '45');
Problem is here , I feel that find is little slower when there is larger sets of data.So I created some objects of object like below
const id = 22;
myArray["x" + id] = {};
myArray["x" + id] = { id: id, value:"test" };
so I can access my item by id easily by myArray[x22]; , but problem is i am not able find the way to remove older items by id.
someone guide me better way to achieve the three points I mentioned above using arrays or objects.

The trouble with your question is, you're asking for a way to finish an algorithm that is supposed to solve a problem of yours, but I think there's something fundamentally wrong with the problem to begin with :)
If you store a sizeable amount of data records, each associated with an ID, and allow your code to access them freely, then you cannot have another part of your code dump some of them to the bin out of the blue (say, from within some timer callback) just because they are becoming "too old". You must be sure nobody is still working on them (and will ever need to) before deleting any of them.
If you don't explicitly synchronize the creation and deletion of your records, you might end up with a code that happens to work (because your objects happen to be processed quickly enough never to be deleted too early), but will be likely to break anytime (if your processing time increases and your data becomes "too old" before being fully processed).
This is especially true in the context of a browser. Your code is supposed to run on any computer connected to the Internet, which could have dozens of reasons to be running 10 or 100 times slower than the machine you test your code on. So making assumptions about the processing time of thousands of records is asking for serious trouble.
Without further specification, it seems to me answering your question would be like helping you finish a gun that would only allow you to shoot yourself in the foot :)
All this being said, any JavaScript object inherently does exactly what you ask for, provided you're okay with using strings for IDs, since an object property name can also be used as an index in an associative array.
var associative_array = {}
var bob = { id:1456, name:"Bob" }
var ted = { id:2375, name:"Ted" }
// store some data with arbitrary ids
associative_array[bob.id] = bob
associative_array[ted.id] = ted
console.log(JSON.stringify(associative_array)) // Bob and Ted
// access data by id
var some_guy = associative_array[2375] // index will be converted to string anyway
console.log(JSON.stringify(some_guy)) // Ted
var some_other_guy = associative_array["1456"]
console.log(JSON.stringify(some_other_guy)) // Bob
var some_AWOL_guy = associative_array[9999]
console.log(JSON.stringify(some_AWOL_guy)) // undefined
// delete data by id
delete associative_array[bob.id] // so long, Bob
console.log(JSON.stringify(associative_array)) // only Ted left
Though I doubt speed will really be an issue, this mechanism is about as fast as you will ever get JavaScript to run, since the underlying data structure is a hash table, theoretically O(1).
Anything involving array methods like find() or filter() will run in at least O(n).
Besides, each invocation of filter() would waste memory and CPU recreating the array to no avail.

Related

Most efficient way to link an object in one array with an object in another array

Beginner and self-taught coder here (always open to learning please correct me) and I'm making a web app through pretty much exclusively HTML CSS and Javascript (I don't really want to use PHP or hosting-side processing because I don't know much about web security and it makes me nervous about uploading data to my hosted site).
Very unsure about the most efficient way to do this so I'm going to try to describe it below and I'd really appreciate your input.
My main question: Is there a more efficient way to do this?
The app eventually will have a javascript canvas, where it will draw an object ('track') at a specific location. This object will then move to another location based off nested data in an array ('step') when the user moves to the next item in an array.
As of now, how I'm going about it is having:
storing the location values in the steps array
have an array of 'tracks' for what shape/color/etc will be drawn on the canvas
linking the two elements by an arbitrary ID that is in both 'steps array' and 'tracks' array
A visual representation of what this might look like
steps[stepNumber].movedTracksInStep[movedTracksInStepNumber] holds object:
{track ID,
X location,
y location}
separate array trackList
trackList[trackNumber] holds object:
{track ID,
shape,
color,
bunchastuff}
I choose to do it like this because I figured it would be better to store the location in the steps array, store the visual data in a separate array, so that way it's not repeating the same data every step.
My question:
Is there a more efficient way to do this, especially in terms of search functions? I'm a newbie so there very well might be something I am missing.
Currently, I just have to search through all of the ID tracks in the step and see if there is a match. I'm wondering if there is a more direct way to link the two together than having to search each time.
I've thought about perhaps having all the data for the visual representation in the first step and then not having to repeat it (though I'm not quite sure how that would work), or having the numbers of arrays match up (but this would change if the user deletes a track or adds a track).
Thank you! Let me know if you need me to explain more.
Objects in JS are stored and copied "by reference", so if you assign value of one object to another, value will not be copied, but reference link will be created. Below is the example close to your code, check inline comments. And you can adopt this behavior to your task:
// Your tracks information
const trackList = {
1: {
shape: "rect",
color: "green",
bunchastuff: "foo"
}
};
// Your steps data
const steps = {
1: {
1: {
// Here we create reference to track 1 in
// trackList object data, without copying it
track: trackList[1],
x: 100,
y: 50
}
}
};
// Print step info
console.log("before track info edit:", steps[1][1].track);
// Update data in track 1
trackList[1].shape = "round";
// Print step info again and we'll
// see, that it also updated
console.log("after track info edit:", steps[1][1].track);
You can read more about object references here: https://javascript.info/object-copy

How to speed up performance of autocomplete from indexeddb database

I have jQuery autocomplete field that has to search through several thousand items, populated from an IndexedDB query (using the idb wrapper). The following is the autocomplete function called when the user begins typing in the box. hasKW() is a function that finds keywords.
async function siteAutoComplete(request, response) {
const db = await openDB('AgencySite');
const hasKW = createKeyWordFunction(request.term);
const state = "NY";
const PR = 0;
const agency_id = 17;
const range = IDBKeyRange.bound([state, PR, agency_id], [state, PR, agency_id || 9999999]);
let cursor = await db.transaction('sites').store.index("statePRAgency").openCursor(range);
let result = [];
while (cursor) {
if (hasKW(cursor.value.name)) result.push({
value: cursor.value.id,
label: cursor.value.name
});
cursor = await cursor.continue();
}
response(result);
}
My question is this: I'm not sure if the cursor is making everything slow. Is there a way to get all database rows that match the query without using a cursor? Is building the result array slowing me down? Is there a better way of doing this? Currently it takes 2-3s to show the autocomplete list.
I hope this will be useful to someone else. I removed the cursor and just downloaded the whole DB into a javascript array and then used .filter. The speedup was dramatic. It took 2300ms using the way above and about 21ms using this:
let result = await db.transaction('sites').store.index("statePRAgency").getAll();
response(result.filter(hasKW));
You probably want to use an index, where by the term index, I mean a custom built one that represents a search engine index. You cannot easily and efficiently perform "startsWith" style queries over one of indexedDB's indices because they are effectively whole value (or least lexicographic).
There are many ways to create the search engine index I am suggesting. You probably want something like a prefix-tree, also known informally as a trie.
Here is a nice article by John Resig that you might find helpful: https://johnresig.com/blog/javascript-trie-performance-analysis/. Otherwise, I suggest searching around on Google for trie implementations and then figuring out how to represent a similar data structure within an indexedDb object store or indexdDb index on an object store.
Essentially, insert the data first without the properties used by the index. Then, in an "indexing step", visit each object and index its value, and set the properties used by the indexedDb index. Or do this at time of insert/update.
From there, you probably want to open a connection shortly after page load and keep it open for the entire duration of the page. Then query against the index every time a character is typed (probably want to rate limit this call to refrain from querying more than n/second, perhaps using some kind of debounce helper function).
On the other hand, I might be a bit rusty on this one, but maybe you can create an index on the string prop, then use a lower bound that is the entered characters. A string that is lesser length than another string that contains it is present earlier in lexicographic order. So maybe it is actually that easy. You would also need to impose an upper bound that contains the entered characters thus far concatenated with some kind of sentinel value that can never realistically exist in the data, something silly like ZZZZZ.
Try this out in the browser's console:
indexedDB.cmp('test', 'tasting'); // 1
indexedDB.cmp('test', 'testing'); // -1
indexedDB.cmp('test', 'test'); // 0
You essentially want to experiment with a query like this:
const sentinel = 'ZZZ';
const index = store.index('myStore');
const bounds = IDBKeyRange.bound(value, value + sentinel);
const request = index.get(bounds);
You might need to tweak the sentinel, experiment with other parameters to IDBKeyRange.bound (the inclusive/exclusive flags), probably need to store the value in homogenized case so that the search is case insensitive, avoid every sending a query when nothing has been typed, etc.

How are {} vs [] data structures handled in JavaScript? [duplicate]

Say you have a very simple data structure:
(personId, name)
...and you want to store a number of these in a javascript variable. As I see it you have three options:
// a single object
var people = {
1 : 'Joe',
3 : 'Sam',
8 : 'Eve'
};
// or, an array of objects
var people = [
{ id: 1, name: 'Joe'},
{ id: 3, name: 'Sam'},
{ id: 8, name: 'Eve'}
];
// or, a combination of the two
var people = {
1 : { id: 1, name: 'Joe'},
3 : { id: 3, name: 'Sam'},
8 : { id: 8, name: 'Eve'}
};
The second or third option is obviously the way to go if you have (or expect that you might have) more than one "value" part to store (eg, adding in their age or something), so, for the sake of argument, let's assume that there's never ever going to be any more data values needed in this structure. Which one do you choose and why?
Edit: The example now shows the most common situation: non-sequential ids.
Each solution has its use cases.
I think the first solution is good if you're trying to define a one-to-one relationship (such as a simple mapping), especially if you need to use the key as a lookup key.
The second solution feels the most robust to me in general, and I'd probably use it if I didn't need a fast lookup key:
It's self-describing, so you don't
have to depend on anyone using
people to know that the key is the id of the user.
Each object comes self-contained,
which is better for passing the data
elsewhere - instead of two parameters
(id and name) you just pass around
people.
This is a rare problem, but sometimes
the key values may not be valid to
use as keys. For example, I once
wanted to map string conversions
(e.g., ":" to ">"), but since ":"
isn't a valid variable name I had to
use the second method.
It's easily extensible, in case
somewhere along the line you need to
add more data to some (or all) users.
(Sorry, I know about your "for
argument's sake" but this is an
important aspect.)
The third would be good if you need fast lookup time + some of the advantages listed above (passing the data around, self-describing). However, if you don't need the fast lookup time, it's a lot more cumbersome. Also, either way, you run the risk of error if the id in the object somehow varies from the id in people.
Actually, there is a fourth option:
var people = ['Joe', 'Sam', 'Eve'];
since your values happen to be consecutive. (Of course, you'll have to add/subtract one --- or just put undefined as the first element).
Personally, I'd go with your (1) or (3), because those will be the quickest to look up someone by ID (O logn at worst). If you have to find id 3 in (2), you either can look it up by index (in which case my (4) is ok) or you have to search — O(n).
Clarification: I say O(logn) is the worst it could be because, AFAIK, and implementation could decide to use a balanced tree instead of a hash table. A hash table would be O(1), assuming minimal collisions.
Edit from nickf: I've since changed the example in the OP, so this answer may not make as much sense any more. Apologies.
Post-edit
Ok, post-edit, I'd pick option (3). It is extensible (easy to add new attributes), features fast lookups, and can be iterated as well. It also allows you to go from entry back to ID, should you need to.
Option (1) would be useful if (a) you need to save memory; (b) you never need to go from object back to id; (c) you will never extend the data stored (e.g., you can't add the person's last name)
Option (2) is good if you (a) need to preserve ordering; (b) need to iterate all elements; (c) do not need to look up elements by id, unless it is sorted by id (you can do a binary search in O(logn). Note, of course, if you need to keep it sorted then you'll pay a cost on insert.
Assuming the data will never change, the first (single object) option is the best.
The simplicity of the structure means it's the quickest to parse, and in the case of small, seldom (or never) changing data sets such as this one, I can only imagine that it will be frequently executed - in which case minimal overhead is the way to go.
I created a little library to manage key value pairs.
https://github.com/scaraveos/keyval.js#readme
It uses
an object to store the keys, which allows for fast delete and value retrieval
operations and
a linked list to allow for really fast value iteration
Hope it helps :)
The third option is the best for any forward-looking application. You will probably wish to add more fields to your person record, so the first option is unsuitable. Also, it is very likely that you will have a large number of persons to store, and will want to look up records quickly - thus dumping them into a simple array (as is done in option #2) is not a good idea either.
The third pattern gives you the option to use any string as an ID, have complex Person structures and get and set person records in a constant time. It's definitely the way to go.
One thing that option #3 lacks is a stable deterministic ordering (which is the upside of option #2). If you need this, I would recommend keeping an ordered array of person IDs as a separate structure for when you need to list persons in order. The advantage would be that you can keep multiple such arrays, for different orderings of the same data set.
Given your constraint that you will only ever have name as the value, I would pick the first option. It's the cleanest, has the least overhead and the fastest look up.

Better design for data stored using HTML5 localStorage

I have a scenario on my web application and I would like suggestions on how I could better design it.
I have to steps on my application: Collection and Analysis.
When there is a collection happening, the user needs to keep informed that this collection is going on, and the same with the analysis. The system also shows the 10 last collection and analysis performed by the user.
When the user is interacting with the system, the collections and analysis in progress (and, therefore, the last collections/analysis) keep changing very frequently. So, after considering different ways of storing these informations in order to display them properly, as they are so dynamic, I chose to use HTML5's localStorage, and I am doing everything with JavaScript.
Here is how they are stored:
Collection in Progress: (set by a function called addItem that receives ITEMNAME)
Key: c_ITEMNAME_Storage
Value: c_ITEMNAME
Collection Finished or Error: (set by a function called editItem that also receives ITEMNAME and changes the value of the corresponding key)
Key: c_ITEMNAME_Storage
Value: c_Finished_ITEMNAME or c_Error_ITEMNAME
Collection in the 10 last Collections (set by a function called addItemLastCollections that receives ITEMNAME and prepares the key with the current date and time)
Key: ORDERNUMBER_c_ITEMNAME_DATE_TIME
Value: c_ITEMNAME
Note: The order number is from 0 to 9, and when each collection finishes, it receives the number 0. At the same time, the number 9 is deleted when the addItemLastCollections function is called.
For the analysis is pretty much the same, the only thing that changes is that the "c" becomes an "a".
Anyway, I guess you understood the idea, but if anything is unclear, let me know.
What I want is opinions and suggestions of other approaches, as I am considering this inefficient and impractical, even though it is working fine. I want something easily maintained. I think that sticking with localStorage is probably the best, but not this way. I am not very familiar with the use of Design Patterns in JavaScript, although I use some of them very frequently in Java. If anyone can give me a hand with that, it would be good.
EDIT:
It is a bit hard even for me to explain exactly why I feel it is inefficient. I guess the main reason is because for each case (Progress, Finished, Error, Last Collections) I have to call a method and modify the String (adding underline and more information), and for me to access any data (let's say, the name or the date) of each one of them I need to test to see which case is it and then keep using split( _ ). I know this is not very straightforward but I guess that this whole approach could be better designed. As I am working alone on this part of the software, I don't have anyone that I can discuss things with, so I thought here would be a good place to exchange ideas :)
Thanks in advance!
Not exactly sure what you are looking for. Generally I use localStorage just to store stringified versions of objects that fit my application. Rather than setting up all sorts of different keys for each variable within localStore, I just dump stringified versions of my object into one key in localStorage. That way the data is the same structure whether it comes from server as JSON or I pull it from local.
You can quickly save or retrieve deeply nested objects/arrays using JSON.stringify( object) and JSON.parse( 'string from store');
Example:
My App Object as sent from server as JSON( I realize this isn't proper quoted JSON)
var data={ foo: {bar:[1,2,3], baz:[4,5,6,7]},
foo2: {bar:[1,2,3], baz:[4,5,6,7]}
}
saveObjLocal( 'app_analysis', data);
function saveObjLocal( key, obj){
localStorage.set( key, JSON.stringify(obj)
}
function getlocalObj( key){
return JSON.parse( localStorage.get(key) );
}
var analysisObj= =getlocalObj('app_analysis');
alert( analysisObj.foo.bar[2])

Backbone multi-collection global search

I'm playing around with the idea of creating a global search that allows me to find any model in any of a number of collections by any of the model's attributes. For example:
I have the following collections:
Users
Applications
Roles
I don't know ahead of time what attributes each User, Applicaion and Role will have but for illustration purposes lets say I have:
User.name
User.last_name
User.email
Application.title
Application.description
Role.name
Role.description
Now, lets say I create a model called Site with a method called search. I want Site.search(term) to search through all the items in each collection where term matches any of the attributes. In essence, a global model search.
How would you suggest I approach this? I can brute-force it by iterating through all the collections' models and each model's attributes but that seems bloated and inefficient.
Any suggestions?
/// A few minutes later...
Here's a bit of code I tried just now:
find: function(query) {
var results = {}; // variable to hold the results
// iterate over the collections
_.each(["users", "applications", "roles"], _.bind(function(collection){
// I want the result to be grouped by type of model so I add arrays to the results object
if ( !_.isUndefined(results[collection]) || !_.isArray(results[collection]) ) {
results[collection] = [];
}
// iterate over the collection's models
_.each(this.get(collection).models, function(model){
// iterate over each model's attributes
_.each(model.attributes, function(value){
// for now I'm only considering string searches
if (_.isString(value)) {
// see if `query` is in the attribute's string/value
if (value.indexOf(query) > -1) {
// if so, push it into the result's collection arrray
results[collection].push(model);
}
};
});
});
// a little cleanup
results[collection] = _.compact(results[collection]);
// remove empty arrays
if (results[collection].length < 1) {
delete results[collection];
}
},this));
// return the results
return results;
}
This yields the expected result and I suppose it works fine but it bothers me that I'm iterating over three arrays. there may not be another solution but I have a feeling there is. If anyone can suggest one, thank you! Meanwhile I'll keep researching.
Thank you!
I would strongly discourage you from doing this, unless you have a very limited set of data and performance is not really a problem for you.
Iteration over everything is a no-no if you want to perform search. Search engines index data and make the process feasible. It is hard to build search, and there is no client-side library that does that effectively.
Which is why everybody is doing searching on the server. There exist easy (or sort of) to use search engines such as solr or the more recent and my personal preference elasticsearch. Presumably you already store your models/collections on the server, it should be trivial to also index them. Then searching becomes a question of making a REST call from your client.

Categories