I have the following example client side object:
var obj = {
"locations": [
[
37.502917,
-122.501335
],
[
37.494473,
-122.499619
],
[
37.484394,
-122.455673
]
],
"types": [
[
"type1"
],
[
"type2"
],
[
"type3"
]
]
};
Locations could contain up to 50 values. An ajax request returns a set of new locations and I need to evaluate if they are already within obj.locations. Each new returned location is a string e.g:
var test = 37.502917 + ',' + -122.501335;
For each location I can iterate through the current ones and check if it is present:
for(var i = 0; i < obj.locations.length; i++) {
if(obj.locations[i] == test){
console.log('Found!');
}
}
Is there a more efficient way of doing this as iterating through the object for each new location seems inefficient?
EDIT: My Solution:
I decided to take the locations object and turn in to a string, then evaluate each of the incoming strings:
var test = -121.60183 + ',' + 38.025783;
var cords = [].concat([], obj.locations).toString();
if( cords.indexOf(test) !== -1) {
console.log('found! ');
}
This is perhaps one of the oldest problems in computer science--looking something up.
You first have to ask yourself if it's worth worrying about. Perhaps it will take 1ms to find the location with a linear search, but 0.5ms with some kind of optimized search. So, is it worth the trouble?
The next approach would be to sort the list of locations, and do a binary search into it.
Another approach is to create some kind of hash table. You could use JavaScript objects for this, with properties as hash keys. The simplest approach would be to use lat+long as the property key, but now you've just shifted the efficiency problem to the efficiency of JS looking up keys in large objects.
You could design your own custom hash-like approach, where all locations with the same integral portion of latitude are stored as an array under a hash of 37. Then the performance is governed by the time taken to find the hash key in the table, and then looking through the smaller number of locations within its array.
Proceeding further, if performance is truly an issue, you could build some kind of tree structure for optimal lookup. At some point, you have to start trading off between the cost of building and updating the tree, and the savings from looking things up using the tree.
It is for sure inefficient, but unless you have to deal with thousands of those objects it will not hang your browser.
However, you can index the locations in an associative array and then use that to check for presence or absence of an element.
For example, you could add a locations_index object to your object, like this :
var obj = {
"locations": [
[
37.502917,
-122.501335
],
[
37.494473,
-122.499619
],
[
37.484394,
-122.455673
]
],
"locations_index" : {
"37.502917,-122.501335" : true,
"37.494473,-122.499619" : true,
// ...
},
"types": [
[
Then you can check if it is or not in the location_index with :
if (obj.locations_index["37.502917,-122.501335"]) {
// It's already there
} else {
Obviously, you need to take care of adding the new locations (and removing the ones you remove) from both the "real" array and the "index".
Related
I want to get familiar with indexedDB to built my Firefox WebExtension.
My sample data is structured like this:
const sampleDataRaw = [
{
"ent_seq" : 1413190,
"att1" : [ {
"sub11" : "content1",
"sub12" : [ "word" ]
}, {
"sub11" : "content2"
} ],
"att2" : [ {
"sub21" : "other content",
"sub22" : [ "term" ]
} ]
}, {
"ent_seq" : 1000010,
"att2" : [ {
"sub21" : "more content"
}, {
"sub22" : "more words"
} ]
}
] // end sampleRawData
I got as far as opening/creating my database, adding this sample data and querying it by the ent_seq key using objectStore.get() and objectStore.openCursor().
The problem arises when I want to search the sub11 or sub21 fields using indexes I should have created for these like this:
objectStore.createIndex("sub11Elements", "att1.sub11", { unique: false });
objectStore.createIndex("sub21Elements", "att2.sub21", { unique: false });
When I want to search, say, fields sub11 as here:
var index = objectStore.index("sub11Elements");
index.get("content1").onsuccess = function(event) {
// I should have the first object of my data now, alas the result is undefined instead
};
It certainly does succeed, but the returned value is undefined since the get() didn't actually find anything.
I want to know why it doesn't find the entry and how to make it find it. I figured it might be because the keypath is wrong, but as stated, if I instead search by the key (ent_seq) I can successfully get the result.att1[i].sub11 values.
On mozilla's websites it's stated that keys can be of type string and array (or array within array etc) amongst others, and keypath parts are supposed to be concatenated using dots.
From searching on stackexchange I've so far found that it's not possible to have variable keys inside the keypath, but that shouldn't be the case here anyway.
Therefore, I really don't see what might be causing the search to not find the object inside the database.
It looks like the second level of objects are arrays, not properties of the first level of objects. The . accessor accesses sub properties, not indices of an array.
IDBObjectStore.prototype.get always yields success when there is no error, and is not indicative of whether a match was found.
A bit more on point 1. Look at "att1":[{"sub11" : "content1","sub12" : [ "word" ]}.... Pretend this was was an actual basic JavaScript object. Would you be able to use att1.sub11? No. Because the value of att1 is an array, not an object.
There are two array of objects one from database and one from csv. I required to compare both array object by their relative properties of Phones and emails and find duplicate array among them. Due to odd database object structure I required to compare both array with Javascript. I wanted to know what is the best algorithm and best way of compare and find duplicates?
I explain simple calculations.
There are 5000 contacts in my database and user may upload another 3000 contacts from csv. Everytime we requires to find duplicate contacts from database and if they find then it may overwrite and rest should be insert. If I compare contact row by row then it may loop 5000 database contacts x 3000 csv contacts = 15000000 time traverse.
This is my present scenario I face due to this system goes stuck. I require some efficient solution of this issue.
I develop the stuff in NodeJS, RethinkDB.
Database object structure exactly represent like that way and it may duplicate entry of emails and phones in other contacts also.
[{
id: 2349287349082734,
name: "ABC",
phones: [
{
id: 2234234,
flag: true,
value: 982389679823
},
{
id: 65234234,
flag: false,
value: 2979023423
}
],
emails: [
{
id: 22346234,
flag: true,
value: "test#domain.com"
},
{
id: 609834234,
flag: false,
value: "test2#domain.com"
}
]
}]
Please review fiddle code, if you want: https://jsfiddle.net/dipakchavda2912/eua1truj/
I have already did indexing. The problem is looking very easy and known in first sight but when we talk about concurrency it is really very critical and CPU intensive.
If understand the question you can use the lodash method differenceWith
let csvContacts = [] //fill it with your values;
let databaseContacts = .... //from your database
let diffArray = [] //the non duplicated object;
const l = require("lodash");
diffArray = l.differenceWith(csvContact,
databaseContacts,
(firstValue,secValue)=>firstValue.email == secValue.email
I am trying to take a JSON list that is formatted as such: (real list has over 2500 entries).
[
['fb.com', 'http://facebook.com/']
['ggle.com', 'http://google.com/']
]
The JSON list represents: ['request url', 'destination url']. It is for a redirect audit tool built on node.js.
The goal is to put those JSON value pairs in a javascript object with a key value array pair as such:
var importedUrls = {
requestUrl : [
'fb.com',
'ggle.com'
],
destinationUrl : [
'https://www.facebook.com/',
'http://www.google.com/'
]
}
Due to the sheer amount of redirects, I do prefer a nonblocking solution if possible.
You first need to create your object:
var importedUrls = {
requestUrl: [],
destinationUrl: []
}
Now, let's say you have your data in an array called importedData for lack of a better name. You can then iterate that array and push each value to its proper new array:
importedData.forEach(function(urls){
importedUrls.requestUrl.push(urls[0]);
importedUrls.destinationUrl.push(urls[1]);
});
This will format your object as you want it to be formatted, I hope.
I will propose it to you that you take another approach.
Why not have an array of importedUrls, each one with its correspondent keys?
You could have something like:
importedUrls = [
{
requestUrl: 'req',
destinationUrl: 'dest'
},
{
requestUrl: 'req2',
destinationUrl: 'dest2'
},
]
I'm sure you can figure out how to tweak the code I showed to fit this format if you want to. What you gain with this is a very clear separation of your urls and it makes the iterations a lot more intuitive.
I am trying to optimize accessing cost of nested objects. I have the following structure (example):
Now I want to access data but the problem is I need to keep on adding loops where every I got nested data. That means if I want to access racks I need to itterate 3 for loops like
var jsonObj=[{
"shelfs": [
{
"Shelf1": [
{
"Racks": [
{
"Rack1": [
{
"Book1": "Value"
}
]
},
{
"Rack2": [
{
"Book1": "Value"
}
]
}
]
}
]
},
{
"Shelf2": [
{
"Racks": [
{
"Rack1": [
{
"Book1": "Value"
}
]
},
{
"Rack2": [
{
"Book1": "Value"
}
]
}
]
}
]
}
]
}];
for(var i=0;i<jsonObj.length;i++)
{
var shelfs=jsonObj[i];
var key=Object.keys(shelfs)[0];
//var shelfs=arr[arr[0].key];
//alert(JSON.stringify(shelfs[key]));//shelfs));
for(var j=0;j<shelfs[key].length;j++)
{
var shelfdetails=shelfs[key][j];
var skeys=Object.keys(shelfdetails);
for(var k=0;k<skeys.length;k++)
{
var racks=shelfdetails[skeys[k]];
alert(JSON.stringify(racks));
}
}
}
Here to access racks information I put 3 nested for loops but eventually it is increasing the time complexity. Please can anybody suggest me better data structure or method to access nested JavaScript objects with low time complexity?
You have n books that you want to display in your UI. It will take n display operations to display n books. It does not matter that they are in nested loops, the total number of display operations is still n. There is no optimization you can perform to reduce the number of display operations you need to perform.
Even if you were to flatten your data structure in to a single flat array of books the number of display operations would still be n.
I am trying to optimize accessing cost of nested objects.
Do you mean the CPU cost, the storage cost, or the code complexity cost? The three have quite different implications. Since you go on to say
I need to keep on adding loops whereever I got nested data.
I am going to assume that you are most interested in code complexity. In that case, consider the following flatter data structure, which might be easier to loop through, to filter, to sort, to group, and to otherwise process using utility libraries such as underscore.
[
{ shelf: 'Shelf1', rack: 'Rack1', book: 'Book1', value: "Value"},
{ shelf: 'Shelf1', rack: 'Rack2', book: 'Book1', value: "Value"},
{ shelf: 'Shelf2', rack: 'Rack1', book: 'Book1', value: "Value"},
{ shelf: 'Shelf2', rack: 'Rack2', book: 'Book1', value: "Value"}
]
Abstractly speaking, each "Book1": "Value" item has associated with it a shelf and a rack. In your suggested data structure, this "association" is represented by "belonging to" relationships, where it belongs to an array which is the value of a property whose name specifies the shelf or rack. In the above flatter structure, the associations are instead specified explicitly by giving them as properties.
With the flatter structure, if for some reason you wanted to create a data object with keys giving the shelf and values giving an array of objects on that shelf, in Underscore that is as easy as
_.groupBy(obj, 'shelf')
So all else being equal it seems that the flatter data structure is a more flexible way to represent the data, and you can derive other things you need from it more easily.
Another way to look at it is that currently in order to find the sets of relationships of shelves, racks, and books, you need to iterate through three levels of nested arrays, whereas in the flatter structure the relationships are represented more directly.
Performance, either CPU-wise or space-wise, is rarely going to be a reason to choose one structure over another, unless you are dealing with a huge amount of data. Otherwise, the difference in performance is likely to measured in milliseconds or microseconds, or a few K of storage. You should choose the structure that allows you to represent your algorithms in a fashion which is concise and provably correct. If you intend to handle hundreds of thousands of objects, then in that case yes, you would want to design custom structures optimized for time or space.
I have an array of approximately 19.000 items.
I'll have to access them by an arbitrary id at random (that is, there's no need to traverse the array)
I was just wondering if js con optimize the code if I use the id as the index of the array, or if there's any kind of trick or library to speed up these kind of things.
To be more precise, I'll have the results of an election in approximately 20k schools, and I'd like to know your advice about which one would be faster:
[
{
school_id: xx
results: [
{
party_id: xx
votes: xx
}, [...]
]
}, [...]
]
[ // use school_id as index to the array
[
{
party_id: xx
votes: xx
}, [...]
], [...]
]
The question is if js is smart enough to optimize array random access.
And any tool you could advice me to use to test the performance would be much welcome
These questions are always engine-dependent. In V8 (Google Chrome, Node.js):
Objects and Arrays are not radically different. For implementation simplicity, all objects have an external elements array where properties that are positive integers are stored.
So when you do obj[5], it doesn't matter if obj is the Javascript Array object or any javascript object - it will access the object's external elements array.
So if you created an object like this:
var a = {
a: 3,
b: 4,
c: {},
5: 5,
6: 6
};
The object layout will be:
[HiddenClassPointer, PropertiesArrayPointer, ElementsArrayPointer, TaggedSmallInteger(3), TaggedSmallInteger(4), JSObjectPointer]
Note how the named fields are stored side by side with the internal fields. If you now add any property after the fact, it will
go into the external properties array pointed by the second field instead of stored on the object directly.
The "fields" with the integer key would be in the external elements array pointed to by ElementsArrayPointer like this:
[HiddenClassPointer, TaggedSmallInteger(25), TheHolePointer, TheHolePointer, TheHolePointer, TheHolePointer, TheHolePointer, TaggedSmallInteger(5), TaggedSmallInteger(6), ...more hole pointers until 25 elements]
The 25 is length of the backing array. I will come back to that soon.
The hole pointer is needed to disambiguate between explicit undefined values given from the user and actual holes in the array. When you try to retrieve a[3], it will
return you undefined because there was a hole. So the actual hole object is not returned to user. So there are actually 3 different types of null :P
The initial length of 25 comes from the formula (initial index + 1 ) + ((initial_index + 1 ) / 2) + 16 so 6 + 7/2 + 16 = 25. You can see it in a heap snapshot.
( 108 - 8 ) / 4 === 25
Write a test using JSPerf. You can test a number of different scenarios using it.