How mongodb indexing reduces the the number of docs to be scanned? - javascript

I have a collection like: [{a: 'eXeD9', b: 399}, {a: 'eXe9', b: 35399} , xOBJs].
I am gonna search 24823293 in b field. So as I know I am have to traverse all docs, until there is a match with 24823293.
So I am confused if I create an index for b field, how it can reduces the number of docs for scanning?
Because maybe the 24823293 is not within those reduced docs.
As I am a mobile application developer, I am confused here any help.

Because with an index the scan will be performed against the possible values of { b } (which would be stored in a time efficient data structure, like a B-tree) rather than on your whole set of documents.
Creating an index on { b } can be seen as making the value of { b } an access key to the documents themselves.
You end up with an index scan instead of a full scan, which can dramatically make the difference.

Related

Find tag associated with high num valuesJavascript logic function

Given an array of objects, calculate the tag that produces the highest "num" most often.
I'm having an issue where the num represents level of happiness, and I'm trying to find the tag that leads to the highest level most often.
Example of array:
[{num: 5, tags:["friends", "family"]}, {num: 1, tags:["friends", "work"]}, {num: 4, tags:["school"]}]
The family tag would get a lot of points cause it appears with a 5. Friends would get points too, but then be disadvantaged because of the one.
This problem is supposed to be a little obscure. There's no one way to do it. If you have any suggestions for how, please leave a comment! If you have code, even better:).
Thank you!
This looks like homework or a job interview question, so I'm going to make suggestions rather than full solutions.
You have an input data structure; think about what your output data structure should look like. Coding then becomes a simple matter of transforming one to the other.
One possibility is a simple object, treating the tags as keys and their num as values, something like:
{friends: 5, family: 5, work: 1...}
...but this means you have to make decisions about what happens when the same tag has more than one value. Do you only care about the highest value for a given tag? Do you want the average of all values? Do you want a confidence interval? (i.e. if there are fifteen instances of tag A, and only one instances of tag B, the values for A are more likely to be "correct".) This is the part that requires some interpretation of the question.
So that implies that instead of keeping track of only a single value for each tag, maybe you need to keep track of all of them, so for every tag you'll be able to see how many scores it received, and what they were:
{friends: [1, 5], family: [5], work: [1]...}
...and then as a final pass can go through those arrays and perform whatever interpretation of the data you decided on above, resulting in a single number for each tag.
So now that you know what you're converting to, the algorithm is pretty obvious:
Initialize an output object
for each object in the source array,
for each tag in the object's "tag",
if the tag doesn't exist yet in the output object, create it as a new object key, with an empty array as its value
push the "num" value onto that array
for each key in the output object,
Do Something™ to its values array to convert it into a single value
Pretty easy with the lodash countBy function.
const theArray = [{num: 5, tags:["friends", "family"]}, {num: 1, tags:["friends", "work"]}, {num: 4, tags:["school"]}];
const allTags = [];
theArray.map((item) =>{
allTags.push(...item.tags);
} );
const count = _.countBy(allTags);
console.log(count);
https://jsfiddle.net/m7L1prau/

Is there a way to map a value in an object to the index of an array in javascript?

Prepending that a solution only needs to work in the latest versions of Chrome, Firefox, and Safari as a bonus.
-
I am trying to use an associative array for a large data set with knockout. My first try made it a true associative array:
[1: {Object}, 3: {Object},...,n:{Object}]
but knockout was not happy with looping over that. So I tried a cheating way, hoping that:
[undefined, {Object}, undefined, {Object},...,{Object}]
where the location in the array is the PK ID from the database table. This array is about 3.2k items large, and would be iterated over around every 10 seconds, hence the need for speed. I tried doing this with a splice, e.g.
$.each(data, function (index, item) {
self.myArray.splice(item.PKID, 0, new Object(item));
}
but splice does not create indices, so since my first PKID is 1, it is still inserted at myArray[0] regardless. If my first PK was 500, it would start at 0 still.
My second thought is to initialize the array with var myArray = new Array(maxSize) but that seems heavy handed. I would love to be able to use some sort of map function to do this, but I'm not really sure how to make the key value translate into an index value in javascript.
My third thought was to keep two arrays, one for easy look up and the other to store the actual values. So it combines the first two solutions, almost, by finding the index of the object in the first example and doing a lookup with that in the second example. This seems to be how many people manage associative arrays in knockout, but with the array size and the fact that it's a live updating app with a growing data set seems memory intensive and not easily manageable when new information is added.
Also, maybe I'm hitting the mark wrong here? We're putting these into the DOM via knockout and managing with a library called isotope, and as I mentioned it updates about every 10 seconds. That's why I need the fast look up but knockout doesn't want to play with my hash table attempts.
--
clarity edits:
so on initial load the whole array is loaded up (which is where the new Array(maxLength) would go, then every 10 seconds anything that has changed is loaded back. That is the information I'm trying to quickly update.
--
knockout code:
<!-- ko foreach: {data: myArray(), afterRender: setInitialTileColor } -->
<div class="tile" data-bind="attr: {id: 'tileID' + $data.PKID()}">
<div class="content">
</div>
</div>
<!-- /ko -->
Then on updates the hope is:
$.each(data.Updated, function (index, item) {
var obj = myModel.myArray()[item.PKID];
//do updates here - need to check what kind of change, how long it's been since a change, etc
}
Here is a solution how to populate array items with correct indexes, so it doesn't start from the first one (0 (zero) I meant)
just use in loop
arr[obj.PKID] = obj;
and if your framework is smart (to use forEach but not for) it will start from your index (like 500 in case below)
http://jsfiddle.net/0axo9Lgp/
var data = [], new_data = [];
// Generate sample array of objects with index field
for (var i = 500; i < 3700; i++) {
data.push({
PKID: i,
value: '1'
});
}
data.forEach(function(item) {
new_data[item.PKID] = item;
});
console.log(new_data);
console.log(new_data.length); // 3700 but real length is 3200 other items are undefined
It's not an easy problem to solve. I'm assuming you've tried (or can't try) the obvious stuff like reducing the number of items per page and possibly using a different framework like React or Mithril.
There are a couple of basic optimizations I can suggest.
Don't use the framework's each. It's either slower than or same as the native Array method forEach, either way it's slower than a basic for loop.
Don't loop over the array over and over again looking for every item whose data has been updated. When you send your response of data updates, send along an array of the PKIds of the updated item. Then, do a single loop:
.
var indexes = []
var updated = JSON.parse(response).updated; // example array of updated pkids.
for(var i=0;i<allElements.length;i++){
if(updated.indexOf(allElements[i].pkid)>-1)
indexes.push(i);
}
So, basically the above assumes you have a simple array of objects, where each object has a property called pkid that stores its ID. When you get a response, you loop over this array once, storing the indexes of all items that match a pk-id in the array of updated pk-ids.
Then you only have to loop over the indexes array and use its elements as indexes on the allElements array to apply the direct updates.
If your indexes are integers in a reasonable range, you can just use an array. It does not have to be completely populated, you can use the if binding to filter out unused entries.
Applying updates is just a matter of indexing the array.
http://jsfiddle.net/0axo9Lgp/2/
You may want to consider using the publish-subscribe pattern. Have each item subscribe to its unique ID. When an item needs updating it will get the event and update itself. This library may be helpful for this. It doesn't depend upon browser events, just arrays so it should be fairly fast.

JavaScript object vs. array lookup performance

What is the performance difference between retrieving the value by key in a JavaScript object vs iterating over an array of individual JavaScript objects?
In my case, I have a JavaScript object containing user information where the keys are the user's IDs and the values are each user's information.
The reason I ask this is because I would like to use the angular-ui-select module to select users, but I can't use that module with a Javascript object - it requires an array.
How much, if anything, am I sacrificing by switching from a lookup by key, to a lookup by iteration?
By key:
var user = users[id];
By iteration
var user;
for (var i = 0; i < users.length; i ++) {
if (users[i].id == id) {
user = users[i]; break;
}
}
The answer to this is browser dependent, however, there are a few performance tests on jsperf.com on this matter. It also comes down to the size of your data. Generally it is faster to use object key value pairs when you have large amounts of data. For small datasets, arrays can be faster.
Array search will have different performance dependent on where in the array your target item exist. Object search will have a more consistent search performance as keys doesn't have a specific order.
Also looping through arrays are faster than looping through keys, so if you plan on doing operations on all items, it can be wise to put them in an array. In some of my project I do both, since I need to do bulk operations and fast lookup from identifiers.
A test:
http://jsben.ch/#/Y9jDP
This problem touches all programming languages. It depends on many factors:
size of your collection -arrays will get slower when you are searching for the last key, and array is quite long
can elements repeat them selves-if yes, than you need a array. If no: you need either a dictionary (map) or you need to write a add method that for each add will iterate your array and find possible duplicates-that can be troublesome, when dealing with large lists
average key usage - you will lose performance, if the most requested userId is at the end of the list.
In your example map would be a better solution.
Secondly, you need to add a break to yor code:)
var user;
for (var i = 0; i < users.length; i ++) {
if (users[i].id == id) {
user = users[i]; break;
}
}
Or you will lose performance:)
associative arrays are much slower then arrays with numbered indexes, because associative arrays work by doing string comparisons, which are much, much slower then number comparisons!

Summarizing a javascript array of strings

I have a variable length array of strings declared in javascript that contains Dungeons and Dragons class names. An example of this is below:
var class_names = new Array("Wizard", "Wizard", "Wizard", "Sorcerer",
"Sorcerer", "Ultimate Magus");
In my HTML, I use the javascript window.onload function to set a variety of variables from the javascript file to build the content of the page being displayed locally.
For things like name, this is easy:
document.getElementById('charname').innerHTML = name[0];
But for the class info, I don't want to just pump out a massive string of class names, I want it condensed down. Using the example 'class_names' above, I want to end up with a string that looks like this:
"Wizard 3, Sorcerer 2, Ultimate Magus 1"
i.e. the number after each class name should be the number of repetitions found in the array.
Anyone have an idea how to make this happen on the fly, so when I alter the javascript file to add more class data to class_names, it is displayed appropriately on my HTML page?
Thanks in advance for any help I get on this pet project (namely creating a HTML page for each character in my campaign that can be printed out as a character sheet....it's far better than manually writing a page for each character, or handwriting it on vanilla sheets).
It's easy enough, just loop through the array and count repetitions.
var l = class_names.length, i, tmp = {}, ret = [];
for( i=0; i<l; i++) {
if( !tmp[class_names[i]]) tmp[class_names[i]] = 0;
tmp[class_names[i]]++;
}
for( i in tmp) {
if( tmp.hasOwnProperty(i)) {
ret.push(i+" "+tmp[i]);
}
}
// output is ret.join(", ");
I think there are many ways to solve your problem...
Possibility A:
If you don't know if the classes are appearing in the right order, try to sort your Array first to ensure that they are grouped properly.
Iterate over the array and count the repetitions, i.e. increase your counter if
lastElement === class_names[i]
and append the result for the last class name to the result string and set the counter back to 1 otherwise.
Possibility B:
Store your Array directly as ["Wizard", 3, "Sorcerer", 2, ...] - this is possible since JS does not require arrays to contain the same type of element at each position.
Possibility C:
Use a different structure, e.g. using objects:
var class_names = [{name: "Wizard", level: 3}, {name: "Sorcerer", level: 2}, ...]

Javascript/jQuery Id check to drive numbering function with validation

I need help with a loop... it's probably simple but I'm having difficulty coding it up.
Basically, I need to check existing Ids for their number so I can create a unique id with a different number. They're named like this: id="poly'+i'" in sequence with my function where i is equal to the number of existing elements. Example: Array 1, Array 2, Array 3 corresponding with i=1 for the creation of Array 1, i=2 for Array 2, etc.
Right now i is based on the total number of existing elements, and my "CreateNew" function is driven off x=i+1 (so the example above, the new element will be named Array 4). The problem is that if you delete one of the middle numbers, the "Create" function will duplicate the high number. i.e. Array 1, 2, 3 delete 2, create new-> Array 1, 3, 3.
I need an if() statement to check if the array already exists then a for() loop to cycle through all i's until it validates. Not sure how to code this up.
The code I'm trying to correct is below (note I did not write this originally, I'm simply trying to correct it with my minimal JS skills):
function NewPanel() {
var i = numberOfPanels.toString();
var x = (parseInt(i)+1).toString();
$('#items').append('<div onclick="polygonNameSelected(event)" class="polygonName" id="poly'+i+'"> Array '+ x +' </div>');
$('div[id*=poly]').removeClass('selected');
$('#poly'+i).addClass('selected');
$('#poly'+i).click(function() {
selectedPolygon = i;
$('div[id*=poly]').removeClass('selected');
$(this).addClass('selected');
});
}
THANK YOU! :)
Please clarify "The problem is that if you delete one of the middle numbers, ". What do you mean by delete? Anyway, the simplest solution is to create two arrays. Both arrays will have the same created id's. Whenever an id is created in the first array, an id will be added to the second array. So when it is deleted from first array, check your second array's highest value and then create this id in first array. I hope this did not confuse you.
Well it is hard to tell why you cannot just splice the array down. It seems to me there is a lot of extra logic involved in the tracking of element numbers. In other words, aside from the index being the same, the ids become the same as well as other attributes due to the overlapping 1, 3, 3 (from the example). If this is not the case then my assumption is incorrect.
Based on that assumption, when I encounter a situation where I want to ensure that the index created will always be an appending one, I usually take the same approach as I would with a database primary key. I set up a field:
var primaryKeyAutoInc = 0;
And every time I "create" or add an element to the data store (in this case an array) I copy the current value of the key as it's index and then increment the primaryKeyAutoInc value. This allows for the guaranteed unique indexing which I am assuming you are going for. Moreover, not only will deletes not affect future data creation, the saved key index can be used as an accessor.

Categories