I have the following data structure which describes an object and the time period which it's valid. Pretend the numbers below are unix timestamps.
{
"id": 1234,
"valid_from": 2000
"valid_to": 4000
},
{
"id": 1235,
"valid_from": 1000,
"valid_to": 2200,
}
...
I want to quickly be able to store these items in JavaScript and then query for items which are valid at a certain time.
For example if I were to query for objects valid at 2100 I would get [1234, 1235]. If I were to query for objects valid at 3999 I would get [1234], and at 4999 nothing.
I will have a size of about 50-100k items in the structure and I'd like fast lookup speeds but inserts, and deletes could be slower.
Items will have duplicate valid_from and valid_to values so it needs to support duplicates. Items will overlap.
I will need to be continually inserting data into the structure (probably by bulk for initial load, and then one off updates as data changes). I will also be periodically modifying records so likely a remove and insert.
I am not sure what the best approach to this is in a highly efficient manner?
Algorithms are not my strong suit but if I just know the correct approach I can research the algorithms themselves.
My Idea:
I was originally thinking a modified binary search tree to support duplicate keys and closest lookup, but this would only allow me to query objects that are > valid_from or < valid_to.
This would involve me bisecting the array or tree to find all items > valid_from and then manually checking each one for valid_to.
I suppose I could have two search trees, one for valid_to and valid_from, then I could check which id's from the results overlap and return those id's?
This still seems kind of hacky to me? Is there a better approach someone can recommend or is this how it's done.
Imagine you have two lists: initiation/begin and expiration/end. Both are sorted by TIME.
Given a particular time, you can find where in each list the first item is that meets the criteria by binary search. You can also do inserts by binary search into each list.
For example, if there are 1000 items and begin location is 342, then items 1-342 are possible, and if end location is 901, then items 901-1000 in the termination list are possible. You now need to intersect both sublists.
Take item IDs from 1-342 in begin and 901-1000 in end, and put them in a separate array (allocated ahead of time). Sort the array. Traverse the array. Whenever the same ID appears twice in a row, it is a hit, a valid match.
Currently I have Select2 in my application, and have previously implemented ajax calls to the database to get a smaller subset based on search query entered by a user.
However, users want to be able to click the back arrow on the browser, and have the query automatically run again (something that currently does not happen with Select2). I was able to implement this by pulling the entire dataset (over 18,000 elements) in and calling select2 on that.
The problem with this is that Select2 basically does a foreach in a foreach when doing a search (foreach element in the dataset, go through each string and get the index of the query - which I understand is basically breaking the string into a char array and checking each char individually to see if the combination is found). So every time someone types a character, we're looking at over 18,000 operations, even though the majority of elements are eliminated as options.
Is there a way to make Select2 actually eliminate the options that don't match (create and bind to a temp array or something like that) or perform a binary search instead of a linear search? If not, are there any alternatives to Select2 that DO implement binary search instead of linear search, or would I need to create my own jQuery plugin to do this?
In this jsfiddle1 a hidden select element is used and a clone of that to filter input. The filtering is done with:
for (var i = 0; i < that.selector.options.length; i++) {
if (re.test(that.selector.options[i].text)) {
sclone.add(new Option(that.selector.options[i].text, i));
}
}
Where re is a RegExp created from an input field that placed above the select clone.
Maybe that's an idea to play with?
1 The language used in the first selector is dutch, but I suppose that shouldn't be obstructive to the idea.
If I want to maintain an ordered list in Firebase, it seems like the best way to do it is to manually assign a priority to each item in my list. That means if I insert or remove an item from the list, I have to update the priorities of all the items following it. For an item at the beginning of the list, this means updating every item in the list. Is there a better performing data structure or algorithm to use in this case?
You can create an ordered list by setting the priority of elements appropriately. Items in a list are ordered lexigraphically by priority, or if the priority can be parsed to a number, by numeric value.
If you want to insert items into the middle of an existing list, modifying the priorities of the existing items would work, but would be horribly inefficient. A better approach is just to pick a priority between the two items where you want to insert the value and set that priority for the new item.
For example, if you had element 1 with priority "a", and element 2 with priority "b", you could insert element 3 between the two with priority "aa" (or "aq", "az", etc).
In our experience, most times when you create an ordered list, you don't necessarily know the position in the list you want to insert the item beforehand. For example, if you're creating a Leader Board for a game, you don't know in advance that you want to place a new score 3rd in the list, rather you know you want to insert it at whatever position score 10000 gets you (which might happen to be third). In this case, simply setting the priority to the score will accomplish this. See our Leader Board example here:
https://www.firebase.com/tutorial/#example-leaderboard
The Ruby gem ranked_model has an interesting approach to this problem. It uses a position integer like many other "acts as list" implementations, but it doesn't rely on re-writing all the integers on each position move. Instead, it spaces the integers widely apart, and so each update may only affect one or two rows. Might be worth looking through the readme and code to see if this approach could fit here.
I have an html table with cells that can be edited when clicked on. I am trying to figure out the best method to change cell data in cells following an edited cell.
For example, say the table comes populated by random numbers or letters. When I changed a cell to "14" I want the cells after it to change automatically to 15, 16, 17,n+1..ect. Or if I entered "h" the following cells would change to i,j,k,l...z stopping at z.
The number one seems pretty easy as I could just create a loop and i++ for each cell. However, the letter one doesn't seem as simple. Would I need to create an alphabet array and find the edited cell letter within it then proceed to the end of the array inserting each into the follow cells?
This can actually be done with a fairly simple function call like this one:
function NextChar(c){
return String.fromCharCode(c.charCodeAt(0) + 1);
}
where c is the alphabetic character that is entered into the cell, passed as a parameter.
I can see this question was done quite some time ago, so this answer is more for people who make come seeking answers later.
I would make arrays with character sequences as you said and use the jQuery.inArray() API to detect which sequence the edited cells content is in.
Check it out: http://api.jquery.com/jQuery.inArray/
I'm using Jorn Zaefferer's Autocomplete plugin on a couple of different pages. In both instances, the order of displayed strings is a little bit messed up.
Example 1: array of strings: basically they are in alphabetical order except for General Knowledge which has been pushed to the top:
General Knowledge,Art and Design,Business Studies,Citizenship,Design and Technology,English,Geography,History,ICT,Mathematics,MFL French,MFL German,MFL Spanish,Music,Physical Education,PSHE,Religious Education,Science,Something Else
Displayed strings:
General Knowledge,Geography,Art and Design,Business Studies,Citizenship,Design and Technology,English,History,ICT,Mathematics,MFL French,MFL German,MFL Spanish,Music,Physical Education,PSHE,Religious Education,Science,Something Else
Note that Geography has been pushed to be the second item, after General Knowledge. The rest are all fine.
Example 2: array of strings: as above but with Cross-curricular instead of General Knowledge.
Cross-curricular,Art and Design,Business Studies,Citizenship,Design and Technology,English,Geography,History,ICT,Mathematics,MFL French,MFL German,MFL Spanish,Music,Physical Education,PSHE,Religious Education,Science,Something Else
Displayed strings:
Cross-curricular,Citizenship,Art and Design,Business Studies,Design and Technology,English,Geography,History,ICT,Mathematics,MFL French,MFL German,MFL Spanish,Music,Physical Education,PSHE,Religious Education,Science,Something Else
Here, Citizenship has been pushed to the number 2 position.
I've experimented a little, and it seems like there's a bug saying "put things that start with the same letter as the first item after the first item and leave the rest alone". Kind of mystifying. I've tried a bit of debugging by triggering alerts inside the autocomplete plugin code but everywhere i can see, it's using the correct order. it seems to be just when its rendered out that it goes wrong.
Any ideas anyone?
max
EDIT - reply to Clint
Thanks for pointing me at the relevant bit of code btw. To make diagnosis simpler i changed the array of values to ["carrot", "apple", "cherry"], which autocomplete re-orders to ["carrot", "cherry", "apple"].
Here's the array that it generates for stMatchSets:
stMatchSets = ({'':[#1={value:"carrot", data:["carrot"], result:"carrot"}, #3={value:"apple", data:["apple"], result:"apple"}, #2={value:"cherry", data:["cherry"], result:"cherry"}], c:[#1#, #2#], a:[#3#]})
So, it's collecting the first letters together into a map, which makes sense as a first-pass matching strategy. What i'd like it to do though, is to use the given array of values, rather than the map, when it comes to populating the displayed list. I can't quite get my head around what's going on with the cache inside the guts of the code (i'm not very experienced with javascript).
SOLVED - i fixed this by hacking the javascript in the plugin.
On line 549 (or 565) we return a variable csub which is an object holding the matching data. Before it's returned, I reorder this so that the order matches the original array of value we were given, ie that we used to build the index in the first place, which i had put into another variable:
csub = csub.sort(function(a,b){ return originalData.indexOf(a.value) > originalData.indexOf(b.value); })
hacky but it works. Personally i think that this behaviour (possibly coded more cleanly) should be the default behaviour of the plugin: ie, the order of results should match the original passed array of possible values. That way the user can sort their array alphabetically if they want (which is trivial) to get the results in alphabetical order, or they can preserve their own 'custom' order.
What I did instead of your solution was to add
if (!q && data[q]){return data[q];}
just above
var csub = [];
found in line ~535.
What this does, if I understood correctly, is to fetch the cached data for when the input is empty, specified in line ~472: stMatchSets[""] = []. Assuming that the cached data for when the input is empty are the first data you provided to begin with, then its all good.
I'm not sure about this autocomplete plugin in particular, but are you sure it's not just trying to give you the best match possible? My autocomplete plugin does some heuristics and does reordering of that nature.
Which brings me to my other answer: there are a million jQuery autocomplete plugins out there. If this one doesn't satisfy you, I'm sure there is another that will.
edit:
In fact, I'm completely certain that's what it's doing. Take a look around line 474:
// loop through the array and create a lookup structure
for ( var i = 0, ol = options.data.length; i < ol; i++ ) {
/* some code */
var firstChar = value.charAt(0).toLowerCase();
// if no lookup array for this character exists, look it up now
if( !stMatchSets[firstChar] )
and so on. So, it's a feature.