Say you have a very simple data structure:
(personId, name)
...and you want to store a number of these in a javascript variable. As I see it you have three options:
// a single object
var people = {
1 : 'Joe',
3 : 'Sam',
8 : 'Eve'
};
// or, an array of objects
var people = [
{ id: 1, name: 'Joe'},
{ id: 3, name: 'Sam'},
{ id: 8, name: 'Eve'}
];
// or, a combination of the two
var people = {
1 : { id: 1, name: 'Joe'},
3 : { id: 3, name: 'Sam'},
8 : { id: 8, name: 'Eve'}
};
The second or third option is obviously the way to go if you have (or expect that you might have) more than one "value" part to store (eg, adding in their age or something), so, for the sake of argument, let's assume that there's never ever going to be any more data values needed in this structure. Which one do you choose and why?
Edit: The example now shows the most common situation: non-sequential ids.
Each solution has its use cases.
I think the first solution is good if you're trying to define a one-to-one relationship (such as a simple mapping), especially if you need to use the key as a lookup key.
The second solution feels the most robust to me in general, and I'd probably use it if I didn't need a fast lookup key:
It's self-describing, so you don't
have to depend on anyone using
people to know that the key is the id of the user.
Each object comes self-contained,
which is better for passing the data
elsewhere - instead of two parameters
(id and name) you just pass around
people.
This is a rare problem, but sometimes
the key values may not be valid to
use as keys. For example, I once
wanted to map string conversions
(e.g., ":" to ">"), but since ":"
isn't a valid variable name I had to
use the second method.
It's easily extensible, in case
somewhere along the line you need to
add more data to some (or all) users.
(Sorry, I know about your "for
argument's sake" but this is an
important aspect.)
The third would be good if you need fast lookup time + some of the advantages listed above (passing the data around, self-describing). However, if you don't need the fast lookup time, it's a lot more cumbersome. Also, either way, you run the risk of error if the id in the object somehow varies from the id in people.
Actually, there is a fourth option:
var people = ['Joe', 'Sam', 'Eve'];
since your values happen to be consecutive. (Of course, you'll have to add/subtract one --- or just put undefined as the first element).
Personally, I'd go with your (1) or (3), because those will be the quickest to look up someone by ID (O logn at worst). If you have to find id 3 in (2), you either can look it up by index (in which case my (4) is ok) or you have to search — O(n).
Clarification: I say O(logn) is the worst it could be because, AFAIK, and implementation could decide to use a balanced tree instead of a hash table. A hash table would be O(1), assuming minimal collisions.
Edit from nickf: I've since changed the example in the OP, so this answer may not make as much sense any more. Apologies.
Post-edit
Ok, post-edit, I'd pick option (3). It is extensible (easy to add new attributes), features fast lookups, and can be iterated as well. It also allows you to go from entry back to ID, should you need to.
Option (1) would be useful if (a) you need to save memory; (b) you never need to go from object back to id; (c) you will never extend the data stored (e.g., you can't add the person's last name)
Option (2) is good if you (a) need to preserve ordering; (b) need to iterate all elements; (c) do not need to look up elements by id, unless it is sorted by id (you can do a binary search in O(logn). Note, of course, if you need to keep it sorted then you'll pay a cost on insert.
Assuming the data will never change, the first (single object) option is the best.
The simplicity of the structure means it's the quickest to parse, and in the case of small, seldom (or never) changing data sets such as this one, I can only imagine that it will be frequently executed - in which case minimal overhead is the way to go.
I created a little library to manage key value pairs.
https://github.com/scaraveos/keyval.js#readme
It uses
an object to store the keys, which allows for fast delete and value retrieval
operations and
a linked list to allow for really fast value iteration
Hope it helps :)
The third option is the best for any forward-looking application. You will probably wish to add more fields to your person record, so the first option is unsuitable. Also, it is very likely that you will have a large number of persons to store, and will want to look up records quickly - thus dumping them into a simple array (as is done in option #2) is not a good idea either.
The third pattern gives you the option to use any string as an ID, have complex Person structures and get and set person records in a constant time. It's definitely the way to go.
One thing that option #3 lacks is a stable deterministic ordering (which is the upside of option #2). If you need this, I would recommend keeping an ordered array of person IDs as a separate structure for when you need to list persons in order. The advantage would be that you can keep multiple such arrays, for different orderings of the same data set.
Given your constraint that you will only ever have name as the value, I would pick the first option. It's the cleanest, has the least overhead and the fastest look up.
Related
I am doing a problem on Leetcode to write a function which checks to see if a supplied array is a palindrome. They seem to expect the solution to involve creating a linked list from the array and then using the linked list to check if its contents are a palindrome.
Am I right in assuming that the reason for using a linked list (other than to test your programming skills) is that it enables a more efficient (ie takes less processing power) solution than working solely with arrays?
What I find counter intuitive about that is the fact that the function takes an array as its argument so, the data is already in an array. My thinking is that it must take as much processing power to get the array into a linked list as it would take to just go through the elements in the array from each end checking each pair to see if they are equal.
In order to make the linked list you would have to access all the array elements. The only thing I can think is that accessing elements from the end of array might be more 'expensive' than from the front.
I have put my code for solving the problem with an array below:
function isPalindrome(array){
const numberOfTests = Math.floor(array.length/2);
for(let i = 0; i < numberOfTests; i++){
let j = array.length - 1 - i;
if(array[i] !== array[j]){
return false;
}
}
return true;
}
console.log(isPalindrome([1,1,1,2]))
I guess my question is why are they suggesting using linked lists to solve this problem other than to test programming skills? Is there something about my function which is less efficient than using a linked list to accomplish the same task?
Edit:
The code editor for the question is pre-populated with:
/**
* Definition for singly-linked list.
* function ListNode(val, next) {
* this.val = (val===undefined ? 0 : val)
* this.next = (next===undefined ? null : next)
* }
*/
/**
* #param {ListNode} head
* #return {boolean}
*/
var isPalindrome = function(head) {
};
Also from the question:
The number of nodes in the list is in the range [1, 105].
0 <= Node.val <= 9
Follow up: Could you do it in O(n) time and O(1) space?
I am not exactly sure what this all means but I interpreted it as suggesting there are performance issues involve with the algorithm and that using linked lists would be a good way to address them.
The problem is at: https://leetcode.com/problems/palindrome-linked-list/
The code challenge is saying that you are "given the head of a singly linked list". So it is not an array. The misunderstanding may come from the way that LeetCode represents a linked list: they use an array notation for it. But be assured that your function will be called with a linked list, not an array.
Am I right in assuming that the reason for using a linked list (other than to test your programming skills) is that it enables a more efficient (ie takes less processing power) solution than working solely with arrays?
No, it is only for testing programming skills.
What I find counter intuitive about that is the fact that the function takes an array as its argument
This is where you got the code challenge wrong. Look at the description ("Given the head of a singly linked list"), and look at the template code you get to start from (the parameter is named head, not array).
Is there something about my function which is less efficient than using a linked list to accomplish the same task?
Your function will not work. The argument does not have a length property since it is not an array. The argument is an instance of ListNode or null.
In your code you included a call of your function. But that is not how LeetCode will call your function. It will not be called like:
isPalindrome([1, 2, 2, 1])
But like:
isPalindrome(new ListNode(1,
new ListNode(2,
new ListNode(2,
new ListNode(1)))))
From the way you have described it, no matter if we analyse this issue via big-O time complexity or empirical performance, there is no real reason to convert it to a linked list first. It will definitely slow your program down.
This is relatively easy to comprehend: in order to create the linked list, you have to access the whole array. How is this slower than accessing the array elements to determine if it is a palindrome? In terms of array access operations, we are accessing each of the array elements at most once (ideally), in each case. However, with the linked list approach we also have to spend time to create the linked list and then determine if that is a palindrome.
It's like if you're doing a math question, instead of doing it on the piece of paper it was given on, copying it to a piece of parchment first and doing it there. You aren't saving time.
Albeit, the time complexity for both should be O(N) worst-case, and their runtimes should not differ drastically as the difference is only a small constant.
Converting to a linked list is probably only for demonstrative reasons, not performance reasons.
I'll start by reiterating my comment for some context:
One of LeetCode's goals is to help you learn common algorithms, programming patterns, and data structures (language agnostic) in a puzzle-oriented way. There's nothing wrong with your approach, except that the input is not an array, so it is not valid for the problem constraints. The main purpose of this problem is for you to understand what a singly-linked list data structure is and to begin to learn about big O notation.
Based on the details of your question and your follow-up comments, it sounds like you're having trouble with the first part: understanding the structure of a singly-linked list. This is understandable if your experience is in JavaScript: a singly-linked list is not a common data structure in comparison to arrays.
Included in the description details of the problem that you linked to, is the following:
Example 1:
Input: head = [1,2,2,1]
Output: true
The way that the head input argument is shown in the text uses the same syntax as an array of numbers in JavaScript. This is only an abstract (theoretical way of looking at things) representation of a linked list. It does NOT mean literally:
const head = [1, 2, 2, 1];
A linked list is a nested structure of nodes, each having a value and (maybe) a child node. The head input example actually looks like this JavaScript data structure:
const head = {
val: 1,
next: {
val: 2,
next: {
val: 2,
next: {
val: 1,
next: null,
},
},
},
};
This might seem new/confusing to you (and that's ok). This data structure is much more common in some other languages. There will be other problems on LeetCode that will be more familiar to you (and less familiar to programmers who work in those languages): it's part of the challenge and enjoyment of learning.
If the site content maintainers ever consider updating the problem details for each code puzzle, it might be a good idea to provide custom description information based on which language is selected, so that this kind of confusion happens less often.
I have certain requirements , I wanted to do the following in quickest way as possible.
I have 1000's of objects like below
{id:1,value:"value1"} . . {id:1000,value:"value1000"}
I want to access above objects by id
I want to clean the objects Lesser than certain id every few minutes (Because it generates 1000's of objects every second for my high frequency algorithm)
I can clean easily by using this.
myArray = myArray.filter(function( obj ) {
return obj.id > cleanSize;
});
I can find the object by id using
myArray.find(x => x.id === '45');
Problem is here , I feel that find is little slower when there is larger sets of data.So I created some objects of object like below
const id = 22;
myArray["x" + id] = {};
myArray["x" + id] = { id: id, value:"test" };
so I can access my item by id easily by myArray[x22]; , but problem is i am not able find the way to remove older items by id.
someone guide me better way to achieve the three points I mentioned above using arrays or objects.
The trouble with your question is, you're asking for a way to finish an algorithm that is supposed to solve a problem of yours, but I think there's something fundamentally wrong with the problem to begin with :)
If you store a sizeable amount of data records, each associated with an ID, and allow your code to access them freely, then you cannot have another part of your code dump some of them to the bin out of the blue (say, from within some timer callback) just because they are becoming "too old". You must be sure nobody is still working on them (and will ever need to) before deleting any of them.
If you don't explicitly synchronize the creation and deletion of your records, you might end up with a code that happens to work (because your objects happen to be processed quickly enough never to be deleted too early), but will be likely to break anytime (if your processing time increases and your data becomes "too old" before being fully processed).
This is especially true in the context of a browser. Your code is supposed to run on any computer connected to the Internet, which could have dozens of reasons to be running 10 or 100 times slower than the machine you test your code on. So making assumptions about the processing time of thousands of records is asking for serious trouble.
Without further specification, it seems to me answering your question would be like helping you finish a gun that would only allow you to shoot yourself in the foot :)
All this being said, any JavaScript object inherently does exactly what you ask for, provided you're okay with using strings for IDs, since an object property name can also be used as an index in an associative array.
var associative_array = {}
var bob = { id:1456, name:"Bob" }
var ted = { id:2375, name:"Ted" }
// store some data with arbitrary ids
associative_array[bob.id] = bob
associative_array[ted.id] = ted
console.log(JSON.stringify(associative_array)) // Bob and Ted
// access data by id
var some_guy = associative_array[2375] // index will be converted to string anyway
console.log(JSON.stringify(some_guy)) // Ted
var some_other_guy = associative_array["1456"]
console.log(JSON.stringify(some_other_guy)) // Bob
var some_AWOL_guy = associative_array[9999]
console.log(JSON.stringify(some_AWOL_guy)) // undefined
// delete data by id
delete associative_array[bob.id] // so long, Bob
console.log(JSON.stringify(associative_array)) // only Ted left
Though I doubt speed will really be an issue, this mechanism is about as fast as you will ever get JavaScript to run, since the underlying data structure is a hash table, theoretically O(1).
Anything involving array methods like find() or filter() will run in at least O(n).
Besides, each invocation of filter() would waste memory and CPU recreating the array to no avail.
I started off with an array of objects and used _.filter to filter down on some search criteria and _.findWhere to select out asingle object based on ID.
Unfortunately the amount of data has increased so much so that it's much more efficient to use _.indexBy to index by ID so I can just do data[ID] = id for the _.findWhere's.
However I am stumped on how to replace the _.filter method without looping through all the keys in data.
Is there a better way?!
Edit
The IDs are always unique.
I can't show any real data as it is sensitive but the structure is
data = {
1: {id: 1, name: 'data1', date: 20/1/2016}
2: {id: 2, name: 'data2', date: 21/1/2016},
3: {....
}
and I need to something like:
var recentData = _.filter(data, function(d){d.date > 1/1/2016});
To get an array of data or ids.
(n.b. the dates are all in epoch times)
This is really an optimization question, rather than simply which function to use.
One thing to go about this would be if we could rely on sort order of the whole collection. If it's already sorted, you go with something like binary search to find the border elements of your date range and then splice everything from this point. (side note: array would probably work better for this).
If the array is not sorted you could also consider sorting it first on your end - but that makes sense only if you need to retrieve such information several times from the same dataset.
But if all you got is just the data, unsorted and you need to pick all elements starting from a certain date - no other way that iterate through it all with something like _.filter().
Alternatively you could revert back to the source of your data and check whether you can improve the results that way - if you're using some kind of API, maybe there are extra params for sorting or narrowing down the date selection (generally speaking database engines are really efficient at what you're trying to do)? Or if you're using a huge static JSON as the source - maybe consider improving that source object with sort order?
Just some ideas. Hard to give you the best resolution without knowing all the story behind what you're trying to do.
I've recently started with Reactjs and I find it really interesting. Keys are about identity, so identifying each component through a unique key is the way to go, right?
Suppose I have this example:
var fruits = [
{'fruitId': 351421, 'fruit': 'banana'},
{'fruitId': 254134, 'fruit': 'apple'},
{'fruitId': 821553, 'fruit': 'orange'}
];
React.DOM.ul(null, fruits.map(function(item) {
return (
React.DOM.li({
key: item.fruitId
}, item.fruit)
);
}));
Note the big IDs numbers. Now, my question is if is better to use numbers as IDs or strings like a hashes as IDs?
Thanks!!
It really doesn't matter, the only thing that matters is that the key is unique among siblings within the parent element. It doesn't have to be unique across your entire app, just inside the parent you're appending these items to.
Often for simple iteration over elements, such as <li> or <option> it's fine to just use the index within your iterator.
EG:
var options = [];
for (var i=0; i<this.props.options.length; i++) {
var option = this.props.options[i];
options.push(
<option key={i} value={option.value}>{option.name}</option>
);
}
The only time this doesn't work well is if you are adding/removing keyed elements in different orders etc later on so that your key might clash with another sibling. In which case you're going to want to generate the key in some other way to make sure it's unique - or use a known unique key from your model. Whichever way you do it as long as it's unique among it's siblings, it'll be fine.
As #Mike says, they are only used to preserve the ordering of the real DOM elements if your are adding to/removing from a list. They only need to be unique to the local component, so it's ok to reuse natural ids from your data. I would not use an index from an iterator because of this.
Re. numbers, vs strings: if you're concerned about performance, I'd use whatever type you've already got. Any conversion/parsing you do will be done on every render. However, this would be pretty low on my list of performance concerns in my app.
I have a scenario on my web application and I would like suggestions on how I could better design it.
I have to steps on my application: Collection and Analysis.
When there is a collection happening, the user needs to keep informed that this collection is going on, and the same with the analysis. The system also shows the 10 last collection and analysis performed by the user.
When the user is interacting with the system, the collections and analysis in progress (and, therefore, the last collections/analysis) keep changing very frequently. So, after considering different ways of storing these informations in order to display them properly, as they are so dynamic, I chose to use HTML5's localStorage, and I am doing everything with JavaScript.
Here is how they are stored:
Collection in Progress: (set by a function called addItem that receives ITEMNAME)
Key: c_ITEMNAME_Storage
Value: c_ITEMNAME
Collection Finished or Error: (set by a function called editItem that also receives ITEMNAME and changes the value of the corresponding key)
Key: c_ITEMNAME_Storage
Value: c_Finished_ITEMNAME or c_Error_ITEMNAME
Collection in the 10 last Collections (set by a function called addItemLastCollections that receives ITEMNAME and prepares the key with the current date and time)
Key: ORDERNUMBER_c_ITEMNAME_DATE_TIME
Value: c_ITEMNAME
Note: The order number is from 0 to 9, and when each collection finishes, it receives the number 0. At the same time, the number 9 is deleted when the addItemLastCollections function is called.
For the analysis is pretty much the same, the only thing that changes is that the "c" becomes an "a".
Anyway, I guess you understood the idea, but if anything is unclear, let me know.
What I want is opinions and suggestions of other approaches, as I am considering this inefficient and impractical, even though it is working fine. I want something easily maintained. I think that sticking with localStorage is probably the best, but not this way. I am not very familiar with the use of Design Patterns in JavaScript, although I use some of them very frequently in Java. If anyone can give me a hand with that, it would be good.
EDIT:
It is a bit hard even for me to explain exactly why I feel it is inefficient. I guess the main reason is because for each case (Progress, Finished, Error, Last Collections) I have to call a method and modify the String (adding underline and more information), and for me to access any data (let's say, the name or the date) of each one of them I need to test to see which case is it and then keep using split( _ ). I know this is not very straightforward but I guess that this whole approach could be better designed. As I am working alone on this part of the software, I don't have anyone that I can discuss things with, so I thought here would be a good place to exchange ideas :)
Thanks in advance!
Not exactly sure what you are looking for. Generally I use localStorage just to store stringified versions of objects that fit my application. Rather than setting up all sorts of different keys for each variable within localStore, I just dump stringified versions of my object into one key in localStorage. That way the data is the same structure whether it comes from server as JSON or I pull it from local.
You can quickly save or retrieve deeply nested objects/arrays using JSON.stringify( object) and JSON.parse( 'string from store');
Example:
My App Object as sent from server as JSON( I realize this isn't proper quoted JSON)
var data={ foo: {bar:[1,2,3], baz:[4,5,6,7]},
foo2: {bar:[1,2,3], baz:[4,5,6,7]}
}
saveObjLocal( 'app_analysis', data);
function saveObjLocal( key, obj){
localStorage.set( key, JSON.stringify(obj)
}
function getlocalObj( key){
return JSON.parse( localStorage.get(key) );
}
var analysisObj= =getlocalObj('app_analysis');
alert( analysisObj.foo.bar[2])