I recall reading developers should think of records in a store as a row in a database where each column is a simple data type. We store complex JS objects in Ext JS stores without any apparent ramifications.
Does anybody know of pitfalls with storing JS objects in an Ext JS store?
Modern web browsers have gotten very good at managing this kind of memory usage and processing speed. Internally we had implemented the sort of record associations that extjs 4 has built-in now, and have scenarios with ~250k complex nested records stored without any real problem. I believe that the negligible impact on performance would continue to hold up for a long time, as it also is pretty good about cleaning up its own memory usage after itself. We mirrored our web server's ORM models to extjs record definitions, and regularly query against these nested stores in much of the same ways that we would hit a more traditional database. You have to be careful what you do with it, e.g., trying to render a grid of 250k records at once will not work out very well. But that is almost entirely the impact of dom rendering and not the iteration or storage of the record/store data. All of this seems to be even more true when testing with the recent extjs 4 beta releases.
Looking at the Ext JS source it seems like a Store is a wrapper around an Object which offers sorting / filter and event functionality. It's a simple key/value pair.
The complexity of the object should not cause any problems.
Now treating a Store as anything but a wrapper around an key/value pair collection is going to cause problems. Thinking it's like a table is going to cause problems. That kind of misunderstanding leads to poorly designed code.
A Store should be treated as a Bag of key/value data with helper methods to organise that bag.
I suppose there can potentially be a performance impact if your JS objects are very large, it could take some time if you end up doing a lot of serialization/deserializtion of those JS objects. If you are dealing with grids, you can mitigate these by use of pagination.
Stores are not necessarily used strictly for grid rows. They are used in many Ext objects such as comboboxes (drop down menus). Here it is used with a key/value pair. Usually this is done for a value and a displayValue relationship for the data.
If you need an even lower level object to work with, check out the Ext.util.MixedCollection object. There are lots of fun stuff in there. It's basically a hashmap of key/value pairs. I believe in the Ext source code, Stores use these objects at its core.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
Take, for instance, a JS web engine that implements associative arrays (objects) by using hash tables. but as we know hash tables have worse case O(n) because collisions are inevitable.
Suppose I begin to develop a new data structure using Javascript, a data structure such as LinkedList that has worse case O(1) for insert/delete. But since I implement it with object/array. Then it must be true that my implementation is also at minimum worse case O(n) as well.
I'm aware that this engine optimizes very well, and a good hash function will generate O(1) on average. However, I just want to confirm my realization that this isn't all as straightforward as the textbook says so. or is it?
I suppose at the root of all data structures implements with an array, since the access is always O(1), then shouldn't all data structures be built with an array without intermediary structures? also, the dynamic array still has delete O(n) cant that be the same problem that can trickle down just like my earlier example?
is this where the benefit of using a low-level programming language is better than using a high-level language? Such that low level there isn't so much abstraction and the textbook complexity numbers can actually match?
apologize if my ideas are all over the place.
But since I implement [my custom data structure] with object/array. Then it must be true that my implementation is also at minimum worse case O(n) as well.
No. Your linked list does not use an object with n keys anywhere. You'll have a
const linkedListNode = {
value: …,
next: null,
};
but even if this was implemented using a HashTable with O(n) worst-case member access, in your case n=2. There are not arbitrarily many properties in your object, just two. That's how you get back to O(1).
Is this where the benefit of using a low level programming language is better than using high level language? Such that low level there isn't so much abstraction and the textbook complexity numbers can actually match?
No. Even in a lower-level programming language, you can begin to question the underlying abstraction. You think in C, an indexed array access in memory is constant time? No, since page faults and other caching shenanigans come into play.
This is why textbook complexity is always defined in terms of a machine model. As long as you define object property access in your JavaScript execution model as constant-time (and it's a very reasonable assumption to do that! It closely resembles the real world), your numbers do apply to JavaScript code as well. Sure, you can try unravelling abstractions and analyse your high-level algorithms in terms of the primitives of a lower level, but there's no point in doing that. It's precisely why we have these abstractions in the first place.
"associative arrays (objects) by using hash tables" -> Javascript objects are complicated and are much more than just "it's a hash map". I don't know the exact technical details but I think they change to hash map after a certain amount of values are stored in them, along with that they also store metadata which is used on other algorithms like Object.keys to automatically sort the keys after they've been pulled out of the hash map. Again I don't know the technical details but I do know that, it's not straight forward.
"as we know hash tables has worse case O(n) because collisions are inevitable" -> It depends on what hashing you're using, but more than that even if collisions are inevitable it's not correct to just claim "it's worse case O(n)" and leave it at that because the probability of it being O(n) logarithmically declines to 0, the chances of it finding a collision time after time again and again is extremely unlikely, so while it can perhaps find a collision that doesn't effectively describe the time complexity.
"it must be true that my implementation is also at minimum worse case O(n) as well" -> Not correct, you're speaking about two different things. If you build a linked list each node will be connected to the next node using a heap reference, which has nothing to do with javascript objects. Iterating through an entire linked list will be O(n) but that's because of having to iterate over every next node, not because of anything with objects or hashes.
"worse case O(1) for insert/delete" -> This is only true if you have the reference to the node where you want to insert/delete it, otherwise you'll have to search through it before insertion/deletion. But that's exactly the same in javascript.
"then shouldn't all data structure be built with array without intermediary structures" -> Most data structures I know (like a list, stack, queue) are implemented on top of a normal array. The ones that aren't (like a binary tree, dictionary/map or a linked list) are not implemented on an array because it wouldn't really make sense. For example the whole point of using object references with a linked list is so that you can directly insert/delete something, using an array under the hood would just defeat the entire point of using a linked list when you're specifically trying to take advance of the object references.
"also dynamic array still has delete O(n) cant that be the same problem which can trickle down just like my earlier example" -> Not necessarily because when you wrap things inside an object and use an internal array inside, you can add metadata, indexes, hashes, things stored outside of the array (private to the object) and all sorts of other things to speed up and keep track of things on that array. So the complexity of what's used internally doesn't just automatically spill over to using it in another object. But you do need to be careful, like if you use a list then the inner workings of it in languages like C# is that it will double the internal array when you try to add more elements to it after it's full, this can result in a lot of memory waste.
That being said the use case for javascript is in 99% of cases not "optimize this by another 10ms", javascript is used because of its non IO blocking nature, streaming, async/await reactive programming and it's rapid speed of development, it's also used 90% for web communication, not some highly optimized graphics engine. So there's very very few edge cases where you need over 9000 complexity optimizations, feature development, code readability, maintainability, things like that are a much bigger deal in JS in general. Along with that in most use cases you aren't going to request 1m data records from your DB using JS, usually like 50 that you want to display on a page and for that you'll use the DB to optimize your query, there's hardly ever any need for using such large data structures in any JS development (or web development in general). It's a lot better to pull what you need and to request more or continuously stream what you need to the client. So a lot of the data structures (like a binary tree) aren't really relevant to things in JS unless it's a very specific use case.
I have Json file I need to look for specific value (file is pretty big) and I want to turn this json into array so doing so would be easier and faster.
BUT what is the best way to save this array? so I wont need to run over this json every time and it will be saved until recycle or service restart?
(node js project)
1) Use Redis recommended
Pros:
Access objects super fast.
Isolated from node process.
Will not affect heap memory.
Can be deployed at separated server.
Data persistence if your application crashed.
Support compression.
low memory consumption.
Cons:
You may face some limitations if you have nested objects, but there
is a workaround which requires extra work to handle.
2) Use database preferred MongoDB,
Pros:
Save/Load objects easily since MongoDB supports JSON.
The same as number 2 of Redis pros
Cons:
You have to query every time to access objects.
3) Use Files not recommended, When your application start/restart, load your objects form file into global array, and when close/shutdown your application dump your objects from global array back into file.
Pros:
Access objects fast.
Cons:
Heap memory leakage if your objects size are Huge.
Data loss if your application crashed.
At the end it's your choice, if the speed matters choose Redis, if you want the easy way choose mongoDB. If it's not a problem losing some of your data go for Files. Also you can mix between number 2 and 3.
I'm trying to get my head around the use of IndexDB. I have an SQL database which I access via REST and I'm planning on providing some local caching using IndexDB.
My SQL structure uses a large (and variable) number of tables, each table storing an array of data (time sequence and value) for a specific sensor value. Ideally, I would have assumed I'd create a new object store for each of my tables from MySQL. However, it seems that you can only create an object store when the database is opened which is a bit of a pain.
So, I see a number of options -:
I could use a single object store and add two indexes - one for the time,
and one for the sensor. I'm a little worried that this might have
performance issues, but I'm not sure how data is stored under the
hood.
I could probably detect a new sensor somehow, and open the
database with a new version number. This just feels a little wrong to
me.
I could alternatively use different databases for each sensor,
but I've read somewhere that it's not recommended to use multiple
databases (although it's unclear why since this is possibly the
easiest solution).
I'd welcome any thoughts people have regarding the best structure for this sort of data, that will provide good performance.
If your data sets are independent, example you don't need to combine results from multiple sensors, I suggest you to split them in different tables and/or different databases. Different database option is more convenient for deleting data.
IndexedDB database limit for performance in a single database is for more than 50K data, depending on browser and hardware. I have a couple tests which can measure the speed, just tweak the object size that is inserted and you can test your use case.
If you have less than 10K data per sensor (object store/database) you won't hit big performance issues. One common mistake when inserting batch of data is separate transaction for each insert - this is completely unnecessary, since you can store 10K data with one transaction. If you are working with even larger data set, you can separate the inserting into couple transaction, so you won't block the reading of that database.
Also for every transaction that you do in IndexeDB you need to open a connection, some people use the approach for keeping the one connection alive and reusing it, I prefer the closing and opening a separate connection for each transaction.
Also for faster access, you can store all database info into Local Storage, that way you can track how many databases you have and descriptions for each of them.
Additionally you can take a look at this similar question
When calling ko.toJS(entity) on any Breeze entity that has navigation properties or is part of a large object graph IE7-8 complain of long running script errors and even chrome can get bogged down for over 1 second.
What gives?!?
I have had this plague me before and seen various forums or Github issues where this became an issue for devs. I wanted to share some knowledge in hopes that it helps others.
The problem
Breeze.js provides your client-side libraries or frameworks with an object graph. Calling JSON.stringify() on the entity will cause the dreaded stack overflow because some of the properties reference each other. Example -
Person -> Company -> Persons
Since a person has a navigation property of company that also has a navigation property of persons they continue to find each other.
ko.toJS() gets around this by detecting duplicates and stopping the serialization at that point.
Why is it slow?
The serialization takes longer depending on the number of entities in the object graph. Don't believe me? Load your app and call ko.toJS() then go load up a bunch more entities in the same object graph. It will take longer, guaranteed. The problem is that even if you call ko.toJS() on a single person Knockout isn't finding duplicates until it has serialized every person in the original persons company. This isn't a problem of Knockout nor Breeze - it is caused by how well interconnected everything is.
What can I do to fix it?
Using a projection query is the best solution, if you can. The problem is usually we need deeper nested levels of entities, not just the current entity. If your entity has a navigation property that has many related entities there is no easy way to perform a projection query, so we need to take it a step further in one of two manners -
Write a custom serializer
Vote for IdeaBlade (the company behind Breeze.js) to support unwrapping more efficiently
My opinion - expose a method on the entity manager to 'unwrap' that accepts the following parameters -
entityManager.unwrap(entity || [entities], [addtlTypesToUnwrap])
Where addtlTypesToUnwrap is either an array of strings or comma separated strings that define which related entity types to unwrap.
Example usage -
var thisEntity = manager.fetchEntityByKey('Person', 1, true);
var relatedProps = ['Company', 'Address']
entityManager.unwrap(thisEntity, relatedProps);
Which would unwrap Person, their address, their company, and the companies address (if it existed)
Long story short : I'd like to treat several javascript associative arrays as a database (where the arrays are tables). The relations could be represented by special fields inside the arrays. I'm not interested in the persistence aspect of a database, I only want to be able to query the arrays with a SQL-like language and retrieve sets of data in the form of associative arrays.
My question : Is there any javascript library that has such features ? Otherwise, is there any library that can at least take care of the SQL-like language part ?
Thanks
I believe the closest thing to what you need is the jLinq library. It can operate with js objects and arrays much in the same way you would do with a database, but in a slightly different way. You don't really write queries, but use methods to construct them. Overall it's way better I think.
Some googling found this: http://ajaxian.com/archives/two-js-solutions-to-run-sql-like-statements-on-arrays-and-objects which seemed interesting.
Can I ask why you want to do this?
I came across this question while searching something sort of related. Wanted to share with you (9 years later, I do realize) that I have the same want/need, often, where the script I'm working on has a lot of cross-referencing to do between various sources of information. I use PowerShell. Enumerating arrays of objects, from within a loop, which is enumerating other objects, is just bad/slow/horrible.
To date, my solution has been to take all my arrays and then make hashtables from them, where the key/name is a property value that is common across all the arrays (e.g. ObjectId (GUID)), and the value is the entire object from the array. With this, while in my loop which is enumerating Array#1, I can check for the presence of this current item in any of the other arrays simply by checking the existence of the key in the corresponding hashtables, and that way there's no enumerating the other array, there's just direct , equal effort access to the correct item in the array (but really coming from the newly built hashtable).
So my arrays are just temporary collection buckets, then everything I do from there uses the hashtables, which are just index/lookup tables.
What I was searching for when I stumbled here, was for solutions to keep track of all the different hashtables in the building/planning phase of my scripts.