I have a flat data file in the form of xml, but there isn't a real Windows viewer for the file, currently. I decided to create a simple application with Node-WebKit, just for basic viewing - the data file won't need to be written to by the application.
My problem is, I don't know the proper way to read a large file. The data file is a backup of phone SMS's and MMS's, and the MMS entries contain Base64 image strings where applicable - so, the file gets pretty big, with large amounts of images (generallly, around 250mb). I didn't create/format the original data in the file, so I can't modify it's structure.
So, the question is - assuming I already have a way to parse the XML into JavaScript objects, should I,
a) Parse the entire file when the application is first run, storing an array of objects in memory for the duration of the applications lifetime, or
b) Read through the entire file each time I want to extract a conversation (all of the messages with a specific outgoing or incoming number), and only store that data in memory, or
c) Employ some alternate, more efficient, solution that I don't know about yet.
Convert your XML data into an SQLite db. SQLite is NOT memory based by default. Query the db when you need the data, problem solved :)
Related
I have bunch of zip codes records (50k) with corresponding states they belong to in csv file (0.5 MB in size). I want to read it in array and later write my own function to see if user provided zip code matches with state.
Currently, I have those records in MongoDb and I do Async reading in the beginning of the application and it takes time. I do not ever need to update the data. Just one time read and array's filter function would do the required job for me.
Would you have any suggestion on alternative way of storing the data that is quicker to load?
Thank you,
I have a large JSON file that I would like to store it in Redis. The problem is when I parse it I run out of memory in Node.Js
I extended the heap memory from 1.39GB from 4GB it's still happening and I believe I am not doing it properly.
With a lot of search, I found out that streaming is my best bet. The thing is I am not really fluent with Streaming and I am not sure even this would resolve my problem
I read a lot and there is a lot of scattered information. I was wondering to ask if you think if this is even approachable or if this is correct?
Would I be able to stream a JSON object into Redis?
Do I have to Staingifiying it or it will be automatically?
Should I stringify my json chunk by chunk?
or streaming into redis will end up being a string?
I am using ioRedis client to interact with Redis.
I appreciate your help in advance.
If you can guarantee that only one processor will be updating that key, you could possibly use SETRANGE. As you parse the file, you can keep a reference to the next offset:
(psuedo-code)
offset = 0
offset = redis.set_range(key, offset, "string")
Then you can load pieces of the file up to Redis without having to load everything into memory at once.
SETRANGE returns the length of the string after it was modified.
This also assumes that you can load pieces of the file contents without having to parse everything as JSON then convert it back to a string. Also assumes that only one process is updating that key -- if multiple processes try to update it, the JSON value can get corrupted.
I am working with php and heatmap js to generate a heat-map.
I was thinking of going down the path of allowing the user to upload a floor-map jpg file initially and then allow him to add the sensor names to different locations in the floor-map.
Once the sensor locations are specified, I need to save that configuration to an XML file. Once I have this set of information (img_id, [sensorid1,x1,y1], [sensorid2,x2,y2],..,[sensoridn,xn,yn]), I can query my database for the latest values of sensors and then display as heat-map on the image (on the specific sensors' x and y coordinates) real-time.
I would like to know if saving the configuration as XML is the right way of doing it. Is there there a better way of temporarily storing the information using javascript/PHP?
There are likely a bunch of ways to solve this. My preference would be for JSON, as it is natively supported by Javascript and PHP. It is also MUCH easier to read and write.
When you say "saving", what do you mean? If you need it to be stored server side, then creating DB entities that the data structure can be mapped to and stored in will be far better than trying to create files server-side. Depending on how the app gets hosted, you may not have permission to do that, and if your server ever goes away you could loose that data (However, there are safe ways to create files using a service like AWS S3). Storing it in a database not only gives you a single place to worry about backups, but also lets you query the data in interesting and powerful ways (SQL etc) easily, without having to figure out how to do that for files with every new query.
Currently, my website is using a Mongo.Collection to hold data submitted from another site. Strings are sent over through HTTP methods and packed into the collection afterward. However, this collection now needs to support storing larger files, but still needs to hold the data already stored, so I've been looking into converting the collection into GridFS. Is there a way to attach the data onto empty files as metadata, or is the conversion more convoluted than this?
I have some data that I want to display on a web page. There's quite a lot of data so I really need to figure out the most optimized way of loading and parsing it. In CSV format, the file size is 244K, and in JSON it's 819K. As I see it, I have three different options:
Load the web page and fetch the data in CSV format as an Ajax request. Then transform the data into a JS object in the browser (I'm using a built-in method of the D3.js library to accomplish this).
Load the web page and fetch the data in JSON format as an Ajax request. Data is ready to go as is.
Hard code the data in the main JS file as a JS object. No need for any async requests.
Method number one has the advantage of reduced file size, but the disadvantage of having to loop through all (2700) rows of data in the browser. Method number two gives us the data in the end-format so there's no need for heavy client-side operations. However, the size of the JSON file is huge. Method number three has the advantage of skipping additional requests to the server, with the disadvantage of a longer initial page load time.
What method is the best one in terms of optimization?
In my experience, data processing times in Javascript are usually dwarfed by transfer times and the time it takes to render the display. Based on this, I would recommend going with option 1.
However, what's best in your particular case really does depend on your particular case -- you'll have to try. It sounds like you have all the code/data you need to do that anyway, so why not run a simple experiment to see which one works best for you.