best way to load 100mb+ json file nodejs

best way to load 100mb+ json file nodejs - javascript

I have a system that generates data every day
And saves the data in the json file. The file about 120MB
I'm trying to send the data with nodejs to the client
router.get('/getData',(req, res) => {
const newData = require(`./newDataJson.json`)
res.json(newData)`});
and then from the client i use axios get request
const fetchNewData = async () => {
const { data } = await axios.get(`/api/getData/`);}
The data reaches the client within about 3 minutes in production.
and few second in localhost.
My question is if it possible to Shorten the load time in production.
thanks !!

I would suggest you to use stream in Node.js that is suitable to send the large data from server to client. Stream are useful when you send big chunk of data. You should give it a try and check if there are any improvements after adding this.
const readStream = fs.createReadStream('./newDataJson.json');
return response.headers({
'Content-Type': 'application/json',
'Content-Disposition': 'attachment; filename="newDataJson.json"',
}).send(readStream);
Also, as Andy suggested this could be a way to divide your file data in smaller partition based on hours.

120 MB is way over the limits of initial page loading times. The best thing to do would be to split the file into smaller chunks:
1.
Right after that the user will see the loaded page you need (as I assume) only portion of that data. So initially send only small chunk to make the data visible. Keep it small, so the data would not block loading and first draw.
2.
Keep sending rest of the data in smaller parts or load them on demand in chunks e.g. with pagination or scroll events.
You can start splitting the data upon receiving/saving them.
Loading data from api on scroll (infinite scroll)
https://kennethscoggins.medium.com/using-the-infinite-scrolling-method-to-fetch-api-data-in-reactjs-c008b2b3a8b9
Info on loading time vs response time
https://www.pingdom.com/blog/page-load-time-vs-response-time-what-is-the-difference/

Related

How do I pass an object from one page to another in nextjs 13?

I'm sorry but I have been trying to solve this problem for 2 hours now. I am writing a small web app using nextjs. I load a csv file, send it to my backend via an API POST request and process it there. The processed data is then sent back to the frontend (an array of about 150 objects with 9 attributes each). Now I want the frontend to take the data and forward it to a new page (the "view" page should just be a different one from the one where I read it in). I just can't seem to do that. There was Router.query in nextjs 12, but it was abolished and I can't think of a meaningful way to transmit my data. Simply stringifying it and appending it as a query parameter doesn't work (probably because it's too long). I can't have the "view" page send a GET request because I don't want to store the data in the backend and it wouldn't be available for such a query. Can someone enlighten me?
const response = await fetch("/api/processcsv", {
method: "POST",
body: JSON.stringify(formData)
})
if (response.ok) {
console.log("Data send")
const res = await response.json()
router.push('/datadisplay?data='+JSON.stringify(res))
} else {console.log(response.status)}
not working

If I send chunked data from my webapp, will the client receive each chunk intact?

I'm writing a very basic webapp that displays data requested from a server. The server sends JSON "packets" of data as a chunked response.
I'm reading the data from the server using the Javascript fetch API in my webpage, via the ReadableStream API. As far as I can tell from experimenting, each chunk that I send from the server arrives at the client as a separate block of data. If I assume that, my client is straightforward:
const response = await fetch("/server_api");
const reader = response.body.getReader();
while (true) {
const {value, done} = await reader.read();
if (done) break;
// convert "value" from an array of bytes to a JS object
// by parsing it as JSON
obj = JSON.parse(new TextDecoder().decode(value))
// process the object
}
However, this will fail if a chunk from the server gets split between two reads, or if two chunks from the server get merged in a single read.
Is this a possibility that I need to worry about? If I do, my code (both server and client side) will need to get significantly more complex, so I'd prefer to avoid that if I can.
Note that I'm specifically talking here about HTTP chunked responses and how they interact with the Javascript fetch API, not about TCP or other levels of the network stack.

Answering my own question, I have triggered a case where two blocks of data sent by the server arrived at the client in a single read() result from the reader.
So yes, you do have to allow for the possibility that chunks sent by the server get merged. I haven't seen a case of a chunk getting split, so it's possible that might not happen, but the key point is that there is not a one to one correspondence between the chunks the server sends and the chunks the client receives.

Fetching large data from Elasticsearch in Node.js

I am fetching about 60,000 documents from an index ABC in Elasticsearch (v7) through their Node.js client. I tried using their recommended Scroll API but it took almost 20s to do so. Next, I increased the max_window_size of the index ABC to 100,000 (from the default 10,000) and the query took 5s.
I am running Elasticsearch v7 on a 8-core, 32GB-RAM machine. I am not performing any aggregations and my query is just a simple GET request to fetch all documents from the index ABC. Is there any way to speed this up to less than one second through their Node.js client?
My code in Express
const { body } = await client.search({
index: 'ABC',
size: 60000,
body: {
query: {
match_all: {}
}
},
});
res.send(body.hits.hits);
If there's no other way to cut the time down, should I cache the entire response in memory and let the UI read from it rather than hitting ES directly (the data only updates once per day, I might not need to perform cache validations). Should I look into Redis for this?

Though I'm not exactly sure what your use case is, the scroll API is probably not appropriate for a web server servicing a web page. To quote this page in the docs:
Scrolling is not intended for real time user requests, but rather for processing large amounts of data, e.g. in order to reindex the contents of one index into a new index with a different configuration.
Instead, you should paginate the results with the from/size arguments. When the UI needs to display more items, you should submit another API request to pull another page by increasing the from argument appropriately.

What are the max chars in a URL string that can be passed to the Puppeteer page.goto(url) function

Background
We are using Puppeteer to render PDFs on a Node server. We are using an API to pass large query strings to the API which is passed to Puppeteer. Once Puppeteer renders the web page, the data in the GET query string is pulled into the HTML page rendered so the data in the page is populated dynamically. Once the page renders, Puppeteer converts it to a PDF and it is downloaded to the client.
Problem
We realized that when the requests are very large it breaks the browser when we hit the API with a GET request. To overcome this we are hitting the API as a POST and hashing the data so it can be rendered later.
This got us wondering if there is a max char for the puppeteer function rendering the web page used to render a PDF.
Example Code
const browser = await puppeteer.launch({
args: ['--no-sandbox', '--disable-setuid-sandbox'],
ignoreHTTPSErrors: true,
dumpio: false
});
const page = await browser.newPage();
const data = reqMethod === 'POST' ? req.body : JSON.parse(req.query.data);
const {pdfOptions, ...templateData} = data;
const url = `${PDF_API_PROD}/${template}?data=${JSON.stringify(templateData)}`;
await page.goto(url);
const pdfBuffer = await page.pdf({
format: 'A4',
margin: {
top: '20px',
left: '20px',
right: '20px',
bottom: '20px',
},
...pdfOptions,
});
Question
After looking at the code above you will see that we are passing the data object directly into the URL as a GET param. This will be used to render the web page with Puppeteer.
Once the web page is rendered with Puppeteer the data in the GET string will be pulled into the web page with JavaScript in order to render the page dynamically.
What is the max chars that can be passed into the Puppeteer function await page.goto(url);?

There is no hard limit built into the browser. I was able to send URLs of a length of up to 2000000 characters to a server myself without any problems. Even after that, I only had trouble because it just takes some time to send the data.
If you are having trouble sending large ULRs, it is most likely one of the following two things:
1. The server is not properly configure to receive the amount of data.
To receive that much data, you have to properly configure your server. By default, most server will cap the data which can be send via the URL.
2. You are hitting a timeout
Keep in mind, that sending a few MB of data, might take some time depending on your internet connection and the server upload speed. It might also be slower to send the data in the head of the HTTP request instead of sending it as a stream inside the body. In my test cases, this was the limiting factor.
Therefore: Most likely, the problem you are encountering is not related to puppeteer but to the receiving end.
What puppeteer does
As you are thinking that puppeteer might truncate the URL: This is not the case. puppeteer is just a wrapper around the DevTools Protocol. Puppeteer will take the URL argument, wrap it as part of the payload via JSON.stringify and send it to the browser. I doubt that the DevTools Protocol has any limitations built into Page.navigate. Therefore, there should be no "library-specific" introduced though puppeteer here.

How to send and receive large JSON data

I'm relatively new to full-stack development, and currently trying to figure out an effective way to send and fetch large data between my front-end (React) and back-end (Express) while minimizing memory usage. Specifically, I'm building a mapping app which requires me to play around with large JSON files (10-100mb).
My current setup works for smaller JSON files:
Backend:
const data = require('../data/data.json');
router.get('/', function(req, res, next) {
res.json(data);
});
Frontend:
componentDidMount() {
fetch('/')
.then(res => res.json())
.then(data => this.setState({data: data}));
}
However, if data is bigger than ~40mb, the backend would crash if I test on local due to running out of memory. Also, holding onto the data with require() takes quite a bit of memory as well.
I've done some research and have a general understanding of JSON parsing, stringifying, streaming, and I think the answer lies somewhere with using chunked json stream to send the data bit by bit, but am pretty much at a loss on its implementation, especially using a single fetch() to do so (is this even possible?).
Definitely appreciate any suggestions on how to approach this.

First off, 40mb is huge and can be inconsiderate to your users especially if there' s a high probability of mobile use.
If possible, it would be best to collect this data on the backend, probably put it onto disk, and then provide only the necessary data to the frontend as it's needed. As the map needs more data, you would make further calls to the backend.
If this isn't possible, you could load this data with the client-side bundle. If the data doesn't update too frequently, you can even cache it on the frontend. This would at least prevent the user from needing to fetch it repeatedly.
Alternatively, you can read the JSON via a stream on the server and stream the data to the client and use something like JSONStream to parse the data on the client.
Here's an example of how to stream JSON from your server via sockets: how to stream JSON from your server via sockets

We Keep Coding

JavaScript is the programming language of the Web.

best way to load 100mb+ json file nodejs - javascript

Related

How do I pass an object from one page to another in nextjs 13?

If I send chunked data from my webapp, will the client receive each chunk intact?

Fetching large data from Elasticsearch in Node.js

What are the max chars in a URL string that can be passed to the Puppeteer page.goto(url) function

How to send and receive large JSON data

Categories

Resources