A JSON file is 6 GB. When reading it with the following code,
var fs = require('fs');
var contents = fs.readFileSync('large_file.txt').toString();
It had the following error:
buffer.js:182
throw err;
^
RangeError: "size" argument must not be larger than 2147483647
at Function.Buffer.allocUnsafe (buffer.js:209:3)
at tryCreateBuffer (fs.js:530:21)
at Object.fs.readFileSync (fs.js:569:14)
at Object.<anonymous> (/home/readHugeFile.js:4:19)
at Module._compile (module.js:569:30)
at Object.Module._extensions..js (module.js:580:10)
at Module.load (module.js:503:32)
at tryModuleLoad (module.js:466:12)
at Function.Module._load (module.js:458:3)
at Function.Module.runMain (module.js:605:10)
Could somebody help, please?
The maximum size for a Buffer, which is what readFileSync() uses internally to hold the file data, is about 2GB (source: https://nodejs.org/api/buffer.html#buffer_buffer_kmaxlength).
You probably need a streaming JSON parser, like JSONStream, to process your file:
const JSONStream = require('JSONStream');
const fs = require('fs');
fs.createReadStream('large_file.json')
.pipe(JSONStream.parse('*'))
.on('data', entry => {
console.log('entry', entry);
});
u can read the file using line reader node js package and at every 50000 lines you can make small files squenceally then process those file and clear them out for your purpose if you have some task to read data from each line for a bigger file.line reader can do the job as it use stream in backend. the line reader dont wait for you if you directly read and process data like update in mongodb etc.
i did it and it worked even for a 10gb file .
Related
Hello and thank you in advance.
This may be a rather simple scenario, however, I am new to Nodejs.
I have a main script which I am calling from javascript using IISNode. Everything has worked great until I decided to rename a dependency file.
Files involved:
embed.js <- main script
dev2.js <- required custom script by embed.js
which reads this json file
fred.json renamed to chartio.json
embed.js code relevant to issue:
var http = require('http');
var jwt = require('jwt-simple');
var dashinfo = require('./dev2');
var ORGANIZATION_SECRET = dashinfo.getkey();
var ORG_ID = dashinfo.getorgid();
dev2.js code relevant to issue:
var mariadb = require('mariadb');
var connectioninfo = require('./chartio.json');
module.exports = {
getkey: function () {
return connectioninfo.connection.apikey;
},
getorgid: function () {
return connectioninfo.connection.orgid;
},
and finally, I have my charti0.json file which I cannot post due to sensitive data.
I assure you that everything was working until I renamed fred.json to chartio.json.
I have looked online to see if there is a way to clear the cache but I couldn't find anything that seemed to work, though I am a novice. I also looked at logs. I tried running this in IE and Chrome
This is what I see logged from the error:
Application has thrown an uncaught exception and is terminated:
Error: Cannot find module './fred.json'
Require stack:
- C:\xxx\xxx\GPS411\node\dev2.js
- C:\xxx\xxx\GPS411\node\embed.js
- C:\Program Files (x86)\iisnode\interceptor.js
at Function.Module._resolveFilename (internal/modules/cjs/loader.js:794:15)
at Function.Module._load (internal/modules/cjs/loader.js:687:27)
at Module.require (internal/modules/cjs/loader.js:849:19)
at require (internal/modules/cjs/helpers.js:74:18)
at Object.<anonymous> (C:\Omnitracs\sylectus-trunk\GPS411\node\dev2.js:3:22)
at Module._compile (internal/modules/cjs/loader.js:956:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:973:10)
at Module.load (internal/modules/cjs/loader.js:812:32)
at Function.Module._load (internal/modules/cjs/loader.js:724:14)
at Module.require (internal/modules/cjs/loader.js:849:19)
Folks, I was able to get this working by doing an old fashion workstation reboot. I changed the filename again to test if the issue returns and it seems to recognize the file changes now. I guess I was just stuck in cache-land, but now I'm free. Thanks for your consideration.
I'm now learning NodeJS from nodeschool.io and the third exercise is about I/O file.
It's asking me to write a program using a single synchronous filesystem operation to read a file and print the number of newlines to the console (stdout). The full path to the file to read will be provided as the first command-line argument (i.e., process.argv[2]).
The answer for this exercise is similar to mine so I really know where I got wrong. This is my solution:
var fs = require('fs');
var contents = fs.readFileSync(process.argv[2]);
var strs = contents.toString();
var lines = strs.split('/n').length - 1;
console.log(lines);
but i got an error:
TypeError: path must be a string or Buffer
at Object.fs.openSync (fs.js:660:18)
at Object.fs.readFileSync (fs.js:565:33)
at Object.<anonymous> (D:\projects\dmt-node-study\first-io.js:3:19)
at Module._compile (internal/modules/cjs/loader.js:654:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:665:10)
at Module.load (internal/modules/cjs/loader.js:566:32)
at tryModuleLoad (internal/modules/cjs/loader.js:506:12)
at Function.Module._load (internal/modules/cjs/loader.js:498:3)
at Function.Module.runMain (internal/modules/cjs/loader.js:695:10)
at startup (internal/bootstrap/node.js:201:19)
app.js
var fs = require('fs');
var contents = fs.readFileSync(process.argv[2]);
var strs = contents.toString();
var lines = strs.split('\n').length - 1;
console.log(lines);
Running the code from command line
>node app test.txt
Considering text file is in your project root directory.
I am newbie to Javascript/jquery. I am writing a simple js file that parses a csv file.
var jquery = require('jquery');
jquery.get('file.csv', function(data) {
alert(data); // this is a line
var tempArray = data.split(','); // array of data
for(var i = 0; i < tempArray.length; i++)
{
console.log(tempArray[i]); // probably index 1 is your IPv6 address.
}
});
When I run the code above, I get the following error:
jquery.get('file.csv', function(data) {
^
TypeError: jquery.get is not a function
at Object.<anonymous> (/Users/ishabir1/Desktop/transloc/parser.js:3:8)
at Module._compile (module.js:410:26)
at Object.Module._extensions..js (module.js:417:10)
at Module.load (module.js:344:32)
at Function.Module._load (module.js:301:12)
at Function.Module.runMain (module.js:442:10)
at startup (node.js:136:18)
at node.js:966:3
[Finished in 0.1s with exit code 1]
Can someone please advise? Thanks!
You cannot use jQuery like that in the Node.js realm.
You need to rely on the fs module for reading files, and on a csv library for actually parsing your data, see example:
var fs = require('fs');
var parse = require('csv').parse;
var parser = parse(function(err, data){
console.log(data);
});
fs.createReadStream('file.csv').pipe(parser);
Don't forget to run npm install csv before requiring it!
I have the following tiny program.js which tries to execute a binary file:
var childProcess = require('child_process');
var path2Binary = '/home/myuser/myproj/bins/mybin';
var par = '--file=' + '/home/myuser/myproj/files/myfile.txt';
var ret = childProcess.execFileSync(path2Binary, [par]);
if (!ret) throw 'Error invoking process!';
var cnt = ret.stdout;
if (!cnt) throw 'Error retrieving output!';
console.log(cnt);
The program tries to execute a binary file and passes it a parameter (a file). The output of this process will be then displayed.
I try to run this: node program.js, but get the following
var ret = childProcess.execFileSync(path2Binary, [par]);
^
TypeError: Object #<Object> has no method 'execFileSync'
at Object.<anonymous> (/home/myuser/myproj/program.js:6:24)
at Module._compile (module.js:456:26)
at Object.Module._extensions..js (module.js:474:10)
at Module.load (module.js:356:32)
at Function.Module._load (module.js:312:12)
at Function.Module.runMain (module.js:497:10)
at startup (node.js:119:16)
at node.js:929:3
More information
I am running on CentOS, Node version is v0.10.36.
I tried running sudo yum install nodejs, but it tells me it is already installed so Node installation looks kinda good.
What's the problem?
On a side note...
If I replace childProcess.execFileSync with childProcess.spawn I get the same.
If I change the first line into the following:
var exec = require('child_process').execFileSync;
Then I get an undefined exception on exec.
Synchronous child processes aren't supported in node v0.10.36 - https://nodejs.org/docs/v0.10.36/api/child_process.html
Looks like it may have been introduced in 0.12.
UPDATED
I have researched this a lot further. The web application is using the node-db-migrate package. There is a migration folder with two migrations (with table creations). Since I just git cloned it down I am sure I need to run this and I do have node-db-migrate installed on my machine when I hit npm-list.
I head into this folder and hit db-migrate-up and tried db-migrate-up [filename] but I am getting -bash: db: command not found.
I am using this with the node-postgres package. It should be loading the database.json file according to the node-db-migrate file.
Hi in my data.coffee here is line 1 to 17 as requested, according to the command line it could be the data part that is having problem.
uuid = require 'node-uuid'
fs = require 'fs'
_ = require 'underscore'
moment = require 'moment-timezone'
apis = require '../logic/apis'
q = require 'q'
data = (_.chain fs.readdirSync "data")
.map (filename) ->
"data/" + filename
.map (f) ->
fs.readFileSync f, "utf8"
.map (p) ->
JSON.parse p
.sortBy (json) ->
-json.intlFormatDateTime
.value()
Hi I come from a ruby/rails/sinatra background. I just inherited a javascript web app and I will be rewriting the back end.
I am just trying to start the app locally for now
I did
coffee app.coffee -n
but I am getting the error below.
Error: ENOENT, no such file or directory 'data'
at Object.fs.readdirSync (fs.js:654:18)
at Object.<anonymous> (/Users/judyngai/Desktop/twiage/twiagemed/nodejs/routes/data.coffee:8:17, <js>:16:22)
at Object.<anonymous> (/Users/judyngai/Desktop/twiage/twiagemed/nodejs/routes/data.coffee:1:1, <js>:226:4)
at Module._compile (module.js:456:26)
at Object.loadFile (/usr/local/lib/node_modules/coffee-script/lib/coffee-script/coffee-script.js:182:19)
at Module.load (/usr/local/lib/node_modules/coffee-script/lib/coffee-script/coffee-script.js:211:36)
at Function.Module._load (module.js:312:12)
at Module.require (module.js:364:17)
at require (module.js:380:17)
at Object.<anonymous> (/Users/judyngai/Desktop/twiage/twiagemed/nodejs/app.coffee:3:8, <js>:8:10)
at Object.<anonymous> (/Users/judyngai/Desktop/twiage/twiagemed/nodejs/app.coffee:1:1, <js>:76:4)
at Module._compile (module.js:456:26)
In the app.coffee file there is these three lines of code
express = require 'express'
routes = require './routes'
data = require './routes/data'
I have installed all the dependencies within the package.json file but there is a database.json file with this line attached
{ "dev": "postgres://twiage_db_user:twiage_db_password#localhost/twiage_db" }
how do I create this database? normally in rails its a rake db:create. I feel like this can solve the problem.
It tries to read "data" directory which is missing. Does cwd + ./data/ exist? Also, it's common to use path relative to script with __dirname variable:
dataDir = __dirname + "/data"
data = (_.chain fs.readdirSync dataDir)
.map (filename) ->
dataDir + "/" + filename
.map (f) ->
fs.readFileSync f, "utf8"
.map (p) ->
JSON.parse p
.sortBy (json) ->
-json.intlFormatDateTime
.value()