Download a json file from a link using Node js - javascript

I need to write an application with Node JS which given a link to a json file e.g http://data.phishtank.com/data/online-valid.json (The link doesn't open the file it opens a download), the program simply downloads the object and then prints it out. How can this be achieved? This is what I have so far and it doesn't seem to be working:
var checkIfPhishing = function(urlToPrint){
var http = require('http');
var fs = require('fs');
var file = fs.createWriteStream("SiteObject.json");
var request = http.get(urlToPrint, function(response) {
response.pipe(file);});
var siteObj= fs.readFileSync("SiteObject.json");
console.log(siteObj);
};
Thank you!

You cannot mix up async and sync reads and writes.
In your case you start Streaming the data from the other server to yours, but then you already start the sync read. Wich blocks the thread so the stream will be processed after youve read your 0 byte file...
So you need to store the stream in a variable first, then on finish log that stream.
So Simply do sth like this:
var data="";
var request = http.get(urlToPrint, function(response) {
response.on("data",append=>data+=append).on("finish",()=>console.log(data));;
});
Store the asyncly provided chunks of the stream in a variable, if the stream finishes log that string.
If you want to store it too:
var http = require('http');
var fs = require('fs');
function checkIfPhishing(urlToPrint){
var file = fs.createWriteStream("SiteObject.json");
var request = http.get(urlToPrint, function(response) {
response.on("finish",function(){
console.log( fs.readFileSync("SiteObject.json",{encoding:"utf8"}));
}).pipe(file);
});
}
This is exactly like your code, but it waits for the stream to finish before it reads synchronously...
However note that the sync read will slow down the whole thing, so you might directly stream to the console/a browser..

Related

How to read the real time changes in a file with CasperJS

I am trying to let my CasperJS script read an outside txt file (abc.txt), and the abc.txt needed to be created in the middle of the CasperJS process.
(and that abc.txt has to be create using 'curl' for some third party api, so I called a childprocess to renew the old abc.txt).
However, the information inside abc.txt seems locked at the begging, it doesn't matter whether I changed the information or delete the whole file.
Has anyone any idea if CasperJS can be more interactive?
or any advice if I need to change the whole script?
(I am trying to get a question from websiteA, then go to websiteB find the answer, then submit the answer in websiteA)
var casper = require('casper').create();
var fs = require('fs');
var data = fs.read('abc.txt');
casper.start();
casper.wait(5000, function () {
console.log('wait 5000 for editing the file');
});
casper.then(function (){
console.log(data);
});
casper.run();
If the content of data.txt changes, during the execution of your script, this doesn't have any influence on variables that hold a copy of the content of that file. There is never a direct connection between the string value and the file content.
If you want to refresh the string content, you need to read that file again:
data = fs.read('abc.txt');
If you need to wait for the change of the file, then you can periodically read the file contents to see if they changed in the mean time. This can be done with casper.waitFor():
var casper = require('casper').create();
var fs = require('fs');
var data = fs.read('abc.txt');
var newData;
casper.start();
casper.waitFor(function _check(){
newData = fs.read('abc.txt');
return data !== newData;
});
casper.wait(1000); // additional wait to make sure that the file writing has finished
casper.then(function (){
console.log(newData);
});
casper.run();

How to download image folder from website to local directory using nodejs

I would like to download image folder which contains number of images.
I need to download into my local directory.
I downloaded one image giving image name.
But I could not understand how can I do that for multiple images itself.
Here is my code.
var http = require('http');
var fs = require('fs');
var file = fs.createWriteStream("./downloads");
var request = http.get("http://www.salonsce.com/extranet/uploadfiles" + image.png, function(response) {
response.pipe(file);
});
Thanks in advance.
To download files using curl in Node.js you will need to use Node's child_process module. You have to call curl using child_process's spawn method. for that i use spawn instead of exec for the sake of convenience - spawn returns a stream with data event and doesn't have buffer size issue unlike exec. That doesn't mean exec is inferior to spawn; in fact we will use exec to download files using wget.
// Function to download file using curl
var download_file_curl = function(file_url) {
// extract the file name
var file_name = url.parse(file_url).pathname.split('/').pop();
// create an instance of writable stream
var file = fs.createWriteStream(DOWNLOAD_DIR + file_name);
// execute curl using child_process' spawn function
var curl = spawn('curl', [file_url]);
// add a 'data' event listener for the spawn instance
curl.stdout.on('data', function(data) { file.write(data); });
// add an 'end' event listener to close the writeable stream
curl.stdout.on('end', function(data) {
file.end();
console.log(file_name + ' downloaded to ' + DOWNLOAD_DIR);
});
// when the spawn child process exits, check if there were any errors and close the writeable stream
curl.on('exit', function(code) {
if (code != 0) {
console.log('Failed: ' + code);
}
});
};
A better way to do this would be to use another tool called glob in parallel. Like,
First install it with
npm install glob
And then,
var glob = require("glob");
var http = require('http');
var fs = require('fs');
var file = fs.createWriteStream("./downloads");
// options is optional
//options = {};
glob('http://www.salonsce.com/extranet/uploadfiles/*', options, function (er, files) {
//you will get list of files in the directory as an array.
// now use your previus logic to fetch individual file
// the name of which can be found by iterating over files array
// loop over the files array. please implement you looping construct.
var request = http.get(files[i], function(response) {
response.pipe(file);
});
});

Parse json object when node server is started

I need to parse json object when the node server.js(which is my entry point to the program) is started ,the parse of the json file is done in diffrent module in my project.
I've two questions
Is it recommended to invoke the parse function with event in the server.js file
I read about the event.emiter but not sure how to invoke function
from different module...example will be very helpful
I've multiple JSON files
UPDATE to make it more clear
if I read 3 json file object (50 lines each) when the server/app is loaded (server.js file) this will be fast I guess. my scenario is that the list of the valid path's for the express call is in this json files
app.get('/run1', function (req, res) {
res.send('Hello World!');
});
So run1 should be defined in the json file(like white list of path's) if user put run2 which I not defined I need to provide error so I think that when the server is up to do this call and keep this obj with all config valid path and when user make a call just get this object which alreay parsed (when the server loaded ) and verify if its OK, I think its better approach instead doing this on call
UPDATE 2
I'll try explain more simple.
Lets assume that you have white list of path which you should listen,
like run1
app.get('/run1', function
Those path list are defined in jsons files inside your project under specific folder,before every call to your application via express you should verify that this path that was requested is in the path list of json. this is given. now how to do it.
Currently I've develop module which seek the json files in this and find if specific path is exist there.
Now I think that right solution is that when the node application is started to invoke this functionality and keep the list of valid paths in some object which I can access very easy during the user call and check if path there.
my question is how to provide some event to the validator module when the node app(Server.js) is up to provide this object.
If it's a part of your application initialization, then you could read and parse this JSON file synchronously, using either fs.readFileSync and JSON.parse, or require:
var config = require('path/to/my/config.json');
Just make sure that the module handling this JSON loading is required in your application root before app.listen call.
In this case JSON data will be loaded and parsed by the time you server will start, and there will be no need to trouble yourself with callbacks or event emitters.
I can't see any benefits of loading your initial config asynchronously for two reasons:
The bottleneck of JSON parsing is the parser itself, but since it's synchronous, you won't gain anything here. So, the only part you'll be able to optimize is interactions with your file system (i.e. reading data from disk).
Your application won't be able to work properly until this data will be loaded.
Update
If for some reason you can't make your initialization synchronous, you could delay starting your application until initialization is done.
The easiest solution here is to move app.listen part inside of initialization callback:
// initialization.js
var glob = require('glob')
var path = require('path')
module.exports = function initialization (done) {
var data = {}
glob('./config/*.json', function (err, files) {
if (err) throw err
files.forEach(function (file) {
var filename = path.basename(file)
data[filename] = require(file)
})
done(data);
})
}
// server.js
var initialization = require('./initialization')
var app = require('express')()
initialization(function (data) {
app.use(require('./my-middleware')(data))
app.listen(8000)
})
An alternative solution is to use simple event emitter to signal that your data is ready:
// config.js
var glob = require('glob')
var path = require('path')
var events = require('events')
var obj = new events.EventEmitter()
obj.data = {}
glob('./config/*.json', function (err, files) {
if (err) throw err
files.forEach(function (file) {
var filename = path.basename(file)
obj.data[filename] = require(file)
})
obj.emit('ready')
})
module.exports = obj
// server.js
var config = require('./config')
var app = require('express')()
app.use(require('./my-middleware'))
config.on('ready', function () {
app.listen(8000)
})

Get updated file in each request nodejs

Basically, I wrote a server that response a js file(object format) to users who made the request. The js file is generated by two config file. I call them config1.js and config2.js.
Here is my code:
var express = require('express');
var app = express();
var _ = require('underscore');
app.use('/config.js', function (req, res) {
var config1 = require('config1');
var config2 = require('config2');
var config = _.extend(config1, config2);
res.set({'Content-Type': 'application/javascript'});
res.send(JSON.stringify(config));
});
For what I am understanding, every time I make a request to /config.js, it will fetch the latest code in config1 and config2 file even after I start server. However, if I start server, make some modification in config1.js. then make the request, it will still return me the old code. Can anyone explain and how to fix that? thanks
You should not use require in order to load your files because it is not its purpose, it caches the loaded file (see this post for more information), that is why you get the same content every time you make a request.
Use a tool like concat-files instead, or concat it "by hand" if you prefer.
Concat files and extend objects aren't equal operations. You can read the files via 'fs' module, parse objects, extend, and send.

How do you read a file or Stream synchronously in node.js?

Please, no lectures about how I should be doing everything asynchronously. Sometimes I want to do things the easy obvious way, so I can move on to other work.
For some reason, the following code doesn't work. It matches code I found on a recent SO question. Did node change or break something?
var fs = require('fs');
var rs = fs.createReadStream('myfilename'); // for example
// but I might also want to read from
// stdio, an HTTP request, etc...
var buffer = rs.read(); // simple for SCCCE example, normally you'd repeat in a loop...
console.log(buffer.toString());
After the read, the buffer is null.
Looking at rs in the debugger, I see
events
has end and open functions, nothing else
_readableState
buffer = Array[0]
emittedReadable = false
flowing = false <<< this appears to be correct
lots of other false/nulls/undefined
fd = null <<< suspicious???
readable = true
lots of other false/nulls/undefined
To read the contents of a file synchronously use fs.readFileSync
var fs = require('fs');
var content = fs.readFileSync('myfilename');
console.log(content);
fs.createReadStream creates a ReadStream.

Categories