Read CSV over SSH and convert to JSON - javascript

This is a duplicate of this question here
Here is the code I'm trying to work with:
let Client = require('ssh2-sftp-client');
let sftp = new Client();
var csv = require("csvtojson");
sftp.connect({
host: 'HOST',
port: 'PORT',
username: 'USERNAME',
password: 'PASSWORD'
}).then(() => {
return sftp.get('/home/user/etc/testfile.csv');
}).then((data) => {
csv()
.fromString(data.toString()) // changed this from .fromStream(data)
.subscribe(function(jsonObj) { //single json object will be emitted for each csv line
// parse each json asynchronously
return new Promise(function(resolve, reject) {
resolve()
console.log(jsonObj);
})
})
}).catch((err) => {
console.log(err, 'catch error');
});
I can read back the CSV data, and can see it going into JSON format on console.log(jsonObj) but the data is unreadable, all '\x00o\x00n\x00'' ..
I'm not sure what to do in the line:
// parse each json asynchronously
Could anyone help to figure out how to parse the CSV/JSON after it comes back from the buffer?

The null bytes \x00 are pointing towards an encoding/decoding issue. The CSV file might be encoded using UTF-16, but Buffer.toString() by default decodes the data using UTF-8. You can change that to data.toString('utf16le') (or data.toString('ucs2')) to force using the correct encoding.

Related

ffmpeg app using node occasionally crashes as file doesn't appear to be read correctly

I have an simple Node application that allows me to pass an AWS S3 URL link to a file (in this case video files). It uses the FFMPEG library to read the video file and return data like codecs, duration, bitrate etc..
The script is called from PHP script which in turn send the data to the Node endpoint and passes the Amazon S3 URL to node. Sometimes for no obvious reasons the video file fails to return the expected values regarding container, codec, duration etc... and just returns '0'. But when I try the exact same file/request again it returns this data correctly e.g container:mp4
I'm not sure but I think the script somehow needs the createWriteStream to be closed but I cannot be sure, the problem is the issue I have found doesn't happen all the time but sporadically so its hard to get to the issue when its difficult to replicate it.
Any ideas?
router.post('/', async function(req, res) {
const fileURL = new URL(req.body.file);
var path = fileURL.pathname;
path = 'tmp/'+path.substring(1); // removes the initial / from the path
let file = fs.createWriteStream(path); // create the file locally
const request = https.get(fileURL, function(response) {
response.pipe(file);
});
// after file has saved
file.on('finish', function () {
var process = new ffmpeg(path);
process.then(function (video) {
let metadata = formatMetadata(video.metadata);
res.send ({
status: '200',
data: metadata,
errors: errors,
response: 'success'
});
}, function (err) {
console.warn('Error: ' + err);
res.send ({
status: '400',
data: 'Something went wrong processing this video',
response: 'fail',
});
});
});
file.on('error', function (err) {
console.warn(err);
});
});
function formatMetadata(metadata) {
const data = {
'video' : metadata.video,
'audio' : metadata.audio,
'duration' : metadata.duration
};
return data;
}
// Expected output
{"data":{"video":{"container":"mov","bitrate":400,"stream":0,"codec":"h264","resolution":{"w":1280,"h":720},"resolutionSquare":{"w":1280,"h":720},"aspect":{"x":16,"y":9,"string":"16:9","value":1.7777777777777777},"rotate":0,"fps":25,"pixelString":"1:1","pixel":1},"audio":{"codec":"aac","bitrate":"127","sample_rate":44100,"stream":0,"channels":{"raw":"stereo","value":2}},"duration":{"raw":"00:00:25.68","seconds":25}}
// Actual output
{"data":{"video":{"container":"","bitrate":0,"stream":0,"codec":"","resolution":{"w":0,"h":0},"resolutionSquare":{"w":0,"h":null},"aspect":{},"rotate":0,"fps":0,"pixelString":"","pixel":0},"audio":{"codec":"","bitrate":"","sample_rate":0,"stream":0,"channels":{"raw":"","value":""}},"duration":{"raw":"","seconds":0}}
Note - this happens sporadically
You are not accounting for a failed fetch from AWS. You should check the status code of the response before you move on to your pipe.
const request = https.get(fileURL, function(response) {
if(response.statusCode == 200)
response.pipe(file);
else
// Handle error case
});

How do I send a YAML file as a base64 encoded string?

I am trying to send a yaml file as a base64 string so that this code works:
const response = await octokit.request('GET /repos/{owner}/{repo}/git/blobs/{file_sha}', {
owner: 'DevEx',
repo: 'hpdev-content',
file_sha: fileSha,
headers: {
authorization: `Bearer ${githubConfig?.token}`,
},
});
const decoded = Buffer.from(response.data.content, 'base64').toString('utf8');
In the above code response.data.content should have the data.
I have this route:
router.get('/repos/:owner/:repo/git/blobs/:file_sha', (req, res) => {
// TODO: do we need to do anything with the path params?
// eslint-disable-next-line #typescript-eslint/no-unused-vars
const { owner, repo, file_sha } = req.params;
const contents = writeUsersReport();
const encoded = Buffer.from(contents, 'binary').toString('base64');
res.send(encoded);
});
The code is working fine except that the client code expects the base64 string in a property called content in the following code:
const decoded = Buffer.from(response.data.content, 'base64').toString('utf8');
But the string is in response.data.
How can I set the content property instead?
How about sending a json response containing an object with a content property from your server side instead of the encoded string directly?
// ...
const encoded = Buffer.from(contents, 'binary').toString('base64');
res.json({content:encoded});

Issues writing currency symbols to file in nodejs

I am trying to write to a file:
private async writeToFile(data: any) {
try {
fs.writeFile(filePath as string, JSON.stringify(data), 'utf8', (error: any) => {
if (error) {
logger.error(`[JSON] Error while saving file : ${error}`);
}
logger.info('The file has been saved!');
});
} catch (error) {
logger.error(`[JSON] Error while saving file : ${error}`);
}
}
where data has:
var data = [{label:'Egyptian Pound £', value: 'E£'}, {"label":"Albanian Lek-AL","value":"AL"}];
When I write to file, the characters are saved as {label: Egyptian Pound E�, value: E�}
The data array is created from a multi line string returned from server:
Egyptian Pound|E£
Albanian Lek|AL
Code to create the data array:
const currencyArr = response
.split('\n')
.map(val => val.trim())
.reduce((arr, currencyString) => {
arr.push({
label: currencyString.split('|')[0] + '-' + currencyString.split('|')[1],
value: currencyString.split('|')[1]
});
return arr;
}, []);
this.writeToFile(currencyArr);
I am not sure why this is happening. As per docs, node supports UTF-8 encoding by default
The only reason I can find this kind of thing happen is if your JS file is the one not encoded in UTF8.
Make sure the JS file is saved in the UTF8 encoding, so the string in your script can be saved to the corresponding encoding.

Find the content-length of the stream before uploading the file or writing the file on network

I am reading the file, zipping & encrypting it and then uploading/writing on network. But I need to know the content-length of the end stream( stream returned after passing through read, zip, encryption) to make a post request.
let zlib = zlib.createGzip(),
encrypt = crypto.cipherIV(....),
input = fs.createReadStream('file.jpg');
function zipAndEncrypt(){
let stream = readStream.pipe( zlib).pipe( encrypt );
let options = {
"stream_length":0,
headers: { "content-type": 'image/jpeg',
"content-length": '123456', // need to get this length
.....
}
}
// post the stream
needle( 'post', url, stream, options )
.then( resp => { console.log( "file length", resp.body.length);})
.catch( err => {})
}
Above code works if I enter the correct content length in headers ( in this case I knew the length ). So I need to find the length of the stream.
so far I achieved the length by :
let chunk = [], conLength;
stream.on( 'data', ( data ) => {
chunk.push( data );
} )
.on( 'end', () => {
conLength = Buffer.concat( chunk ).length;
} );
But the post request fails, SOCKET hang up error.
It looks like stream is drained or consumed as it does not emit 'data' event after finding the length using the code above.
Tried stream.resume(). But nothing works. Could you please suggest how to find the length of the stream without consuming the stream.
If you need to send the content length, the only way to know it, is after the file has been zipped & encrypted.
So, your solution works, but only if you send the buffer, and not the stream, because you already consumed all the data from the stream. And since you already have all the chunks in memory, you might as well send it.
let chunk = [];
stream.on('data', data => chunk.push(data))
.on('end', () => {
const buffer = Buffer.concat(chunk);
const conLength = buffer.length;
// Execute the request here, sending the whole buffer, not the stream
needle(/*...*/)
});
But if your file is too big, you're require to stream it, otherwise you will reach out of memory, an easy workaround, with a little overhead, is to pipe it to a temporary file, and then send that file. That way you can know the file size before performing the request, accessing the stream.bytesWritten property or using fs.lstat.
function zipAndEncrypt(input) {
const gzip = zlib.createGzip();
const encrypt = crypto.createCipheriv(algo, key, iv),
const stream = input.pipe(gzip).pipe(encrypt);
const fileName = tmpFileName();
const file = fs.createWriteStream(fileName)
stream
.pipe(file)
.on('finish', () => {
let options = {
"stream_length": 0,
headers: {
"content-type": 'image/jpeg',
"content-length": file.bytesWritten
}
}
const readStream = fs.createReadStream(fileName);
// post the stream
needle('post', url, readStream, options)
.then(resp => {
console.log("file length", resp.body.length);
})
.catch(err => {})
.finally(() => {
// Remove the file from disk
});
})
}

how to load an image from url into buffer in nodejs

I am new to nodejs and am trying to set up a server where i get the exif information from an image. My images are on S3 so I want to be able to just pass in the s3 url as a parameter and grab the image from it.
I am u using the ExifImage project below to get the exif info and according to their documentation:
"Instead of providing a filename of an image in your filesystem you can also pass a Buffer to ExifImage."
How can I load an image to the buffer in node from a url so I can pass it to the ExifImage function
ExifImage Project:
https://github.com/gomfunkel/node-exif
Thanks for your help!
Try setting up request like this:
var request = require('request').defaults({ encoding: null });
request.get(s3Url, function (err, res, body) {
//process exif here
});
Setting encoding to null will cause request to output a buffer instead of a string.
Use the axios:
const response = await axios.get(url, { responseType: 'arraybuffer' })
const buffer = Buffer.from(response.data, "utf-8")
import fetch from "node-fetch";
let fimg = await fetch(image.src)
let fimgb = Buffer.from(await fimg.arrayBuffer())
I was able to solve this only after reading that encoding: null is required and providing it as an parameter to request.
This will download the image from url and produce a buffer with the image data.
Using the request library -
const request = require('request');
let url = 'http://website.com/image.png';
request({ url, encoding: null }, (err, resp, buffer) => {
// Use the buffer
// buffer contains the image data
// typeof buffer === 'object'
});
Note: omitting the encoding: null will result in an unusable string and not in a buffer. Buffer.from won't work correctly too.
This was tested with Node 8
Use the request library.
request('<s3imageurl>', function(err, response, buffer) {
// Do something
});
Also, node-image-headers might be of interest to you. It sounds like it takes a stream, so it might not even have to download the full image from S3 in order to process the headers.
Updated with correct callback signature.
Here's a solution that uses the native https library.
import { get } from "https";
function urlToBuffer(url: string): Promise<Buffer> {
return new Promise((resolve, reject) => {
const data: Uint8Array[] = [];
get(url, (res) => {
res
.on("data", (chunk: Uint8Array) => {
data.push(chunk);
})
.on("end", () => {
resolve(Buffer.concat(data));
})
.on("error", (err) => {
reject(err);
});
});
});
}
const imageUrl = "https://i.imgur.com/8k7e1Hm.png";
const imageBuffer = await urlToBuffer(imageUrl);
Feel free to delete the types if you're looking for javascript.
I prefer this approach because it doesn't rely on 3rd party libraries or the deprecated request library.
request is deprecated and should be avoided if possible.
Good alternatives include got (only for node.js) and axios (which also support browsers).
Example of got:
npm install got
Using the async/await syntax:
const got = require('got');
const url = 'https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_272x92dp.png';
(async () => {
try {
const response = await got(url, { responseType: 'buffer' });
const buffer = response.body;
} catch (error) {
console.log(error.body);
}
})();
you can do it that way
import axios from "axios";
function getFileContentById(
download_url: string
): Promise < Buffer > {
const response = await axios.get(download_url, {
responseType: "arraybuffer",
});
return Buffer.from(response.data, "base64");
}

Categories