Pako not able to deflate gzip files generated in python - javascript

I'm generating gzip files from python using the following code: (using python 3)
file = gzip.open('output.json.gzip', 'wb')
dataToWrite = json.dumps(data).encode('utf-8')
file.write(dataToWrite)
file.close()
However, I'm trying to read this file now in Javascript using the Pako library (I'm using Angular 2):
this.http.get("output.json.gzip")
.map((res:Response) => {
var resText:any = new Uint8Array(res.arrayBuffer());
var result = "";
try {
result = pako.inflate(resText, {"to": "string"});
} catch (err) {
console.log("Error " + err);
}
return result;
});
But I'm getting this error in the console: unknown compression method. Should I be doing something else to properly inflate gzip files?

Turns out that I needed to use the res.blob() function to get the true binary data, not res.arrayBuffer(); and then convert the blob to the array buffer:
return this.http.get("output.json.gzip", new RequestOptions({ responseType: ResponseContentType.Blob }))
.map((res:Response) => {
var blob = res.blob();
var arrayBuffer;
var fileReader = new FileReader();
fileReader.onload = function() {
arrayBuffer = this.result;
try {
let result:any = pako.ungzip(new Uint8Array(arrayBuffer), {"to": "string"});
let obj = JSON.parse(result);
console.log(obj);
} catch (err) {
console.log("Error " + err);
}
};
fileReader.readAsArrayBuffer(blob);
return "abc";
});

Related

Getting pdf file from from express and treating it as if it from an input

i am trying to make a component that take a pdf from input or an already uploaded one and then extract pages from it and uploaded again
when choosing a file from input (choosing file from my computer)
i am using this
const handleFileChange = async (event) => {
const file = event.target.files[0];
setFiles(event.target.files[0])
const fileName = event.target.files[0].name
setFileName(fileName);
const fileReader = new FileReader();
fileReader.onload = async () => {
const pdfBytes = new Uint8Array(fileReader.result);
const pdfDoc = await PDFDocument.load(pdfBytes);
setPdfDoc(pdfDoc);
setPdfBlob(pdfBytes)
};
fileReader.readAsArrayBuffer(file);
setShowPdf(true)
};
we get a pdfDoc and a Unit8Array
then i use the pdfDoc to get pages and extract a new pdf file....
this works fine
now when selecting a file that we already uploaded
i use this to ping the api to get the file
const handleGetFile = async (url) => {
const headers = {
Authorization: "Bearer " + (localStorage.getItem("token")),
Accept: 'application/pdf'
}
await axios.put(`${process.env.NEXT_PUBLIC_API_URL}getPdfFileBlob`, {
pdfUrl: `https://handle-pdf-photos-project-through-compleated-task.s3.amazonaws.com/${url}`
}, { responseType: 'arraybuffer', headers }).then((res) => {
const handlePdf = async () => {
const uint8Array = new Uint8Array(res.data);
const pdfBlob = new Blob([uint8Array], { type: 'application/pdf' });
setPdfBlob(uint8Array)
// setPdfDoc(pdfBlob) .....? how do i create a pdf doc from the unit8array
}
handlePdf()
}).catch((err) => {
console.log(err)
})
}
this the the end point i am pinging
app.put('/getPdfFileBlob',async function(req,res){
try {
console.log(req.body.pdfUrl)
const url =req.body.pdfUrl;
const fileName = 'file.pdf';
const file = fs.createWriteStream(fileName);
https.get(url, (response) => {
response.pipe(file);
file.on('finish', () => {
file.close();
// Serve the file as a response
const pdf = fs.readFileSync(fileName);
res.setHeader('Content-Type', 'application/pdf');
res.setHeader( 'Content-Transfer-Encoding', 'Binary'
);
res.setHeader('Content-Disposition', 'inline; filename="' + fileName + '"');
res.send(pdf);
});
});
} catch (error) {
res.status(500).json({success:false,msg:"server side err"})
}
})
after getting this file here is what am trying to do
const handlePageSelection = (index) => {
setSelectedPages(prevSelectedPages => {
const newSelectedPages = [...prevSelectedPages];
const pageIndex = newSelectedPages.indexOf(index);
if (pageIndex === -1) {
newSelectedPages.push(index);
} else {
newSelectedPages.splice(pageIndex, 1);
}
return newSelectedPages;
});
};
const handleExtractPages = async () => {
for (let i = pdfDoc.getPageCount() - 1; i >= 0; i -= 1) {
if (!selectedPages.includes(i + 1)) {
pdfDoc.removePage(i);
}
}
await pdfDoc.save();
};
well in the first case where i upload the pdf file from local storage i get a pdfDoc
console of pdf Doc and pdfBlob
and when i select already existing file i can't find a way to transfer unit8array buffer to pdf doc
log of pdfBlob and no pdf doc
what i want is transform the pdfblob to pdfDcoument or get the pdf document from the array buffer so i can use getpages on it

readAsDataUrl converting to octet-stream instead of pdf

I am getting a blob data from a RESTFul endpoint and then I need to convert it to base64 string with file type as application/pdf however its converting it to application/octet-stream.
Here's what my codes does:
const getBytesData = () => {
if (user) {
getInvoicePDFStringByInvoiceId('129', user.user.token).then((data) => {
if (data.blob) {
const reader = new FileReader();
reader.readAsDataURL(data.blob);
reader.onloadend = () => {
var base64data = reader.result.replace('octet-stream', 'pdf');
console.log('Pdf loaded:- ', base64data);
setPDFLoaded(base64data);
return;
};
} else {
console.log('Error happened from API = ', data.error, data.message);
}
});
}
};
Can someone help me understand what could solve this issue?

Upload Image from form-data to S3 using a Lambda

So I am writing a Lambda that will take in some form data via a straight POST through API Gateway (testing using Postman for now) and then send that image to S3 for storage. Every time I run it, the image uploaded to S3 is corrupted and won't open properly. I have seen people having to decode/encode the incoming data but I feel like I have tried everything using Buffer.from. I am only looking to store either .png or .jpg. The below code does not reflect my attempts using Base64 encoding/decoding seeing they all failed. Here is what I have so far -
Sample Request in postman
{
image: (uploaded .jpg/.png),
metadata: {tag: 'iPhone'}
}
Lambda
const AWS = require('aws-sdk')
const multipart = require('aws-lambda-multipart-parser')
const s3 = new AWS.S3();
exports.handler = async (event) => {
const form = multipart.parse(event, false)
const s3_response = await upload_s3(form)
return {
statusCode: '200',
body: JSON.stringify({ data: data })
}
};
const upload_s3 = async (form) => {
const uniqueId = Math.random().toString(36).substr(2, 9);
const key = `${uniqueId}_${form.image.filename}`
const request = {
Bucket: 'bucket-name',
Key: key,
Body: form.image.content,
ContentType: form.image.contentType,
}
try {
const data = await s3.putObject(request).promise()
return data
} catch (e) {
console.log('Error uploading to S3: ', e)
return e
}
}
EDIT:
I am now atempting to save the image into the /tmp directory then use a read stream to upload to s3. Here is some code for that
s3 upload function
const AWS = require('aws-sdk')
const fs = require('fs')
const s3 = new AWS.S3()
module.exports = {
upload: (file) => {
return new Promise((resolve, reject) => {
const key = `${Date.now()}.${file.extension}`
const bodyStream = fs.createReadStream(file.path)
const params = {
Bucket: process.env.S3_BucketName,
Key: key,
Body: bodyStream,
ContentType: file.type
}
s3.upload(params, (err, data) => {
if (err) {
return reject(err)
}
return resolve(data)
}
)
})
}
}
form parser function
const busboy = require('busboy')
module.exports = {
parse: (req, temp) => {
const ctype = req.headers['Content-Type'] || req.headers['content-type']
let parsed_file = {}
return new Promise((resolve) => {
try {
const bb = new busboy({
headers: { 'content-type': ctype },
limits: {
fileSize: 31457280,
files: 1,
}
})
bb.on('file', function (fieldname, file, filename, encoding, mimetype) {
const stream = temp.createWriteStream()
const ext = filename.split('.')[1]
console.log('parser -- ext ', ext)
parsed_file = { name: filename, path: stream.path, f: file, type: mimetype, extension: ext }
file.pipe(stream)
}).on('finish', () => {
resolve(parsed_file)
}).on('error', err => {
console.err(err)
resolve({ err: 'Form data is invalid: parsing error' })
})
if (req.end) {
req.pipe(bb)
} else {
bb.write(req.body, req.isBase64Encoded ? 'base64' : 'binary')
}
return bb.end()
} catch (e) {
console.error(e)
return resolve({ err: 'Form data is invalid: parsing error' })
}
})
}
}
handler
const form_parser = require('./form-parser').parse
const s3_upload = require('./s3-upload').upload
const temp = require('temp')
exports.handler = async (event, context) => {
temp.track()
const parsed_file = await form_parser(event, temp)
console.log('index -- parsed form', parsed_file)
const result = await s3_upload(parsed_file)
console.log('index -- s3 result', result)
temp.cleanup()
return {
statusCode: '200',
body: JSON.stringify(result)
}
}
The above edited code is a combination of other code and a github repo I found that is trying to achieve the same results. Even with this solution the file is still corrupted
Figured out this issue. Code works perfectly fine - it was an issue with API Gateway. Need to go into the API Gateway settings and set thee Binary Media Type to multipart/form-data then re-deploy the API. Hope this helps someone else who is banging their head against the wall on figuring out sending images via form data to a lambda.

How to read content of JSON file uploaded to google cloud storage using node js

I manually upload the JSON file to google cloud storage by creating a new project. I am able to read the metadata for a file but I don't know how to read the JSON content.
The code I used to read the metadata is:
var Storage = require('#google-cloud/storage');
const storage = Storage({
keyFilename: 'service-account-file-path',
projectId: 'project-id'
});
storage
.bucket('project-name')
.file('file-name')
.getMetadata()
.then(results => {
console.log("results is", results[0])
})
.catch(err => {
console.error('ERROR:', err);
});
Can someone guide me to the way to read the JSON file content?
I've used the following code to read a json file from Cloud Storage:
'use strict';
const Storage = require('#google-cloud/storage');
const storage = Storage();
exports.readFile = (req, res) => {
console.log('Reading File');
var archivo = storage.bucket('your-bucket').file('your-JSON-file').createReadStream();
console.log('Concat Data');
var buf = '';
archivo.on('data', function(d) {
buf += d;
}).on('end', function() {
console.log(buf);
console.log("End");
res.send(buf);
});
};
I'm reading from a stream and concat all the data within the file to the buf variable.
Hope it helps.
UPDATE
To read multiple files:
'use strict';
const {Storage} = require('#google-cloud/storage');
const storage = new Storage();
listFiles();
async function listFiles() {
const bucketName = 'your-bucket'
console.log('Listing objects in a Bucket');
const [files] = await storage.bucket(bucketName).getFiles();
files.forEach(file => {
console.log('Reading: '+file.name);
var archivo = file.createReadStream();
console.log('Concat Data');
var buf = '';
archivo.on('data', function(d) {
buf += d;
}).on('end', function() {
console.log(buf);
console.log("End");
});
});
};
I was using the createWriteStream method like the other answers but I had a problem with the output in that it randomly output invalid characters (�) for some characters in a string. I thought it could be some encoding problems.
I came up with my workaround that uses the download method. The download method returns a DownloadResponse that contains an array of Buffer. We then use Buffer.toString() method and give it an encoding of utf8 and parse the result with JSON.parse().
const downloadAsJson = async (bucket, path) => {
const file = await new Storage()
.bucket(bucket)
.file(path)
.download();
return JSON.parse(file[0].toString('utf8'));
}
There exists a convenient method:'download' to download a file into memory or to a local destination. You may use download method as follows:
const bucketName='bucket name here';
const fileName='file name here';
const storage = new Storage.Storage();
const file = storage.bucket(bucketName).file(fileName);
file.download(function(err, contents) {
console.log("file err: "+err);
console.log("file data: "+contents);
});
A modern version of this:
const { Storage } = require('#google-cloud/storage')
const storage = new Storage()
const bucket = storage.bucket('my-bucket')
// The function that returns a JSON string
const readJsonFromFile = async remoteFilePath => new Promise((resolve, reject) => {
let buf = ''
bucket.file(remoteFilePath)
.createReadStream()
.on('data', d => (buf += d))
.on('end', () => resolve(buf))
.on('error', e => reject(e))
})
// Example usage
(async () => {
try {
const json = await readJsonFromFile('path/to/json-file.json')
console.log(json)
} catch (e) {
console.error(e)
}
})()

Reactjs - Can't base64 encode file from react-dropzone

I am using react-dropzone to handle file upload on my website. When successfully loading a file, the dropzone triggers the following callback:
onDrop: function (acceptedFiles, rejectedFiles) {
myFile = acceptedFiles[0];
console.log('Accepted files: ', myFile);
}
I would like to base64 encode this file. When doing :
var base64data = Base64.encode(myFile)
console.log("base64 data: ", base64data) // => base64 data: W29iamVjdCBGaWxlXQ==W29iamVjdCBGaWxlXQ==
Regardless of file uploaded, it always prints out the same string.
Am I missing something ? I need to base64 encode this file (always images)
This JS Bin is a working example of converting a File to base64: http://jsbin.com/piqiqecuxo/1/edit?js,console,output . The main addition seems to be reading the file using a FileReader, where FileReader.readAsDataURL() returns a base64 encoded string
document.getElementById('button').addEventListener('click', function() {
var files = document.getElementById('file').files;
if (files.length > 0) {
getBase64(files[0]);
}
});
function getBase64(file) {
var reader = new FileReader();
reader.readAsDataURL(file);
reader.onload = function () {
console.log(reader.result);
};
reader.onerror = function (error) {
console.log('Error: ', error);
};
}
If you want it in a neat method that works with async / await, you can do it this way
const getBase64 = async (file: Blob): Promise<string | undefined> => {
var reader = new FileReader();
reader.readAsDataURL(file as Blob);
return new Promise((reslove, reject) => {
reader.onload = () => reslove(reader.result as any);
reader.onerror = (error) => reject(error);
})
}

Categories