Creating an Image File using node.js buffer from input text - javascript

I have a requirement in which I need to convert an input text to a png/jpeg file and then convert to base64 string and send as an input to an API.
I cannot use node.js fs module since I cannot physically create files.
So I was trying to user node.js Buffer module to achieve the same.
But the issue I'm facing is, I cannot add extension to it(I don't know if there is any such option).
Is there any other way of doing so?
Below is the code I tried...
function textToFileBase64(str){
var buf = Buffer.from(str, 'utf-8');
return buf.toString('base64');
}
The only problem with the above code is that it creates a file without extension and even if I need the file as abc.png, it says the file is damaged when I open it.

Related

How to extract text from PDF?

I'm creating a React Application with NodeJS and it needs to get some text from a PDF that the user upload.
I already tried to use: pdf-parse, pdf2json, pdf.js and react-pdf-js. The file should be selected by the user, and all those libraries use a Path to acess the file. What should I do?
PS1: I'm using a input type='file' button to get the file.
The code must work both NodeJS and Web Browser
You doesn't upload any code snippet, so my answer is according to this scenario
You can see this example, this is perfect example for "HOW TO USE pdf.js"
http://git.macropus.org/2011/11/pdftotext/example/
And this is the code on git
https://github.com/hubgit/hubgit.github.com/tree/master/2011/11/pdftotext
but I think you have to make some changes according to your requirement
Enjoy..
I'm answering my own question. First I create a regular html input.
<input type='file'/>
I'm using React, so I use onChange attribute in place of id.
So, when the user enters with the file, a function is activated and I use the following code to get the file:
const file = event.target.files[0];
file not has a path, which is used by PDF.JS to get the real file.
Then I use a FileReader to convert the file int a Array of bits (I guess):
const fileReader = new FileReader();
Then we set a function at fileReader.onload the function can be found here
fileReader.onload = function() {...}
Finally we do this:
fileReader.readAsArrayBuffer(file);
Important PS: pdf.pdfInfo must be replaced with pdf at new PDF.JS versions.
Thanks for helping.
Extra PS: To use pdfjsLib as PDFJS in React I did this in index.html file:
window.PDFJS = pdfjsLib

Issues with synchronously downloading a file in Node

With multiple methods to download the file and write it synchronously in my local application (currently using a module called download-file-sync), I am having issues with the file written using writeFileSync.
Here is my code:
var downloadFileSync = require('download-file-sync');
fs.writeFileSync("twc.mp4", downloadFileSync(sourceURLEncoded));
Now this technically writes something, and opening the file in Notepad++ shows at least the start of the file is identical to the same file downloaded via Chrome, with the same amount of lines. However, the file size is around doubled:
The Node download will not play, while the Chrome download does.
How am I able to achieve a successful synchronous file download in Node?
The reason is that download-file-sync calls curl and encodes the result as a string when you want "pure" bytes.
If the string isn't valid UTF-8 some characters may be expanded resulting in a different size and content than the original binary file.
To fix you can simply replace the module with the code it uses and make a new function where you use buffer (the default) for encoding:
function downloadFileSync(url) {
return require('child_process')
.execFileSync('curl', ['--silent', '-L', url]); // remove options {encoding: 'utf8'}
}
And try with that instead.

Including a text file in Chrome extension and reading it with Javascript

I want to create a Chrome extension that contains a text file with static data (a dictionary of English words) and I want the extension to be able to parse that file. I've only managed to find FileReader class, but it looks like it's made for reading user-selected files, while in my case I always want to read the same exact file included in extension's package. As a workaround, I can convert the file to a Javascript array of strings declared in some .js file included in the manifest, but in that case the whole contents would be loaded into memory at once, while what I need is to read the data line by line. Is there any way to do this?
You can go the FileReader route, since you can obtain the Entry of your package directory with chrome.runtime.getPackageDirectoryEntry().
However, an easier way is to just make a XHR to your file using chrome.runtime.getURL() with a relative path. The first way is useful when you want to list files, though.

Can dropbox-js readFile only get contents of a file of type "text/plain"?

I am using the dropbox-js API as a back-end to an application I am creating.
I need to get the contents of a file and I understand that the method "readFile" that is used to get the contents only really supports text files.
I can get the contents of a text file of type "text/plain" i.e. .txt files, using the following:
client.readFile(d2.path, {arrayBuffer: true}, function(error, contents){
var decoded = decodeUtf8(contents);
console.log(decoded);
});
The API reference for this method is here: http://coffeedoc.info/github/dropbox/dropbox-js/master/classes/Dropbox/Client.html#readFile-instance
The decode function was found here: https://gist.github.com/boushley/5471599
This does not seem to work for any other document type file. If I try and read a .docx / .doc file the result consists of what looks like scrambled characters. Should it be able to work with other document type files? How would I read it differently?
I really need it to support more than .txt files.
Edit:
This is a test document (.docx) that I tried to read:
This is how it is decoded (Contents shows that it is indeed an arrayBuffer, while Decoded is the actual string that is returned after decode:
readFile should work for any content type. Presumably the "scrambled characters" you see are exactly the content of the .docx or .doc file you're reading. (If you looked at the file via type on Windows or cat on Mac/Linux, you would see the same thing.)
So I think the issue you're having is that you want to somehow extract the text from a variety of file formats. Dropbox (and dropbox.js) won't help you with that particular problem... you'll need to find software that understands all those file formats and can convert them to the form you need. For example, textract is a Python library that can do this.

Generate a Word document in JavaScript with Docx.js?

I am trying to use docx.js to generate a Word document but I can't seem to get it to work.
I copied the raw code into the Google Chrome console after amending line 247 to fix a "'textAlign' undefined error"
if (inNode.style && inNode.style.textAlign){..}
Which makes the function convertContent available. The result of which is an Object e.g.
JSON.stringify( convertContent($('<p>Word!</p>)[0]) )
Results in -
"{"string":
"<w:body>
<w:p>
<w:r>
<w:t xml:space=\"preserve\">Word!</w:t>
</w:r>
</w:p>
</w:body>"
,"charSpaceCount":5
,"charCount":5,
"pCount":1}"
I copied
<w:body>
<w:p>
<w:r>
<w:t xml:space="preserve">Word!</w:t>
</w:r>
</w:p>
</w:body>
into Notepad++ and saved it as a file with an extension of 'docx' but when I open it in MS Word but it says 'cannot be opened because there is a problem with the contents'.
Am I missing some attribute or XML tags or something?
You can generate a Docx Document from a template using docxtemplater (library I have created).
It can replace tags by their values (like a template engine), and also replace images in a paid version.
Here is a demo of the templating engine: https://docxtemplater.com/demo/
This code can't work on a JSFiddle because of the ajaxCalls to local files (everything that is in the blankfolder), or you should enter all files in ByteArray format and use the jsFiddle echo API: http://doc.jsfiddle.net/use/echo.html
I know this is an older question and you already have an answer, but I struggled getting this to work for a day, so I thought I'd share my results.
Like you, I had to fix the textAlign bug by changing the line to this:
if (inNode.style && inNode.style.textAlign)
Also, it didn't handle HTML comments. So, I had to add the following line above the check for a "#text" node in the for loop:
if (inNodeChild.nodeName === '#comment') continue;
To create the docx was tricky since there is absolutely no documentation on this thing as of yet. But looking through the code, I see that it is expecting the HTML to be in a File object. For my purposes, I wanted to use the HTML I rendered, not some HTML file the user has to select to upload. So I had to trick it by making my own object with the same property that it was looking for and pass it in. To save it to the client, I use FileSaver.js, which requires a blob. I included this function that converts base64 into a blob. So my code to implement it is this:
var result = docx({ DOM: $('#myDiv')[0] });
var blob = b64toBlob(result.base64, "application/vnd.openxmlformats-officedocument.wordprocessingml.document");
saveAs(blob, "test.docx");
In the end, this would work for simple Word documents, but isn't nearly sophisticated for anything more. I couldn't get any of my styles to render and I didn't even attempt to get images working. I've since abandoned this approach and am now researching DocxgenJS or some server-side solution.
You may find this link useful,
http://evidenceprime.github.io/html-docx-js/
An online demo here:
http://evidenceprime.github.io/html-docx-js/test/sample.html
You are doing the correct thing codewise, but your file is not a valid docx file. If you look through the docx() function in docx.js, you will see that a docx file is actually a zip containing several xml files.
I am using Open Xml SDK for JavaScript.
http://ericwhite.com/blog/open-xml-sdk-for-javascript/
Basically, on web server, I have a empty docx file as new template.
when user in browser click new docx file, I will retrieve the empty docx file as template, convert it to BASE64 and return it as Ajax response.
in client scripts, you convert the BASE64 string to byte array and using openxmlsdk.js to load the byte array as an javascript OpenXmlPackage object.
once you have the package loaded, you can use regular OpenXmlPart to create a real document. (inserting image, creating table/row ).
the last step is stream it out to end user as a document. this part is security related. in my code I send it back to webserver and gets saved temporarily. and prepare a http response to notify end user to download it.
Check the URL above, there are useful samples of doing this in JavaScript.

Categories