I have created a Rich Text Editor in UIWebview. My requirement is to save this text in .doc word file. How to achieve this. I am getting html content by using
NSString *strWebText = [webView stringByEvaluatingJavaScriptFromString:#"document.body.innerHTML"];
Now how can I proceed further to convert it in .doc format? Or is there any javascript function to convert text or save text to .doc file?
Microsoft Word can open a .doc file that is really .html and will open it as such. There isnt anyway for you to easily convert your html to a binary .doc file without significant code or the intervention of a server.
if you create a html file from a word doc, you will see the html produced. You will find certain headers at the top copy these in your html and word should open it correctly.
iOS does not have any built-in support for editing rich text or converting between formats. You'll have to find a library or write it yourself.
you can use :--
[strWebText writeToFile:#"Data.doc" atomically:YES encoding:NSUnicodeStringEncoding error:&error];
Related
I want to develope a app like camScaner. In which user can scan any type of file like image, pdf etc and then convert that file from word to pdf. So I want to aske that, is there any way to convert docs file to pdf in react-native?
You can use libreoffice convert for achieve the task
Link >> https://www.npmjs.com/package/libreoffice-convert
Also there is one more library awesome-unoconv which will provide you the same thing and will convert word to pdf
Link >> https://www.npmjs.com/package/awesome-unoconv
I'm using NodeJS to do an app that finds and replaces a text in a pdf. I have found some approaches:
Using some npm package, like pdfReader, that converts pdf to json. So I get the text and replaces it with what I want. The problem it's convert the output back to pdf.
The possible solution for the first item it's to convert the PDF to HTML, edit the HTML and convert it back to pdf. But most of the tutorials using NodeJS it's about convert HTML to PDF, not PDF to HTML.
Any solutions for this problem?
Update
I ended up using PDFKit to create the pdf files that i need. In my case, this solution don't to cover all the possibles. But if you have to find a word and replace it in an unpredictable pdf file, maybe this problem has no solution in nodeJS. The PDFKit lib has an open issue for this feature.
Look at this approach how to export json data to pdf file with specify format with Nodejs?. Basically uses your idea. Convert PDF to JSON and then render the JSON in html, then convert the HTML to pdf.
I am trying to convert multiple files from PDF to plain text in Adobe. I found a solution online that reads:
/* PDF to Text */
this.saveAs("C:\Users\sandr\Dropbox\Light\Doctorate\Supervisor meetings\2018\October\Method\test_corpus\2sleep.tar\2sleep\2sleep\pdf\txt_output" + this.documentFileName + ".txt","com.adobe.acrobat.plain-text");
The script runs but it always gives an error saying it could not open the file and it doesn't actually create the text file. Does anyone know why this is?
Because Acrobat Javascript needs to run on both Mac and Windows, you need to use platform independent paths. Windows-specific file names and paths won't work. For example...
this.saveAs("C/Users/user/Dropbox/foo.pdf");
Also...
this.documentFileName
will have a ".pdf" extension at the end, you may want to trim that before appending the ".txt"
(1) Is there a way to search for texts in a pdf file and go to that location in the pdf file using Python?
(2) Is there a way to highlight a text in a pdf file and that text get extracted, using Python?
I tried using Javascript pdf.js, which actually worked but I want to try Python. Any help would be appreciated. Thanks!
For searching for text within a PDF file you can use PyMuPDF or pdfminer. PyMuPDF would also let you create a PDF viewer and highlight the text if that's what you have in mind.
I am using the dropbox-js API as a back-end to an application I am creating.
I need to get the contents of a file and I understand that the method "readFile" that is used to get the contents only really supports text files.
I can get the contents of a text file of type "text/plain" i.e. .txt files, using the following:
client.readFile(d2.path, {arrayBuffer: true}, function(error, contents){
var decoded = decodeUtf8(contents);
console.log(decoded);
});
The API reference for this method is here: http://coffeedoc.info/github/dropbox/dropbox-js/master/classes/Dropbox/Client.html#readFile-instance
The decode function was found here: https://gist.github.com/boushley/5471599
This does not seem to work for any other document type file. If I try and read a .docx / .doc file the result consists of what looks like scrambled characters. Should it be able to work with other document type files? How would I read it differently?
I really need it to support more than .txt files.
Edit:
This is a test document (.docx) that I tried to read:
This is how it is decoded (Contents shows that it is indeed an arrayBuffer, while Decoded is the actual string that is returned after decode:
readFile should work for any content type. Presumably the "scrambled characters" you see are exactly the content of the .docx or .doc file you're reading. (If you looked at the file via type on Windows or cat on Mac/Linux, you would see the same thing.)
So I think the issue you're having is that you want to somehow extract the text from a variety of file formats. Dropbox (and dropbox.js) won't help you with that particular problem... you'll need to find software that understands all those file formats and can convert them to the form you need. For example, textract is a Python library that can do this.