How to create a single page pdf from a html file
I'm currently using the convert-html-to-pdf library (version: 1.0.1) to convert html files into pdf. But, the conversion is separating my html in multiple pages and it ends up breaking the code (streched divs).
Since my html file is responsive to certain needs, the size of the div blocks change on each file, and because of this I'm not able to determine when to "break" each page unless I somehow measure the size of the page and the size of the blocks.
There are two possible corrections:
create a single page pdf (this one is preferred)
mesaure the size of the page and compare with the items to make sure they are not separeted each page
The first one is supposed to be easier, but I can't find anything in the documentation that helps me do that.
I'm inclined to accept libraries suggestions to create a single page pdf.
Related
I am currently working on a pdf generator with html/css code. Everything is already functional, including page number generation in the footer. My current problem is that I want to create a table of contents with a dynamic page number (since depending on the content, the page numbers of the different chapters are different). Is it possible to do this with just html/css?
The generator is working properly. But I have no idea how to proceed to get the page number of the pdf file.
The system I am working on has a questionnaire in it and then shows the responses to the admin in a nice report on screen. I need to create functionality that turns the on screen report into a pdf, similar to how the browser generates a pdf of the page when you select print. Although I need to only turn a section of the page into pdf. And it would be ideal to be able to alter the HTML so that the pdf page breaks don't interfere with the presentation of the report.
You can download a pdf of how the report looks, generated by the browser functionality. This is just an example, I need the pdf to be generated by a link or button and not include the whole page (the top part in this case).
I have tried some php HTML to pdf generators, but it's difficult because the HTML is dynamically generated so I'm not sure how to send all the HTML, once rendered, to the page that creates the pdf.
To overcome the page breaks, I've considered using javascript or jquery to read the height of the div of each question within the report and then write a simple script to calculate if the next div will fit on the page and if not add a margin on top of that div so that it starts on a new page.
What software can I use to generate the pdf, given these requirements? Either php or javascript. Appreciate the help.
Have you considered Snappy for PHP? It makes use of wkhtmltopdf behind the curtains to convert any HTML document into PDF.
We are using it and it works great.
https://github.com/KnpLabs/snappy
You could try mpdf and use the page-break-inside: avoid property which is actually a CSS property. I have not used this, but it might be what you're looking for.
Looks like you can add this property to the <div> and <table> tags (mPDF Supported CSS).
I have a page with a lots of javascript. However, the page once rendered remains static, there are no moving things or special effects, etc... It should be possible to render the same HTML without any javascript at all using only the plain HTML and CSS. This is exactly what I want - I would like to get a no javascript version of the particular page. Surely, I do not expect any dynamic behavior, so I am OK if buttons are dead, for example. I just want them rendered.
Now, I do not want an image. It needs to be an HTML with CSS, may be embedded with the HTML, which is fine too.
How can I do it?
EDIT
I am sorry, but I must have not been clear. My web site works with javascript and will not work without it. I do not want to check if it works without, I know it will not and I really do not care about it. This is not what I am asking. I am asking about a specific page, which I want to grab as pure HTML + CSS. The fact that its dynamic nature is lost is of no importance.
EDIT2
There is a suggestion to gram the HTML from the DOM inspector. This is what I did the first thing - in Chrome development utils copied as HTML the root html element and saved it to a file. Of course, this does not work, because it continues to reference the CSS files on the web. I guess I should have mentioned that I want it to work from the file system.
Next was to save the page as complete with all the environment using some kind of the Save menu (browser dependent). It saves the page and all the related files forming a closure, which can be open from the file system. But the html has to be manually cleaned up of all the javascript - tedious and error prone.
EDIT3
I seem to keep forgetting things. Images should be preserved, of course.
I have to do a similar task on a semi-regular basis. As yet I haven't found an automated method, but here's my workflow:
Open the page in Google Chrome (I imagine FireFox also has the relevant tools);
"Save Page As" (complete page), rename the html page to something nicer, delete any .js scripts which got downloaded, move everything into a single folder;
On the original page, open the Elements tab (DOM inspector), find and delete any tags which I know cause problems (Facebook "like" buttons for example) (I also try to delete script tags at this stage because it's easier) and copy as HTML (right-click the <html> tag. Paste this into (replace) the downloaded HTML file (remember to keep the DOCTYPE which doesn't get copied;
Search all HTML files for any remaining script sections and delete (also delete any noscript content), and search for on (that's with a space at the start but StackOverflow won't render it) to remove handlers (onload, onclick, etc);
Search for images (src=, url(), find common patterns in image filenames and use regular expressions to replace them globally. So for example src="/images/myimage.png" => |/images/||. This needs to be applied to all HTML and CSS files. Also make sure the CSS files have the correct path (href). While doing this I usually replace all href (links) with #;
Finally open the converted page in a browser (actually I tend to do this early on so that I can see if any change I make causes it to break), use the Console tab to check for 404 errors (images that didn't get downloaded or had a different name) and the Network tab to check if anything is still being loaded from the online version;
For any files which didn't get downloaded I go back to the original page and use the Resources tab to find them and download manually;
(Optional) Cull any content which isn't needed (tracker images/iframes, unused CSS, etc).
It's a big job. I'd love a tool which automated all that, but so far I haven't found one. The pages I download are quite badly made (shops) which have a lot of unusual code, so that's why there are so many steps. You might not need to follow every step.
I am implementing multilingual support into my webpage. I would like to minimize the page blinking caused from page reload, and I came to the idea to change page language without forcing the whole page to reload. To achieve this, the only possible way that comes to my mind is with the use of JavaScript:
I dynamically load appropriate language .js file with appropriate translations
I manually go through every text object on the page and update it by re-sending the appropriate new text value
To provide you with example code, I paste a code that will update just submit buttons. On the language change, I call a function that loads appropriate .js language file dynamically.
var fileRef = LoadJsCssFile("Language/svk.js", "js", UpdateLanguage);
After the language .js file is fully loaded, I call the function that updates every element containing text on the webpage:
function UpdateLanguage()
{
var buttons = document.getElementsByClassName("submit_button");
for (buttonID in buttons)
{
buttons[buttonID].innerHTML = lang.SUBMIT;
}
};
Manually updating every text object in the webpage is complex and error prone. As I am not very experienced with JavaScript yet, I was thinking, if there is a way to simply refresh the all key elements in the webpage with one JavaScript command without casing the webpage blink?
If you have any other idea, how to effectively implement language change without page blink, I am interested to know. :-)
I found a solution on my own:
I prepare several javascript language files containing strings per every keyword
On language selection button, I import appropriate language file for the language I wish to use
I manually update every text on the webpage through javascript.
The above solution is suitable for smaller sites. for large ones, that would be a lot of work, to update every single text string through javascript.
SHORT: my python code generates a webpage with a table. i'm considering rewriting it to generate a js file instead, that holds the table contents in an array ... and then let the table be generated client-side. I am not sure of the pros and cons. Anyone care to offer their experience/insight? Are there other solutions?
LONG: the web page contains a single table and an embedded gmap. the table is a set of locations with several columns of location-stats and also two navigation columns. one nav column consists of onclicks that will recenter embedded gmap to the lat,lon of the location. the other nav column consists of hrefs that open a new window with a gmap centered on the lat,lon.
Until recently, my python code would do some number crunching on a list of files, and then generate the html file. also i wrote a js file that keeps the webpage liquid upon browser window resizing.
Recently, I modified my python code so that it:
placed the lat,lon info in a custom attribute of the tr elements
no longer produced the nav column tds
and then wrote a js function that
loops through the trs onLoad
reads the lat,lon from the custom attribute
inserts the nav tds
fwiw, this reduced the size of the html file by 70% while increasing the js by 10%.
ok, so now I am debating if I should go all the way and write my python code to generate 2 files
an essentially abstract html file
a js file containing a js array of the locations and their stats
If your API can output a JSON document with your data, you gain significant flexibility and future-proofing. This might even be something your users will want to access directly for their own external consumption. And of course your JS code can easily generate a table from this data.
However nobody here can tell you whether this is worth doing or not, as that depends entirely on the scope of your project and opportunity cost of time spent re-architecting.