Unable to get local storage variable in Wicked Pdf - javascript

I have stored some data "xyz" in java script local storage. Now I'm trying to access that in wicked pdf but unable to do so. It's working when i display as html but does not print that in the pdf.
That's what i have saved:
localStorage.setItem("my_id", "xyz");
In file.pdf.erb:
<script type="text/javascript">
setTimeout((function () {
//console.log(localStorage.getItem("dashboard_bar_chart_actual_situation_control"));
document.getElementById("p1").innerHTML = "HELLo "+localStorage.getItem("my_id");
window.status = "FLAG_FOR_PDF";
}), 1000);
</script>

Are you setting the localStorage value on the same page that you are rendering to PDF, or are you setting it on a prior page, and then expecting it to be retrieved on the PDF page?
The latter won't work in the default PDF rendering mode, because your HTML is saved to a file on-disk, and opened in a virtual browser, without any context (localStorage, cookies) of what happened on other pages (unless you supply them as options to wkhtmltopdf, at least for cookies).
You could explicitly set it during render, so that it becomes available, something like this:
<script>
// Render it to the page, so it is executed when the JS runs:
localStorage.setItem("my_id", "<%= "xyz" %>")
// Use it:
document.getElementById("p1").innerHTML = "HELLo "+localStorage.getItem("my_id");
</script>
Alternatively, you can use the WickedPDF middleware, which attempts an in-place replacement of the HTML you see with a PDF version, without saving it to a temporary HTML file on-disk, but that also may not work well, depending on how the JS code gets executed.

Related

getting OnLoad HTML/DOM for an HTML page in PHP

I am trying to get the HTML (ie what you see initially when the page completes loading) for some web-page URI. Stripping out all error checking and assuming static HTML, it's a single line of code:
function GetDisplayedHTML($uri) {
return file_get_contents($uri);
}
This works fine for static HTML, and is easy to extend by simple parsing, if the page has static file dependencies/references. So tags like <script src="XXX">, <a href="XXX">, <img src="XXX">, and CSS, can also be detected and the dependencies returned in an array, if they matter.
But what about web pages where the HTML is dynamically created using events/AJAX? For example suppose the HTML for the web page is just a brief AJAX-based or OnLoad script that builds the visible web page? Then parsing alone won't work.
I guess what I need is a way from within PHP, to open and render the http response (ie the HTML we get at first) via some javascript engine or browser, and once it 'stabilises', capture the HTML (or static DOM?) that's now present, which will be what the user's actually seeing.
Since such a webpage could continually change itself, I'd have to define "stable" (OnLoad or after X seconds?). I also don't need to capture any timer or async event states (ie "things set in motion that might cause web page updates at some future time"). I only need enough of the DOM to represent the static appearance the user could see, at that time.
What would I need to do, to achieve this programmatically in PHP?
To render page with JS you need to use some browser. PhantomJS was created for tasks like this. Here is simple script to run with Phantom:
var webPage = require('webpage');
var page = webPage.create();
var system = require('system');
var args = system.args;
if (args.length === 1) {
console.log('First argument must be page URL!');
} else {
page.open(args[1], function (status) {
window.setTimeout(function () { //Wait for scripts to run
var content = page.content;
console.log(content);
phantom.exit();
}, 500);
});
}
It returns resulting HTML to console output.
You can run it from console like this:
./phantomjs.exe render.js http://yandex.ru
Or you can use PHP to run it:
<?php
$path = dirname(__FILE__);
$html = shell_exec($path . DIRECTORY_SEPARATOR . 'phantomjs.exe render.js http://phantomjs.org/');
echo htmlspecialchars($html);
My PHP code assumes that PhantomJS executable is in the same directory as PHP script.

Send a PDF URL in a browser to the printer via iframe

For current non-IE browsers (Chrome, Firefox, Opera, Safari), I would like to send a PDF document to the printer given a URL to that PDF.
To avoid superfluous windows popping up, I am presently doing this with an <iframe> but I would like to close the iframe after the printing is completed, otherwise some browsers will pop up a dialog when one tries to leave the page.
Here is what I have come up with so far (using lodash and jQuery for simplicity):
var _print_url, _remove_iframe, _set_print;
_print_url = function(src_url) {
$("<iframe src='" + src_url + "' type='application/pdf'>").css({
visibility: 'hidden',
position: 'fixed',
right: '0',
bottom: '0'
}).on('load', _set_print).appendTo(document.body);
};
_remove_iframe = function(iframe) {
return $(iframe).parent().find(iframe).detach();
};
_set_print = function() {
this.contentWindow.print();
/*
* Where we could do #contentWindow.close() with a window, we must remove the
* print iframe from the DOM. We have to engage in some async madness
* it seems, for no apparent reason other than this won't work
* synchronously (#cw.print above must be async, it seems) - even though
* window.close() appears to work synchronously.
*/
_.delay(_.partial(_remove_iframe, this), 100);
};
Sometimes it seems with Google Chrome the print-dialog ends up showing the PDF correctly, but when one selects the printer and confirms the intention to print it will actually send the contents of the frame's parent page to the printer instead of the PDF itself.
There is a link suggestions on the Mozilla page but this document seems obsolete at the moment. The best example I could find was by reverse-engineering the Amazon Web Services print dialog for invoices, but that opens a window.
One alternative I have considered is Google Cloud Print, but obviously this requires the installation of extra software or configuration of a Google Account, neither of which I would wish to impose on users unless necessary.
Are there other examples of how one might print a PDF given a URL, particularly with Javascript and without otherwise superfluous additional browser add-ons or artefacts such as windows popping up?
-- NOTE whit this approach you will never see the popup blocker --
I run in a similar problem wiht an ajax application a couple month ago, the problem was I faced was I need to create the pdf file and store before to send it to print, what I did is the following:
I didnt use iframes. This application works with php TCPDF for creating the pdf and jquery and underscore for the templating system.
you can see the demo video at http://www.screenr.com/Ur07 (2:18 min)
Send the information via JSON (ajax) To achieve this because I faced the problem that everything was in ajax, so I couldn't make post with the information, so what I did is to append a hidden form to the DOM and then with the target="_blank" make the post to a new window (which you will close a the end of the process).
HTML hidden virtual form
<form method='<%= method %>'
action="<%= action %>"
name="<%= name %>"
id="<%= id %>"
target="_blank">
<input type='hidden' name='json' id='<%= valueId %>' />
</form>
Javascript
function makePost(){
var _t = _.template(_JS_Templates["print.form.hidden"]); //this the html of the form
var o = {
method : "POST",
action : js_WEB_ + "/print.php",
name : "print_virtual_form",
id : "print_virtual_form_id",
valueId : "print_virtual_value"
}
var form = _t(o);
$(".warp").append(form); //appending the form to the dom
var json = {
data : data // some data
}
$("#print_virtual_value").val(JSON.stringify(json)); //assing to the hidden input the value and stringify the json
$("#print_virtual_form_id").submit(); //make the post submmitting the form
$("#print_virtual_form_id").remove(); //removing the "virtual hidden form"
}
2.- when you receive the json in my case was using php you create whatever you have to create, I did it in this way you can see the php code in this example (is my answer)
Generate PDF using TCPDF on ajax call
3.- And finally in this is where the cool parts come is the response, in this part, I use the php file to write a javascript code saying that have to close the parent windows and report the json answer to an specific function, is kind of callback.
switch($_SERVER['REQUEST_METHOD']){
case "GET":
echo "Denied";
break;
case "POST":
if(isset($_POST["json"])){
if(!empty($_POST["json"])){
$json = $_POST["json"];
$hash = PDF::makePDF($json);
echo "<script>
function init() {
//calling callback
//returning control to the previous browser
window.opener.someclass.finish('".$hash."');
window.close();
}
window.onload = init;
</script>";
}else{
echo "not json detected";
}
}else{
echo "not json detected";
}
break;
4.- and for the end with the control in you window browser you can do.
var someobject = (function(name){
//private
var finish(data){
//do whatever you want with the json data
}
//public
var $r = {};
$r.finish = function(data){
finish(data); //you will pass the data to the finish function and that is..!! voila!!
}
return $r;
})("someobject");
I know is not exactly what you ask but is another approach to the same problem while I think this is little more complex, I can guarantee works in a lot of browser and the users will love the way you do it. They will never see what is happening and they will download the file just how the expect to do it, clicking a link and the saving the file.
Just my two cents and happy coding.

Inject local .js file into a webpage?

I'd like to inject a couple of local .js files into a webpage. I just mean client side, as in within my browser, I don't need anybody else accessing the page to be able to see it. I just need to take a .js file, and then make it so it's as if that file had been included in the page's html via a <script> tag all along.
It's okay if it takes a second after the page has loaded for the stuff in the local files to be available.
It's okay if I have to be at the computer to do this "by hand" with a console or something.
I've been trying to do this for two days, I've tried Greasemonkey, I've tried manually loading files using a JavaScript console. It amazes me that there isn't (apparently) an established way to do this, it seems like such a simple thing to want to do. I guess simple isn't the same thing as common, though.
If it helps, the reason why I want to do this is to run a chatbot on a JS-based chat client. Some of the bot's code is mixed into the pre-existing chat code -- for that, I have Fiddler intercepting requests to .../chat.js and replacing it with a local file. But I have two .js files which are "independant" of anything on the page itself. There aren't any .js files requested by the page that I can substitute them for, so I can't use Fiddler.
Since your already using a fiddler script, you can do something like this in the OnBeforeResponse(oSession: Session) function
if ( oSession.oResponse.headers.ExistsAndContains("Content-Type", "html") &&
oSession.hostname.Contains("MY.TargetSite.com") ) {
oSession.oResponse.headers.Add("DEBUG1_WE_EDITED_THIS", "HERE");
// Remove any compression or chunking
oSession.utilDecodeResponse();
var oBody = System.Text.Encoding.UTF8.GetString(oSession.responseBodyBytes);
// Find the end of the HEAD script, so you can inject script block there.
var oRegEx = oRegEx = /(<\/head>)/gi
// replace the head-close tag with new-script + head-close
oBody = oBody.replace(oRegEx, "<script type='text/javascript'>console.log('We injected it');</script></head>");
// Set the response body to the changed body string
oSession.utilSetResponseBody(oBody);
}
Working example for www.html5rocks.com :
if ( oSession.oResponse.headers.ExistsAndContains("Content-Type", "html") &&
oSession.hostname.Contains("html5rocks") ) { //goto html5rocks.com
oSession.oResponse.headers.Add("DEBUG1_WE_EDITED_THIS", "HERE");
oSession.utilDecodeResponse();
var oBody = System.Text.Encoding.UTF8.GetString(oSession.responseBodyBytes);
var oRegEx = oRegEx = /(<\/head>)/gi
oBody = oBody.replace(oRegEx, "<script type='text/javascript'>alert('We injected it')</script></head>");
oSession.utilSetResponseBody(oBody);
}
Note, you have to turn streaming off in fiddler : http://www.fiddler2.com/fiddler/help/streaming.asp and I assume you would need to decode HTTPS : http://www.fiddler2.com/fiddler/help/httpsdecryption.asp
I have been using fiddler script less and less, in favor of fiddler .Net Extensions - http://fiddler2.com/fiddler/dev/IFiddlerExtension.asp
If you are using Chrome then check out dotjs.
It will do exactly what you want!
How about just using jquery's jQuery.getScript() method?
http://api.jquery.com/jQuery.getScript/
save the normal html pages to the file system, add the js files manually by hand, and then use fiddler to intercept those calls so you get your version of the html file

.NET in js include file

Can I get print data in my .js files before sending them to the client? I want to be able to print the session id into a variable to use in getting around the flash cookie bug. Here's my code:
//Add our custom post-params
var auth = "<% = If(Request.Cookies(FormsAuthentication.FormsCookieName) Is Nothing, String.Empty, Request.Cookies(FormsAuthentication.FormsCookieName).Value) %>";
var ASPSESSID = "<%= Session.SessionID %>";
this.setPostParams({
ASPSESSID: ASPSESSID,
AUTHID: auth
});
That shows up exactly the same way in my js file. I want the server to process the code!
You can not include the .NET markup in the js file. You can use an ashx file to generate a dynamic script files if you really want to do it.
Why write it in the JS file? Write it to the page with ClientScriptManager.RegisterClientScriptBlock so the JavaScript can be cached properly. The script will still be able to access the variable in the page as long as the included script appears after the code.
I think you want to use immediate if. Like:
var auth = "<% = iif(Request...

Applying DOM Manipulations to HTML and saving the result?

I have about 100 static HTML pages that I want to apply some DOM manipulations to. They all follow the same HTML structure. I want to apply some DOM manipulations to each of these files, and then save the resulting HTML.
These are the manipulations I want to apply:
# [start]
$("h1.title, h2.description", this).wrap("<hgroup>");
if ( $("h1.title").height() < 200 ) {
$("div.content").addClass('tall');
}
# [end]
# SAVE NEW HTML
The first line (.wrap()) I could easily do with a find and replace, but it gets tricky when I have to determine the calculated height of an element, which can't be easily be determined sans-JavaScript.
Does anyone know how I can achieve this? Thanks!
While the first part could indeed be solved in "text mode" using regular expressions or a more complete DOM implementation in JavaScript, for the second part (the height calculation), you'll need a real, full browser or a headless engine like PhantomJS.
From the PhantomJS homepage:
PhantomJS is a command-line tool that packs and embeds WebKit.
Literally it acts like any other WebKit-based web browser, except that
nothing gets displayed to the screen (thus, the term headless). In
addition to that, PhantomJS can be controlled or scripted using its
JavaScript API.
A schematic instruction (which I admit is not tested) follows.
In your modification script (say, modify-html-file.js) open an HTML page, modify it's DOM tree and console.log the HTML of the root element:
var page = new WebPage();
page.open(encodeURI('file://' + phantom.args[0]), function (status) {
if (status === 'success') {
var html = page.evaluate(function () {
// your DOM manipulation here
return document.documentElement.outerHTML;
});
console.log(html);
}
phantom.exit();
});
Next, save the new HTML by redirecting your script's output to a file:
#!/bin/bash
mkdir modified
for i in *.html; do
phantomjs modify-html-file.js "$1" > modified/"$1"
done
I tried PhantomJS as in katspaugh's answer, but ran into several issues trying to manipulate pages. My use case was modifying the static html output of Doxygen, without modifying Doxygen itself. The goal was to reduce delivered file size by remove unnecessary elements from the page, and convert it to HTML5. Additionally I also wanted to use jQuery to access and modify elements more easily.
Loading the page in PhantomJS
The APIs appear to have changed drastically since the accepted answer. Additionally, I used a different approach (derived from this answer), which will be important in mitigating one of the major issues I encountered.
var system = require('system');
var fs = require('fs');
var page = require('webpage').create();
// Reading the page's content into your "webpage"
// This automatically refreshes the page
page.content = fs.read(system.args[1]);
// Make all your changes here
fs.write(system.args[2], page.content, 'w');
phantom.exit();
Preventing JavaScript from Running
My page uses Google Analytics in the footer, and now the page is modified beyond my intention, presumably because javascript was run. If we disable javascript, we can't actually use jQuery to modify the page, so that isn't an option. I've tried temporarily changing the tag, but when I do, every special character is replaced with an html-escaped equivalent, destroying all javascript code on the page. Then, I came across this answer, which gave me the following idea.
var rawPageString = fs.read(system.args[1]);
rawPageString = rawPageString.replace(/<script type="text\/javascript"/g, "<script type='foo/bar'");
rawPageString = rawPageString.replace(/<script>/g, "<script type='foo/bar'>");
page.content = rawPageString;
// Make all your changes here
rawPageString = page.content;
rawPageString = rawPageString.replace(/<script type='foo\/bar'/g, "<script");
Adding jQuery
There's actually an example on how to use jQuery. However, I thought an offline copy would be more appropriate. Initially I tried using page.includeJs as in the example, but found that page.injectJs was more suitable for the use case. Unlike includeJs, there's no <script> tag added to the page context, and the call blocks execution which simplifies the code. jQuery was placed in the same directory I was executing my script from.
page.injectJs("jquery-2.1.4.min.js");
page.evaluate(function () {
// Make all changes here
// Remove the foo/bar type more easily here
$("script[type^=foo]").removeAttr("type");
});
fs.write(system.args[2], page.content, 'w');
phantom.exit();
Putting it All Together
var system = require('system');
var fs = require('fs');
var page = require('webpage').create();
var rawPageString = fs.read(system.args[1]);
// Prevent in-page javascript execution
rawPageString = rawPageString.replace(/<script type="text\/javascript"/g, "<script type='foo/bar'");
rawPageString = rawPageString.replace(/<script>/g, "<script type='foo/bar'>");
page.content = rawPageString;
page.injectJs("jquery-2.1.4.min.js");
page.evaluate(function () {
// Make all changes here
// Remove the foo/bar type
$("script[type^=foo]").removeAttr("type");
});
fs.write(system.args[2], page.content, 'w');
phantom.exit();
Using it from the command line:
phantomjs modify-html-file.js "input_file.html" "output_file.html"
Note: This was tested and working with PhantomJS 2.0.0 on Windows 8.1.
Pro tip: If speed matters, you should consider iterating the files from within your PhantomJS script rather than a shell script. This will avoid the latency that PhantomJS has when starting up.
you can get your modified content by $('html').html() (or a more specific selector if you don't want stuff like head tags), then submit it as a big string to your server and write the file server side.

Categories