Hi I was wondering if there is the ability in node js and zombie js to inject javascript files in to the headless browser, similar to what you can do with phantomjs.
For example in phantom js you would do:
page.injectJs("amino/TVI.js")
I have used phantomjs and it does do what I want it to do but however I am testing other options due to the high memory required by using phantom js.
you can append script tag into document object since it support DOM API in zombie.
The following example shows how to insert jquery into zombie homepage:
var Browser = require("zombie");
var assert = require("assert");
// Load the page from localhost
browser = new Browser()
browser.visit("http://zombie.labnotes.org/", function () {
assert.ok(browser.success);
// append script tag
var injectedScript = browser.document.createElement("script");
injectedScript.setAttribute("type","text/javascript");
injectedScript.setAttribute("src", "http://code.jquery.com/jquery-1.11.0.min.js");
browser.body.appendChild(injectedScript);
browser.wait(function(window) {
// make sure the new script tag is inserted
return window.document.querySelectorAll("script").length == 4;
}, function() {
// jquery is ready
assert.equal(browser.evaluate("$.fn.jquery"), "1.11.0");
console.log(browser.evaluate("$('title').text()"));
});
});
Try to think the other way around. You have already everything at your hand in zombie to inject everything you want.
For example: that.browser.window points to the jsdom window that every part of your site javascript is using as a base. So you can access the dom and all other window objects in the page already loaded.
I don't know what you want to archieve with injecting - you should not use it for testing anway, but it looks this is not your actual goal
Related
A friend has asked me to capture a client-side rendered website built with React.js, preferably using PhantomJS. I'm using a simple rendering script as follows:
var system = require('system'),
fs = require('fs'),
page = new WebPage(),
url = system.args[1],
output = system.args[2],
result;
page.open(url, function (status) {
if (status !== 'success') {
console.log('FAILED to load the url');
phantom.exit();
} else {
result = page.evaluate(function(){
var html, doc;
html = document.querySelector('html');
return html.outerHTML;
});
if(output){
var rendered = fs.open(output,'w');
rendered.write(result);
rendered.flush();
rendered.close();
}else{
console.log(result);
}
}
phantom.exit();
});
The url is http://azertyjobs.tk
I consistently get an error
ReferenceError: Can't find variable: Promise
http://azertyjobs.tk/build/bundle.js:34
http://azertyjobs.tk/build/bundle.js:1 in t
...
Ok so I figured out that ES6 Promises aren't natively supported by PhantomJS yet, so I tried various extra packages like the following https://www.npmjs.com/package/es6-promise and initiated the variable as such:
var Promise = require('es6-promise').Promise
However this still produces the same error, although Promise is now a function. The output of the webpage is also still as good as empty (obviously..)
Now I'm pretty oldschool, so this whole client-side rendering stuff is kind of beyond me (in every aspect), but maybe someone has a solution. I've tried using a waiting script too, but that brought absolutely nothing. Am I going about this completely wrong? Is this even possible to do?
Much appreciated!
Ludwig
I've tried the polyfill you linked and it didn't work, changed for core.js and was able to make a screenshot. You need to inject the polyfill before the page is opened:
page.onInitialized = function() {
if(page.injectJs('core.js')){
console.log("Polyfill loaded");
}
}
page.open(url, function (status) {
setTimeout(function(){
page.render('output.jpg');
phantom.exit();
}, 3000);
});
What you need to understand is that there are several parts of a page loading. First there is the HTML - the same thing you see when you "view source" on a web page. Next there are images and scripts and other resources loaded. Then the scripts are executed, which may or may not result in more content being loaded and possible modifications to the HTML.
What you must do then is figure out a way to determine when the page is actually "loaded" as the user sees it. PhantomJS provides a paradigm for you to waitFor content to load. Read through their example and see if you can figure out a method which works for you. Take special note of where they put phantom.exit(); as you want to make sure that happens at the very end. Good luck.
Where (how) are you trying to initialise Promise? You'll need to create it as a property of window, or use es6-promise as a global polyfill, like this require('es6-promise').polyfill(); or this require('es6-promise/auto'); (from the readme).
Also, what do you mean by "capture"? How If you're trying to scrape data, you may have better luck using X-ray. It supports Phantom, Nightmare and other drivers.
Keep in mind also that React can also be server rendered. React is like templating, but with live data bindings. It's not as complicated as you're making it out to be.
Is there an easy way to create a Firefox extension for a simple content script WITHOUT having to use the Add-On SDK and PageMod? It really seems overkill to go through the hassle of installing Python and the SDK, learning how to use the SDK and API, and adding unnecessary bloat and abstraction layers, just to execute a simple content script.
I already tried using a XUL browser overlay and injecting the scripts there, but having everything injected in the browser.xul context instead of document.body is also adding a lot of complexity...
So what's the easiest, lightweight way to inject a few scripts and css files in the html document instead of the XUL document?
Your guestion is border-line too broad, so I won't be discussing everything in detail but give a general overview.
Easy might be an overstatement, but SDK content scripts (and actually modules too), Greasemonkey/Scriptish and everything else that resembles a content script uses Sandbox internally. Even bootstrap.js in restartless add-ons are executed in a sandbox.
The basic idea is the following:
Get a reference to the content window you want to attach too.
Choose a "principal" the script should run under. The principal is essentially the security context/policy that also defines the same origin. An unprivileged content script would usually use the content window itself (which is a principal too), while a privileged script (chrome access to Components) script would use the system principal.
Choose if you want XRay wrappers. The docs tell you more about it.
Choose the Sandbox prototype (the "global" or top-level this). Usually for content script stuff you'll choose the content window.
Create the Sandbox.
Add any stuff your content script may need to the Sandbox.
Execute a script by either evalInSandbox or the subscript loader.
Here is a limited example adding an unprivileged content script to a window:
// 1. get the content window, e.g. the currently selected tab window
var contentWindow = gBrowser.contentWindow;
// 2. Choose the principal, e.g. just use the content window again
var principal = contentWindow;
// 3. We want XRay wrappers, to keep our content script and the actual
// page scripts in their own corners.
var wantXrays = true;
// 4. Our prototype will be the window
var sbProto = contentWindow;
// 5. Putting it all together to create a sandbox
var sandbox = Cu.Sandbox(principal, {
sandboxPrototype: sbProto,
wantXrays: wantXrays
});
// 6. Adding a random helper function (e.g.)
sandbox.getRandomInt = function (min, max) {
return Math.floor(Math.random() * (max - min + 1)) + min;
};
// 7. Execute some content script, aka. the stupid example.
try {
var execute = function() {
var demo1 = document.querySelector('title').textContent;
var demo2 = getRandomInt(1, 1000);
alert(demo1 + " " + demo2);
}
Cu.evalInSandbox(
"(" + execute.toSource() + ")()",
sandbox
);
} catch(ex) {
console.error(ex);
}
PS: This example will run verbatim in a Scratchpad with Environment/Browser.
Regarding styles:
Do what the SDK does, I guess, which is simplified:
var wu = contentWindow.QueryInterface(Ci.nsIInterfaceRequestor).
getInterface(Ci.nsIDOMWindowUtils);
var uri = Services.io.newURI(
"chrome://myaddon/style/content-style.css",
null,
null);
wu.loadSheet(uri, wu.USER_SHEET);
I'm trying to develop extension that works only on specified pages - If page owner adds global variable into their code (for eg. ACCEPT_STATS = true;) I want to execute specified code.
I've already bind my function to the onload event, i've also found solution how to do that in Firefox:
var win = window.top.getBrowser().selectedBrowser.contentWindow;
if (typeof win.wrappedJSObject.ACCEPT_STATS !== 'undefined') {
// code to run if global variable present
}
but I couldn't make this work under Chrome. Is there any possibility to access document's global variable throw Chrome Extension code?
My extension's code is injected as a content-script.
Yes, including script into the page does run in an isolated context from the pages runtime script.
However, it is possible to work around the isolated worlds issue by pushing inline script into the runtime context via a script tag appended to the document's html. That inline script can then throw a custom event.
The included script in the isolated context can listen for that event and respond to it accordingly.
So code in your included script would look something like this:
// inject code into "the other side" to talk back to this side;
var scr = document.createElement('script');
//appending text to a function to convert it's src to string only works in Chrome
scr.textContent = '(' + function () {
var check = [do your custom code here];
var event = document.createEvent("CustomEvent");
event.initCustomEvent("MyCustomEvent", true, true, {"passback":check});
window.dispatchEvent(event); } + ')();'
//cram that sucker in
(document.head || document.documentElement).appendChild(scr);
//and then hide the evidence as much as possible.
scr.parentNode.removeChild(scr);
//now listen for the message
window.addEventListener("MyCustomEvent", function (e) {
var check = e.detail.passback;
// [do what you need to here].
});
The javascript running on the page is running in a different "isolated world" than the javascript that you inject using content scripts. Google Chrome keeps these two worlds separate for security reasons and therefore you can't just read window.XYZ on any window. More info on how isolated worlds work : http://www.youtube.com/watch?v=laLudeUmXHM
The correct way of implementing this is by communicating with the page is via window.postMessage API. Here're how I would go about it :
Inject a content script into each tab
Send a message to the tab via window.postMessage
If the page understands this message, it responds correctly (again via window.postMessage)
Content script executes the code that it needed to execute.
HTH
I need a way to load a website - something like gBrowser.loadURI, window.location or window.open - but I need to execute some more code AFTER that website has been loaded (and parsed by the browser). The functions I've mentioned don't block execution of my code until the site is fully loaded, but only until it has started loading.
In case it matters: This code will not be part of my/a website, but will be a FireGestures script.
https://developer.mozilla.org/en/Code_snippets/Tabbed_browser#Manipulating_content_of_a_new_tab seems to be what you want. They suggest:
var newTabBrowser = gBrowser.getBrowserForTab(gBrowser.addTab("http://www.google.com/"));
newTabBrowser.addEventListener("load", function () {
// use newTabBrowser.contentDocument to manipulate DOM
// or do whatever you want on-load
}, true);
See also docs for tabbrowser and browser.
My app is loading an external javascript file with jQuery.getScript(). When I use the bookmarklet or an extension to start the app everything works fine. When the app is installed through KBX though inside Chrome with the KBX extension the included functions inside the javascript file are not accessible in the callback anymore and I get : Uncaught ReferenceError: myfunc is not defined .
Is there any trick to get access to the included functions?
Bookmarklet : javascript:(function(){var d=document;var s=d.createElement('script');s.text="KOBJ_config={'rids':['a1135x30']};";d.body.appendChild(s);var l=d.createElement('script');l.src='http://init.kobj.net/js/shared/kobj-static.js';d.body.appendChild(l);})()
Chrome extension : crx
url for installation via KBX : app on KBX
Here is the ruleset:
ruleset a1135x30 {
meta {
name "test_external_js_loading"
description <<
debugging external loading in kbx
>>
author "loic devaux"
logging on
}
dispatch {
domain ".*"
}
global {
}
rule first_rule {
select when pageview ".*" setting ()
// pre { }
// notify("Hello World", "This is a sample rule.");
{
emit <|
$K.getScript('http\:\/\/lolo.asia/kynetx_debug/js/myfunc.js',function() {
myfunc();
/*
* myfunc.js content:
myfunc = function(){
console.log('running myfunc');
};
*/
}
);
|>
}
}
}
I'm not completely sure that your issue has to do with the sandboxed environment that the KBX runs your code in but I think it might. Here is a post I wrote about dealing with the sandboxed environment of the KBX http://geek.michaelgrace.org/2011/03/kynetxs-new-sandboxed-browser-extensions/
From blog post
I recently released my “Old School Retweet” Kynetx app in the Kynetx app store for the newly released browser extensions. I super love the new extensions and all that they do for users and developers alike. Something that I forgot when I released the app in the app store is that the new extension are sandboxed.
Because the extensions are sandboxed, all of the scripts from the extensions run a bit differently than they used to in the previous Kynetx extensions. Without getting into the technical details too much, the previous extensions just injected JavaScript into the page and the new extensions run JavaScript in a sandbox which has access to the DOM but can’t access anything else on the page. Because of this change my retweet app broke since I was using the jQuery loaded by Twitter.com to bring up the new tweet box (I do this because Twitter.com used that library to bind a click event and to trigger that event it has to be from the same library that bound it). Thankfully, with the help of a friend, I was able to get a work around for both Firefox and Chrome’s sandbox environment.
How I did it…
If the app is run not inside a sandbox I can just access the jQuery that Twitter.com loads to open a new tweet box
$("#new-tweet").trigger("click");
From within the Firefox sandbox I can access the page outside of the sandbox
window['$']("#new-tweet").trigger("click");
If I am in the Chrome sandbox I can create a script element that has the JavaScript that I want to execute. Crude, but it works. : )
var trigger_click_script = document.createElement("script");
var fallback = "window['$']('#new-tweet').trigger('click');";
trigger_click_script.innerHTML = fallback;
document.getElementsByTagName("head")[0].appendChild(trigger_click_script);
Here is the JavaScript code that I ended up with that gets executed when a user clicks on the retweet button.
// get stuff to retweet
var tweet = $K(this).parents(".tweet-content").find(".tweet-text").text();
var name = $K(this).parents(".tweet-content").find(".tweet-screen-name").text();
// build tweet
var retweet = "RT #"+name+" "+tweet;
// open new tweet box
$("#new-tweet").trigger("click");
// hack for FF sandbox
if ($("#tweet-dialog:visible").length === 0) {
window['$']("#new-tweet").trigger("click");
}
// put tweet in new tweet box
$K(".draggable textarea.twitter-anywhere-tweet-box-editor").val(retweet).focus();
$K("#tweet_dialog a.tweet-button.button.disabled").removeClass("disabled");
// hack for chrome sandbox
if ($("#tweet-dialog:visible").length === 0) {
var fallback = "window['$']('#new-tweet').trigger('click'); ";
fallback += "window['$']('.draggable textarea.twitter-anywhere-tweet-box-editor').val('"+retweet+"').focus(); ";
fallback += "window['$']('#tweet_dialog a.tweet-button.button.disabled').removeClass('disabled'); ";
var trigger_click_script = document.createElement("script");
trigger_click_script.innerHTML = fallback;
document.getElementsByTagName("head")[0].appendChild(trigger_click_script);
}
Another thing that you can do to make your stuff accessible outside of the sandbox is the declare your stuff at the window level (defeats the purpose of the sandbox, and not recommended). For example: if you want to perform a console.log, whilst inside the sandbox, the console.log won't log to the window console. But, if you say window.console.log, it will. So, you could (but shouldn't) declare a var the following way:
window.myvar = "MyValue";
That would make the var a window level var. Even though I am preaching against this, I have done it a time or two, for testing.
So... I just did something that worked for both FF and Chrome. It isn't pretty, but none of this really is. It was nice to have one workaround for both instead of having to work differently for FF than Chrome. I needed to get the value from a global object... but the sandbox was blocking that. With this hack I was able to do that one way for both browsers.
First, from within the sandbox... add an invisible div to the bottom of the document.body
$K('body').append('<div id="randomdiv" style="display:none;"></div>');
Then create a script in the document.head that will set the text of the randomdiv to the value that I needed.
var temp = '$("#randomdiv").text(twttr.currentUserScreenName);';
var somescript = document.createElement("script");
somescript.innerHTML = temp;
document.getElementsByTagName("head")[0].appendChild(somescript);
Then... at this point, from within the sandbox, you can select the value from the DOM, rather than from some global js object. This is how you would do it.
var myvar = $K('#randomdiv').text();
Let me know your thoughts. This is what was the easiest for me.