A friend has asked me to capture a client-side rendered website built with React.js, preferably using PhantomJS. I'm using a simple rendering script as follows:
var system = require('system'),
fs = require('fs'),
page = new WebPage(),
url = system.args[1],
output = system.args[2],
result;
page.open(url, function (status) {
if (status !== 'success') {
console.log('FAILED to load the url');
phantom.exit();
} else {
result = page.evaluate(function(){
var html, doc;
html = document.querySelector('html');
return html.outerHTML;
});
if(output){
var rendered = fs.open(output,'w');
rendered.write(result);
rendered.flush();
rendered.close();
}else{
console.log(result);
}
}
phantom.exit();
});
The url is http://azertyjobs.tk
I consistently get an error
ReferenceError: Can't find variable: Promise
http://azertyjobs.tk/build/bundle.js:34
http://azertyjobs.tk/build/bundle.js:1 in t
...
Ok so I figured out that ES6 Promises aren't natively supported by PhantomJS yet, so I tried various extra packages like the following https://www.npmjs.com/package/es6-promise and initiated the variable as such:
var Promise = require('es6-promise').Promise
However this still produces the same error, although Promise is now a function. The output of the webpage is also still as good as empty (obviously..)
Now I'm pretty oldschool, so this whole client-side rendering stuff is kind of beyond me (in every aspect), but maybe someone has a solution. I've tried using a waiting script too, but that brought absolutely nothing. Am I going about this completely wrong? Is this even possible to do?
Much appreciated!
Ludwig
I've tried the polyfill you linked and it didn't work, changed for core.js and was able to make a screenshot. You need to inject the polyfill before the page is opened:
page.onInitialized = function() {
if(page.injectJs('core.js')){
console.log("Polyfill loaded");
}
}
page.open(url, function (status) {
setTimeout(function(){
page.render('output.jpg');
phantom.exit();
}, 3000);
});
What you need to understand is that there are several parts of a page loading. First there is the HTML - the same thing you see when you "view source" on a web page. Next there are images and scripts and other resources loaded. Then the scripts are executed, which may or may not result in more content being loaded and possible modifications to the HTML.
What you must do then is figure out a way to determine when the page is actually "loaded" as the user sees it. PhantomJS provides a paradigm for you to waitFor content to load. Read through their example and see if you can figure out a method which works for you. Take special note of where they put phantom.exit(); as you want to make sure that happens at the very end. Good luck.
Where (how) are you trying to initialise Promise? You'll need to create it as a property of window, or use es6-promise as a global polyfill, like this require('es6-promise').polyfill(); or this require('es6-promise/auto'); (from the readme).
Also, what do you mean by "capture"? How If you're trying to scrape data, you may have better luck using X-ray. It supports Phantom, Nightmare and other drivers.
Keep in mind also that React can also be server rendered. React is like templating, but with live data bindings. It's not as complicated as you're making it out to be.
Related
I'trying to change Javascript src in order to get a "dev environment" for testing some Javascripts.
( Obviously i can't build a real dev environment , i can't mirror this website on a dev env ).
So i was thinking about manipulating Dom with PhantomJS and testing javascript with CasperJS. I wanna convert ( for example ) this script
<script type="..." language="..." src="production_path/source.js"></script>
into this one
<script type="..." language="..." src="dev_path/source.js"></script>
before the script starts loading.
I'm trying with
casper.start("http://www.example.com/",function(status){
var scripts = document.getElementsByTagName('script');
casper.each(scripts,function(self,my_script){
//here i would rewrite script url
});
});
casper.run();
but it doesn't work. I'm afraid i have to wait for something , but i'm not understanding what.
Taking a step back, is it okay to re-phrase your question as: how do I get PhantomJS to load "dev_path/source.js" when it tries to load "production_path/source.js"?
If so, write a onResourceRequested handler, and use the changeUrl function of the resourceRequest object.
It will be something like this:
casper.page.onResourceRequested = function(requestData, networkRequest) {
if(requestData.url == 'production_path/source.js'){
console.log("Changing request from production to dev for source.js");
networkRequest.changeUrl('dev_path/source.js');
}
};
Of course in a real situation I'd use a regex replace (as I expect there are multiple URLs to replace).
(Untested, so let me know if it does not work, and I'll look into it more carefully.)
Hi I was wondering if there is the ability in node js and zombie js to inject javascript files in to the headless browser, similar to what you can do with phantomjs.
For example in phantom js you would do:
page.injectJs("amino/TVI.js")
I have used phantomjs and it does do what I want it to do but however I am testing other options due to the high memory required by using phantom js.
you can append script tag into document object since it support DOM API in zombie.
The following example shows how to insert jquery into zombie homepage:
var Browser = require("zombie");
var assert = require("assert");
// Load the page from localhost
browser = new Browser()
browser.visit("http://zombie.labnotes.org/", function () {
assert.ok(browser.success);
// append script tag
var injectedScript = browser.document.createElement("script");
injectedScript.setAttribute("type","text/javascript");
injectedScript.setAttribute("src", "http://code.jquery.com/jquery-1.11.0.min.js");
browser.body.appendChild(injectedScript);
browser.wait(function(window) {
// make sure the new script tag is inserted
return window.document.querySelectorAll("script").length == 4;
}, function() {
// jquery is ready
assert.equal(browser.evaluate("$.fn.jquery"), "1.11.0");
console.log(browser.evaluate("$('title').text()"));
});
});
Try to think the other way around. You have already everything at your hand in zombie to inject everything you want.
For example: that.browser.window points to the jsdom window that every part of your site javascript is using as a base. So you can access the dom and all other window objects in the page already loaded.
I don't know what you want to archieve with injecting - you should not use it for testing anway, but it looks this is not your actual goal
So, as a sort of exercise for myself, I'm writing a little async script loader utility (think require.js, head.js, yepnope.js), and have run across a little bit of a conundrum. First, the basic syntax is like this:
using("Models/SomeModel", function() {
//callback when all dependencies loaded
});
Now, I want to know, when this call is made, what file I'm in. I could do it with an ajax call, so that I can mark a flag after the content loads, but before I eval it to mark that all using calls are going to be for a specific file, then unset the flag immediately after the eval (I know eval is evil, but in this case it's javascript in the first place, not json, so it's not AS evil). I'm pretty sure this would get what I need, however I would prefer to do this with a script tag for a few reasons:
It's semantically more correct
Easier to find scripts for debugging (unique file names are much easier to look through than anonymous script blocks and debugger statements)
Cross-domain requests. I know I could try to use XDomainRequest, but most servers aren't going to be set up for that, and I want the ability to reference external scripts on CDN's.
I tried something that almost got me what I needed. I keep a list of every time using is called. When one of the scripts loads, I take any of those using references and incorporate them into the correct object for the file that just loaded, and clear the global list. This actually seems to work alright in Firefox and Chrome, but fails in IE because the load events seem to go off at weird times (a jQuery reference swallowed a reference to another type and ended up showing it as a dependency). I thought I could latch on to the "interactive" readystate, but it doesn't appear to ever happen.
So now I come asking if anybody here has any thoughts on this. If y'all want, I can post the code, but it's still very messy and probably hard to read.
Edit: Additional usages
//aliasing and multiple dependencies
using.alias("ajax.googleapis.com/ajax/libs/jquery/1.10.2/jquery.min.js", "jQuery");
using(["jQuery", "Models/SomeModel"], function() {
//should run after both jQuery and SomeModel have been loaded and run
});
//css and conditionals (using some non-existant variables here)
using.css({ src: "IEFix", conditionally: browser === "MSIE" && version < 9 });
//should include the IEFix.css file if the browser is IE8 or below
and to expound more on my response below, consider this to be file A (and consider the jquery alias from before to be there still):
using(["jQuery", "B"], function() {
console.log("This should be last (after both jQuery and B have loaded)");
console.log(typeof($));
});
Then this would be B:
using("C", function() {
console.log("This should be second");
});
And finally, C:
console.log("This should be first");
The output should be:
This should be first
This should be second
This should be last (after both jQuery and B have loaded)
[Object Object]
Commendable that you are taking on such an educational project.
However, you won't be able to pull it off quite the way you want to do it.
The good news is:
No need to know what file you are in
No need to mess with eval.
You actually have everything you need right there: A function reference. A callback, if you will.
A rough P-code for your using function would be:
function using(modules, callback) {
var loadedModules = []
// This will be an ajax call to load things, several different ways to do it..
loadedModules[0] = loadModule(modules[0]);
loadedModules[1] = loadModule(modules[1]);
// Great, now we have all the modules
// null = value for `this`
callback.apply(null, loadedModules);
}
My site uses pushState to load pages. I have one issue, I want to use javascript on one of the pages but can't because it loads everything with AJAX. So what do I do? I've been told something about "parseScript" but I can't find enough information on it.
--Example--
I load using AJAX
On my page I have this script:
<script type="text/javascript">
function go(){
alert('1');
}
</script>
GO!!!
Nothing happens.
--Edit--
If I open up Google Chrome's debugger:
"Uncaught ReferenceError: go is not defined"
And the <script> tag is no where to be found
Browsers don't seem to parse <script> element content that's added to the document via targetElement.innerHTML. That's probably what you're running into.
The best solution is to use a well-tested framework like jQuery for solving problems like this. They've already figured out how to safely and correctly inject scripts into the DOM. There's no sense re-inventing the wheel unless you absolutely can't spare the bandwidth for the library.
One way you might fix this is by separating the JavaScript from the HTML in the Ajax response, either by issuing two requests (probably slower) or by structuring your JavaScript and HTML within a JSON object (probably harder to maintain).
Here's an example:
<script>
function load_content(){
var req = new XMLHttpRequest();
req.open("GET", "ajax.json", true);
req.onreadystatechange = function (e){
if (req.readyState === 4){
if (req.status === 200){
// these three lines inject your JavaScript and
// HTML content into the DOM
var json = JSON.parse(req.responseText);
document.getElementById("target").innerHTML = json.html;
eval(json.js);
} else {
console.log("Error", req.statusText);
}
}
};
req.send(null);
}
</script>
Load more stuff
<div id="target"></div>
The document ajax.json on the server looks like this:
{
"js": "window.bar = function (){ console.log(\"bar\"); return false; }",
"html": "<p>Log a message</p>"
}
If you choose this route, you must either:
namespace your functions: MyApp.foo = function (){ ... };, or
explicitly add your functions to the global namespace: window.foo = function (){ ... };.
This is because eval executes in the current scope, so your function definitions inherit that scope and won't be globally available. In my example, I chose the latter option since it's just a trivial example, but you should be aware of why this is necessary.
Please make sure to read When is JavaScript's eval() not evil? if you decide to implement this yourself.
I think it would be helpful to have a little more detail as to how the Ajax call is made and the content is loaded. That said, a few things of note:
the syntax for javascript:void() is invalid. It should be javascript:void(0). For that matter, using javascript:void() on the href of an anchor tag is generally bad practice. Some browsers do not support it. If you must use an tag, set the href to # and add "return false;" to the click event.
you should use a button tag instead of the a tag in this case anyway.
given what you have provided, it should work (aside from the syntax error with void())
If I were to do this I would use jquery's load call.
That takes care of putting an ajax call ,and parsing tags for script/no-script elements.
IF you dont wanna use jquery, I would suggest you go online and find what the jquery load method does and implement the same as an event handler for your ajax call.
I'm quite sure this a common question, but I'm pretty new to JS and am having some trouble with this.
I would like to load x.html into a div with id "y" without using iframes. I've tried a few things, searched around, but I can't find a decent solution to my issue.
I would prefer something in JavaScript if possible.
Wow, from all the framework-promotional answers you'd think this was something JavaScript made incredibly difficult. It isn't really.
var xhr= new XMLHttpRequest();
xhr.open('GET', 'x.html', true);
xhr.onreadystatechange= function() {
if (this.readyState!==4) return;
if (this.status!==200) return; // or whatever error handling you want
document.getElementById('y').innerHTML= this.responseText;
};
xhr.send();
If you need IE<8 compatibility, do this first to bring those browsers up to speed:
if (!window.XMLHttpRequest && 'ActiveXObject' in window) {
window.XMLHttpRequest= function() {
return new ActiveXObject('MSXML2.XMLHttp');
};
}
Note that loading content into the page with scripts will make that content invisible to clients without JavaScript available, such as search engines. Use with care, and consider server-side includes if all you want is to put data in a common shared file.
jQuery .load() method:
$("#y").load("x.html");
Using fetch
<script>
fetch('page.html')
.then(response=> response.text())
.then(text=> document.getElementById('elementID').innerHTML = text);
</script>
<div id='elementID'> </div>
fetch needs to receive a http or https link, this means that it won't work locally.
Note: As Altimus Prime said, it is a feature for modern browsers
2021
Two possible changes to thiagola92's answer.
async await - if preferred
insertAdjacentHTML over innerText (faster)
<script>
async function loadHtml() {
const response = await fetch("page.html")
const text = await response.text()
document.getElementById('elementID').insertAdjacentText('beforeend', text)
}
loadHtml()
</script>
<!-- ... -->
<div id='elementID'> </div>
I'd suggest getting into one of the JS libraries out there. They ensure compatibility so you can get up and running really fast. jQuery and DOJO are both really great. To do what you're trying to do in jQuery, for example, it would go something like this:
<script type="text/javascript" language="JavaScript">
$.ajax({
url: "x.html",
context: document.body,
success: function(response) {
$("#yourDiv").html(response);
}
});
</script>
document.getElementById("id").innerHTML='<object type="text/html" data="x.html"></object>';
There was a way to achieve this in the past, but it was removed from the specification, and subsequently, from browsers as well (e.g. Chrome removed it in Chrome 70). It was called HTML imports and it originally was part of the web components specs.
Currently folks are working on a replacement for this obviously lacking platform feature, which will be called HTML modules. Here's the explainer, and here's the Chrome platform status for this feature. There is no milestone specified yet as of when this feature will land.
Chances are the syntax is going to look similar to this:
import { content } from "file.html";
Resolving the remaining issues with HTML modules I assume might take quite some time, so until then the only viable options you have is to have
either your build stack resolve the issue for you (e.g. with webpack-raw-loader (Webpack 4), or with asset modules (Webpack 5)),
or to rely on async fetch to get the job done (which might result in a less-than-optimal performance experience).
We already have JSON modules and CSS module scripts (which both were sorely missing features for a long time as well).
http://www.boutell.com/newfaq/creating/include.html
this would explain how to write your own clientsideinlcude but jQuery is a lot, A LOT easier option ... plus you will gain a lot more by using jQuery anyways