PhantomCSS/CasperJS - Greying out advertisement images

PhantomCSS/CasperJS - Greying out advertisement images - javascript

Hey guys just testing our pages out using the grunt-phantomcss plugin (it's essentially a wrapper for PhantomJS & CasperJS).
We have some stuff on our sites that comes in dynamically (random profile images for users and random advertisements) sooo technically the page looks different each time we load it, meaning the build fails. We would like to be able to jump in and using good ol' DOM API techniques and 'grey out'/make opaque these images so that Casper/Phantom doesn't see them and passes the build.
We've already looked at pageSettings.loadImages = false and although that technically works, it also takes out every image meaning that even our non-ad, non-profile images get filtered out.
Here's a very basic sample test script (doesn't work):
casper.start( 'http://our.url.here.com' )
.then(function(){
this.evaluate(function(){
var profs = document.querySelectorAll('.profile');
profs.forEach(function( val, i ){
val.style.opacity = 0;
});
return;
});
phantomcss.screenshot( '.profiles-box', 'profiles' );
});
Would love to know how other people have solved this because I am sure this isn't a strange use-case (as so many people have dynamic ads on their sites).

Your script might actually work. The problem is that profs is a NodeList. It doesn't have a forEach function. Use this:
var profs = document.querySelectorAll('.profile');
Array.prototype.forEach.call(profs, function( val, i ){
val.style.opacity = 0;
});
It is always a good idea to register to page.error and remote.message to catch those errors.
Another idea would be to employ the resource.requested event handler to abort all the resources that you don't want loaded. It uses the underlying onResourceRequested PhantomJS function.
casper.on("resource.requested", function(requestData, networkRequest){
if (requestData.url.indexOf("mydomain") === -1) {
// abort all resources that are not on my domain
networkRequest.abort();
}
});
If your page handles unloaded resources well, then this should be a viable option.

Related

How to make this script resume execution after page reload?

Preliminary context sharing
I am asked to manually perform a very repetitive action on a website that I do not own and for which I do not have any API access.
The only hope I have to automate these actions is to write some JavaScript and execute it on the browser just to automate the actions that I would be doing manually otherwise.
Please sorry in advance if this question already has an answer somewhere else, I'm a backend developer and in my limited knowledge of front-end I didn't manage to find any equivalence.
Explanation of the issue
Say I have to post several entries, one by one, into a form. I have written the following code (over simpified just for demonstration purposes):
//This array of Json objects is produced by an upstream service
var inputs = [
{
...
},
{
...
},
{
...
}
]
for (i = 0; i < inputs.length; i++) {
fillSomeForms(inputs[i])
clickSubmit() //<-- this will make the page reload, and so the script execution stop
}
The problem that I have here is very basic: after the first for iteration, when I invoke clickSubmit(), the page reloads (because the submission is a POST followed by a redirect to a "submit next" page) and so the JS stops executing.
I have tried to look around on the web for similar issues, and I've seen people tweaking the localStorage in order to resume the execution of their script.
However, that seems to assume the script being a resource of the front-end code, which is not the case for me (I don't own the code, I simply inject this JS into the browser's developer console and execute it to save some time).
Is there any way to reach this purpose? I am not necessarily looking for a clean solution, just for something that could get this work and spare us some monkey work (nothing of what I'm doing here is clean, but the system administrators do not want to provide access to the REST APIs that the platform actually provide to do so).

When you inject into the console, load a copy of the page into an iframe, and submit your forms from that copy:
const inputs = [ /* a convenient inputs array */ ];
const pageCopy = document.body.appendChild( document.createElement( "iframe" ) );
pageCopy.addEventListener( "load", () => {
//The page copy has finished loading / reloading, let's submit more stuff
if( inputs.length > 0 ) {
const moreInput = inputs.pop();
console.log( "Submitting inputs: ", moreInput );
//this shouldn't work, but let's clone the current DOM into the iframe...
pageCopy.contentDocument.body.parentElement.innerHTML =
document.body.parentElement.innerHTML;
fillSomeFormsInPageCopy( pageCopy.contentDocument, moreInput );
pageCopy.contentDocument.querySelector( "#submitButtonId" ).click();
console.log( "Clicked submit. Will wait for iframe to finish reloading..." );
//Okay, we clicked and the iframe is reloading. This event will fire again as soon as it's done reloading, ready to submit more form data
}
else if( inputs.length === 0 ) {
console.log( "Finished submitting all the inputs in the array!" );
}
} );
pageCopy.src = document.location.href;
Please understand I can't test this code. (I'm not even sure the click() event can be fired across an iframe boundary, for security, but I hope it can.)
Hopefully you can understand how to use the pageCopy's document to find your form elements and set their values. E.g., you can use
pageCopy.contentDocument.getElementById( "form-entry-id-1" ).value =
moreInput[ "form-entry-id-1" ];

In case it may help someone in the future, I finally was able to work around the problem by opening a new tab (and working in that tab) per iteration of my loop.
Something like this:
while (inputs.length > 0) {
const singleInput = inputs.pop();
const newWindow = window.open('about:blank', '_blank');
newWindow.addEventListener('load', () => {
newWindow.document.body.parentElement.innerHTML = document.body.parentElement.innerHTML;
fillForm(newWindow.document, singleInput) //<-- the function fill form uses the document in parameter to perform the different get/set
newWindow.document.getElementById("submit-button").click();
});
}

Track Links Using Events - Race Conditions

I am building AngularJS applications which have common header with links to each of the application:
App1
App2
Each application is running on its own subdomain and when user clicks a link on the header - page redirects to that application.
I have to track user actions with the links, e.g. onClick events with Omniture (but the problem applies to Google Analytics as well). When I add an onClick event that calls a function to send event to Omniture, e.g.:
App1
trackLink() is a function of an AngularJS service, brief implementation:
trackLink: function (eVar8Code) {
s = this.getSVariable(s);
s.eVar8 = eVar8Code;
s.prop28 = s.eVar8;
this.sendOmnitureMessage(s, send, false);
return s;
},
the function executes asynchronously and returns right away. Then standard link's behaviour kicks in: page is redirected to the URL defined in "href" attribute. New page is loaded very quickly (around 70 ms) but AJAX request to Omniture has not been executed: it's all async.
I believe that using events for the links is incorrect approach, one should rather use Query parameters, e.g.:
App1
but it's hard to convince some.
What is a good practise to track events on links?

Change your function to include a short timeout (probably you'd let it return false to suppress default link behaviour, too, and redirect via the location object).
Google Analytics has hit callbacks which are executed after the call to Google was sent, you might want to look if Adobe Analytics has something similar (as this can be used for redirects after the tracking call has been made).
If event tracking and query parameters are interchangeable depends on your use case (they certainly measure different things). However event tracking is a well accepted way for link tracking.

As #Eike Pierstorff suggested - I used capabilities of Adobe Analytics native library to set a delay (200ms) which give the call to Adobe Analytics much better chances to succeed:
in HTML:
App1
in AngularJS service:
sendOmnitureMessageWithDelay: function (s, element, eVar8Code) {
var s = s_gi(s_account); // jshint ignore:line
s.useForcedLinkTracking = true;
s.forcedLinkTrackingTimeout = 200; // Max number of milliseconds to wait for tracking to finish
s.linkTrackVars = 'eVar8,prop28';
s.eVar8 = eVar8Code;
s.prop28 = eVar8Code;
var target = element;
if (!target) {
target = true;
}
s.tl(target, 'o', s.eVar8, null, 'navigate');
this.cleanOmnitureVars();
}
Here, element - is HTML element about.
It works pretty well in 99% of the cases but has issues on the slow and old devices where page loads before call to Adobe has been made. It appears that there is no good solution to this problem and there cannot be guarantee that events would always be recorded in Adobe Analytics (or Google Analytics).

Bug in my lazyload plugin for mootools

I want to implement a plug-in serial download pictures in MooTools. Let's say there are pictures with the img tag inside a div with the class imageswrapper. Need to consistently download each image after it loads the next and so on until all the images are not loaded.
window.addEvent('domready', function(){
// get all images in div with class 'imageswrapper'
var imagesArray = $$('.imageswrapper img');
var tempProperty = '';
// hide them and set them to the attribute 'data-src' to cancel the background download
for (var i=0; i<imagesArray.length; i++) {
tempProperty = imagesArray[i].getProperty('src');
imagesArray[i].removeProperty('src');
imagesArray[i].setProperty('data-src', tempProperty);
}
tempProperty = '';
var iterator = 0;
// select the block in which we will inject Pictures
var injDiv = $$('div.imageswrapper');
// recursive function that executes itself after a new image is loaded
function imgBomber() {
// exit conditions of the recursion
if (iterator > (imagesArray.length-1)) {
return false;
}
tempProperty = imagesArray[iterator].getProperty('data-src');
imagesArray[iterator].removeProperty('data-src');
imagesArray[iterator].setProperty('src', tempProperty);
imagesArray[iterator].addEvent('load', function() {
imagesArray[iterator].inject(injDiv);
iterator++;
imgBomber();
});
} ;
imgBomber();
});

There are several issues I can see here. You have not actually said what the issue is so... this is more of a code review / ideas for you until you post the actual problems with it (or a jsfiddle with it)
you run this code in domready where the browser may have already initiated the download of the images based upon the src property. you will be better off sending data-src from server directly before you even start
Probably biggest problem is: var injDiv = $$('div.imageswrapper'); will return a COLLECTION - so [<div.imageswrapper></div>, ..] - which cannot take an inject since the target can be multiple dom nodes. use var injDiv = document.getElement('div.imageswrapper'); instead.
there are issues with the load events and the .addEvent('load') for cross-browser. they need to be cleaned up after execution as in IE < 9, it will fire load every time an animated gif loops, for example. also, you don't have onerror and onabort handlers, which means your loader will stop at a 404 or any other unexpected response.
you should not use data-src to store the data, it's slow. MooTools has Element storage - use el.store('src', oldSource) and el.retrieve('src') and el.eliminate('src'). much faster.
you expose the iterator to the upper scope.
use mootools api - use .set() and .get() and not .getProperty() and .setProperty()
for (var i) iterators are unsafe to use for async operations. control flow of the app will continue to run and different operations may reference the wrong iterator index. looking at your code, this shouldn't be the case but you should use the mootools .each(fn(item, index), scope) from Elements / Array method.
Anyway, your problem has already been solved on several layers.
Eg, I wrote pre-loader - a framework agnostic image loader plugin that can download an array of images either in parallel or pipelined (like you are trying to) with onProgress etc events - see http://jsfiddle.net/dimitar/mFQm6/ - see the screenshots at the bottom of the readme.md:
MooTools solves this also (without the wait on previous image) via Asset.js - http://mootools.net/docs/more/Utilities/Assets#Asset:Asset-image and Asset.images for multiple. see the source for inspiration - https://github.com/mootools/mootools-more/blob/master/Source/Utilities/Assets.js
Here's an example doing this via my pre-loader class: http://jsfiddle.net/dimitar/JhpsH/
(function(){
var imagesToLoad = [],
imgDiv = document.getElement('div.injecthere');
$$('.imageswrapper img').each(function(el){
imagesToLoad.push(el.get('src'));
el.erase('src');
});
new preLoader(imagesToLoad, {
pipeline: true, // sequential loading like yours
onProgress: function(img, imageEl, index){
imgDiv.adopt(imageEl);
}
});
}());

PJax - how do I turn off the modified behaviour

I've got PJax up and running on my test site - it works a treat. However it relies heavily on a lot of javascript widgets and hence leaks memory.
Since I don't have time right now to re-write every widget, I thought that a simple solution would be to do a normal page load after, say 20 pjax page transitions. A simple plan....but it doesn't seem to be possible.
$.pjax.disable();
....still fetches the content via AJAX, but doesn't change the page.
$(document).pjax();.
...doesn't change the behaviour
$.pjax.handleClick = function (event, container, options) { return; };
...doesn't change the behaviour
$.pjax.state.timeout = 0;
...doesn't change the behaviour
delete $.pjax;
...breaks navigation
$.pjax.defaults.timeout=0;
...doesn't change the behaviour
How do I suspend pjax?

If you add a listener for pjax:beforeSend, you can capture the requested URL, set location.href yourself and return false to cancel the pjax behavior. That is how I'm doing it with the following code:
var pageLoadCounter = 0;
var MAX_PAGE_LOADS = 20;
$(".pjaxContainer").on("pjax:beforeSend", function (e, xhr, settings) {
if (++pageLoadCounter > MAX_PAGE_LOADS) {
// URI can be found at https://github.com/medialize/URI.js
var uri = URI(settings.url);
// Remove _pjax from query string before reloading
uri.removeSearch("_pjax");
location.href = uri.toString();
return false;
}
});

I've discovered that changing the id of the pjax container div gives me the desired result - although this seems like a bit of a kludge. It would also be possible by changing the timeout of the ajax request to 0 - but I still need to work out how to do this.
I did ask on the PJax github page about this but so far have not received a response.

How does the javascript preloading work?

I don't want to know a way to preload images, I found much on the net, but I want to know how it works.
How is javascript able to preload images?
I mean, I tried a snippet from here, and even if it works, it doesn't seem to preload images.
When I check firebug, I can see that the image is loaded twice, once while the preloading, another time when displaying it!
To improve this code I'd like to know how it works.
Here is what i do:
function preload(arrayOfImages) {
$(arrayOfImages).each(function(){
$('<img/>')[0].src = this;
//(new Image()).src = this;
alert(this +' && ' + i++);
});
}
then i do something like that:
preloader = function() {
preload(myImages);
}
$(document).ready(preloader);
Here is how i display/add the image :
$("li.works").click(function() {
$("#viewer").children().empty();
$('#viewer').children().append('<img src=\'images/ref/'+this.firstChild.id+'.jpg\' alt="'+this.firstChild.id+'" \/>')
$("#viewer").children().fadeIn();

Your basic Javascript preloader does this:
var image = new Image();
image.src = '/path/to/the/image.jpg';
The way it works is simply by creating a new Image object and setting the src of it, the browser is going to go grab the image. We're not adding this particular image to the browser, but when the time comes to show the image in the page via whatever method we have setup, the browser will already have it in its cache and will not go fetch it again. I can't really tell you why whatever you have isn't working this way without looking at the code, though.
One interesting gotcha that is discussed in this question is what happens when you have an array of images and try preloading them all by using the same Image object:
var images = ['image1.jpg','image2.jpg'];
var image = new Image();
for(var x = 0; x < images.length; x++) {
image.src = images[x];
}
This will only preload the last image as the rest will not have time to preload before the loop comes around again to change the source of the object. View an example of this. You should be able to instantly see the second image once you click on the button, but the first one will have to load as it didn't get a chance to preload when you try to view it.
As such, the proper way to do many at once would be:
var images = ['image1.jpg','image2.jpg'];
for(var x = 0; x < images.length; x++) {
var image = new Image();
image.src = images[x];
}

Javascript preloading works by taking advantage of the caching mechanism used by browsers.
The basic idea is that once a resource is downloaded, it is stored for a period of time locally on the client machine so that the browser doesn't have to retrieve the resource again from across the net, the next time it is required for display/use by the browser.
Your code is probably working just fine and you're just misinterpeting what Fire Bug is displaying.
To test this theory just hit www.google.com with a clean cache. I.e. clear your download history first.
The first time through everything will likely have a status of 200 OK. Meaning your browser requested the resource and the server sent it. If you look at the bottom on the Fire Bug window it will says how big the page was say 195Kb and how much of that was pulled from cache. In this case 0Kb.
Then reload the same page without clearing your cache, and you will still see the same number of requests in FireBug.
The reason for this is simple enough. The page hasn't changed and still needs all the same resources it needed before.
What is different is that for the majority of these requests the server returned a 304 Not Modified Status, so the browser checked it's cache to see if it already had the resource stored locally, which in this case it did from the previous page load. So the browser just pulled the resource from the local cache.
If you look at the bottom of the Fire Bug window you will see that page size is still the same (195Kb) but that the majority of it, in my case 188Kb, was pulled locally from cache.
So the cache did work and the second time i hit Google I saved 188Kb of download.
I'm sure you will find the same thing with preloading your images. The request is still made but if the server returns a status of 304 then you will see that the image is in fact just pulled from local cache and not the net.
So with caching, the advantage is NOT that you kill off all future resource requests, i.e. a Uri lookup is still made to the net but rather that if possible the browser will pull from the local cache to satisify the need for the content, rather than run around the net looking for it.

You may be confused by the concept of "preloading". If you have a bunch of images in your HTML with <img src="...">, they cannot be preloaded with Javascript, they just load with the page.
Preloading images with Javascript is about loading images not already in the document source, then displaying them later. They are loaded after the page has rendered for the first time. They are preloaded in order to eliminate/minimize loading time when it comes to making them appear, for example when changing an image on mouse rollover.
For most applications, it is usually better practice to use "CSS sprites" as a form of preloading, in lieu of Javascript. SO should have a ton of questions on this.

It just involves making a new DOM image object and setting the src attribute. Nothing clever and AFAIK, it has always worked for me.
Is it possible the second "load" firebug is showing you is it loading it from cache?

The index on the loop is only looking
at the first image. Change it to use
the index:
function preload(arrayOfImages) {
$(arrayOfImages).each(function(i){ // Note the argument
$('<img/>')[i].src = this; // Note the i
//(new Image()).src = this;
alert(this +' && ' + i++);
});
}
Edit: In retrospect, this was wrong and I can see you're trying to create image elements. I don't understand why the index is there at all, there need not be an index. I think the function should look like this:
function preload(arrayOfImages) {
$(arrayOfImages).each(function () {
$('<img/>').attr('src', this);
});
}
And to instantiate it, why not just do this:
$(function () { // Equivalent to $(document).ready()
preload(myImages);
});
JavaScript image preloading works because when a DOM element that contains an image is created, the image is downloaded and cached. Even if another request is made when the image is actually rendered from the HTML, the server will send back a 304 (not changed), and the browser will simply load the image from its cache.
Paolo suggests using the following notation to create an image object:
var image = new Image();
While this will work, the DOM-compliant way of doing this is:
var image = document.createElement('img');
image.setAttribute('src', 'path/to/image.jpg');
Which is the way it is being done in the script, except it's using jQuery's HTML string literal syntax to do it. Additionally, most modern browsers offer compatibility with the Image() constructor by simply calling DOM-standard methods. For example, if you open up the Google Chrome JavaScript console and type Image, this is what you'll get:
function Image() {
return document.createElementNS('http://www.w3.org/1999/xhtml', 'img');
}
Chrome merely uses the native DOM methods to create an image element.

We Keep Coding

JavaScript is the programming language of the Web.