how should my site handle ocassionally missing javascript files gracefully?

how should my site handle ocassionally missing javascript files gracefully? - javascript

Say I've got this script tag on my site (borrowed from SO).
<script type="text/javascript" async=""
src="http://edge.quantserve.com/quant.js"></script>
If edge.quantserve.com goes down or stops responding without returning a 404, won't SO have to wait for the timeout before the rest of the page loads? I'm thinking Chaos Monkey shows up and blasts a server that my site is depending on, a server that isn't part of a CDN and has a poor failover.
What's the industry standard way to handle this issue? I couldn't find a dupe on SO, maybe I'm searching for the wrong terms.
Update: I should have looked a bit more closely at the SO code, there's this at the bottom:
<script type="text/javascript">var _gaq=_gaq||[];_gaq.push(['_setAccount','UA-5620270-1']);
_gaq.push(['_setCustomVar', 2, 'accountid', '14882',2]);
_gaq.push(['_trackPageview']);
var _qevents = _qevents || [];
(function(){
var s=document.getElementsByTagName('script')[0];
var ga=document.createElement('script');
ga.type='text/javascript';
ga.async=true;
ga.src='http://www.google-analytics.com/ga.js';
s.parentNode.insertBefore(ga,s);
var sc=document.createElement('script');
sc.type='text/javascript';
sc.async=true;
sc.src='http://edge.quantserve.com/quant.js';
s.parentNode.insertBefore(sc,s);
})();
</script>
OK, so if the quant.js file fails to load, it's creating a script tag with ga.async=true;. Maybe that's the trick.
Possible answer: https://stackoverflow.com/a/1834129/30946

Generally, it's tricky to do it well and cross-browser.
Some proposals:
Move the script to the very bottom of the HTML page (so that almost everything is displayed before you request that script)
Move it to the bottom and wrap it in <script>document.write("<scr"+"ipt src='http://example.org/script.js'></scr"+"ipt>")</script> or the way you added after update (document.createElement('script'))
A last option is to load it via XHR (but this works only for same-domain, or cross-domain only if the CORS is enabled on a third-party server); you can then use timeout property of the XHR (for IE and Fx12+), and in the other browsers, use setTimeout and check the XHR's readyState. It's kind of convoluted and very non-cross-browser for now, so the option 2 looks the best.

Make a copy of the file on your server and use this. it will load your copy only if the one from the server has failed to load
<script src="http://edge.quantserve.com/quant.js"></script>
<script>window.quant || document.write('<script src="js/quant.js"><\/script>')</script>

To answer your question about the browser having to wait for the script to load before the rest of the page loads, the answer to that would typically be no. Typical browsers will have multiple threads processing the download of the page and linked content (CSS, images, js). So the rest of the page should be loaded, though the user's browser indicator will still show the page trying to load until the final request is fulfilled or timed out.
Depending on the nature of the resource you are trying to load, this will obviously effect your page differently. Typically, if you are worried about this, you can host all your files on a common CDN (or your website if it is not that highly trafficked), that way at least if one thing fails, chances are everything is failing and you have a bigger issue to contend with :)

Related

Trigger prefetching after page is fully loaded

My scenario is:
user visits domain.com (home page)
domain.com/products page contains large image library and quite large CSS and JS libraries
when user visits domain.com and the home page has fully loaded, we start to prefetch resources & if possible at least some % of images from the archive.
Currently on some pages JS "eats" quite a lot of resources therefor triggering prefetch in some cases during page load is not the best answer - as it will cause a small lag when user interacts with JS created events and elements.
My questions are:
Is it even possible (will it work) to trigger <link rel="prefetch" href="image.png"> or CSS file to be added to <head> so it can prefetch data from another page after current page is fully loaded?
Should I do it similar like rendering additional stylesheet using JS where I add new tag within <head> as a stylesheet file so it can then render.. or is there another way?

You might use Cache Storage to prefetch (precache) assets. I work on an open-source project which uses this approach. Although, to serve precached assets you need a service worker. The logic of finding assets in my project looks like this.
The demo of this project is here. Also, I wrote an article which explains technical details of the project.
Assets get prefetched once the lib is loaded, so I don't wait for the entire page load. Maybe I should use requestIdleCallback to wait until the browser is idle.
Hopefully, it gives you some inspiration.

Just to note that you could add aditional stylesheet after the page load completly or whenever you want with somethig like this:
document.addEventListener("DOMContentLoaded", function(event) {
var script = document.createElement("link");
script.rel = "stylesheet";
script.href= "stylesOfAnotherPage2.css";
document.getElementsByTagName("body")[0].appendChild(script);//or head
});
When you load page1, stylesOfAnotherPage2.css is cached, so when page2 is called, stylesOfAnotherPage2.css is already cached if page2 call the same file.

You might use HTTP Caching and Link prefetching to use your browser idle time to download or prefetch documents that the user might visit in the near future.
Prefetching hints
The browser observes all of these hints and queues up each unique
request to be prefetched when the browser is idle. There can be
multiple hints per page, as it might make sense to prefetch multiple
documents. For example, the next document might contain several large
images.
<link rel="prefetch alternate stylesheet" title="Designed for Mozilla" href="mozspecific.css">
<link rel="next" href="2.html">
Also, you can read this thread:
Preload, Prefetch And Priorities in Chrome
There you can read the different states and priorities about the execution, load and preload times, some tips to improve them.

preload of CSS and JS using https://developer.mozilla.org/en-US/docs/Web/HTML/Preloading_content might be a good fit and has good support in most modern browsers: https://caniuse.com/#search=preload
There are probably better solutions, such as the one suggested by #soulshined, but another crude way to do this is the other page contains etags or cache control headers would be to use AJAX to send requests to the resources you expect to load. This would cause the browser to request those resources and prefill the user agent cache so that when the user requests that resource on the other page there is a higher change the cache would contain the resources and it'd load faster than if it had to fetch everything for the first time.

To prefetch assets there are some prefetching methods such as DNS-prefetch, pre-connect, pre-render & prefetch. As per the requirement, you may use them appropriately. Each method has its own purpose this would be useful to know each one specifically.

Wistia E-v1.js script being loaded twice

So I am calling the wistia script with a script tag in my head like this:
<script charSet='ISO-8859-1' src='//fast.wistia.com/assets/external/E-v1.js' async defer data-script='wistia' />
However, when I check out the network tab on Chrome, I notice that the E-v1.js script from Wistia is being loaded twice, which is rather significant as it is a 273kb script.
The first load of the script is from https://fast.wistia.com/assets/external/E-v1.js, the location to which I have called it.
However, the second load of the script comes from an iframe, despite me not having put any iframes on the page. This iframe calls the script even on webpages which do not contain any wistia videos. The referrer is: https://fast.wistia.com/embed/iframe_shim?domain=com.
What's going on here? I assume this is some trying-to-be-helpful behaviour from wistia to lazy load their script via an iframe, but it's already loaded...

So I contacted Wistia and got an answer. Their development practices are not exactly intuitive.
Here's what the guy said:
The iframe_shim is a way of tracking the visitor_key for stats tracking, and storing that information on the fast.wistia domain rather than your domain. For a more lightweight method of doing that, you can set window.wistiaIframeShim = false in script tags on your page, and that will stop E-v1.js from loading again. Visitors will then be tracked via a cookie and localstorage directly on your domain instead of the fast.wistia.com domain. As far as I know this shouldn't be problematic, and we'll eventually be changing how that works to make it more efficient, it just hasn't been prioritized yet.
So they seem to load it twice from two different origins just to store a tiny amount of information on their own domain rather than on the client. Seems ridicuous to me, but I can confirm as of right now that all you have to do is change that window variable.
THE FIX: window.wistiaIframeShim = false

Find jQuery cache hit/miss from CDN

If you include jQuery from a CDN, is there a way to determine whether a user fetched the content from the CDN or retrieved it from their cache?
Obviously a cache hit doesn't make an HTTP request, but could you test that and report Javascript back to your own server with the data?

Why not just use CHARLES or a similar debugging proxy to determine loading speed?
If you want to know the speed from a client's perspective from multiple locations, use http://www.webpagetest.org/ with two differing versions of your website (one with CDN, one with self-hosted static location) and compare the loading speeds. Personally, unless you have a lot of custom javascript code, it makes sense to use a CDN for jQuery, especially since lots of sites use the Google Libraries API for jQuery.

If you have logging on your CDN (we don't seem to?) you could change the CDN url for test runs and also use a pingback url on your server. Over a period of time, compare the ping back url hits and times with your CDN url hits and the unique visitors counts.
You should be able to get an idea about how many unique hits on your cdn url you get vs unique hits you get on your page. The difference should be bots, scrapers and cached or failed loading of resources. Bots you can eliminate, scrapers probably as well, so your %ages should be reflective over a long enough period.
Would this work for you?
We do this on non-cdn resources to see if people are downloading the latest CSS files or not to force a name change only on those IPs that seem to have cached a resource after a change was made to the css file.

Testing for a 304 (not modified) would be difficult without using ajax. And using ajax will be very difficult unless you get around the same origin policy on the CDN.
I assume you want to test the actual time the scripts loads and becomes available on the clients, and would like to compare this data using CDN vs. something local. If so, wouldn’t it be better to test the actual time instead of doing some cache test?
It’s fairly easy to set up an A/B test of the actual time the scripts are loading.
For the A test you could do the CDN/local separation
<script>var _time = new Date().getTime();</script>
<script src="http://code.jquery.com/jquery-1.7.1.min.js"></script>
<script src="project.js"></script>
<script>_time = new Date().getTime() - _time;</script>
And the B test a local script merge or whatever:
<script>var _time = new Date().getTime();</script>
<script src="project.includingjquery.min.js"></script>
<script>_time = new Date().getTime() - _time;</script>
Then report the _time variable into analytics or your own database using ajax. If the B users have lower _time reported, you know it’s the right way to go...

If you were willing to add some bulk to a page to test this you could add an ajax request to a CDN for jQuery and then check the headers for a 304 response. If you then create a second ajax request to ping back to your server telling you if jQuery was cached or not, i.e. if it was a 200 or a 304 response. I haven't tried this but it should work, but it will add some extra requests for your users, but given the fact they'll be asynchronous it probably wouldn't have any impact.

<script>var s=new Date().getTime();</script>
<script src="cdncontent"></script>
<script>
var s = new Date().getTime() - s;
if (s < 100) {
//likely from cache
} else {
//likely from CDN
}
</script>

It always worth loading resources from a CDN. The only drawback is whether the CDN is geographically close to the user or not. You need to check your target audience and decide if it's better to use it or not. In case of jquery, I always use Google CDN which I think is realiable and robust.

Detect and log when external JavaScript or CSS resources fail to load

I have multiple <head> references to external js and css resources. Mostly, these are for things like third party analytics, etc. From time to time (anecdotally), these resources fail to load, often resulting in browser timeouts. Is it possible to detect and log on the server when external JavaScript or CSS resources fail to load?
I was considering some type of lazy loading mechanism that when, upon failure, a special URL would be called to log this failure. Any suggestions out there?
What I think happens:
The user hits our page and the server side processes successfully and serves the page
On the client side, the HTML header tries to connect to our 3rd party integration partners, usually by a javascript include that starts with "http://www.someothercompany.com...".
The other company cannot handle our load or has shitty up-time, and so the connection fails.
The user sees a generic IE Page Not Found, not one from our server.
So even though my site was up and everything else is running fine, just because this one call out to the third party servers failed, one in the HTML page header, we get a whole failure to launch.

If your app/page is dependent on JS, you can load the content with JS, I know it's confusing. When loading these with JS, you can have callbacks that allow you to only have the functionality of the loaded content and not have to worry about what you didn't load.
var script = document.createElement("script");
script.type = "text/javascript";
script.src = 'http://domain.com/somefile.js';
script.onload = CallBackForAfterFileLoaded;
document.body.appendChild(script);
function CallBackForAfterFileLoaded (e) {
//Do your magic here...
}
I usually have this be a bit more complex by having arrays of JS and files that are dependent on each other, and if they don't load then I have an error state.
I forgot to mention, obviously I am just showing how to create a JS tag, you would have to create your own method for the other types of files you want to load.
Hope maybe that helps, cheers

You can look for the presence of an object in JavaScript, e.g. to see if jQuery is loaded or not...
if (typeof jQuery !== 'function') {
// Was not loaded.
}
jsFiddle.
You could also check for CSS styles missing, for example, if you know a certain CSS file sets the background colour to #000.
if ($('body').css('backgroundColor') !== 'rgb(0, 0, 0)') {
// Was not loaded.
}
jsFiddle.
When these fail, you can make an XHR to the server to log these failings.

What about ServiceWorker? We can use it to intercept all http requests and get response code to log whether the external resource fails to load.

Make a hash of the js name and session cookie and send both js name in plain and the hash. Server side, make the same hash, if both are same log, if not, assume it's abuse.

Is there a way to mitigate downloading of resources (images/css and js files) with Javascript?

I have a html page on my localhost - get_description.html.
The snippet below is part of the code:
<input type="text" id="url"/>
<button id="get_description_button">Get description</button>
<iframe id="description_container" src="#"/>
When the button is clicked the src of the iframe is set to the url entered in the textbox. The pages fetched this way are very big with lots of linked files. What I am interested in the page is a block of text contained in a <div id="description"> element.
Is there a way to mitigate downloading of resources linked in the page that loads into the iframe?
I don't want to use curl because the data is only available to logged in users and the steps to take with curl to get the content is too complicated. The iframe is simple as I use this on a box which sends the right cookies to identify the request as coming from a logged in user, but the problem is that it is very wasteful to get nearly 1 MB of data to keep 1 KB of it and throw out the rest.
Edit
If the proposed method just works in Firefox it is fine, so I added Firefox tag. Also, it is possible that the answer actually is from the realm of Firefox add-on techniques, so I added that tag as well.
The problem is not that I cannot get at what I'm looking for, rather, the problem is the easy iframe method is wasteful.
I know that Firefox does allow loading only the text of a page. If you open a page and press Ctrl+U you are taken to 'view page source' window, There links behave as normal and are clickable, if you click on a link in source view, the source of the new page is loaded into the view source window, without the linked resources being downloaded, exactly what I'm trying to get. But I don't know how to access this behaviour.
Another example is the Adblock add-on. It somehow kills elements before they get loaded. With plain Javascript this is not possible. Because it only is triggered too late to intervene in good time.

The Same Origin Policy forbids any web page to access contents of any other web page in a different domain so basically you cannot do that.
However it seems that with some browsers it is allowed to access web pages content if you are trying to access it from a local web page which seems to be your case.
Safari, IE 6/7/8 are browser that allow a local web page to do so via XMLHttpRequest (source: Google Browser Security Handbook) so you may want to choose to use one of those browsers to do what you need (note that future versions of those browsers may not allow to do so anymore).
A part from this solution I only see two possibities:
If the web pages you need to fetch content from are somehow controlled by you, you can create a simpler interface to let other web pages to get the content you need (for example allowing JSONP requests).
If the web pages you need to fetch content from are not controlled by you the only solution I see is to fetch content server side logging in from the server directly (I know that you don't want to do so, but I don't see any other possibility if the previous I mentioned are not practicable)
Hope it helps.

Actually I've seen Cross Domain jQuery .load request before, here: http://james.padolsey.com/javascript/cross-domain-requests-with-jquery/
The author claims that codes like these found on that page
$('#container').load('http://google.com'); // SERIOUSLY!
$.ajax({
url: 'http://news.bbc.co.uk',
type: 'GET',
success: function(res) {
var headline = $(res.responseText).find('a.tsh').text();
alert(headline);
}
});
// Works with $.get too!
would work. (The BBC code might not work because of the recent redesign, but you get the idea)
Apparently it is using YQL wrapped into a jQuery plugin to do the trick. Now I cannot say I fully understand what he is doing there but it appears to work, and fits the bill. Once you load the data I suppose it is a simple matter of filtering out the data that you need.
If you prefer something that works at the browser level, may I suggest Mozilla's Jetpack framework for lightweight extensions. I've not yet read the documentations in its entirety but it should contain the APIs needed for this to work.

There are various ways to go about this in AJAX, I'm going to show the jQuery way for brevity as one option, though you could do this in vanilla JavaScript as well.
Instead of an <iframe> you can just use a container, let's say a <div> like this:
<div id="description_container"></div>
Then to load it:
$(function() {
$("#get_description_button").click(function() {
$("#description_container").load($("input").val() + " #description");
});
});
This uses the .load() method which takes a string in this format: .load("url selector"), then takes that element in the page and places it's content inside the container you're loading, in this case #description_container.
This is just the jQuery route, mainly to illustrate that yes, you can do what you want, but you don't have to do it exactly like this, just showing the concept is getting what you want from an AJAX request, rather than in an <iframe>.

Your description sounds like you are fetching pages from the same domain (you said that you need to be logged in and have session credentials) so have you tried to use async request via XMLHttpRequest? It might complain if the html on a page is particularly messed up but you chould still be able to get raw text via .responseText and extract what you need with a regex.

We Keep Coding

JavaScript is the programming language of the Web.