Subresource integrity and cache busting techniques in PHP

Subresource integrity and cache busting techniques in PHP - javascript

I'd like to implement Subresource Integrity and cache busting for static assets such as stylesheets and JavaScript files in my application. Currently I use PHP with Twig templates.
I know there are many tools out there to generate hashes for all the JS and CSS files but I am looking for how to implement the hashes into the <script> and <link> tags for hundreds of files.
This blog post described most of what I'm trying to do, however the author only covers cache busting and uses a static timestamp in the file name that he changes manually every time. Using a build tool to programatically generate that timestamp isn't difficult either but with SRI the value is a hash, which is different for every file.
For example, a snippet of header.html.twig:
<!-- cdn requests -->
<script src='https://ajax.googleapis.com/ajax/libs/jquery/2.1.4/jquery.min.js'
integrity='sha384-8gBf6Y4YYq7Jx97PIqmTwLPin4hxIzQw5aDmUg/DDhul9fFpbbLcLh3nTIIDJKhx'
crossorigin='anonymous'></script>
<!-- same-origin requests -->
<script src='foo.1a516fba.min.js'
integrity='sha384-GlFvui4Sp4wfY6+P13kcTmnzUjsV78g61ejffDbQ1QMyqL3lVzFZhGqawasU4Vg+'></script>
<script src='bar.faf315f3.min.js'
integrity='sha384-+vMV8w6Qc43sECfhc+5+vUA7Sg4NtwVr1J8+LNNROMdHS5tXrqGWSSebmORC6O86'></script>
Changing the src/href and integrity attributes every time is not a sane approach.
I could write a Twig function that calls a PHP function to hash the file every time and it may work on OK on dev but that seems awfully computationally expensive.
What is a feasible approach to this?

To answer your question: There is no feasible approach because this is not a proper application of Subresource Integrity.
According to W3C the integrity attribute is:
...a mechanism by which user agents may verify that a fetched resource has been delivered without unexpected manipulation
It was introduced because these days lots of pages are fetching their CSS and JS scripts from CDNs like you are and if a hacker were ever to gain control of a CDN they could wreak an extraordinary amount of havoc across thousands of websites by injecting malicious code into the resources delivered!
Imagine if every version of jQuery delivered by code.jquery.com or ajax.googleapis.com suddenly contained malicious code! How many sites would be affected? Scary.
By providing the agent (browser) with an integrity hash that the contents of the fetched resource should be compared against, you are ensuring the agent only continues to execute the code if it gets exactly what you told it to expect. If it's different, don't trust it!
In the case of the resources in your application, I assume they exist on the same server so there is no middle route to intercept. If a hacker gains control of your server and injects malicious code in the JS scripts, they could just as easily rehash the contents and change the integrity attribute in your HTML as well. Subresource Integrity offers no additional security check.
But...
Just for the sport of solving what is quite a fun problem I would suggest if you wanted to dynamically generate the hash for the integrity attribute:
Use Gulp (my personal preference) to concatenate, minify and thumbprint the filename of your resource. Read the contents of the generated file using gulp.src('bar.*.min.js'). Use the NPM sha1 package to create the hash as a variable and finally maybe use gulp-inject to change the src attribute and then gulp-replace to write the integrity attribute too. Some flow like that is what I would go for :-)
I hope that answers your question.

Related

Securing Third party libraries in web applications

I have a web application which has login page.
In the source code (specifically in the <head>), I can see the third party javascript libraries used and the path to this library, sometimes the version of the library.
I can even access the code of these libraries without authentication.
Is that a security risk ?
For example:
<script type="text/javascript" src="/****/js/ui/js/jquery-ui-1.2.2.custom.min.js"></script>
<script type="text/javascript" src="/*****/dwr/interface/AjaxService.js"></script>
If yes, how to mitigate it?

Yes, there are two threats you need to mitigate:
First, the authenticity of the library. This can be achieved with SRI, which is a way to check the library signature - see this great post by Scott Helme.
Second, you want to check the library itself for know vulnerabilities. I'm not sure how it can be done when you add the libraries in that way - but there are tools you can use like Snyk to test and see if the library has known security issues. For example, here Snyk's results to the jquery version you're using. See here to find out more on the issue.
Hoped this help you out :)

Yes, such way has some issues.
The attacker can exploit lib server and to give you modified lib code.
First, I recommend you to download a lib (or even better is to add it to bundle via package.json) and to include all libs from your server, not 3rd party.
Every time you download you can check control sum of the lib to make sure it is not modified.
This will save you from some issues, but your address can be changed by the attacker too.
(He can redirect user to his host, instead of your when user resolves your address).
So it's better to have html + js in 1 file without cross link to be more safer.
This can be achieved using webpack bundling.
So attacker can compromise only the whole app, not 1 lib, it can be harder.
EDIT
(However, option to have only 1 file is good only for small project. For a bigger project you should use links for perfomance and have just a bit more risk.)
And you can check the code that you have (on server or in package.json) using snyk, which is open-source database of vulnerabilities.
EDIT
One more way of protection is using CSP headers. They allow to download content of some format (styles or scripts or images or etc) using only specific list of sources. It can prevent some kinds of XSS. It is highly recommended to use all types of CSP headers always. However the risk remains always: trusted source can be compromised, even DNS can be compomised.

content distribution network (CDN): Loading CSS or Javascript library

I am using knitrBootstrap for some projects and I am beginning, to learn JQuery (Javascript) and CSS for some modifications of the generated pages. I also understand, that usually the CSS files and scripts are placed in separate files and loaded from the same domain (or locally) to an HTML document, but when I read the documentation of both libraries I see that they can be loaded from a CDN provider and that the generated HTML files from knitrBootstrap also do that.
E.g.: http://rawgithub.com/jimhester/knitrBootstrap/master/vignettes/illusions.html (lines 18-24)
<!-- jQuery -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/jqueryui/1.10.3/jquery-ui.min.js"></script>
<!-- bootstrap -->
<link href=https://netdna.bootstrapcdn.com/bootstrap/3.0.0/css/bootstrap.min.css rel="stylesheet">
<script src="https://netdna.bootstrapcdn.com/bootstrap/3.0.0/js/bootstrap.min.js"></script>
This seems very nice since it allows to load static resources from a third party provider and spare resources on the own server when hosted. However, I was also a little bit concerned about the security (not exactly for my purposes but for webpages using this in general) and therefore searched about it. I found the concept of Same-Origin policy and from what I understand, the functions provided by JQuery should not be allowed to change the DOM objects of the page itself, but do it.
Why are the JQuery code and the Bootstrap CSS allowed to alter the remaining document even if they are not loaded from the same domain but from another (in this case a CDN)?

There is nothing stopping the CDN from replacing the files, and of more concern, there is nothing stopping someone else from replacing those files maliciously without the CDN being aware of it, except whatever unknown security measures are in place there.
The reason that the community is usually willing to ignore that potential flag is because of one huge benefit of CDNs: the ability for all users to use the exact same CDN for a given file. For example, imagine that every major site used the CloudFlare CDN link for JQuery. That means that when you, as a user, visit another major site that also uses it that you can save your own bandwidth by using a likely cached copy of the file. This of course brings up the other major point: the site is not wasting any of its own bandwidth serving up the file or handling requests for it.
However, getting to your question, the Same Origin policy does not apply to loading scripts or CSS; it applies to in-page requests (see: ajax) made by your scripts in order to try to avoid cross site scripting (XSS). The intent here is that you, as the site creator, should be in control of what scripts get loaded, but your in-page request may be easily trickable into making a cross-site request, thus potentially exposing data that should not be exposed (e.g., session variables). The key is that when the browser makes the request to the CDN, it does not give that CDN your session variables or any other cookies that it should not get (your domain's). However, once the script is able to be executing, it does have access your domain's cookies and it can forward those onto any other sites without the Same Origin policy in place.
Unlike Javascript, CSS does not actually execute code directly, rather it specifies a bunch of properties that have a visual effect on your page (which causes the browser to execute code to make it happen, including potentially downloading images used by the CSS).

Minify everything into one file or use CDN

For things like jQuery etc, isit better to leave it to use CDN or minify it into 1 file together with other JS

CDN - it's likely it will already be cached on the users machines and thus you'll save the download for it. Not to mention it will load faster from a CDN than from your site anyway - the overhead of the one extra connection to grab that file is diminimus

All your code should definitely be combined & minified. For the libraries, it's a bit trickier. CDNs are good in theory, but some studies have shown that they were not actually as efficient as they could be because of various reasons.
That means, if you've 50% miss rate on your CDN, the overhead of the extra DNS resolving and extra connection can actually slow you down more than it'll help.
The most important thing anyway is that you should version your minified/combined JS file, make it have a unique url for every version of the code you deploy. That way you can set Expires headers to +10 years, and make sure that anyone that downloads it only downloads it once.
Also don't forget to enable gzip (mod_deflate in apache), that will typically compress the transfer to 1/5-1/10th of its original size.

Using CDN is great, as the js file may be already cached from the CDN to user's computer.
But, there might be some plugins of jQuery and your own sites validation and other functions which might be separated in different JS files. then minify + combining is good approach.
For our ease we have separated the code in different files, and when browser tries to load content it has limitations on how many requests to send on the same server, CDN is out of your domain it will be requested without any browser limit so it loads fast. You need to combine your js files to reduce the number of requests from browser to load your page faster.
For me I use PHP to combine and minify
In html
<script src="js.php" >
and in php
header('Content-type: text/javascript');
include 'js/plugin.js';
include 'js/validation.js';
You can use output buffering to minify and also send this content as gziped to browser

min.js to clear source

As far i know until now, the min version of a .js(javascript) file is obtaining by removing the unncessary blank spaces and comments, in order to reduce the file size.
My questions are:
How can I convert a min.js file into a clear, easy readable .js file
Besides, size(&and speed) are there any other advtages of the min.js file.
the js files can be encripted?
can js be infected. I think the answer is yes, so the question is how to protect the .js files from infections?
Only the first question is most important and I'm looking for help on it.
TY

To convert a minified file into a editable source, simply open any IDE that supports auto-formatting and auto-format it. I use Netbeans to do this.
If you do client side caching for the minified file, it means to say that the client (computer) needs to process less bytes. Size and speed are the main advantages of a minified file, and they are already great advantages to prepare for a future that requires great load of data transfer. By the way, it also saves you some bandwidth on your server and therefore money.
I don't see the need of encryption. See How to disable or encrypt "View Source" for my site
Javascript files cannot be edited unless it is done so on the server. The security of your Javascript files depends on your 1) server protection 2) data protection. Data should not be able to exploit. But of course, Javascript is executed on the client side, it will be meaningless for the client user to attack him/herself. However Twitter has shown multiple Javascript exploits. You need to constantly test and check your codes against XSS, CSRF and other attacks. This means to say that if your Javascript file has a loophole, it was the developer, you, who created it.

Multiple minifiers exists, that also are able to compress JS, see http://dean.edwards.name/weblog/2007/04/packer3 for one of the most being used. Some others exists, also see the JSMin library http://www.crockford.com/javascript/jsmin.html
The main advantage is the size gain. You should also aggregate your JS files when you have multiple JS files, this also saves a lot of I/O (less HTTP requests) between the server and the client. This is probably more important than minifying.
I can't answer you about encryption. Client security will mainly depend on its browser.
EDIT: Ok my first answer is not for the first question, merged both in 2.

How to determine if a javascript was already loaded by other html file

How to determine if a javascript was already loaded by other html file? I want to reduce the redundant loading of the javascript files to decrease the loading time of my webpages.

If your web server is providing correct caching headers this shouldn't be necessary, the browser will cache the javascript file across multiple requests.
You might want to check out the YDN page Best Practices for Speeding Up Your Web Site

If you want to prevent the files from being downloaded twice then this will be automatic provided they are set to be cacheable (most webservers should set these headers sensibly by default).
If you want to make sure that the include tag happens only once when including files in a dynamic language then you will need some sort of manager. ASP.NET provides a scriptmanager class that does this (among other things). I cannot speak for other languages and frameworks

As Rory says the second request will probably be cached, and noting that this is a bit of a design failure if it can happen, you can understand that the cached file is still going to execute with negative effect.
This is horrible, but you could wrap your JS script like this:
if (!document.foo)
{
//your script here
document.foo = true;
}

We Keep Coding

JavaScript is the programming language of the Web.