automate http expires - javascript

I have a webapp written in PHP and i generate the headers with header() function.
The problem is that when I'm making changes to the javascript code of my app, on clients side, the old javascript will not be executed because is cached to the clients browsers.
How can I automate the process of header expiration? I assume that is has to be a better way than modifying that function each time I modify the javascript code.

The only bullet-proof solution is to change filenames of server-side resources:
From: Yahoo's Best Practices for Speeding Up Your Web Site:
Keep in mind, if you use a far future Expires header you have to change the component's filename whenever the component changes. At Yahoo! we often make this step part of the build process: a version number is embedded in the component's filename[...]
Of course this process must be automated. We are appending JavaScript file contents hash into file name.

Change the URI to the script with each release.
This can be done by adding a query string. You can automate this by, for example, taking the revision number from your version control system and inserting it into your template.
This will allow you to have long expiry times (for optimal caching) and still get fresh JavaScript each time a new release is published (so long as the HTML document isn't loaded from the cache (but they tend to have short cache times compared to JS)).

The best way to version javascript files is to include a version number in their filename. When you rev the code, you bump the version number and then you rev any web pages that include the JS file to refer to the new filename. You then only need to expire the web pages themselves and they will automatically refer to the new JS files. The JS files can have very long expiration (months or years) so you get maximum caching benefit for them.
This also ensures that you get a consistent set of JS files.
This is how jQuery does it with versioning.

Since you don't provide much detail only the general pointer:
Usually you can configure the Expires and other header params in the webserver - either globally and/or per "folder" etc.
You can make the JS file expire for example after 1 hour... this way you would know that 1 hour after a change all clients will be using the new JS file...
IF you need the change to take effect immediately even for clients currently active the header won't help much - you would have to do some AJAX magic...

Related

Why have some external javascript files ?numbers?

I have seen many websites that include at their JavaScript and CSS external resources things like this:
filename.js?v=3cc1b79c2abb
And:
filename.css?v=7bbb71ecd5eb
The "?v=..." things at the end...
What is this? And for what is this useful?
Thank you!
Cheers :)
These are a form of "Cache Busting" - they force the browser to download the latest version of the file, rather than taking a chance at loading an old file from cache.
There is something more deeper - why do we need cache busting?
For efficiency sake we have to make the browser cache the resource files. For that to work we set last modified date as a very old date (say, 01-Jan-1970 00:00:00.000) and expiry date long into the future. These 2 things will make the browser cache the files so that they are not requested from the server again. That is very efficient. However, that causes a problem when you update the application. None of the resources will be downloaded again! To work around that we configure the build tool to append a version number query string unique to the build at the end of resource URLs. It is typical to use build timestamp or a uuid or the source repository version number (in case of version control tools like SVN which give a unique version number to every commit) as the version number string appended to the end of the resource URLs. That forces the browser to download new version whenever the application is updated.
This is your own version/keyword v=7bbb71ecd5eb of js and css, After use this, there would not be cache in browser with your older javascript and css.
Which means your new update of css and javascript would be applied without any cache.
It's to force the browser to download the file instead of getting it from cache.
For example, you have this url with css : styles.css?v=blablabla, but later you change css and want to have these changes to be seen instantly (instead of waiting when browser cache will expire or forcing user to press Ctrl + F5) you change it to something like styles.css?v=otherblablabla. Browser sees it as different url so it have to download it.
It's just a parameter in query string, and because the url points to static resource, these parameters are ignored by web server.
You could also see something like this image.png?1392469113262. It's just a parameter named 1392469113262 that has no value. image.png is static resource so this parameter will be ignored by server. These numbers are usually some timestamp and it's the often the best way to force browser to not cache image (or any other resource).

Forcing the browser to reload css/js only if they have changed

There are a lot of questions and answers on SO related to my problem [I want the browser to cache js/css forever. During a new release if some of the js/css files have been updated, the browser should reload and cache them.]
This solution seemed most appropriate to me :
What is an elegant way to force browsers to reload cached CSS/JS files?
However, there is just one thing that I am unable to figure out.
The solution makes use of last_modified_time. However, I am not allowed to use it. I need to use some other mechanism.
What are the options? Is there a possibility of pre-calculating the versions during build and updating(replacing) them in jsps via build script (before deployment, so that the version numbers are not calculated on run time)? Any existing tool for this purpose? I use Java/Jsp.
We always use
file.css?[deploytimestamp]
This way the CSS file is cached for each deployment at the client. The same goes for our minified javascript. Is this an option for you?
It may not be the best way, but this is what I am doing now:
All of my js/css have a [source control = svn] revision number
References in my jsp are like /foo/path1/path2/xyz000000/foo.
Build Step 1 - Generate a map of css|js files and their revision numbers
Build Step 2 - Replace xyz000000 references in jsps with a hash of svn revisions
A rule in url rewriter to direct all /foo/path1/path2/xyz<767678>/foo. to /foo/path1/path2/foo.[js|css]
Infinitely cache the css|js files
Whenever there is a commit, the revision number changes and so do the references in .jsp
Generate an md5-hash of each css file after deployment. Use this hash instead of the timestamp in the url of the css.
file.css?[hash of file.css contents]
It may be wise to calculate the hashes once after deployment and store them to gain some performance. You could store them in a database, or even in a PHP array in a separate file that is included in your website code.

Improving Javascript Load Times - Concatenation vs Many + Cache

I'm wondering which of the following is going to result in better performance for a page which loads a large amount of javascript (jQuery + jQuery UI + various other javascript files). I have gone through most of the YSlow and Google Page Speed stuff, but am left wondering about a particular detail.
A key thing for me here is that the site I'm working on is not on the public net; it's a business to business platform where almost all users are repeat visitors (and therefore with caches of the data, which is something that YSlow assumes will not be the case for a large number of visitors).
First up, the standard approach recommended by tools such as YSlow is to concatenate it, compress it, and serve it up in a single file loaded at the end of your page. This approach sounds reasonably effective, but I think that a key part of the reasoning here is to improve performance for users without cached data.
The system I currently have is something like this
All javascript files are compressed and loaded at the bottom of the page
All javascript files have far future cache expiration dates, so will remain (for most users) in the cache for a long time
Pages only load the javascript files that they require, rather than loading one monolithic file, most of which will not be required
Now, my understanding is that, if the cache expiration date for a javascript file has not been reached, then the cached version is used immediately; there is no HTTP request sent at to the server at all. If this is correct, I would assume that having multiple tags is not causing any performance penalty, as I'm still not having any additional requests on most pages (recalling from above that almost all users have populated caches).
In addition to this, not loading the JS means that the browser doesn't have to interpret or execute all this additional code which it isn't going to need; as a B2B application, most of our users are unfortunately stuck with IE6 and its painfully slow JS engine.
Another benefit is that, when code changes, only the affected files need to be fetched again, rather than the whole set (granted, it would only need to be fetched once, so this is not so much of a benefit).
I'm also looking at using LabJS to allow for parallel loading of the JS when it's not cached.
Specific questions
If there are many tags, but all files are being loaded from the local cache, and less javascript is being loaded overall, is this going to be faster than one tag which is also being loaded from the cache, but contains all the javascript needed anywhere on the site, rather than an appropriate subset?
Are there any other reasons to prefer one over the other?
Does similar thinking apply to CSS? (I'm currently using a much more monolithic approach to CSS)
2021 Edit:
As this answer has had some recent upvotes, do notice that with http 2.0 things changed a lot. You don't get the per-request hit as you now multiplex over a single TCP connection. You also get server-push. While most of the answer is still valid, do take it as how things were previously done.
I would say that the most important thing to focus on is the perception of speed.
First thing to take into consideration, there is no win-win formula out there but a threshold where a javascript file grows into such a size that it could (and should) be split.
GWT uses this and they call it DFN (Dead-for-now) code. There isn't much magic here. You just have to manually define when you'll need a need a new piece of code and, should the user need it, just call that file.
How, when, where will you need it?
Benchmark. Chrome has a great benchmarking tool. Use it extensivelly. See if having just a small javascript file will greatly improve the loading of that particular page. If it does by all means do start DFNing your code.
Apart from that it's all about the perception.
Don't let the content jump!
If your page has images, set up their widths and heights up front. As the page will load with the elements positioned right where they are supposed to be, there will be no content fitting and adjusting the user's perception of speed will increase.
Defer javascript!
All major libraries can wait for page load before executing javascript. Use it. jQuery's goes like this $(document).ready(function(){ ... }). It doesn't wait for parsing the code but makes the parsed code fire exactly when it should. After page load, before image load.
Important things to take into consideration:
Make sure js files are cached by the client (all the others stand short compared to this one)
Compile your code with Closure Compiler
Deflate your code; it's faster than Gziping it (on both ends)
Apache example of caching:
// Set up caching on media files for 1 month
<FilesMatch "\.(gif|jpg|jpeg|png|swf|js|css)$">
ExpiresDefault A2629744
Header append Cache-Control "public, proxy-revalidate"
Header append Vary "Accept-Encoding: *"
</FilesMatch>
Apache example of deflating:
// compresses all files for faster transfer
LoadModule deflate_module modules/mod_deflate.so
AddOutputFilterByType DEFLATE text/html text/plain text/xml font/opentype font/truetype font/woff
<FilesMatch "\.(js|css|html|htm|php|xml)$">
SetOutputFilter DEFLATE
</FilesMatch>
And last, and probably least, serve your Javascript from a cookie-less domain.
And to keep your question in focus, remember that when you have DFN code, you'll have several smaller javascript files that, precisely for being split, won't have the level of compression Closure can give you with a single one. The sum of the parts isn't equal to the whole in this scenario.
Hope it helps!
I really think you need to do some measurement to figure out if one solution is better than the other. You can use JavaScript and log data to get a clear idea of what your users are seeing.
First, analyze your logs to see if your cache rate is really as good as you would expect for your userbase. For example, if each html page includes jquery.js, look over the logs for a day--how many requests were there for html pages? How many for jquery.js? If the cache rate is good, you should see far fewer requests for jquery.js than for html pages. You probably want to do this for a day right after an update, and also a day a few weeks after an update, to see how that affects the cache rate.
Next, add some simple measurements to your page in JavaScript. You said the script tags are at the bottom, so I assume it looks something like this?
<html>
<!-- all your HTML content... -->
<script src="jquery.js"></script>
<script src="jquery-ui.js"></script>
<script src="mycode.js"></script>
In that case, you time how long it takes to load the JS, and ping the server like this:
<html>
<!-- all your HTML content... -->
<script>var startTime = new Date().getTime();</script>
<script src="jquery.js"></script>
<script src="jquery-ui.js"></script>
<script src="mycode.js"></script>
<script>
var endTime = new Date().getTime();
var totalTime = endTime - startTime; // In milliseconds
new Image().src = "/time_tracker?script_load=" + totalTime;
</script>
Then you can look through the logs for /time_tracker (or whatever you want to call it) and see how long it's taking people to load the scripts.
If your cache rate isn't good, and/or you're dissatisfied with how long it takes to load the scripts, then try moving all the scripts to a concatenated/minified file on one of your pages, and measure how long that takes to load in the same way. If the results look promising, do the rest of the site.
I would definitely go with the non-monolithic approach. Not only in your case, but in general gives you more flexibility when you need something changed or re-configured.
If you make a change to one of these files then you will have to merge-compress and deliver. If you are doing this in an automated way then you are OK.
As far as the browser question "if the cache expiration date for a javascript file has not been reached, then the cached version is used immediately; there is no HTTP request sent at to the server at all", i think that there is an HTTP request made but the with response "NOT MODIFIED". To be sure you should check all the Requests made to the Web Server (using one of the tools available). After the response is given then the browser uses the unmodified resource - the js file or image or other.
Good luck with your B2B.
Even though you are dealing with repeat-visitors, there are many reasons why their cache may have been cleared, including privacy and performance tools that delete temporary cache files to "speed up your computer".
Merging and mini-fying your script doesn't have to be an onerous process. I write my JavaScript in separate files, nicely spaced out to be readable to me so it is easier to maintain. However, I serve it via a script page that combines all of the scripts into a single script and mini-fies it all - so one script gets sent to the browser with all my scripts in. This is the best of both worlds as I work on a collection of JavaScript files that are all readable, and the visitor gets one compressed JavaScript file, which is the recommendation for reducing the HTTP requests (and therefore the queue time).
Did you try Google Closure? From what I've read about it, it seems quite promising.
http://code.google.com/closure/
http://googlecode.blogspot.com/2009/11/introducing-closure-tools.html - blog post
http://axod.blogspot.com/2010/01/google-closure-compiler-advanced-mode.html - performance of GC
http://www.sitepoint.com/google-closure-how-not-to-write-javascript/ - a few tips for javascript
Generally it's better to have fewer, larger requests than to have many small requests, since the browser will only do two (?) requests in parallel to a particular domain.
So whilst you say that most users are repeat visitors, when the cache expires there will be many round-trips for the many files, rather than one for a monolithic file.
If you take this to an extreme and have potentially thousands of files with individual functions in them, it would become obvious that this would lead to a huge number of requests when the cache expires.
Another reason to have a monolithic file is for when various parts of the site have different chunks of javascript associated with them, as you again get this in the cache when you hit the first page, saving later requests and round-trips.
If you're worried about the initial hit loading a "large" javascript file you can try loading it asynchronously, using the method described here : http://www.webmaster-source.com/2010/06/07/loading-javascript-asynchronously/
Whichever way you go in the end, remember that since you're setting a far-future modified date, you'll need to change the name of the javascript (and CSS) files when changes are made in them, otherwise clients won't pick up the changes until their cache expires anyway.
PS : Profile it on the different browsers with the differing methods and write it up, as it will prove useful to those who are also stuck on slow JS engines like IE6 :)
I've used the following for both CSS and Javascript -- most of my pages in Google Speed report being 94-96/100 and they load very fast (always within a second, even if there are 100kb's of Javascript).
1. I have a PHP function to call files -- this is a class and stores all the unique files that are asked for. My call looks something like:
javascript( 'jquery', 'jquery.ui', 'home-page' );
2. I spit out a url-encoded version of these strings combined together to call a dynamic PHP page:
<script type="text/javascript" src="/js/?files=eNptkFsSgjAMRffCP4zlTVmDi4iQkVwibbEUHzju3UYEHMffc5r05gJnEX8IvisHnnHPQN9cMHZeKThzJOVeex7R3AmEDhQLCEZBLHLMLVhgpaXUikRMXCJbhdTjgNcG59UJyWSVPSh_0lqSSp0KN6XNEZSYwAqt_KoBY-lRRvNblBZrYeHQYdAOpHPS-VeoTpteVFwnNGSLX6ss3uwe1fi-mopg8aqt7P0LzIWwz-T_UCycC2sQavrp-QIsrnKh"></script>
3. That dynamic PHP page takes decodes the string and creates an array of the files that will needed to be called. A cache_file path is created:
$compressed_js_file_path = $_SERVER['DOCUMENT_ROOT'] . '/cache/js/' . md5( implode( '|', $js_files ) ) . '.js';
4. It checks to see if that file path already exists in the cache, if so, it just reads the file:
if( file_exists( $compressed_js_file_path ) ) {
echo file_get_contents( $compressed_js_file_path );
} else {
5. If it doesn't exist, it compresses all the javascript into one "monolith" file, but realize it has ONLY the necessary javascript for that page, not for the entire site.
if( $fh = #fopen( $compressed_js_file_path, 'w' ) ) {
fwrite( $fh, $js );
fclose( $fh );
}
// Echo the compressed Javascript
echo $js;
I've given you excerpts of the code. The program you use to compress javascript is completely up to you. I use this with both CSS and Javascript so that all those file requires 1 HTTP request, ever, the result is cached on the server (simply delete that file if you change something), and it has only the necessary Javascript & CSS for that page.

Put javascript in one .js file or break it out into multiple .js files?

My web application uses jQuery and some jQuery plugins (e.g. validation, autocomplete). I was wondering if I should stick them into one .js file so that it could be cached more easily, or break them out into separate files and only include the ones I need for a given page.
I should also mention that my concern is not only the time it takes to download the .js files but also how much the page slows down based on the contents of the .js file loaded. For example, adding the autocomplete plugin tends to slow down the response time by 100ms or so from my basic testing even when cached. My guess is that it has to scan through the elements in the DOM which causes this delay.
I think it depends how often they change. Let's take this example:
JQuery: change once a year
3rd party plugins: change every 6 months
your custom code: change every week
If your custom code represents only 10% of the total code, you don't want the users to download the other 90% every week. You would split in at least 2 js: the JQuery + plugins, and your custom code. Now, if your custom code represents 90% of the full size, it makes more sense to put everything in one file.
When choosing how to combine JS files (and same for CSS), I balance:
relative size of the file
number of updates expected
Common but relevant answer:
It depends on the project.
If you have a fairly limited website where most of the functionality is re-used across multiple sections of the site, it makes sense to put all your script into one file.
In several large web projects I've worked on, however, it has made more sense to put the common site-wide functionality into a single file and put the more section-specific functionality into their own files. (We're talking large script files here, for the behavior of several distinct web apps, all served under the same domain.)
The benefit to splitting up the script into separate files, is that you don't have to serve users unnecessary content and bandwidth that they aren't using. (For example, if they never visit "App A" on the website, they will never need the 100K of script for the "App A" section. But they would need the common site-wide functionality.)
The benefit to keeping the script under one file is simplicity. Fewer hits on the server. Fewer downloads for the user.
As usual, though, YMMV. There's no hard-and-fast rule. Do what makes most sense for your users based on their usage, and based on your project's structure.
If people are going to visit more than one page in your site, it's probably best to put them all in one file so they can be cached. They'll take one hit up front, but that'll be it for the whole time they spend on your site.
At the end of the day it's up to you.
However, the less information that each web page contains, the quicker it will be downloaded by the end-viewer.
If you only include the js files required for each page, it seems more likely that your web site will be more efficient and streamlined
If the files are needed in every page, put them in a single file. This will reduce the number of HTTP request and will improve the response time (for lots of visits).
See Yahoo best practice for other tips
I would pretty much concur with what bigmattyh said, it does depend.
As a general rule, I try to aggregate the script files as much as possible, but if you have some scripts that are only used on a few areas of the site, especially ones that perform large DOM traversals on load, it would make sense to leave those in separate file(s).
e.g. if you only use validation on your contact page, why load it on your home page?
As an aside, you can sometimes sneak these files into interstitial pages, where not much else is going on, so when a user lands on an otherwise quite heavy page that needs it, it should already be cached - use with caution - but can be a handy trick when you have someone benchmarking you.
So, as few script files as possible, within reason.
If you are sending a 100K monolith, but only using 20K of it for 80% of the pages, consider splitting it up.
It depends pretty heavily on the way that users interact with your site.
Some questions for you to consider:
How important is it that your first page load be very fast?
Do users typically spend most of their time in distinct sections of the site with subsets of functionality?
Do you need all of the scripts ready the moment that the page is ready, or can you load some in after the page is loaded by inserting <script> elements into the page?
Having a good idea of how users use your site, and what you want to optimize for is a good idea if you're really looking to push for performance.
However, my default method is to just concatenate and minify all of my javascript into one file. jQuery and jQuery.ui are small and have very low overhead. If the plugins you're using are having a 100ms effect on page load time, then something might be wrong.
A few things to check:
Is gzipping enabled on your HTTP server?
Are you generating static files with unique names as part of your deployment?
Are you serving static files with never ending cache expirations?
Are you including your CSS at the top of your page, and your scripts at the bottom?
Is there a better (smaller, faster) jQuery plugin that does the same thing?
I've basically gotten to the point where I reduce an entire web application to 3 files.
vendor.js
app.js
app.css
Vendor is neat, because it has all the styles in it too. I.e. I convert all my vendor CSS into minified css then I convert that to javascript and I include it in the vendor.js file. That's after it's been sass transformed too.
Because my vendor stuff does not update often, once in production it's pretty rare. When it does update I just rename it to something like vendor_1.0.0.js.
Also there are minified versions of those files. In dev I load the unminified versions and in production I load the minified versions.
I use gulp to handle doing all of this. The main plugins that make this possible are....
gulp-include
gulp-css2js
gulp-concat
gulp-csso
gulp-html-to-js
gulp-mode
gulp-rename
gulp-uglify
node-sass-tilde-importer
Now this also includes my images because I use sass and I have a sass function that will compile images into data-urls in my css sheet.
function sassFunctions(options) {
options = options || {};
options.base = options.base || process.cwd();
var fs = require('fs');
var path = require('path');
var types = require('node-sass').types;
var funcs = {};
funcs['inline-image($file)'] = function (file, done) {
var file = path.resolve(options.base, file.getValue());
var ext = file.split('.').pop();
fs.readFile(file, function (err, data) {
if (err) return done(err);
data = new Buffer(data);
data = data.toString('base64');
data = 'url(data:image/' + ext + ';base64,' + data + ')';
data = types.String(data);
done(data);
});
};
return funcs;
}
So my app.css will have all of my applications images in the css and I can add the image's to any chunk of styles I want. Typically i create classes for the images that are unique and I'll just take stuff with that class if I want it to have that image. I avoid using Image tags completely.
Additionally, use html to js plugin I compile all of my html to the js file into a template object hashed by the path to the html files, i.e. 'html\templates\header.html' and then using something like knockout I can data-bind that html to an element, or multiple elements.
The end result is I can end up with an entire web application that spins up off one "index.html" that doesn't have anything in it but this:
<html>
<head>
<script src="dst\vendor.js"></script>
<script src="dst\app.css"></script>
<script src="dst\app.js"></script>
</head>
<body id="body">
<xyz-app params="//xyz.com/api/v1"></xyz-app>
<script>
ko.applyBindings(document.getTagById("body"));
</script>
</body>
</html>
This will kick off my component "xyz-app" which is the entire application, and it doesn't have any server side events. It's not running on PHP, DotNet Core MVC, MVC in general or any of that stuff. It's just basic html managed with a build system like Gulp and everything it needs data wise is all rest apis.
Authentication -> Rest Api
Products -> Rest Api
Search -> Google Compute Engine (python apis built to index content coming back from rest apis).
So I never have any html coming back from a server (just static files, which are crazy fast). And there are only 3 files to cache other than index.html itself. Webservers support default documents (index.html) so you'll just see "blah.com" in the url and any query strings or hash fragments used to maintain state (routing etc for bookmarking urls).
Crazy quick, all pending on the JS engine running it.
Search optimization is trickier. It's just a different way of thinking about things. I.e. you have google crawl your apis, not your physical website and you tell google how to get to your website on each result.
So say you have a product page for ABC Thing with a product ID of 129. Google will crawl your products api to walk through all of your products and index them. In there you're api returns a url in the result that tells google how to get to that product on a website. I.e. "http://blah#products/129".
So when users search for "ABC thing" they see the listing and clicking on it takes them to "http://blah#products/129".
I think search engines need to start getting smart like this, it's the future imo.
I love building websites like this because it get's rid of all the back end complexity. You don't need RAZOR, or PHP, or Java, or ASPX web forms, or w/e you get rid of those entire stacks.... All you need is a way to write rest apis (WebApi2, Java Spring, or w/e etc etc).
This separates web design into UI Engineering, Backend Engineering, and Design and creates a clean separation between them. You can have a UX team building the entire application and an Architecture team doing all the rest api work, no need for full stack devs this way.
Security isn't a concern either, because you can pass credentials on ajax requests and if your stuff is all on the same domain you can just make your authentication cookie on the root domain and presto (automatic, seamless SSO with all your rest apis).
Not to mention how much simpler server farm setup is. Load balance needs are a lot less. Traffic capabilities a lot higher. It's way easier to cluster rest api servers on a load balancer than entire websites.
Just setup 1 nginx reverse proxy server to serve up your index .html and also direct api requests to one of 4 rest api servers.
Api Server 1
Api Server 2
Api Server 3
Api Server 4
And your sql boxes (replicated) just get load balanced from the 4 rest api servers (all using SSD's if possible)
Sql Box 1
Sql Box 2
All of your servers can be on internal network with no public ips and just make the reverse proxy server public with all requests coming in to it.
You can load balance reverse proxy servers on round robin DNS.
This means you only need 1 SSL cert to since it's one public domain.
If you're using Google Compute Engine for search and seo, that's out in the cloud so nothing to worry about there, just $.
If you like the code in separate files for development you can always write a quick script to concatenate them into a single file before minification.
One big file is better for reducing HTTP requests as other posters have indicated.
I also think you should go the one-file route, as the others have suggested. However, to your point on plugins eating up cycles by merely being included in your large js file:
Before you execute an expensive operation, use some checks to make sure you're even on a page that needs the operations. Perhaps you can detect the presence (or absence) of a dom node before you run the autocomplete plugin, and only initialize the plugin when necessary. There's no need to waste the overhead of dom traversal on pages or sections that will never need certain functionality.
A simple conditional before an expensive code chunk will give you the benefits of both the approaches you are deciding on.
I tried breaking my JS in multiple files and ran into a problem. I had a login form, the code for which (AJAX submission, etc) I put in its own file. When the login was successful, the AJAX callback then called functions to display other page elements. Since these elements were not part of the login process I put their JS code in a separate file. The problem is that JS in one file can't call functions in a second file unless the second file is loaded first (see Stack Overflow Q. 25962958) and so, in my case, the called functions couldn't display the other page elements. There are ways around this loading sequence problem (see Stack Overflow Q. 8996852) but I found it simpler put all the code in one larger file and clearly separate and comment sections of code that would fall into the same functional group e.g. keep the login code separate and clearly commented as the login code.

What are best practices for preventing stale CSS and JavaScript

I'm researching this for a project and I'm wondering what other people are doing to prevent stale CSS and JavaScript files from being served with each new release. I don't want to append a timestamp or something similar which may prevent caching on every request.
I'm working with the Spring 2.5 MVC framework and I'm already using the google api's to serve prototype and scriptaculous. I'm also considering using Amazon S3 and the new Cloudfront offering to minimize network latency.
I add a parameter to the request with the revision number, something like:
<script type="text/javascript" src="/path/to/script.js?ver=456"></script>
The 'ver' parameter is updated automatically with each build (read from file, which the build updates). This makes sure the scripts are cached only for the current revision.
Like #eran-galperin, I use a parameter in the reference to the JS file, but I include a server-generated reference to the file's "last modified" date. #stein-g-strindhaug suggests this approach. It would look something like this:
<script type="text/javascript" src="/path/to/script.js?1347486578"></script>
The server ignores the parameter for the static file and the client may cache the script until the date code changes. If (and only if) you modify the JS file on the server, the date code will change automatically.
For instance, in PHP, my script to create this code looks like this:
function cachePreventCode($filename) {
if (!file_exists($filename))
return "";
$mtime = filemtime($filename);
return $mtime;
}
So then when your PHP file includes a reference to a CSS file, it might look like this:
<link rel="stylesheet" type="text/css" href="main.css?<?= cachePreventCode("main.css") ?>" />
... which will create ...
<link rel="stylesheet" type="text/css" href="main.css?1347489244" />
With regards to cached files, I have yet to run into any issues of bugs related to stale cached files by using the querystring method.
However, with regards to performance, and echoing Todd B's mention of revving by filename, please check out Steve Souders' work for more on the topic:
"Squid, a popular proxy, doesn’t cache resources with a querystring. This hurts performance when multiple users behind a proxy cache request the same file - rather than using the cached version everybody would have to send a request to the origin server."
"Proxy administrators can change the configuration to support caching resources with a querystring, when the caching headers indicate that is appropriate. But the default configuration is what web developers should expect to encounter most frequently."
http://www.stevesouders.com/blog/2008/08/23/revving-filenames-dont-use-querystring/
Use a conditional get request with an If-Modified-Since header
This is actually a very hard issue, and something that you can spend a while engineering the correct solution for.
I would recommend publishing your files using a timestamp and/or version built into the url, so instead of:
/media/js/my.js you end up with:
/media/js/v12/my.js or something similar.
You can automate the versioning/timestamping with any tool.
This has the added benefit of NOT breaking the site as you roll out new versions, and lets you do real side-by-side testing (unlike a rewrite rule that just strips the version and sends back the newest file).
One thing to watch out for with JS or CSS is when you include dependent urls inside of them (background images, etc) you need to make sure the JS/CSS timestamp/version changes if a resource inside does (as well as rewrite them, but that is possible with a very simple regex and a resource manifest).
No matter what you do make sure not to toss a ?vblah on the end, as you are basically throwing caching out the window when you do that (which is unfortunate, as it is by far the easiest way to handle this)
If you get the "modified time" of the file as a timestamp it will be cached until the file is modified. Just use a helper function (or whatever it is called in other frameworks) to add script/css/image tags that get the timestamp from the file. On a unix like system (wich most survers are) you could simply touch the files to force the modified time to change if necessary.
Ruby on Rails uses this strategy in production mode (by default I beleave), and uses a normal timestamp in development mode (to be really sure something isn't cached).
If you use MAVEN, you can use this, ADD on you pom.xml:
<properties>
<maven.build.timestamp.format>yyyyMMddHHmm</maven.build.timestamp.format>
<timestamp>${maven.build.timestamp}</timestamp>
</properties>
With this you can acess ${timestamp} in your view.
Like this sample:
<script type="text/javascript" src="/js/myScript.js?t=${timestamp}"></script>
Based on Todd Berman's answer of incorporating a revision number into the URL (but not as a query string), a perhaps slightly more convenient approach would be to have the server transform the versioned URL into a canonical form. This could be done with symlinks, e.g.:
/media/js/v12/my.js => /media/js/my.js
or you could set up server-side URL rewrites to always transform paths of the form /media/js/v*/my.js to, say, /media/js/my.js.

Categories