This is not a Meta question.
I am trying to technically understand what principle is hidden behind the following behaviour. It's very easy to reproduce:
Vote up/down anything on this page1,
Click on any other link on this page,
Come back by pressing the back button.
Your upvote is not there anymore as well as any AJAX activities having appeared on the page.
Why is that? Why is the browser acting like so? How could StackOverflow prevent that?
1 If you are not connected, just wait for someone else's activity on the page (new comment, answer, vote) before moving page.
It’s the browser’s cache that is at play here.
Since you’re asked how SO could “prevent” this, it could be done by advising the browser to check for whether the document has changed every time. But SO not doing so, for performance reasons. So the HTML document is seen as “still valid” for a certain amount of time, during which the browser takes it straight from its cache, without making a round-trip to the server.
If you look at the HTTP response headers in your browser’s developer tools for the request your browser made for this page, you will see something like this,
Cache-Control: public, no-cache="Set-Cookie", max-age=60
– so this HTML document is to be considered valid for 60 seconds. If you navigate away from it and back in your browser, or close the tab and reopen it from history, within that 60 seconds, the browser is supposed to take the cached version of it and display it, without checking again with the server whether or not something has changed. And since your vote did not manipulate this original HTML document (only the DOM was updated with your vote), you still get the previous vote count shown.
But if you press [F5] in your browser, the cache will be circumvented – it will request the document from SO again, and then you see your vote, because this time the updated numbers are part of the updated HTML document that SO serves you.
If you want to delve more into HTTP caching, some resources of the top of Google that seem worth a look:
Caching Tutorial for Web Authors and Webmasters
A Beginner's Guide to HTTP Cache Headers
You are not "unvoting", you just are not seeing your vote because your browser is caching the ajax request.
If your press F12 on Chrome, click on Settings icon and then "Disable cache (while DevTools is open)", when you press back the browser will resend the request.
To prevent that you must specify on your code that you never want that specific request to be cached.
You may want to check the following post:
Prevent browser caching of jQuery AJAX call result
-
Ps. You must stay with the Console (F12) opened while doing the test.
Related
First, I found some resources online here and here saying about the same thing:
For a normal/soft reload, the browser will re-validate the cache, checking to see if the files are modified.
I tested it on Chrome. I have a webpage index.html which loads a few javascript files in the end of body. When hitting the refresh button (soft/normal), from the network panel I saw index.html was 304 Not Modified, which was good. However, all the javascript files were loaded from memory cache with status code 200. No revalidation!
Then I tried modifying one of the javascript files. Did the soft reload. And guess what? That file was still loaded from memory cache!
Why does Chrome do this? Doesn't that defeat the purpose of the refresh button?
Here is more information about Chrome's memory cache.
This is a relatively new behaviour which was introduced in 2017 by Chrome browser.
The well-known behaviour of browsers is to revalidate cached resource when the user refreshes the page (either by using CTRL+R combination or dedicated refresh button) by sending If-Modified-Since or If-None-Match header. It works for all resources obtained by GET request: stylesheets, scripts, htmls etc. This leads to tons of HTTP requests that in the majority of cases end with 304 Not Modifiedresponses.
The most popular websites are the ones with constantly changing content, so their users tend to refresh them habitually to get the latest news, tweets, videos and posts. It's not hard to imagine how many unnecessary requests were made every second and as it is said that the best request is the one never made, Facebook decided to address this problem and asked Chrome and Firefox to find a solution together.
Chrome came up with the described solution.
Instead of invalidating each subresource, it only checks if the HTML document changed. If it didn't, it means that it's very likely that everything else also wasn't modified, so it's returned from browser's cache. This works best when each resource has content addressed URL; for example, URL contains a hash of the content of the file. Users can always overcome this behaviour by performing a hard refresh.
Firefox's solution gives more control to developers, and it's on a good way to be implemented by all browser vendors. That is the new Cache-control directive: immutable.
You can find more information about it here: https://developer.mozilla.org/pl/docs/Web/HTTP/Headers/Cache-Control#Revalidation_and_reloading
Resources:
Facebook's article about the motivation behind proposed change, numbers, comparisons: https://code.fb.com/web/this-browser-tweak-saved-60-of-requests-to-facebook/?utm_source=codedot_rss_feed
Chromium team introducing new behaviour: https://blog.chromium.org/2017/01/reload-reloaded-faster-and-leaner-page_26.html
Browser caches are a little more complex than simple 200 and 304s than they once were and pay attention to server side directives in headers to tell them how to handle caching for each specific site.
We can adjust the browser caching profiles using various headers (such as Cache-Control) by specifically setting the time before expires you can tell a browser to use the local copy instead of requesting a new fresh copy, these can be quite aggressive in the cases of content you really don't want changed (i.e a companies logo). By doing something like Cache-Control: public, max-age=31536000
Additionally you can also set the Expires header which will allow you to almost do the same as Cache-Control but with a little less control. It just sets the amount of time to pass before the browser considers a asset stale and re-requests. Although with a re-request we could still get a cached result if the not modified response code is sent back from the server.
A lot of web servers have settings enabled to allow more aggressive caching of certain asset files (js, images, css) but less aggressive caching of content files.
We're developing a web application that handles state change via change of the hash of the page (e.g. example.com/#/page1).
Lately, I've been running into an issue with Google Chrome, when the prefetch option is enabled ("Predict network actions to improve page load performance"). Among the different routes, we have #/logout that performs the logout.
In the "normal" state, I'm on the page example.com/#/ (the main page), and as I start typing "l" after that (example.com/#/l), Chrome autocompletes with logout. However, not only it does autocomplete, but it also calls the "haschange" event, so the client is sending a request to log out to the server... Even just by typing a l!
This behaviour is not only unexpected, but it's also dangerous. Aside from unchecking "Predict network actions to improve page load performance" in the settings page (which is on by default), is there a way to prevent Chrome to do this?
EDIT
A small new "discovery". Actually, Chrome is not firing the "hashchange" event, as a console.log in the event handler is not being executed. Chrome learnt that, when visiting the #/logout page, a request to the server (GET /auth/destroy) is called, and so it's firing it by itself! What can we do to stop this?
Answering my own question. This is not really a solution, but rather a workaround.
According to this documentation, prendering is disabled in certain situations: with POST requests (not an option in our case) and when the resources are served via HTTPS.
Since we were already going to enable HTTPS in the production environment, we just enabled it in the development one as well and the issue disappeared. However, I still feel like this is more of a workaround than a real solution.
I am quite new to web programming and trying to get my head around iframes.
So, let us say I have an iframe on my webpage (which is on a server) to the popular bbc site as follows:
<iframe src="http://www.bbc.co.uk"></iframe>
Now, when the user goes to my page, the iframe loads - but, who is making the calls within the iFrame? (i.e the BBC content?) Is it my server or the user?
I guess another way to ask the question is who's IP will bbc's log see in this case? the web servers or the users IP?
Stupid question I suppose, but I just am confused!
The user's web browser would still be making the request.
You can use your browser's developer tools to see this happen and confirm (they usually pop up by pressing F12). Please become comfortable with them as they will be one of your trusty tools for web development in the future. :)
So to answer your question. Regardless of where the page holding the iframe lives, ultimately the user is still making the request therefore their IP should show up.
Your visitor's browser will simply get a whole HTML page from your server and after that it's up to the browser to make do. As a result, all calls like external scripts or images but also iframes will be made by the client.
I'm having problems with a website is Chrome.
Most of the site uses ajax/xmlhttprequest for pages loads and the history API to enable the back button. Only the page content is changing with the request, the menu etc are never reloaded. This just re-reuns the ajax request for the previous page. This all works fine until someone clicks the back button after viewing the blog. The blog isn't loaded with ajax, it's just a standard link.
In Firefox if I go to the blog then press back the site loads correctly. The main page with the navigation loaded and so is the page to be viewed within it.
In Chrome however if I press the back button from the blog the 'outer' page isn't loaded, only the contents of the ajax request is. You may need to view it to fully understand.
Is this a bug in Chrome or my work? It seems I can't return to a page that was partially loaded using xmlhttprequest as only the requested item is loaded.
The site is here: http://www.basmooarc.com
Thanks
Ric
short answer
Add a Cache-Control: no-store HTTP header for XHR responses.
long answer
I'm pretty sure this is a bug in Chrome. I found the exact same bug in my app, and it works fine in Firefox but breaks in Chrome. I think the issue is that Chrome caches the XHR response and serves it from the cache when you press the back button. My app uses Etags, but Chrome does not bother to check the Etag. It just uses the cached response, which is missing all the outer content. The best solution I've come up with so far is to add no-store to the cache control header for XHR responses.
You can see the behavior through Chrome's net-internals page. Type chrome://net-internals in the URL bar, open the Events tab and go through the steps to reproduce your bug. When you go to a non-ajax page and then press the back button, you'll see a URL_REQUEST entry for the URL of the page you're trying to go to, but Chrome just checks the cache and that's it. Contrast that with a normal request for that URL. The normal one will have a cache check, followed by an HTTP_TRANSACTION_SEND_REQUEST section, which is where Chrome makes the actual HTTP request.
If javascript modifies DOM in page A, user navigates to page B and then hits back button to get back to the page A. All modifications to DOM of page A are lost and user is presented with version that was originally retrieved from the server.
It works that way on stackoverflow, reddit and many other popular websites. (try to add test comment to this question, then navigate to different page and hit back button to come back - your comment will be "gone")
This makes sense, yet some websites (apple.com, basecamphq.com etc) are somehow forcing browser to serve user the latest state of the page. (go to http://www.apple.com/ca/search/?q=ipod, click on say Downloads link at the top and then click back button - all DOM updates will be preserved)
where is the inconsistency coming from?
One answer: Among other things, unload events cause the back/forward cache to be invalidated.
Some browsers store the current state of the entire web page in the so-called "bfcache" or "page cache". This allows them to re-render the page very quickly when navigating via the back and forward buttons, and preserves the state of the DOM and all JavaScript variables. However, when a page contains onunload events, those events could potentially put the page into a non-functional state, and so the page is not stored in the bfcache and must be reloaded (but may be loaded from the standard cache) and re-rendered from scratch, including running all onload handlers. When returning to a page via the bfcache, the DOM is kept in its previous state, without needing to fire onload handlers (because the page is already loaded).
Note that the behavior of the bfcache is different from the standard browser cache with regards to Cache-Control and other HTTP headers. In many cases, browsers will cache a page in the bfcache even if it would not otherwise store it in the standard cache.
jQuery automatically attaches an unload event to the window, so unfortunately using jQuery will disqualify your page from being stored in the bfcache for DOM preservation and quick back/forward. [Update: this has been fixed in jQuery 1.4 so that it only applies to IE]
Information about the Firefox bfcache
Information about the Safari Page Cache and possible future changes to how unload events work
Opera uses fast history navigation
Chrome doesn't have a page cache ([1], [2])
Pages for playing with DOM manipulations and the bfcache:
This page will be stored in the regular cache
This page will not, but will still be bfcached
I've been trying to get Chrome to behave like Safari does, and the only way I've found that works is to set Cache-control: no-store in the headers. This forces the browser to re-fetch the page from the server when the user presses the back button. Not ideal, but better than being shown an out-of-date page.
Facebook remembers page state by modifying the hash identifier in the URL for ajax requests. These changes are recorded in browser history, so when the user clicks the back button, the hash changes to what it was before. So then it is implied that you will need some Javascript to monitor the has identifier and react when it is changed by the browser. Andreas Blixt has a hash monitoring script available.
This has nothing to do with the hash (#) symbol.
If you would check apple's HTTP headers, it's simply caching the page.
Using the URL hash/fragment identifier is a pretty common way to hook/remember state in a web application that relies on Ajax and DOM updates.
Check out the Really Simple History project for some ideas. It's possible to monitor the URL for changes to the hash, and rsh does this, taking into account browser differences.
For anybody running in problems with Rails and this -- your issue isn't bfcache (I thought it was) -- it's the turbolinks gem. Here is how to remove it.
Hopefully this'll save you some time and banging your head against the wall.
What you are looking for is for some type of URL hash management. The # in the url is for client side only.
When you change the state of the back with JS, then you update the data in the # of the url.
Also you add some type of polling that monitors if the hash has changed, and loads the state of the page based off the new data in the hash.
Take a look at this:
http://ajaxpatterns.org/Unique_URLs