Execution order of Http Response headers? - javascript

I saw this plugin which download files using Ajax and some other fallback techniques.
But since ajax download file feature is not supported in all browsers , he used a trick with Iframe. ( which is pretty easy to implement)
But one thing caught my eye :
He also added an option which tells you when the file has finished
download.
He did it via cookie. He polls to see if the cookie via setInterval. as long as the cookie does not exist - the file wasn't finish download.( and when the cookie is present - the file has downloaded)
So the header for downloading a file is:
Content-Disposition: attachment; filename=Report0.pdf
And he added :
Set-Cookie: fileDownload=true; path=/
But then I thought - who said that set-cookie is called after the file has finish downloaded ?
Questions:
Looking at the actual headers :
1 - Does the browser digest each header according to the actual order of appearance ?
2 - Are there any headers which must appear prior to other headers ?
3 - Does the digest of each header - blocks the digest until current hedare digest is completed ? I mean : does the line content-disposition:attachment;filename=1.jpg prevents the browser from digesting the next header - until the filename=1.jpg is finished loading ?
nb
I've also tried investigate it via fiddler but I didn't get any conclusion.( I mean how can I test it in fiddler ?)

You're right to be skeptical.
There's no requirement that a client wait until the response body is complete to evaluate the Set-Cookie header that preceded the body, and there's in fact good reason to believe that most browsers will set the cookie before the body is complete (since many web pages will look at document.cookie in JavaScript inside a HTML page).
In fact, I tested this (using a MeddlerScript you can see here: http://pastebin.com/SUwCFyxS) and found that IE, Chrome and Firefox all set the cookie before the download completes, and set the cookie even if the user hits "Cancel" on the download.
The HTTP specification includes the notion of a Trailer (which is a header that appears after the response body) but these are little used and not supported in many clients (e.g. WinINET/IE). If the client did support Trailers, the server could send the Set-Cookie header after the body which would mean that the client couldn't see it until the body finished downloading.

Related

Looking at how imgur serves its images. How come when I check network tab in Chrome, its sending this PNG and its only 387B? [duplicate]

I'm using the Google "Page Speed" plug-in for Firefox to access my web site.
Some of the components on my page is indicated as HTTP status:
200
200 (cache)
304
By Google's "Page Speed".
What I'm confused about is the difference between 200 (cache) and 304.
I've refreshed the page multiple times (but have not cleared my cache) and it always seems that my favicon.ico and a few images are status=200 (cache) while some other images are http status 304.
I don't understand why the difference.
UPDATE:
Using Google "Page Speed", I receive a "200 (cache)" for http://example.com/favicon.ico as well as http://cdn.example.com/js/ga.js
But, I receive a http status "304" for http://cdn.example.com/js/combined.min.js
I don't understand why I have two JavaScript files located in the same directory /js/, one returning a http status 304 and the other returning a 200 (cache) status code.
The items with code "200 (cache)" were fulfilled directly from your browser cache, meaning that the original requests for the items were returned with headers indicating that the browser could cache them (e.g. future-dated Expires or Cache-Control: max-age headers), and that at the time you triggered the new request, those cached objects were still stored in local cache and had not yet expired.
304s, on the other hand, are the response of the server after the browser has checked if a file was modified since the last version it had cached (the answer being "no").
For most optimal web performance, you're best off setting a far-future Expires: or Cache-Control: max-age header for all assets, and then when an asset needs to be changed, changing the actual filename of the asset or appending a version string to requests for that asset. This eliminates the need for any request to be made unless the asset has definitely changed from the version in cache (no need for that 304 response). Google has more details on correct use of long-term caching.
200 (cache) means Firefox is simply using the locally cached version. This is the fastest because no request to the Web server is made.
304 means Firefox is sending a "If-Modified-Since" conditional request to the Web server. If the file has not been updated since the date sent by the browser, the Web server returns a 304 response which essentially tells Firefox to use its cached version. It is not as fast as 200 (cache) because the request is still sent to the Web server, but the server doesn't have to send the contents of the file.
To your last question, I don't know why the two JavaScript files in the same directory are returning different results.
This threw me for a long time too. The first thing I'd verify is that you're not reloading the page by clicking the refresh button, that will always issue a conditional request for resources and will return 304s for many of the page elements. Instead go up to the url bar select the page and hit enter as if you had just typed in the same URL again, that will give you a better indicator of what's being cached properly. This article does a great job explaining the difference between conditional and unconditional requests and how the refresh button affects them:
http://blogs.msdn.com/b/ieinternals/archive/2010/07/08/technical-information-about-conditional-http-requests-and-the-refresh-button.aspx
HTTP 304 is "not modified". Your web server is basically telling the browser "this file hasn't changed since the last time you requested it." Whereas an HTTP 200 is telling the browser "here is a successful response" - which should be returned when it's either the first time your browser is accessing the file or the first time a modified copy is being accessed.
For more info on status codes check out http://en.wikipedia.org/wiki/List_of_HTTP_status_codes.
For your last question, why ? I'll try to explain with what I know
A brief explanation of those three status codes in layman's terms.
200 - success (browser requests and get file from server)
If caching is enabled in the server
200 (from memory cache) - file found in browser, so browser is not going request from server
304 - browser request a file but it is rejected by server
For some files browser is deciding to request from server and for some it's deciding to read from stored (cached) files. Why is this ? Every files has an expiry date, so
If a file is not expired then the browser will use from cache (200 cache).
If file is expired, browser requests server for a file. Server check file in both places (browser and server). If same file found, server refuses the request. As per protocol browser uses existing file.
look at this nginx configuration
location / {
add_header Cache-Control must-revalidate;
expires 60;
etag on;
...
}
Here the expiry is set to 60 seconds, so all static files are cached for 60 seconds. So if u request a file again within 60 seconds browser will read from memory (200 memory). If u request after 60 seconds browser will request server (304).
I assumed that the file is not changed after 60 seconds, in that case you would get 200 (ie, updated file will be fetched from server).
So, if the servers are configured with different expiring and caching headers (policies), the status may differ.
In your case you are using cdn, the main purpose of cdn is high availability and fast delivery. Therefore they use multiple servers. Even though it seems like files are in same directory, cdn might use multiple servers to provide u content, if those servers have different configurations. Then these status can change. Hope it helps.

Does browser continue to send requests for resource even with `Cache-Control: max-age`

I'm reading this great article on caching and there is the following there:
Validators are very important; if one isn’t present, and there isn’t
any freshness information (Expires or Cache-Control) available, caches
will not store a representation at all.
The most common validator is the time that the document last changed,
as communicated in Last-Modified header. When a cache has a
representation stored that includes a Last-Modified header, it can use
it to ask the server if the representation has changed since the last
time it was seen, with an If-Modified-Since request.
So, I'm wondering whether browser continues to send requests (for example HEAD) for a resource even if I specified Cache-Control: max-age=3600? If it doesn't, than what's the point in this header? Is it used after the max-age time passes?
The Cache-Control: max-age=3600 header means that the browser will cache the response for up to 3600 seconds. After that time has passed it may no longer serve the response without first confirming that it is still fresh.
In order to this, the browser can either:
Fetch the full resource with a normal GET request (transfers the whole response body again)
Or perform a revalidation based on an ETag (If-None-Match) or the Last-Modified header (If-Modified-Since), i.e. the client only fetches the response body if it has actually changed. This is of course only possible if the validator was present in the original response.
In short: the reason to use both max-age and a cache validator is to first cache the response for some time and then perform a bandwidth-saving revalidation to confirm the resource's freshness.

Why directive templates sometimes are loaded from server (with 304 response) and sometimes from browser cache (no request at all)?

When I reload page, Angular directive templates are loaded in two ways.
First one - browser makes a request to the server and it responses with 304 - it's ok.
But the second one - browser doesn't make a request. And I can't guess why.
As a result, when I make changes in templates from the first group, the changes are shown with the next page reload. But the changes in templates from the second group are not shown. That's the trouble.
And the question is - how to make the browser send request to the server for each template?
It seems that in the response headers for templates there are no Cache-Control headers. In such case a browser will use a heuristic to decide for how long a response can be cached.
To solve your problem of having always fresh templates fetched in development. You can:
check "Disable cache" in developer tools
set a proper Cache-Control header for resources you care about i.e :
Cache-Control: no-cache
If you want to understand different behaviours triggered by various Cache-Control values I highly recommend this article by Ilya Grigorik.

"Re open last closed tab" causing to show last ajax request content

I am using HTML 5 history api to save state when ajax requests happen and i provide full html content if user request to same page with none ajax request.
"Re-open last closed tab" feature of browser brings last ajax request content without hitting to server. If browser would request without bring last request content then everything would work without problem. But browser just show last ajax request content.
I have been experienced this on Chrome 17, Firefox 10. (i haven't tried it on ie9 because it has no support history api)
What is well-known solution for this problem ?
Edit: These ajax requests are just "get" request to server.
it is really not possible to demonstrate it in jsfiddle.net because few reasons. You can demonstrate it in your localhost like below.
Make "get" request to server and pull json objects then push that url into history api like below.
history.pushState(null,null,url);
Then close that tab and click "Re-open last closed tab" feature of your browser. What do you see ? Json response body ? Browser shows it without making request to server, right ?
Problem was causing by http response headers. Headers was contain cacheable information for ajax requests so browser was showing that url content from cache without hit to database.
After removing cache params from response headers then browser was able to hit server without brings content from cache.
When you reopen a closed tab, the browser is allowed to reuse the data from cache for the given URL to populate the window. Since the data in cache is from the ajax request response, that's what it uses, and you see the JSON.
So that leads to the question: Why didn't the browser use the HTML from cache when satisfying the ajax request? Browsers use different rules to determine whether to use cached content depending on what they're doing. In this case, it appears Chrome is happy to reuse it when restoring the recently-closed tab, and not when doing the ajax request.
You can correct it by telling the browser to never cache the response. Whether that's desirable depends on your use case.
For instance, inserting these at the top of your file (after the opening <?php tag, of course) makes it not happen for me:
header("Cache-Control: no-store, no-cache, must-revalidate, max-age=0");
header("Cache-Control: post-check=0, pre-check=0", false);
header("Pragma: no-cache");
It all depends which browser you are using, and which optimisations are enabled.
Google Chrome for instance will keep the page in memory, so when you hit back and it goes to a new site, or when you do re-open closed tab - it will restore the page from memory.
Older/slower browsers will just refresh anything.
Though this shouldn't really be a problem, as your javascript state should be restored as well - it should be exactly the same in every way when they re-open that page.

Why serve 1x1 pixel GIF (web bugs) data at all?

Many analytic and tracking tools are requesting 1x1 GIF image (web bug, invisible for the user) for cross-domain event storing/processing.
Why to serve this GIF image at all? Wouldn't it be more efficient to simply return some error code such as 503 Service Temporary Unavailable or empty file?
Update: To be more clear, I'm asking why to serve GIF image data when all information required has been already sent in request headers. The GIF image itself does not return any useful information.
Doug's answer is pretty comprehensive; I thought I'd add in an additional note (at the OP's request, off of my comment)
Doug's answer explains why 1x1 pixel beacons are used for the purpose they are used for; I thought I'd outline a potential alternative approach, which is to use HTTP Status Code 204, No Content, for a response, and not send an image body.
204 No Content
The server has fulfilled the request
but does not need to return an
entity-body, and might want to return
updated metainformation. The response
MAY include new or updated
metainformation in the form of
entity-headers, which if present
SHOULD be associated with the
requested variant.
Basically, the server receives the request, and decides to not send a body (in this case, to not send an image). But it replies with a code to inform the agent that this was a conscious decision; basically, its just a shorter way to respond affirmatively.
From Google's Page Speed documentation:
One popular way of recording page
views in an asynchronous fashion is to
include a JavaScript snippet at the
bottom of the target page (or as an
onload event handler), that notifies a
logging server when a user loads the
page. The most common way of doing
this is to construct a request to the
server for a "beacon", and encode all
the data of interest as parameters in
the URL for the beacon resource. To
keep the HTTP response very small, a
transparent 1x1-pixel image is a good
candidate for a beacon request. A
slightly more optimal beacon would use
an HTTP 204 response ("no content")
which is marginally smaller than a 1x1
GIF.
I've never tried it, but in theory it should serve the same purpose without requiring the gif itself to be transmitted, saving you 35 bytes, in the case of Google Analytics. (In the scheme of things, unless you're Google Analytics serving many trillions of hits per day, 35 bytes is really nothing.)
You can test it with this code:
var i = new Image();
i.src = "http://httpstat.us/204";
First, i disagree with the two previous answers--neither engages the question.
The one-pixel image solves an intrinsic problem for web-based analytics apps (like Google Analytics) when working in the HTTP Protocol--how to transfer (web metrics) data from the client to the server.
The simplest of the methods described by the Protocol, the simplest (at lest the simplest method that includes a request body) is the GET request. According to this Protocol method, clients initiate requests to servers for resources; servers process those requests and return appropriate responses.
For a web-based analytics app, like GA, this uni-directional scheme is bad news, because it doesn't appear to allow a server to retrieve data from a client on demand--again, all servers can do is supply resources not request them.
So what's the solution to the problem of getting data from the client back to the server? Within the HTTP context there are other Protocol methods other than GET (e.g., POST) but that's a limited option for many reasons (as evidenced by its infrequent and specialized use such as submitting form data).
If you look at a GET Request from a browser, you'll see it is comprised of a Request URL and Request Headers (e.g., Referer and User-Agent Headers), the latter contains information about the client--e.g., browser type and version, browser langauge, operating system, etc.
Again, this is part of the Request that the client sends to the server. So the idea that motivates the one-pixel gif is for the client to send the web metrics data to the server, wrapped inside a Request Header.
But then how to get the client to Request a resource so it can be "tricked" into sending the metrics data? And how to get the client to send the actual data the server wants?
Google Analytics is a good example: the ga.js file (the large file whose download to the client is triggered by a small script in the web page) includes a few lines of code that directs the client to request a particular resource from a particular server (the GA server) and to send certain data wrapped in the Request Header.
But since the purpose of this Request is not to actually get a resource but to send data to the server, this resource should be a small as possible and it should not be visible when rendered in the web page--hence, the 1 x 1 pixel transparent gif. The size is the smallest size possible, and the format (gif) is the smallest among the image formats.
More precisely, all GA data--every single item--is assembled and packed into the Request URL's query string (everything after the '?'). But in order for that data to go from the client (where it is created) to the GA server (where it is logged and aggregated) there must be an HTTP Request, so the ga.js (google analytics script that's downloaded, unless it's cached, by the client, as a result of a function called when the page loads) directs the client to assemble all of the analytics data--e.g., cookies, location bar, request headers, etc.--concatenate it into a single string and append it as a query string to a URL (*http://www.google-analytics.com/__utm.gif*?) and that becomes the Request URL.
It's easy to prove this using any web browser that has allows you to view the HTTP Request for the web page displayed in your browser (e.g., Safari's Web Inspector, Firefox/Chrome Firebug, etc.).
For instance, i typed in valid url to a corporate home page into my browser's location bar, which returned that home page and displayed it in my browser (i could have chosen any web site/page that uses one of the major analytics apps, GA, Omniture, Coremetrics, etc.)
The browser i used was Safari, so i clicked Develop in the menu bar then Show Web Inspector. On the top row of the Web Inspector, click Resources, find and click the utm.gif resource from the list of resources shown on the left-hand column, then click the Headers tab. That will show you something like this:
Request URL:http://www.google-analytics.com/__utm.gif?
utmwv=1&utmn=1520570865&
utmcs=UTF-8&
utmsr=1280x800&
utmsc=24-bit&
utmul=enus&
utmje=1&
utmfl=10.3%20r181&
Request Method:GET
Status Code:200 OK
Request Headers
User-Agent:Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/533.21.1
(KHTML, like Gecko) Version/5.0.5 Safari/533.21.1
Response Headers
Cache-Control:private, no-cache, no-cache=Set-Cookie, proxy-revalidate
Content-Length:35
Content-Type:image/gif
Date:Wed, 06 Jul 2011 21:31:28 GMT
The key points to notice are:
The Request was in fact a request
for the utm.gif, as evidenced by the
first line above: *Request
URL:http://www.google-analytics.com/__utm.gif*.
The Google Analytics parameters are clearly visible in the query string
appended to the Request URL: e.g.,
utmsr is GA's variable name to refer to the client screen
resolution, for me, shows a value of
1280x800; utmfl is the variable
name for flash version, which has a
value of 10.3, etc.
The Response Header called
Content-Type (sent by the server back to the client) also confirms
that the resource requested and
returned was a 1x1 pixel gif:
Content-Type:image/gif
This general scheme for transferring data between a client and a server has been around forever; there could very well be a better way of doing this, but it's the only way i know of (that satisfies the constraints imposed by a hosted analytics service).
Some browsers may display an error icon if the resource could not load. It makes debugging/monitoring the service also a little bit more complicated, you have to make sure that your monitoring tools treat the error as a good result.
OTOH you don't gain anything. The error message returned by the server/framework is typically bigger then the 1x1 image. This means you increase your network traffic for basically nothing.
Because such a GIF has a known presentation in a browser - it's a single pixel, period. Anything else presents a risk of visually interfering with the actual content of the page.
HTTP errors could appear as oversized frames of error text or even as a pop-up window. Some browsers may also complain if they receive empty replies.
In addition, in-page images are one of the very few data types allowed by default in all broswers. Anything else may require explicit user action to be downloaded.
This is to answer the OP's question - "why to serve GIF image data..."
Some users will put a simple img tag to call your event logging service -
<img src="http://www.example.com/logger?event_id=1234">
In this case, if you don't serve an image, the browser will show a placeholder icon that will look ugly and give the impression that your service is broken!
What I do is, look for the Accept header field. When your script is called via an img tag like this, you will see something like following in the header of the request -
Accept: image/gif, image/*
Accept-Encoding:gzip,deflate
...
When there is "image/"* string in the Accept header field, I supply the image, otherwise I just reply with 204.
Well the major reason is to attach the cookie to it so if users go from one side to another we still have the same element to attach cookie to.
#Maciej Perliński is basically correct, but I feel a detailed answer will be beneficial.
why 1x1 GIF and not a 204 No-Content status code?
204 No-Content enables the server to omit all response headers (Content-Type, Content-Length, Content-Encoding, Cache-Control etc...) and return an empty response body with 0 bytes (and saving a lot of unneeded bandwidth).
Browsers know to respect 204 No-Content responses, and not to expect/wait for response headers and response body.
if the server needs to set any response header (e.g. cache-control or cookie), he cannot use 204 No-Content because browsers will ignore any response header by design (according to the HTTP protocol spec).
why 1x1 GIF and not a Content-Length: 0 header with 200 OK status code?
Probably a mix of several issues, just to name a few:
legacy browsers compatibility
MIME type checks on browsers, 0 bytes is not a valid image.
200 OK with 0 bytes might not be fully supported by intermediate proxy servers and VPNs
You don't have to serve an image if you are using the Beacon API (https://w3c.github.io/beacon/) implementation method.
An error code would work if you have access to the log files of your server. The purpose of serving the image is to obtain more data about the user than you normally would with a log file.

Categories