I was reading this post from Alex Maccaw, where he states :
The last issue is with Ajax requests that get sent out in parallel. If a user creates a record, and then immediately updates the same record, two Ajax requests will be sent out at the same time, a POST and a PUT. However, if the server processes the 'update' request before the 'create' one, it'll freak out. It has no idea what record needs updating, as the record hasn't been created yet.
The solution to this is to pipeline Ajax requests, transmitting them serially. Spine does this by default, queuing up POST, PUT and DELETE Ajax requests so they're sent one at a time. The next request sent only after the previous one has returned successfully.
But the HTTP spec Sec 8.1.2.2 Pipelining says:
Clients SHOULD NOT pipeline requests using non-idempotent methods or non-idempotent sequences of methods (see section 9.1.2). Otherwise, a premature termination of the transport connection could lead to indeterminate results.
So, does Spine really 'pipeline' POSTs ?
Maccaw's usage of the term "pipelineing" and that of the HTTP spec are not the same here. Actually, they're opposite.
In the HTTP spec, the term "pipelining" means sending multiple requests without waiting for a response. See section 8.1.2.2.
A client that supports persistent connections MAY "pipeline" its
requests (i.e., send multiple requests without waiting for each
response).
Based on this definition, you can see why the spec would strongly discourage pipelining non-idempotent requests, as one of the pipelined requests might change the state of the app, with unexpected results.
When Maccaw writes about spine's "pipelining", he's actually referring to the solution to the fact that the client will "pipeline" requests without waiting for a response, as per the HTTP spec. That is, spinejs will queue the requests and submit them serially, each consecutive request being made only after its predecessor completes.
Related
In theory, one should use the HTTP GET method only for idempotent requests.
But, for some intricate reasons, I cannot use any other method than GET and my requests are not idempotent (they mutate the database). So my idea is to use the Cache-Control: no-cache header to ensure that any GET request actually hits the database. Also, I cannot change the URLs which means I cannot append a random URL argument to bust caches.
Am I safe or shall I implement some kind of mechanism to ensure that the GET request was received exactly once? (The client is the browser and the server is Node.js.)
What about a GET request that gets duplicated by some middle-man resulting in the same GET request being received twice by the server? I believe the spec allows such situation but does this ever happen in "real life"?
I've never seen a middle man, such as Cloudflare or NGNIX, preventing or duplicating a GET request with Cache-Control: no-cache.
Let's start by saying what you've already pointed out -- GET requests should be idempotent. That is, they should not modify the resource and therefore should return the same thing every time (barring any other methods being used to modify it in the meantime.)
It's worth pointing out, as restcookbook.com notes, that this doesn't mean nothing can change as a result of the request. Rather, the resource's representation should not change. So for instance, your database might log the request, but shouldn't return a different value in the response.
The main concern you've listed is middleware caching.
The danger isn't that the middleware sends the request to your server more than once (you mentioned 'duplicating' a request), but rather that (a) it sends an old, cached, no-longer-accurate response to whatever is making the request, and (b) the request does not reach the server.
For instance, imagine a response returning a count property that starts at 0 and increments when the GET endpoint is hit. Request #1 will return "1" as the count. Request #2 should now return "2" as the count, but if its cached, it might still show as 1, and not hit the server to increase the count to 2. That's 2 separate problems you have (caching, and not updating).
So, will a middleware prevent a request from reaching the server and serve a cached copy instead? We don't know. It depends on the middleware. You can absolutely write one right now that does just that. You can also write one that doesn't.
If you don't know what will be consuming your API, then it's not a great option. But whether it's "safe" depends on the specifics.
As you know, it's always best to follow the set of expectations that comes with the grammar of HTTP requests. Deviating from them sets yourself up for failure in many ways. (For instance, there are different security expectations for requests based on method. A browser may treat a GET request as "simple" from a CORS perspective, while it would never treat a PATCH request as such.)
I would go to great lengths to not break this convention, but if I were forced by circumstances to break this expectation, I would definitely note it in my APIs documentation.
One workaround to ensure that your GET request is only called once is to allow caching of responses and use the Vary header. The spec for the Vary header can be found here.
In summary, a Vary header basically tells any HTTP cache, which parts of the request header to take into account when trying to find the cached object.
For example, you have an endpoint /api/v1/something that accepts GET requests and does the required database updates. Let's say that when successful, this endpoint returns the following response.
HTTP/1.1 200 OK
Content-Length: 3458
Cache-Control: max-age=86400
Vary: X-Unique-ID
Notice the Vary header has a value of X-Unique-ID. This means that if you include the X-Unique-ID header in your request, any HTTP caching layer (be it the browser, CDN, or other middleware) will use the value in this header to determine whether to use a previously cached response or not.
Say your make a first request that includes a X-Unique-ID header with the value id_1 then you make a subsequent request with X-Unique-ID value of id_2. The caching layer will not use a previously cached response for the second request because the value of the X-Unique-ID is different.
However, if you make another request that contains the X-Unique-ID value of id_1 again, the caching layer will not make a request to the backend but instead reuse the cached response for the first request assuming that the cache hasn't expired yet.
One thing you have to consider though is this will only work if the caching layer actually respects the specifications for the Vary header.
The Hypertext Transfer Protocol (HTTP) is designed to enable communications between clients and servers.
where Get method is used to request the data from specified resources.
When we used 'Cache-control: no-cache' it means the cache can't store anything about the client request
or server responses. That Request hits to the server and a full response is downloaded each and every time.
This depends a lot on what's sat in the middle and where the retry logic sits, if there is any. Almost all of your problems will be in failure handling and retry handling - not the basic requests.
Let's say, for example that Alice talks to Bob via a proxy. Let's assume for the sake of simplicity that the requests are small and the proxy logic is pure store-and-forward. i.e. most of the time a request either gets through or doesn't but is unlikely to get stalled half-way through. There's no guarantee this is the case and some proxies will stop requests part-way through by design.
Alice -> Proxy GET
Proxy -> Bob GET
Bob -> Proxy 200
Proxy -> Alice 200
So far so good. Now imagine Bob dies before responding to the proxy. Does the proxy retry? If so, we have this:
Alice -> Proxy GET
Proxy -> Bob GET
Bob manipulates database then dies
Proxy -> Bob GET (retry)
Now we have a dupe
Unlikely, but possible.
Now imagine (much more likely) that the proxy (or even more likely, some bit of the network between the proxy and the client) dies. Does the client retry? If so, we have this:
Alice -> Proxy GET
Proxy -> Bob GET
Bob -> Proxy 200
Proxy or network dies
Alice -> Proxy GET (retry)
Proxy -> Bob GET
Is this a dupe or not? Depends on your point of view
Plus, for completeness there's also the degenerate case where the server receives the request zero times.
What happens when triggering a single GET request, while simultaneously a http2 push is in-flight for the same resource?
What is the specified behavior and what do the browsers actually do?
An example scenario could look like this:
at time 0: GET / (get document) and the server pushes /data.json
at time 1: GET /data.json (triggered by script, while the h2 push is still not finished / in-flight)
Will this result in two calls towards the server? Is this behavior specified or browser specific, e.g. in Chromium maybe via the HTTP Cache:
The cache implements a single writer - multiple reader lock so that only one network request for the same resource is in flight at any given time.
https://www.chromium.org/developers/design-documents/network-stack/http-cache
The HTTP/2 specification in RFC 7540 says:
Once a client receives a PUSH_PROMISE frame and chooses to accept the
pushed response, the client SHOULD NOT issue any requests for the
promised response until after the promised stream has closed.
So it seems to be likely that the request will wait for the push response to be delivered, if the server does not take too long to start sending:
If the client determines, for any reason, that it does not wish to
receive the pushed response from the server or if the server takes
too long to begin sending the promised response, the client can send
a RST_STREAM frame, using either the CANCEL or REFUSED_STREAM code
and referencing the pushed stream's identifier.
If I make a AJAX reqeust it will be displayed in the network tab in Chrome. If I in the same moment makes a client based redirect, the AJAX request will canceled. But will the request make it to the server and execute as normal? Is it something in HTTP/TCP that know's that the client has canceled the request? I don't think so, but I want to be sure.
If you're running PHP server-side, it will stop processing in the event of a client-side abort. (From what I've read, this isn't the case with other server-side technologies, which will continue processing after a client aborts.) See:
http://php.net/manual/en/features.connection-handling.php
But, it's best not to assume anything one way or another. The browser may cancel the request. And this cancellation may occur in time to stop processing server-side. But, that's not necessarily the case. The client could cancel at any stage during the request -- from right before the request is actually sent to just after a response body is sent. Also bear in mind, there are other things which can interrupt server-side request processing (hardware, power, OS failure, etc.). Expect some unpredictability.
From this, I'd make two recommendations:
Write your code be as transaction-safe as possible. If a request makes data changes, don't commit them until all changes have been piped to the database. And if your application relies on multiple AJAX requests to change some data, don't commit any of the changes until the the end of "final" AJAX request.
Do not assume, even if a request finishes, that the client receives the response. Off the top of my head, that means if your application is AJAX-heavy, always rely on client-side state to tell the server what information it has, rather than relying on server-side state to assume the client "knows" something.
This is one of the few cases where synchronous requests (async: false in the $.ajax(...) options) are appropriate. This usually avoids the browser from navigating to the other page until the request has finished.
This question seems to suggest that Ajax requests are not guaranteed to return in their sent order. However, Ajax uses the TCP protocol, which seems to guarantee that the packets will return in their sent order:
Ordered data transfer — the destination host rearranges according to sequence number
Are asynchronous Ajax requests guaranteed to return in the order that they were sent?
No.
This has nothing to do with TCP. It's due to the fact that a request must be handled by an HTTP server and there's no guarantee that parallel requests will take the same time to complete.
Are asynchronous Ajax requests guaranteed to return in the order that they were sent?
Nope. What if the server takes, say, 3 times as long to respond to the first request? Example:
Time 0: request A sent
Time 1: request B sent
Time 2: server processing requests A and B
Time 3: server processing request A, sends response B to client
Time 4: server processing request A
Time 5: server sends response A to client
I believe that you confusing two contexts here, in ajax if you fire off two requests, at the "same" time, one is not guaranteed to return before the other. This has nothing to do with TCP, which is on a different layer of the OSI model. TCP packets make up the traffic and the "reorder and rearrangement" occurs completely invisible to the http protocol (which ajax is a part of).
The term "asynchronous" answers your own question. However, there are circumstances in which asynchronous requests may, effectively, become synchronous. See this answer for more on that.
When you send a request, the server will begin to handle that request. If another request follows, the server will begin working on it (if it can), and so on. As each requests finishes (with or without output), your callback will be fired (if present).
All of the packets pertaining to a single TCP request are guaranteed to be received in order. This ordering though only applies to a single request. Multiple requests can be sent to various hosts and there is normally no guarantee as to the order in which you will receive responses from the hosts you are interacting with. Thus when sending asynchronous requests, you are essentially sending out multiple requests in parallel and it is impossible to guarantee the order in which requests will be responded to, since each request is independent of all others.
Ordered data transfer — the destination host rearranges according to sequence number
You are right... but taking it out of context. AJAX requests are over HTTP, which is in turn over TCP.
Each AJAX request is a different HTTP request, which is in turn over a different TCP connection, therefore they are not rearranged and ordered in the way you think they are.
Since each AJAX HTTP request may take a variable amount of time to be handled, and they are being handled concurrently, there is no guarantee about the order in which they finish.
If an http request is made and the caller abandons the request does it get completed anyway? For example an asynchronous JavaScript GET request to log a banner click in the DB then redirect. Does the script need to wait for the response?
How critical is your request? What if the database is not available at that time? What if the server side code throws an exception?
For very critical requests, you may need to implement some sort of message queueing that is able to hold the request data until it can be fully processed. This gets more complicated if you are dealing with grids and clouds (you can't just queue the message on a single node, since the node can potentially have a hardware failure). But this is an extreme case, where you end up with dedicated queue servers.
The client doesn't send any notification to the server that it is canceling the request.
PHP doesn't know if the client has disconnected until it tries to send the client some data (eg., an unbuffered echo() call), so if your script doesn't return any data to the user, it will fully execute. If it does return data it may abort partway through, but this can be changed with ignore_user_abort())
If you're using a different environment, you will have to explore the documentation.
For most cases, once the request is received by the server, it will not stop processing if the client stops listening.
However, the server can always fail while servicing the request, so it's probably not a good idea to assume it completed.
You should wait for it to be safe. You never know when the server is going to get around to processing your request (though it is usually within a couple hundred milliseconds or less), so you won't know if something timed out, failed, or if you were going to receive a different response than expected unless you wait.
You don't have to wait for the response in order for the request to reach the server. The server can check if someone is still listening while processing the request, but processing the request will start even if noone is listening for the response (unless there was an error on the way, of course).
If you want to be sure that the request really was processed, you should wait for the response, but it's not required for the request to go through to the server.