I'm using JavaScript's Cache Web API to store responses from my server to requests made by my client application. Moreover, I need a way to programmatically remove them based on when the request was originally made. Here is the code I use to store the responses:
/** Searches for the corresponding cache for the given request. If found, returns
* the cached response. Otherwise, performs the fetch request and adds the response
* to cache. Returns the HTTP response.
*/
export async function fetchCachedData(request: Request) {
const cache = await caches.open(CACHE_NAME);
// Check if response is already cached
const cachedResponse = await cache.match(request);
if (cachedResponse) {
console.debug("Using cached response for", request.url);
return cachedResponse.clone();
}
// Fetch new response
console.debug("Fetching", request.url, "...");
const response = await fetchFromAPI(request);
const responseDate = new Date().getTime().toString();
response.headers.set("Date", responseDate);
// Cache the new response
if (
response.ok /*&& response.clone().headers.get("Cache-Control") !== "no-store"*/
) {
await cache.put(request, response.clone());
console.info("Cached response as", response.url);
}
return response.clone();
}
This approach seems to work on browsers like Firefox, however on Chrome I get an error telling me that headers is read-only:
TypeError: Failed to execute 'set' on 'Headers': Headers are immutable
I have also tried setting the Date header on the server side, however it appears that not all headers set in the express-based app are honoured when cloning and retrieving them from the cache. This is why I wish to manually set the request date when the response is retrieved on the client side.
I don't necessarily need the date to be stored in the cached response's headers, that's just the way I have my cache filtering code set up. Ideally, the request date should be stored somewhere in the response object so that it is preserved when using clone() and is present in the cache.
I've solved this by using the pragma HTTP header -- it appears to be unused as of the HTTP/1.1 spec, however when set on the server side it is preserved in the headers of the response object obtained after making a request from the fetch() API.
Server code (Express.js):
export function headerMiddleware(
_req: Request,
res: Response,
next: NextFunction
) {
const now = new Date().getTime();
res.setHeader("pragma", now);
next();
}
This implementation is probably discouraged since pragma is a deprecated header and its original intention was to signal if no-cache is how the response is to be handled, however when I set its value to a numeric string there appear to be no errors and the solution runs smoothly. Ideally, I'd use the Date header, however if I set that on the server side then the header is removed upon inspection on the client.
Related
I’ve now spent countless hours trying to get the cache API to cache a simple request. I had it working once in between but forgot to add something to the cache key, and now its not working anymore. Needless to say, cache.put() not having a return value that specifies if the request was actually cached or not does not exactly help and I am left with trial and error. Can someone maybe give me a hint on what I’m doing wrong and what is actually required? I’ve read all the documentation more than 3 times now and I’m at a loss…
Noteworthy maybe is that this REST endpoint sets pragma: no-cache and everything else cache-related to no-cache, but i want to forcibly cache the response anyway which is why I tried to completely re-write the headers before caching, but it still isn’t working (not matching or not storing, no one knows…)
async function apiTest(token, url) {
let apiCache = await caches.open("apiResponses");
let request = new Request(
new URL("https://api.mysite.com/api/"+url),
{
headers: {
"Authorization": "Bearer "+token,
}
}
)
// Check if the response is already in the cloudflare cache
let response = await apiCache.match(request);
if (response) {
console.log("Serving from cache");
}
if (!response) {
// if not, ask the origin if the permission is granted
response = await fetch(request);
// cache response in cloudflare cache
response = new Response(response.body, {
status: response.status,
statusText: response.statusText,
headers: {
"Cache-Control": "max-age=900",
"Content-Type": response.headers.get("Content-Type"),
}
});
await apiCache.put(request, response.clone());
}
return response;
}
Thanks in advance for any help, I've asked the same question on the Cloudflare community first and not received an answer in 2 weeks
This might be related to your use of caches.default, instead of opening a private cache with caches.open("whatever"). When you use caches.default, you are sharing the same cache that fetch() itself uses. So when your worker runs, your worker checks the cache, then fetch() checks the cache, then fetch() later writes the cache, and then your worker also writes the same cache entry. Since the write operations in particular happen asynchronously (as the response streams through), it's quite possible that they are overlapping and the cache is getting confused and tossing them all out.
To avoid this, you should open a private cache namespace. So, replace this line:
let cache = caches.default;
with:
let cache = await caches.open("whatever");
(This await always completes immediately; it's only needed because the Cache API standard insists that this method is asynchronous.)
This way, you are reading and writing a completely separate cache entry from the one that fetch() itself reads/writes.
The use case for caches.default is when you intentionally want to operate on exactly the cache entry that fetch() would also use, but I don't think you need to do that here.
EDIT: Based on conversation below, I now suspect that the presence of the Authorization header was causing the cache to refuse to store the response. But, using a custom cache namespace (as described above) means that you can safely cache the value using a Request that doesn't have that header, because you know the cached response can only be accessed by the Worker via the cache API. It sounds like this approach worked in your case.
Specifically I am interested in changing all responses with code 403 to code 404, and changing all responses with code 301 to 302. I do not want any other part of the response to change, except the status text (which I want to be empty). Below is my own attempt at this:
addEventListener("fetch", event => {
event.respondWith(fetchAndModify(event.request));
});
async function fetchAndModify(request) {
// Send the request on to the origin server.
const response = await fetch(request);
const body = await response.body
newStatus = response.status
if (response.status == 403) {
newStatus = 404
} else if (response.status == 301) {
newStatus = 302
}
// Return modified response.
return new Response(body, {
status: newStatus,
statusText: "",
headers: response.headers
});
}
I have confirmed that this code works. I would like to know if there is any possibility at all that this overwrites part of the response other than the status code or text, and if so, how can I avoid that? If this goes against certain best practices of Cloudflare workers or javascript, please describe which ones and why.
You've stumbled on a real problem with the Fetch API spec as it is written today.
As of now, status, statusText, and headers are the only standard properties of Response's init structure. However, there's no guarantee that they will remain the only properties forever, and no guarantee that an implementation doesn't provide additional non-standard or not-yet-standard properties.
In fact, Cloudflare Workers today implements a non-standard property: webSocket, which is used to implement WebSocket proxying. This property is present if the request passed to fetch() was a WebSocket initiation request and the origin server completed a WebSocket handshake. In this case, if you drop the webSocket field from the Response, WebSocket proxying will break -- which may or may not matter to you.
Unfortunately, the standard does not specify any good way to rewrite a single property of a Response without potentially dropping unanticipated properties. This differs from Request objects, which do offer a (somewhat awkward) way to do such rewrites: Request's constructor can take another Request object as the first parameter, in which case the second parameter specifies only the properties to modify. Alternately, to modify only the URL, you can pass the URL as the first parameter and a Request object as the second parameter. This works because a Request object happens to be the same "shape" as the constructor's initializer structure (it's unclear if the spec authors intended this or if it was a happy accident). Exmaples:
// change URL
request = new Request(newUrl, request);
// change method (or any other property)
request = new Request(request, {method: "GET"});
But for Response, you cannot pass an existing Response object as the first parameter to Response's constructor. There are straightforward ways to modify the body and headers:
// change response body
response = new Response(newBody, response);
// change response headers
// Making a copy of a Response object makes headers mutable.
response = new Response(response.body, response);
response.headers.set("Foo", "bar");
But if you want to modify status... well, there's a trick you can do, but it's not pretty:
// Create an initializer by copying the Response's enumerable fields
// into a new object.
let init = {...response};
// Modify it.
init.status = 404;
init.statusText = "Not Found";
// Work around a bug where `webSocket` is `null` but needs to be `undefined`.
// (Sorry, I only just noticed this when testing this answer! We'll fix this
// in the future.)
init.webSocket = init.webSocket || undefined;
// Create a new Response.
response = new Response(response.body, init);
But, ugh, that sure was ugly.
I have proposed improvements to the Fetch API to solve this, but I haven't yet had time to follow through on them. :(
I want to dynamically add http-headers via CloudFlare workers ONLY for the first time visitors. For example these headers:
Link: </path/to/file.css>; rel=preload; as=style; nopush
Link: </path/to/script.js>; rel=preload; as=script; nopush
So, what I need is the following, via JavaScript, in CloudFlare Workers:
Check if a specific cookie exists on the client's side.
If the cookie doesn't exist add http-headers and then set that specific cookie.
If the cookie does exist do nothing.
You can play with the code here.
Here's a general example (involving cookie and headers) from the CF's blog:
// A Service Worker which skips cache if the request contains
// a cookie.
addEventListener('fetch', event => {
let request = event.request
if (request.headers.has('Cookie')) {
// Cookie present. Add Cache-Control: no-cache.
let newHeaders = new Headers(request.headers)
newHeaders.set('Cache-Control', 'no-cache')
event.respondWith(fetch(request, {headers: newHeaders}))
}
// Use default behavior.
return
})
Here's a Cloudflare Worker that implements what you describe:
addEventListener('fetch', event => {
event.respondWith(handle(event.request))
})
async function handle(request) {
// Check for cookie.
let cookies = request.headers.get('Cookie') || ""
if (cookies.includes("returning=true")) {
// User has been here before. Just pass request through.
return fetch(request)
}
// Forward request to origin, get response.
let response = await fetch(request)
// Copy Response object so that we can edit headers.
response = new Response(response.body, response)
// Add headers.
response.headers.append("Link",
"</path/to/file.css>; rel=preload; as=style; nopush")
response.headers.append("Link",
"</path/to/script.js>; rel=preload; as=script; nopush")
// Set cookie so that we don't add the headers
// next time.
response.headers.set("Set-Cookie", "returning=true")
// Return on to client.
return response
}
I've managed to modify the worker and provide a solution where Cache-Control header is removed if the user has commented on the website.
addEventListener('fetch', event => {
event.respondWith(addHeaders(event.request))
})
async function addHeaders(request) {
// Forward request to origin, get response.
let response = await fetch(request)
if (response.headers.has("Content-Type") &&
!response.headers.get("Content-Type").includes("text/html")) {
// File is not text/html. Pass request through.
return fetch(request)
}
// Copy Response object so that we can edit headers.
response = new Response(response.body, response)
// Check for cookie.
let cookies = request.headers.get('Cookie') || ""
if (cookies.includes("returning=true")) {
if (cookies.includes("comment_") && response.headers.has("Cache-Control")) {
// User has commented. Delete "cache-control" header.
response.headers.delete("Cache-Control")
// Return on to client.
return response
}
// User has been here before but has not commented.
// Just pass request through.
return fetch(request)
}
// Add headers.
response.headers.set("Link",
"</path/to/file.css>; rel=preload; as=style; nopush")
response.headers.append("Link",
"</path/to/script.js>; rel=preload; as=script; nopush")
// Set cookie so that we don't add the headers next time.
response.headers.set("Set-Cookie", "returning=true")
// Return on to client.
return response
}
However, if you're trying to delete a header which is set by Cloudflare in the first place (in this case Cache-Control) you'll encounter an unknown error 1101 which will make your site inaccessible. Also you can't modify a header set by Cloudflare. Apparently you can ONLY manipulate headers set by origin and eventually add new headers.
Consider this sample index.html file.
<!DOCTYPE html>
<html><head><title>test page</title>
<script>navigator.serviceWorker.register('sw.js');</script>
</head>
<body>
<p>test page</p>
</body>
</html>
Using this Service Worker, designed to load from the cache, then fallback to the network if necessary.
cacheFirst = (request) => {
var mycache;
return caches.open('mycache')
.then(cache => {
mycache = cache;
cache.match(request);
})
.then(match => match || fetch(request, {credentials: 'include'}))
.then(response => {
mycache.put(request, response.clone());
return response;
})
}
addEventListener('fetch', event => event.respondWith(cacheFirst(event.request)));
This fails badly on Chrome 62. Refreshing the HTML fails to load in the browser at all, with a "This site can't be reached" error; I have to shift refresh to get out of this broken state. In the console, it says:
Uncaught (in promise) TypeError: Failed to execute 'fetch' on 'ServiceWorkerGlobalScope': Cannot construct a Request with a Request whose mode is 'navigate' and a non-empty RequestInit.
"construct a Request"?! I'm not constructing a request. I'm using the event's request, unmodified. What am I doing wrong here?
Based on further research, it turns out that I am constructing a Request when I fetch(request, {credentials: 'include'})!
Whenever you pass an options object to fetch, that object is the RequestInit, and it creates a new Request object when you do that. And, uh, apparently you can't ask fetch() to create a new Request in navigate mode and a non-empty RequestInit for some reason.
In my case, the event's navigation Request already allowed credentials, so the fix is to convert fetch(request, {credentials: 'include'}) into fetch(request).
I was fooled into thinking I needed {credentials: 'include'} due to this Google documentation article.
When you use fetch, by default, requests won't contain credentials such as cookies. If you want credentials, instead call:
fetch(url, {
credentials: 'include'
})
That's only true if you pass fetch a URL, as they do in the code sample. If you have a Request object on hand, as we normally do in a Service Worker, the Request knows whether it wants to use credentials or not, so fetch(request) will use credentials normally.
https://developers.google.com/web/ilt/pwa/caching-files-with-service-worker
var networkDataReceived = false;
// fetch fresh data
var networkUpdate = fetch('/data.json').then(function(response) {
return response.json();
}).then(function(data) {
networkDataReceived = true;
updatePage(data);
});
// fetch cached data
caches.match('mycache').then(function(response) {
if (!response) throw Error("No data");
return response.json();
}).then(function(data) {
// don't overwrite newer network data
if (!networkDataReceived) {
updatePage(data);
}
}).catch(function() {
// we didn't get cached data, the network is our last hope:
return networkUpdate;
}).catch(showErrorMessage).then(console.log('error');
Best example of what you are trying to do, though you have to update your code accordingly. The web example is taken from under Cache then network.
for the service worker:
self.addEventListener('fetch', function(event) {
event.respondWith(
caches.open('mycache').then(function(cache) {
return fetch(event.request).then(function(response) {
cache.put(event.request, response.clone());
return response;
});
})
);
});
Problem
I came across this problem when trying to override fetch for all kinds of different assets. navigate mode was set for the initial Request that gets the index.html (or other html) file; and I wanted the same caching rules applied to it as I wanted to several other static assets.
Here are the two things I wanted to be able to accomplish:
When fetching static assets, I want to sometimes be able to override the url, meaning I want something like: fetch(new Request(newUrl))
At the same time, I want them to be fetched just as the sender intended; meaning I want to set second argument of fetch (i.e. the RequestInit object mentioned in the error message) to the originalRequest itself, like so: fetch(new Request(newUrl), originalRequest)
However the second part is not possible for requests in navigate mode (i.e. the initial html file); at the same time it is not needed, as explained by others, since it will already keep it's cookies, credentials etc.
Solution
Here is my work-around: a versatile fetch that...
can override the URL
can override RequestInit config object
works with both, navigate as well as any other requests
function fetchOverride(originalRequest, newUrl) {
const fetchArgs = [new Request(newUrl)];
if (request.mode !== 'navigate') {
// customize the request only if NOT in navigate mode
// (since in "navigate" that is not allowed)
fetchArgs.push(request);
}
return fetch(...fetchArgs);
}
In my case I was contructing a request from a serialized form in a service worker (to handle failed POSTs). In the original request it had the mode attribute set, which is readonly, so before one reconstructs the request, delete the mode attribute:
delete serializedRequest["mode"];
request = new Request(serializedRequest.url, serializedRequest);
I am using Vue.js and Choices.js javascript plugin and I have to dynamically populate values of two select fields via ajax.
What I am trying achieve is initate a get request at page load and populate the universities select, and after a value in universities select is chosen start a new getrequest to populate the faculties select.
What is happening is that when I pick the university for the first time, everything will work normally. For example if I pick a university option with value="1" an ajax get request will be sent to /faculties?university_id=1.The console log will print onChange startedso we are sure the method is running correctly; the appropriate v-model="selectedUniversity"is updating too.
If I now change the value of the select field again, the ajax function won't be called anymore and no additional requests will be done to the server. The console.logwill still run, and the v-modelis still being updated. Does anyone understand what is going on here?
var Choices = require('choices.js');
module.exports = {
data: function() {
return {
selectedUniversity: '',
selectedFaculty: '',
universities: {},
faculties: {}
}
},
mounted: function () {
var self = this;
var universitySelect = new Choices(document.getElementById('university'));
universitySelect.ajax(function(callback) {
fetch('/universities')
.then(function(response) {
response.json().then(function(data) {
callback(data, 'id', 'name');
self.universities = data;
});
})
.catch(function(error) {
console.log(error);
});
});
},
methods: {
onChange: function () {
console.log("onChange started");
var self = this;
var url = '/faculties?university_id=' + self.selectedUniversity;
var facultySelect = new Choices(document.getElementById('faculty'));
//This part below only runs the first time when the university select is selected
facultySelect.ajax(function(callback) {
fetch(url)
.then(function(response) {
response.json().then(function(data) {
callback(data, 'id', 'name');
self.faculties = data;
});
})
.catch(function(error) {
console.log(error);
});
});
}
}
}
The Headers are set like this:
I think your request URL /faculties?university_id=1 is cached and that's why it worked on first time and second time, the response is coming from the cached response.
In your fetch API, set cache mode to ignore the cached response,
fetch(url, {cache: "no-store"}).then(....)
For complete list of cache modes for fetch() API,
https://hacks.mozilla.org/2016/03/referrer-and-cache-control-apis-for-fetch/
In case if above link is unavailable,
Fetch cache control APIs
The idea behind this API is specifying a caching policy for fetch to explicitly indicate how and when the browser HTTP cache should be consulted. It’s important to have a good understanding of the HTTP caching semantics in order to use these most effectively. There are many good articles on the web such as this one that describe these semantics in detail. There are currently five different policies that you can choose from.
“default” means use the default behavior of browsers when downloading resources. The browser first looks inside the HTTP cache to see if there is a matching request. If there is, and it is fresh, it will be returned from fetch(). If it exists but is stale, a conditional request is made to the remote server and if the server indicates that the response has not changed, it will be read from the HTTP cache. Otherwise it will be downloaded from the network, and the HTTP cache will be updated with the new response.
“no-store” means bypass the HTTP cache completely. This will make the browser not look into the HTTP cache on the way to the network, and never store the resulting response in the HTTP cache. Using this cache mode, fetch() will behave as if no HTTP cache exists.
“reload” means bypass the HTTP cache on the way to the network, but update it with the newly downloaded response. This will cause the browser to never look inside the HTTP cache on the way to the network, but update the HTTP cache with the downloaded response. Future requests can use that updated response if appropriate.
“no-cache” means always validate a response that is in the HTTP cache even if the browser thinks that it’s fresh. This will cause the browser to look for a matching request in the HTTP cache on the way to the network. If such a request is found, the browser always creates a conditional request to validate it even if it thinks that the response should be fresh. If a matching cached entry is not found, a normal request will be made. After a response has been downloaded, the HTTP cache will always be updated with that response.
“force-cache” means that the browser will always use a cached response if a matching entry is found in the cache, ignoring the validity of the response. Thus even if a really old version of the response is found in the cache, it will always be used without validation. If a matching entry is not found in the cache, the browser will make a normal request, and will update the HTTP cache with the downloaded response.
Let’s look at a few examples of how you can use these cache modes.
// Download a resource with cache busting, to bypass the cache
// completely.
fetch("some.json", {cache: "no-store"})
.then(function(response) { /* consume the response */ });
// Download a resource with cache busting, but update the HTTP
// cache with the downloaded resource.
fetch("some.json", {cache: "reload"})
.then(function(response) { /* consume the response */ });
// Download a resource with cache busting when dealing with a
// properly configured server that will send the correct ETag
// and Date headers and properly handle If-Modified-Since and
// If-None-Match request headers, therefore we can rely on the
// validation to guarantee a fresh response.
fetch("some.json", {cache: "no-cache"})
.then(function(response) { /* consume the response */ });
// Download a resource with economics in mind! Prefer a cached
// albeit stale response to conserve as much bandwidth as possible.
fetch("some.json", {cache: "force-cache"})
.then(function(response) { /* consume the response */ });