In progressive Web App using the service worker the request given is stored in the cache memory. If the content updated in the server whether the same content in the cache memory will updated, how it happens?
Check out Jake Archibald's article on caching strategies and service workers. I think this is exactly what you are looking for.
Also note that sw-precache, a tool for generating service workers, will hash your cached content automatically and likely solve some caching issues for you.
Related
I am serving Angular app as a static content in Express Server. When serving static files with Express, Express by default adds ETag to the files. So, each next request will first check if ETag is matched and if it is, it will not send files again. I know that Service Worker works similar and it tries to match the hash. Does anyone know what is the main difference between these two approaches (caching with ETag and caching with Service Workers), and when we should use one over the other? What would be the most efficient when it comes to performance:
Server side caching and serving Angular app static files
Implementing Angular Service Worker for caching
Do both 1 and 2
To give a better perspective, I'll address a third cache option as well, to clarify the differences.
Types of caching
Basically we have 3 possible layers of caching, based on the priority they are checked from the client:
Service Worker cache (client-side)
Browser Cache, also known as HTTP cache (client-side)
Server side cache (CDN)
PS: Some browser like Chrome have an extra memory cache layer in front of the service worker cache.
Characteristics / differences
The service worker is the most reliable from the client-side ones, since it defines its own rules over how to manage the caching, and provide extra capabilities and fine-grained control over exactly what is cached and how caching is done.
The Browser caching is defined based on some HTTP headers from the assets response (Cache-Control and Expires), but the main issue is that there are many conditions in which those are ignored.
For instance, I've heard that for files bigger than 25Mb, normally they are not cached, specially on mobile, where the memory is limited (I believe it's getting even more strict lately, due to the increase in mobile usage).
So between those 2 options, I'd always chose the Service Worker cache for more reliability.
Now, talking to the 3rd option, the CDN checks the HTTP headers to look for ETag for busting the cache.
The idea of the Server-side caching is to only call the origin server in case the asset is not found on the CDN.
Now, between 1st and 3rd, the main difference is that Service Workers works best for Slow / failing network connections and offline, since the cache is done client-side, so if the network is off, then the service worker retrieves the last cached information, allowing for a smooth user experience.
Server-side, on the other hand, only works when we are able to reach the server, but at the same time, the caching happens out of user's device, saving local space, and reducing the application memory consumption.
So as you see, there's no right / wrong answers, just what works best for your use case.
Some Sources
MDN Cache
MDN HTTP caching
Great article from web.dev
Facebook study on caching duration and efficiency
Let's answer your questions:
what is the main difference between these two approaches (caching with ETag and caching with Service Workers)
Both solutions cache files, the main difference is the need to reach the server or stay locally:
For the ETag, the browser hits the server asking for a file with a hash (the etag), depending on the file stored in the server, the server will answer with a "the file was not modified, use your local copy" with a 300 HTTP response or "here is a new version of that file" with a 200 HTTP response and a new file. In both cases the server always decides. and the user will wait for a round trip.
With the Service worker approach you can decide locally what to do. You can write some logic to control what/when to use a local copy (cached) or when go to the server. This is very useful for offline capabilities since the logic is happening in the client, and there is no need to hit the server.
when we should use one over the other?
You can use both together. You can define some logic in the service worker, if there is no connection return the local copies, otherwise go to the server.
What would be the most efficient when it comes to performance:
Server side caching and serving Angular app static files
Implementing Angular Service Worker for caching
Do both 1 and 2
My recommended approach is use both approaches. Although treat your files differently, the 'index.html' file can change, in this case use the service worker (in case there is no internet access) and if there is internet access let the web server answer with the etag. All the other static files (CSS and JS) should be immutable files, this is you can be sure the local copy is valid, in this case add a hash to the files' name (so they are always unique files) and cache them. When you have a new version of your app, you will modify the 'index.html' pointing to new immutable files.
When creating a react app, service worker is invoked by default. Why service worker is used? What is the reason for default invoking?
You may not need a service worker for your application. If you are creating a project with create-react-app it is invoked by default
Service workers are well explained in this article. To Summarise from it
A service worker is a script that your browser runs in the
background, separate from a web page, opening the door to features
that don't need a web page or user interaction. Today, they already
include features like push notifications and background sync and have
ability to intercept and handle network requests, including
programmatically managing a cache of responses.
In the future, service workers might support other things like
periodic sync or geofencing.
According to this PR to create-react-app
Service workers are introduced with create-react-app via
SWPrecacheWebpackPlugin.
Using a server worker with a cache-first strategy offers performance
advantages, since the network is no longer a bottleneck for fulfilling
navigation requests. It does mean, however, that developers (and
users) will only see deployed updates on the "N+1"
visit to a page, since previously cached resources are updated in the
background.
The call to register service worker is enabled by default in new apps but you can always remove it and then you’re back to regular behaviour.
In simple and plain words, it’s a script that browser runs in the background and has whatsoever no relation with web pages or the DOM, and provide out of the box features. It also helps you cache your assets and other files so that when the user is offline or on slow network.
Some of these features are proxying network requests, push notifications and background sync. Service workers ensure that the user has a rich offline experience.
You can think of the service worker as someone who sits between the client and server and all the requests that are made to the server pass through the service worker. Basically, a middle man. Since all the request pass through the service worker, it is capable to intercept these requests on the fly.
I'd like to add 2 important considerations about Service Workers to take into account:
Service Workers require HTTPS. But to enable local testing, this restriction doesn't apply to localhost. This is for security reasons as a Service Worker acts like a man in the middle between the web application and the server.
With Create React App Service Worker is only enabled in the production environment, for example when running npm run build.
Service Worker is here to help developing a Progressive Web App. A good resource about it in the context of Create React App can be found in their website here.
I am using sw-precache and sw-toolbox to manage my service worker. Let's say I have a css file which I want to cache
staticFileGlobs: ['public/asset/build/css/m_index.min.css']
It get's added to the service worker on running gulp task as
var precacheConfig = [["public/asset/build/css/m_college.min.css","8d9b0e69820ba2fab83c45e2884bd61f"]
The hash with the file helps me in cache busting when service worker is registered. All works fine.
Now consider a situation where a certain PC or user or browser is unable to register service worker and the file is served through the network to him every time. In this case, the file will get stored in the browser memory because there is no cache busting by default. And it might feed the old file to that user for a lifetime even after the developer has updated the file.
What is the way to handle this scenario?
I would use the Etag response header (https://developer.mozilla.org/en/docs/Web/HTTP/Headers/ETag) to avoid loading obsolete assets for users.
For anyone looking out for answers, there is a simple solution - COOKIES.
While trying to register cookie if something fails set a cookie. This cookie than can be read on the backend before sending files
If the cookie is not set, load files without cache busting else append a cache buster on the end to prevent browser caching where service worker is not running.
Because cookies are the only thing accessible directly on both frontend and backend.
I am learning about service worker API. I found more cool features in it. But what excites me is that why it work only on https site and not on http site?.
I believe this is something to do with security yet I couldn't find the reason while I browse. So any explanation explaining this is appreciated...
Since service workers are rather powerful, being able to run even when the original page is no longer open, you'll really want to limit who can set up a service worker and who can't. Since a plain HTTP request is very easily man-in-the-middled, any random Javascript could be injected into such a request, which could set up a service worker. Which means, your ISP, a government like China, or a serious attacker could set up service workers very easily. By requiring an HTTPS connection, men-in-the-middle are largely avoided and you can at least assure that the Javascript that sets up the service worker actually came from the page you think it did.
To complement #deceze answer, if an attacker man-in-the-middle a connection to an HTTP site, it can compromises this site this time only — a later connection to the site is not guaranteed to be compromised, and for instance for attacks that rely on local network accesses, may not be easily reproductible.
If one could use a service worker on http web sites, an attacker needs to just do one man-in-the-middle attack to be forever (or at least until that browser is cleared) the source of the content for the said site.
I have an app that runs service worker. I'm using sw-toolbox library for dynamic caching of URLs, but I want to create wrapper over sw-toolbox that provides getters and setters to my application for URL caching.
As service worker runs in different thread and my application running in main thread, so just wondering how to create a wrapper in Javascript through which my application can communicate with service worker and cache resources on-demand?
So far the Cache api is available from content, that means you can use it directly from your Javascript code, no need to run in the Service Worker.
Check this thread to find the resolution: https://github.com/slightlyoff/ServiceWorker/issues/698 You will be able to use caches from the window object.
Meaning you have all range of Cache methods to play with the content: https://developer.mozilla.org/en-US/docs/Web/API/Cache
Just another reminder, from Chrome 46, you will be able to store stuff just in secured origins.
From the Documentation, Cache API/interface is exposed to windowed scopes as well as workers.
You don't have to use it in conjunction with service workers, even
thought it is defined in the service worker spec.
It depends how your worker caches data, it it just uses the standard "Cache" API than you can just query cache object which is attached to global scope.
In this particular case Cache.match() is your friend.