Related
I'm playing with the service worker API in my computer so I can grasp how can I benefit from it in my real world apps.
I came across a weird situation where I registered a service worker which intercepts fetch event so it can check its cache for requested content before sending a request to the origin.
The problem is that this code has an error which prevented the function from making the request, so my page is left blank; nothing happens.
As the service worker has been registered, the second time I load the page it intercepts the very first request (the one which loads the HTML). Because I have this bug, that fetch event fails, it never requests the HTML and all I see its a blank page.
In this situation, the only way I know to remove the bad service worker script is through chrome://serviceworker-internals/ console.
If this error gets to a live website, which is the best way to solve it?
Thanks!
I wanted to expand on some of the other answers here, and approach this from the point of view of "what strategies can I use when rolling out a service worker to production to ensure that I can make any needed changes"? Those changes might include fixing any minor bugs that you discover in production, or it might (but hopefully doesn't) include neutralizing the service worker due to an insurmountable bug—a so called "kill switch".
For the purposes of this answer, let's assume you call
navigator.serviceWorker.register('service-worker.js');
on your pages, meaning your service worker JavaScript resource is service-worker.js. (See below if you're not sure the exact service worker URL that was used—perhaps because you added a hash or versioning info to the service worker script.)
The question boils down to how you go about resolving the initial issue in your service-worker.js code. If it's a small bug fix, then you can obviously just make the change and redeploy your service-worker.js to your hosting environment. If there's no obvious bug fix, and you don't want to leave your users running the buggy service worker code while you take the time to work out a solution, it's a good idea to keep a simple, no-op service-worker.js handy, like the following:
// A simple, no-op service worker that takes immediate control.
self.addEventListener('install', () => {
// Skip over the "waiting" lifecycle state, to ensure that our
// new service worker is activated immediately, even if there's
// another tab open controlled by our older service worker code.
self.skipWaiting();
});
/*
self.addEventListener('activate', () => {
// Optional: Get a list of all the current open windows/tabs under
// our service worker's control, and force them to reload.
// This can "unbreak" any open windows/tabs as soon as the new
// service worker activates, rather than users having to manually reload.
self.clients.matchAll({type: 'window'}).then(windowClients => {
windowClients.forEach(windowClient => {
windowClient.navigate(windowClient.url);
});
});
});
*/
That should be all your no-op service-worker.js needs to contain. Because there's no fetch handler registered, all navigation and resource requests from controlled pages will end up going directly against the network, effectively giving you the same behavior you'd get without if there were no service worker at all.
Additional steps
It's possible to go further, and forcibly delete everything stored using the Cache Storage API, or to explicitly unregister the service worker entirely. For most common cases, that's probably going to be overkill, and following the above recommendations should be sufficient to get you in a state where your current users get the expected behavior, and you're ready to redeploy updates once you've fixed your bugs. There is some degree of overhead involved with starting up even a no-op service worker, so you can go the route of unregistering the service worker if you have no plans to redeploy meaningful service worker code.
If you're already in a situation in which you're serving service-worker.js with HTTP caching directives giving it a lifetime that's longer than your users can wait for, keep in mind that a Shift + Reload on desktop browsers will force the page to reload outside of service worker control. Not every user will know how to do this, and it's not possible on mobile devices, though. So don't rely on Shift + Reload as a viable rollback plan.
What if you don't know the service worker URL?
The information above assumes that you know what the service worker URL is—service-worker.js, sw.js, or something else that's effectively constant. But what if you included some sort of versioning or hash information in your service worker script, like service-worker.abcd1234.js?
First of all, try to avoid this in the future—it's against best practices. But if you've already deployed a number of versioned service worker URLs already and you need to disable things for all users, regardless of which URL they might have registered, there is a way out.
Every time a browser makes a request for a service worker script, regardless of whether it's an initial registration or an update check, it will set an HTTP request header called Service-Worker:.
Assuming you have full control over your backend HTTP server, you can check incoming requests for the presence of this Service-Worker: header, and always respond with your no-op service worker script response, regardless of what the request URL is.
The specifics of configuring your web server to do this will vary from server to server.
The Clear-Site-Data: response header
A final note: some browsers will automatically clear out specific data and potentially unregister service workers when a special HTTP response header is returned as part of any response: Clear-Site-Data:.
Setting this header can be helpful when trying to recover from a bad service worker deployment, and kill-switch scenarios are included in the feature's specification as an example use case.
It's important to check the browser support story for Clear-Site-Data: before your rely solely on it as a kill-switch. As of July 2019, it's not supported in 100% of the browsers that support service workers, so at the moment, it's safest to use Clear-Site-Data: along with the techniques mentioned above if you're concerned about recovering from a faulty service worker in all browsers.
You can 'unregister' the service worker using javascript.
Here is an example:
if ('serviceWorker' in navigator) {
navigator.serviceWorker.getRegistrations().then(function (registrations) {
//returns installed service workers
if (registrations.length) {
for(let registration of registrations) {
registration.unregister();
}
}
});
}
That's a really nasty situation, that hopefully won't happen to you in production.
In that case, if you don't want to go through the developer tools of the different browsers, chrome://serviceworker-internals/ for blink based browsers, or about:serviceworkers (about:debugging#workers in the future) in Firefox, there are two things that come to my mind:
Use the serviceworker update mechanism. Your user agent will check if there is any change on the worker registered, will fetch it and will go through the activate phase again. So potentially you can change the serviceworker script, fix (purge caches, etc) any weird situation and continue working. The only downside is you will need to wait until the browser updates the worker that could be 1 day.
Add some kind of kill switch to your worker. Having a special url where you can point users to visit that can restore the status of your caches, etc.
I'm not sure if clearing your browser data will remove the worker, so that could be another option.
I haven't tested this, but there is an unregister() and an update() method on the ServiceWorkerRegistration object. you can get this from the navigator.serviceWorker.
navigator.serviceWorker.getRegistration('/').then(function(registration) {
registration.update();
});
update should then immediately check if there is a new serviceworker and if so install it. This bypasses the 24 hour waiting period and will download the serviceworker.js every time this javascript is encountered.
For live situations you need to alter the service worker at byte-level (put a comment on the first line, for instance) and it will be updated in the next 24 hours. You can emulate this with the chrome://serviceworker-internals/ in Chrome by clicking on Update button.
This should work even for situations when the service worker itself got cached as the step 9 of the update algorithm set a flag to bypass the service worker.
We had moved a site from godaddy.com to a regular WordPress install. Client (not us) had a serviceworker file (sw.js) cached into all their browsers which completely messed things up. Our site, a normal WordPress site, has no service workers.
It's like a virus, in that it's on every page, it does not come from our server and there is no way to get rid of it easily.
We made a new empty file called sw.js on the root of the server, then added the following to every page on the site.
<script>
if (navigator && navigator.serviceWorker && navigator.serviceWorker.getRegistration) {
navigator.serviceWorker.getRegistration('/').then(function(registration) {
if (registration) {
registration.update();
registration.unregister();
}
});
}
</script>
In case it helps someone else, I was trying to kill off service workers that were running in browsers that had hit a production site that used to register them.
I solved it by publishing a service-worker.js that contained just this:
self.globalThis.registration.unregister();
I use service worker with sw-toolbox library. My PWA caches everything except API queries (images, css, js, html). But what if some files will be changed someday. Or what if service-worker.js will be changed.
How application should know about changes in files?
My service-worker.js:
'use strict';
importScripts('./build/sw-toolbox.js');
self.toolbox.options.cache = {
name: 'ionic-cache'
};
// pre-cache our key assets
self.toolbox.precache(
[
'./build/main.js',
'./build/main.css',
'./build/polyfills.js',
'index.html',
'manifest.json'
]
);
// dynamically cache any other local assets
self.toolbox.router.any('/*', self.toolbox.cacheFirst);
// for any other requests go to the network, cache,
// and then only use that cached resource if your user goes offline
self.toolbox.router.default = self.toolbox.networkFirst;
I don't know what is the usual method to update cache in PWA. Maybe PWA should send AJAX request in background and check UI version?
AFAIK the sw_toolbox does not have a strategy for cache with network update. This is really what you want I think.
You want to modify the cache-network race strategy - > https://jakearchibald.com/2014/offline-cookbook/#cache-network-race
Instead of just letting the loser fade away, once the network responds you will want to update the client. This is a little more advanced that I have time or time to explain here.
I would post a message to the client to let it know there is an update. You may want to alert the user to the update or just force the update.
I don't consider this to be an edge case, but a very common, but advanced scenario. I hope to publish a more detailed solution soon.
There is nice solution written here where he states (in a nutshell) to either not use cache-first strategy or update a UX pattern of displaying a "Reload for the latest updates."
I dealt with services workers without using any library and the solution I ended up coming up with involved a bit of server side code and some client side. The strategy in a nutshell
Firstly the variables you will need and where:
On the server side have a "service worker version" variable (Put this in a database or config file if you are using something like php that will update immediately on the server side without requiring a redeploy. Let's call it serverSWVersion
On one of the javascript files you cache (I have a javascript file dedicated to this) have a global variable that will also be the "service worker version". Let's call it clientSWVersion
Now how to use the two:
Whenever a person lands on the page make an ajax call to your server to get the serverSWVersion value. Compare this with the clientSWVersion value.
If the values are different that means your web app version is not
the latest.
If this is the case then unregister the service worker and refresh the page so that the service worker will be re registered and the new files will be cached.
What to actually do when new file is available
Update the serviceSWVersion and clientSWVersion variables and upload to server where applicable.
When a person visits again then the service worker should be re registered and all the cached files will be retrieved.
I have provided a php server side based code that I used while I was implementing this strategy. It should show you the principles. Just drop the "Exercise" folder in a htdocs of a php server and it should work without you having to do anything else. I hope you find it useful... And remember you could just use a database instead of a config file to store the server side service worker variable if you are using some other server instead of php:
Zip file with code:
ServiceWorkerExercise.zip
When a service worker is altered, the browser will install it, but the new version will not be activated until the browser tab or PWA app window is closed and re-opened.
So, if you change the cache name, the new cache will not serve any files until the browser re-opens, nor will the old cache be deleted until that time.
You can detect service worker changes in your page javascript using registration.onupdatefound and ask the user to close and re-open the window - something like this:
// register the service worker
navigator.serviceWorker.register('sw.js').then(function(registration)
{
registration.onupdatefound = function()
{
console.log("ServiceWorker update found.");
alert("A new version is available - please close this browser tab or app window and re-open to update ... ");
}
}, function(err)
{
console.log('ServiceWorker registration failed: ', err);
});
change self.toolbox.router.any('/', self.toolbox.cacheFirst); to self.toolbox.router.any('/', self.toolbox.fastest);
I want to create a website that can work even when it's server is offline — I found that that's what ServiceWorkers are for.
When I reload the page with service worker and no connectivity, it works just fine. However, shift+reload (e.g. bypassing cache) disarms service worker and I get "could not connect to server" error.
My question is — can I somehow prevent shift+reload (shift+f5, ctrl+f5 etc) from ruining service worker, or, at least, make it recover afterwards without restoring connectivity?
I was able to keep using the service worker even after Ctrl+F5 via the following approach:
In the window script:
navigator.serviceWorker.register(<...>).then (registration => {
if (navigator.serviceWorker.controller === null) {
// we get here after a ctrl+f5 OR if there was no previous service worker.
navigator.serviceWorker.ready.then(() => {
registration.active.postMessage("claimMe");
});
}
<...>
});
In the service script:
self.onmessage = (event) => {
if (event.data === "claimMe") {
self.clients.claim();
}
};
In short, we ask the service worker to claim all clients again, which works even for clients that used Ctrl+F5.
If you want to respect the Ctrl+F5 in the service worker code, you could either:
Clear the cache before claiming. Note that this will also affect any other existing clients, which may be unwanted.
Do something more complicated like sending the id of the client that requested a Ctrl+F5 and treat fetch requests specially for such clients.
QUICK Answer
How to make ServiceWorker survive cache reset/Shift+F5?
Theorically (*), you can do it with js plugins which detect the key hit of Ctrl or Shift... then prevent the "force refresh" to happen
...but there is a story behind this Ctrl/Shift reload.
(*) disclaimer: I've not tried this yet
LONG story... (kind of)
This is actually a spec of Service Worker. And only present in recent change of Chrome. For the earlier version of Chrome , Service Worker has no any issue surviving a "force refresh".
Along with the spec from W3C
navigator.serviceWorker.controller returns null if the request is a force refresh (shift+refresh).
Also, there are many people (**) has suggested that a "force refresh" should always clear out all kind of caches. Which is matched with the purpose of its existent and its spec.
...Furthermore, we even got it on the wikipedia...
Wikipedia: Bypassing your cache means forcing your web browser to re-download a web page from scratch
(**) us, web developers in the early stage of service worker.
In my opinion I think it is OK to let a force refresh doing its job. Since pretty much all of us always expecting the browser to not use any cache when we doing this.
I was able to solve this by detecting the ctrl+shift+r and reloading:
const wb = new Workbox(swUrl)
const wbReg = await wb.register({ immediate: true })
// workaround for ctrl + shift + r disabling service workers
// https://web.dev/service-worker-lifecycle/#shift-reload
if (wbReg && navigator.serviceWorker.controller === null) {
console.error('detected ctrl+shift+r: reloading page')
location.reload()
throw new Error('page loaded with cache disabled: ctrl+shift+r')
}
I'm playing with the service worker API in my computer so I can grasp how can I benefit from it in my real world apps.
I came across a weird situation where I registered a service worker which intercepts fetch event so it can check its cache for requested content before sending a request to the origin.
The problem is that this code has an error which prevented the function from making the request, so my page is left blank; nothing happens.
As the service worker has been registered, the second time I load the page it intercepts the very first request (the one which loads the HTML). Because I have this bug, that fetch event fails, it never requests the HTML and all I see its a blank page.
In this situation, the only way I know to remove the bad service worker script is through chrome://serviceworker-internals/ console.
If this error gets to a live website, which is the best way to solve it?
Thanks!
I wanted to expand on some of the other answers here, and approach this from the point of view of "what strategies can I use when rolling out a service worker to production to ensure that I can make any needed changes"? Those changes might include fixing any minor bugs that you discover in production, or it might (but hopefully doesn't) include neutralizing the service worker due to an insurmountable bug—a so called "kill switch".
For the purposes of this answer, let's assume you call
navigator.serviceWorker.register('service-worker.js');
on your pages, meaning your service worker JavaScript resource is service-worker.js. (See below if you're not sure the exact service worker URL that was used—perhaps because you added a hash or versioning info to the service worker script.)
The question boils down to how you go about resolving the initial issue in your service-worker.js code. If it's a small bug fix, then you can obviously just make the change and redeploy your service-worker.js to your hosting environment. If there's no obvious bug fix, and you don't want to leave your users running the buggy service worker code while you take the time to work out a solution, it's a good idea to keep a simple, no-op service-worker.js handy, like the following:
// A simple, no-op service worker that takes immediate control.
self.addEventListener('install', () => {
// Skip over the "waiting" lifecycle state, to ensure that our
// new service worker is activated immediately, even if there's
// another tab open controlled by our older service worker code.
self.skipWaiting();
});
/*
self.addEventListener('activate', () => {
// Optional: Get a list of all the current open windows/tabs under
// our service worker's control, and force them to reload.
// This can "unbreak" any open windows/tabs as soon as the new
// service worker activates, rather than users having to manually reload.
self.clients.matchAll({type: 'window'}).then(windowClients => {
windowClients.forEach(windowClient => {
windowClient.navigate(windowClient.url);
});
});
});
*/
That should be all your no-op service-worker.js needs to contain. Because there's no fetch handler registered, all navigation and resource requests from controlled pages will end up going directly against the network, effectively giving you the same behavior you'd get without if there were no service worker at all.
Additional steps
It's possible to go further, and forcibly delete everything stored using the Cache Storage API, or to explicitly unregister the service worker entirely. For most common cases, that's probably going to be overkill, and following the above recommendations should be sufficient to get you in a state where your current users get the expected behavior, and you're ready to redeploy updates once you've fixed your bugs. There is some degree of overhead involved with starting up even a no-op service worker, so you can go the route of unregistering the service worker if you have no plans to redeploy meaningful service worker code.
If you're already in a situation in which you're serving service-worker.js with HTTP caching directives giving it a lifetime that's longer than your users can wait for, keep in mind that a Shift + Reload on desktop browsers will force the page to reload outside of service worker control. Not every user will know how to do this, and it's not possible on mobile devices, though. So don't rely on Shift + Reload as a viable rollback plan.
What if you don't know the service worker URL?
The information above assumes that you know what the service worker URL is—service-worker.js, sw.js, or something else that's effectively constant. But what if you included some sort of versioning or hash information in your service worker script, like service-worker.abcd1234.js?
First of all, try to avoid this in the future—it's against best practices. But if you've already deployed a number of versioned service worker URLs already and you need to disable things for all users, regardless of which URL they might have registered, there is a way out.
Every time a browser makes a request for a service worker script, regardless of whether it's an initial registration or an update check, it will set an HTTP request header called Service-Worker:.
Assuming you have full control over your backend HTTP server, you can check incoming requests for the presence of this Service-Worker: header, and always respond with your no-op service worker script response, regardless of what the request URL is.
The specifics of configuring your web server to do this will vary from server to server.
The Clear-Site-Data: response header
A final note: some browsers will automatically clear out specific data and potentially unregister service workers when a special HTTP response header is returned as part of any response: Clear-Site-Data:.
Setting this header can be helpful when trying to recover from a bad service worker deployment, and kill-switch scenarios are included in the feature's specification as an example use case.
It's important to check the browser support story for Clear-Site-Data: before your rely solely on it as a kill-switch. As of July 2019, it's not supported in 100% of the browsers that support service workers, so at the moment, it's safest to use Clear-Site-Data: along with the techniques mentioned above if you're concerned about recovering from a faulty service worker in all browsers.
You can 'unregister' the service worker using javascript.
Here is an example:
if ('serviceWorker' in navigator) {
navigator.serviceWorker.getRegistrations().then(function (registrations) {
//returns installed service workers
if (registrations.length) {
for(let registration of registrations) {
registration.unregister();
}
}
});
}
That's a really nasty situation, that hopefully won't happen to you in production.
In that case, if you don't want to go through the developer tools of the different browsers, chrome://serviceworker-internals/ for blink based browsers, or about:serviceworkers (about:debugging#workers in the future) in Firefox, there are two things that come to my mind:
Use the serviceworker update mechanism. Your user agent will check if there is any change on the worker registered, will fetch it and will go through the activate phase again. So potentially you can change the serviceworker script, fix (purge caches, etc) any weird situation and continue working. The only downside is you will need to wait until the browser updates the worker that could be 1 day.
Add some kind of kill switch to your worker. Having a special url where you can point users to visit that can restore the status of your caches, etc.
I'm not sure if clearing your browser data will remove the worker, so that could be another option.
I haven't tested this, but there is an unregister() and an update() method on the ServiceWorkerRegistration object. you can get this from the navigator.serviceWorker.
navigator.serviceWorker.getRegistration('/').then(function(registration) {
registration.update();
});
update should then immediately check if there is a new serviceworker and if so install it. This bypasses the 24 hour waiting period and will download the serviceworker.js every time this javascript is encountered.
For live situations you need to alter the service worker at byte-level (put a comment on the first line, for instance) and it will be updated in the next 24 hours. You can emulate this with the chrome://serviceworker-internals/ in Chrome by clicking on Update button.
This should work even for situations when the service worker itself got cached as the step 9 of the update algorithm set a flag to bypass the service worker.
We had moved a site from godaddy.com to a regular WordPress install. Client (not us) had a serviceworker file (sw.js) cached into all their browsers which completely messed things up. Our site, a normal WordPress site, has no service workers.
It's like a virus, in that it's on every page, it does not come from our server and there is no way to get rid of it easily.
We made a new empty file called sw.js on the root of the server, then added the following to every page on the site.
<script>
if (navigator && navigator.serviceWorker && navigator.serviceWorker.getRegistration) {
navigator.serviceWorker.getRegistration('/').then(function(registration) {
if (registration) {
registration.update();
registration.unregister();
}
});
}
</script>
In case it helps someone else, I was trying to kill off service workers that were running in browsers that had hit a production site that used to register them.
I solved it by publishing a service-worker.js that contained just this:
self.globalThis.registration.unregister();
My users, when on a SPA page, are getting logged out after a couple of hours. Though, if they use the older postback forms, they never time out. So you have context, I have included enough code to provide context for the description of the issue on the bottom.
Web.config for authentication
<authentication mode="Forms">
<forms loginUrl="~/Account/Login" timeout="480" slidingExpiration="true" defaultUrl="~" ticketCompatibilityMode="Framework40"/>
</authentication>
My api controller
namespace my.Controllers
{
public class ApiMotionController : ApiController
{
[Authorize(Roles = "Mover"]
public IQueryable<Motions> Get()
JavaScript code
(function () {
'use strict';
angular.module('app')
.controller('MotionManager', ['$scope', '$http', buildMotionManager]);
function buildMotionManager($scope, $http) {
/*Static Members*/
$scope._whoami = 'MotionManager'; //Used for troubleshooting controller
/*Initialization Code*/
getMotions($scope, $http)();
/*Scope methods*/
$scope.refreshMotionsList = getMotions($scope, $http);
$scope.addMotion = addMotion($scope, $http);
$scope.playMotion = playMotion($scope, $http);
}
function getMotions($scope, $http){
return function(){
$http.get('/api/getMotions')
.succeed(function(data){
$scope.motionList = data;
})
.error(function(data){
console.log('FAIL', data);
});
};
}
function addMotion($scope, $http){
//stub. Code not shown here.
};
function playMotion($scope, $http){
//stub. Code not shown here.
};
})();
There my be typos in the above code, since I retyped it from my original while sanitizing.
The code does work as expected, but the problem is that after hours of working, suddenly all web API calls are failing with a 401 error. That is, they are all acting like the user is now de-authenticated.
As above, I cannot duplicate this issue when I am using web forms, or even MVC forms, and re-posting whole pages. It is only when I am using SPA style coding. I haven't tried other SPA frameworks, since I have 6 months of angular directed code in this project, switching isn't an option.
I have considered putting an iframe, with a timer to fire off in the background against a form object, just to trick the browser into generating a proper form postback. I want to avoid doing that, because it seems to hacky.
The only other key issue I have found is that I am seeing a bunch of schannel errors being logged into my application event log on the IIS server. They are all 10,10 which isn't well documented. The 10 series is well documented outside of 10,10. But none of those suggestions seem to work, or are even relevant.
Server is IIS 7.5 and I have tried this on IIS 8.
Application Log Errors:
A fatal alert was generated and sent to the remote endpoint. This may result in termination of the connection. The TLS protocol defined fatal error code is 10. The Windows SChannel error state is 10.
Error State: 10, Alert Description: 10
A fatal alert was generated and sent to the remote endpoint. This may result in termination of the connection. The TLS protocol defined fatal error code is 40. The Windows SChannel error state is 1205.
An TLS 1.2 connection request was received from a remote client application, but none of the cipher suites supported by the client application are supported by the server. The SSL connection request has failed.
Discovery
Error Code 40 means that there is a handshake issue. Since State Management is custom for my platform, I decided to change it to inproc. So far, I have seen the error log reduce in new error frequency, but disappear. However, I am still testing for the 401 issue.
Post discovery follow up
Had the certs re-issued, and the schannel errors cleared, but the problem remained.
I had started exploring the header information with a fine tooth comb, even if it means that I had to add custom header information to accompany my server calls.
I have now included in all $http calls withCredentials: true, which has brought my failure rate down to around 15%. that means that the failures are down to once or twice a day.
I started watching my 'auth' cookie on the client, and something confusing happens occasionally. The cookie will change without prompt, then it has changed back. Almost like the session is bouncing from current, to a new one, then back to current. So I have killed my cleanup process on the session table on the server, and see what I am getting there.
I had also been checking the system logs for exceptions, or SQL timeouts, and nothing.
Started to convert all controllers to MVC controllers, but I have hit conversion problems after conversion problems, including the use of jSON serializer. I still don't understand the decision to stick with the MS serializer when the JSON.NET one work so much better.
Current Status
The last change I made was to add filters.Add(new AuthorizeAttribute()); to my FilterConfig.RegisterGlobalFilters function.
Everything is still failing. After investigating the IIS logs I am still seeing everything getting de-authenticated.
FF on Windows - Fail
Chrome on Windows - Fail
Chrome on Droid - Fail
Safari on iPad - Fail
IE on Windows - Fail
12/10 Discovery
I have found the real problem. The authentication in MVC controllers are just not compatible with the web API controllers. So when I authenticate with the MVC controller, the web API controllers basically ignore it, and eventually time out on the authentication.
Latest Discovery
Apparently when the asp.net worker process shut down, and restarted, it would get a false flag that the database schema didn't exists. So I removed the check, and all reads and writes started working fine. It is interesting that the api controller would forge a new cookie when the mvc controller would fail the authentication. It was like it was creating a new provider instance. However, I couldn't find a 2nd instance, so I have to assume the existing provider was being duplicated.
Fix that is being tested
Now that I have removed the DB test, I am now testing the issue in long run tests. Each long run is longer than the worker process stays alive, but shorter than the session timeout.
Cornerstone of finding this bug
Apparently IIS Express was hiding the bug in that it seems to act without an external worker process. So I moved the test environment to my local IIS server.
It looks like there are several issues that were causing my problem, each one broken down here:
IIS Express wasn't closing sessions the same way that full IIS would.
So I moved the application to my local IIS, and added logging to everything.
ASP.NET worker process would launch new provider instance every time the API Controllers were called.
This would cause a new schema check per call. MVC controllers would only cause this check once per initial launch.
Since my provider is marries to my application schema, I just disabled the schema check.
Angular must be told to marshal the cookies.
So I added: cfg: { withCredentials: true, responseType: "json" }
the response type was to cover the occasional issue where I would see 'text/text'. Now I always see 'application/json'. This seems to be a browser issue, mostly with IE.
I also had to add config.MapHttpAttributeRoutes(); to my register method of my WebApiConfig class.
Using all of this, I was able to discover that the core of the problem was that every api call was causing my security provider to re-test the schema, which my MVC controllers are set to suppress that test after first load. The test always fails, because I had to expand a couple of tables, but I didn't need the models changed.
Resolution: I removed the test from the provider. Since the provider is strongly tied into the rest of the application, it didn't seem logical to keep treating it as a typical ASP.NET Membership provider. And that was the top feature that I didn't need.
Second benefit, I gained back a little bit of performance.