I want to intercept some user requests, cancel it and then present a hand made HTML.
First I tried to call window.stop() in the beginning of content.js, but after checking the Network tab of the Developer Tool I noticed that some elements are still being requested (although not loaded, it seems).
Then I tried to add a chrome.webRequest.onBeforeRequest listener inside background.js. So inside the callback I set my hand made HTML (document.getElementsByTagName("html")[0].innerHTML = myHtml) and in the end I canceled the original request (return {cancel: true}), but, although the original request is being canceled, my HTML doesn't appear.
Is there a way to make the webbrowser submit additional HTTP header if the user clicks on a link?
Background: In our environment every http-request has a unique ID on the server side. See https://serverfault.com/questions/797609/apache-x-request-id-like-in-heroku
If your web application receives a http-request, I would like to know which page was the page before. The http referrer is not enough, since the user could use several tabs in his browser.
I would like to avoid to put the ugly request-id into every GET request which gets send from the browser to the server. Up to now our URLs are nice.
My prefered solution would be some JavaScript magic which adds the request-id of the current page into the next http request.
Steps in detail:
browser access URL http://example.com/search
web server receives http request with request ID 123
web server sends content of the URL to the browser (a search page). The page includes the request ID 123 somewhere
the user searches for "foobar".
the web browser submits a http request to the server and includes the previous request id somehow.
web server receives second http request (ID 456) and can access the value of the first request (ID 123) somehow.
Web server can store the relation "123 --> 456" in a database for later analysis.
My goal is to track the relations "123 --> 456". Above solution is just a strategy to get to the goal. Other strategies are welcome.
We use the web framework django. But AFAIK this does matter in this context.
the user could use several tabs in his browser
I elaborate what that means for a matching solution. The sequence of requests which come from one user does not solve the issue.
One use with several tabs:
user looks at page A in tab1
user looks at page B in tab2
user follows a link on page A to page C
user follows a link on page C to page D
user follows a link on page B (tab2) to page E.
I want to know see two sequences:
A -> C -> D
And
B -> E
The only modern 'sane' option here is to use a ServiceWorker.
A ServiceWorker can intercept HTTP requests for a domain you control and decorate it with more headers.
A ServiceWorker works 'outside' of a browser tab, and if multiple tabs are open with the same website, the same serviceworker will be used for all of them.
A full tutorial on how to accomplish that is definitely too much for this answer box, but intercepting and doing stuff with HTTP requests is a big use-case, so off-site sources will usually have this as an example.
I would say that this is kind of a bad idea. If you think you need this, maybe you can handle this in a different way. A common way to do this might be using cookies instead.
We can modify request headers using:
.setRequestHeader() method of XMLHttpRequest() object (in same or allowed origins).
Editing the headers in browser console or using some complement (it is not practical).
Performing the request from the server side e.g using CURL, wget, or some library (client->serverProxy->url with custom headers ).
It is not possible (using javascript) to change the headers sent by browser in a request like because at least now, the http content negotiation is a browser's inner capability (except in part using XMLHttpRequest in same or allowed origins).
Then, in my opinion, as #Evert said you have two practical ways (a third in fact) to achieve your goal, performing a server proxy or using cookies. Here you have a very simple way using window.localStorage:
LocalStorage example
if (!localStorage.getItem("ids")) {//<-- the place in which we store the behavior
localStorage.setItem("ids", 'somevalue')
} else {
var ids = JSON.parse(localStorage.getItem("ids"));
ids.ids.push(id);//<-- we add some value
localStorage.setItem("ids", JSON.stringify(ids));
}
Full example here: https://jsfiddle.net/hy4rzob9/ press run several times and you'll see that we store each visit, of course, in your implementation you have to replace the random number for a unique identifier of each page.
LocalStorage example with several tabs
Taking into account the update, we could store the history using also document.referrer with localStorage with something like this:
var session = Math.random();
if(!localStorage.getItem("routes")){//<-- first time
var routes = {};
routes[session] = [document.location.href];
localStorage.setItem("routes", JSON.stringify(routes))
}else{
var routes = JSON.parse(localStorage.getItem("routes"));
if(!document.referrer){
routes[session] = [document.location.href];//<-- new root
}else{
for(let ses in routes){
if(routes[ses].includes(document.referrer)){
routes[ses].push(document.location.href);
}
}
}
localStorage.setItem("routes", JSON.stringify(routes))
}
var r = JSON.parse(localStorage.getItem("routes"));
console.log(r);
Full example here https://codesandbox.io/s/qk99o4vy7q, to emulate your example open this https://qk99o4vy7q.codesandbox.io/a.html (represents A) and open in a new tab https://qk99o4vy7q.codesandbox.io/b.html (represents B), navigate in both tabs and see the console. This example won't work if we share some referrer, because we can't differentiate between referrers if we attach nothing in the URL. A -> C -> D and B -> E will work, but A -> C -> D and B -> E -> A won't.
Ping example
There is other way, that is easy but has a limitation in browser compatibility, that is using ping attribute of <a> like this:
Link to track
ping Contains a space-separated list of URLs to which, when the
hyperlink is followed, POST requests with the body PING will be sent
by the browser (in the background). Typically used for tracking.
Open the console -> network, delete all, run the snippet and click in the link, if your browser supports it, you will see that the browser send a POST request to trackPing.py (I guess doesn't exist in SO), that post is void but you could track the environmental variables such as request.environ['REMOTE_ADDR'] or something.
First of all, sorry for my english.
Edit:
After reading your edit, I realised that my answer didn't fit at all, because of the tabs.
It is not possible to modify directly the way the browser makes a get request. Knowing that, your posibilities are:
Use GET parameters. I know you try to avoid this.
As #Evert said, use ServiceWorkers. It is the cleanest way to modify a request before it leaves the browser.
The last approach (an an easy one) is similar to #Emeeus's, but instead of using localStorage, whose values are shared between tabs, you should use sessionStorage, whose values are tab-independant. Also, instead of store the entire route, you should store just a random ID. This ID will work as the identification of the chain of requests for an specific tab. Then, once your webserver returns each Request-ID for example using <meta name="request_id" content="123" /> you just need to make a request via ajax to an specific tracking endpoint and store:
chain_id (stored in sessionStorage)
request_id (stored in head > meta)
timestamp (generated in webserver)
session_id (accesible from webserver). You can avoid this, but it is still useful for checking purposes.
The request to store the route is made after you page is loaded, instead of before. This approach is quite similar to how Analytics works.
// generate an unique code and store it in sessionStorage.
if (!sessionStorage.getItem('chain_id')) {
sessionStorage.setItem('chain_id', 'a7835e0a-3ee9-e981-...');
}
// Then, if you use JQuery:
$(document).ready(function() {
$.ajax({
type: "POST",
url: 'your/tracking/endpoint/',
data: {
'chain_id': sessionStorage.getItem('chain_id'),
'request_id': document.querySelector("meta[name='request_id']").getAttribute('content'),
}
});
});
Note: It is preferable to don't use JQuery to handle tracking requests neither wait until document is fully loaded. It is just an example.
And that's all. You have the relation between user-agent, the chain, the request and the timestamp of the request, so if you need to know what request was made before or after a given one, you just need to lookup in the database using the Chain-ID and the timestamp as filters.
The django model for your requests could be.
from django.db import models
from django.contrib.sessions.models import Session
class Request(models.Model):
session = models.ForeignKey(Session)
chain_id = models.Charfield(max_length=100)
request_id = models.WhatEverField...
request_url = models.URLField(max_length=200)
created = models.DateTimeField(auto_now_add=True)
I hope it helps.
I don't know if this will help, but I think maybe Ajax will do,
like set additional header inside onclick event listener, as for request id, if it's not something that sensitive then you could use cookie for the container, or maybe something much better ...
I've run into a situation where I would like to be sure about how do browsers handle URIs that include a fragment identifier, such as Products#A. Imagine my website has two pages: Products and FAQs. Then, inside each I want to use # to navigate to specific HTML elements. So:
What is the difference between href="Products#A" and href="#A" if I'm already in page Products?
And if I'm in page FAQs?
Does placing an URL like in href="Products#A" always triggers a server call or does the browser know that it is already on page Products and it does not make a server call?
What is I add a / (like href="/products#A")? Does this force a server call?
Is this standard for all browsers?
I've run a few tests but I'm missing some theory here.
If the link, fragment excepted, resolves to the same URL as the current page, then it will navigate around the current page.
If it resolves to a different page, then it will load a new page.
It doesn't matter how the relative (or absolute) URL is actually expressed, only what it resolves to.
To answer your questions, I am doing these with my own experience.
What is the difference between href="Products#A" and href="#A" if I'm already in page Products?
Nothing. Just for safety, and if you are in a different page, and the content is rendered using the same partial (or the same source), this might help.
And if I'm in page FAQs?
It navigates normally to Products and finds the element with id="A" and scrolls to there.
Does placing an URL like in href="Products#A" always triggers a server call or does the browser know that it is already on page Products and it does not make a server call?
That's a client call, no server calls are made for URL fragments.
Is this standard for all browsers?
I believe so.
My page makes two requests: one to /login and one to /treeContents?rootID=9 for example. I'd like to combine them into one request, /loginAndTreeContents?rootID=9
But, the old way means that subsequent requests to /treeContents?rootID=9 will be retrieved from the cache, and the new way means that they won't.
So I was thinking, is there a way with javascript to manually take the response for /loginAndTreeContents?rootID=9 and jam it into the cache as if it was from /treeContents?rootID=9, so that subsequent requests will just return that?
Is there a way to manually do it with javascript? Or perhaps, can HTML5's appcache help me? Thanks!
No. You cannot make the response from one URL magically be the cached result from another URL. That is not how the cache works. It stores the page in the cache only as the URL that it actually came from.
It is possible to precache pages by loading them into iframes that are not visible.
It would also be possible to stuff the desired HTML into LocalStorage and then retrieve it from there with your own JS in some other page.
In Chrome, I'm looking to detect in a page URL is going to be on example.com's domain, and if it is, before loading, append foo=bar as a parameter and load that instead.
I've found that I can access when the Omnibar has been submitted here, but it seems like it'd load the original URL anyways, and while that's alright it's twice the bandwidth I feel is necessary. It's not a problem when it's only one page, but it's a change that needs to happen on every page in an site, so double the bandwidth definitely becomes an issue.
Currently, it works to detect if the URL is going to be example.com and is submitted, then call window.stop() and then set location.href to example.com/&?foo=bar but that doesn't seem ideal.
In short, the user goes to http://www.example.com and then the script changes it to http://www.example.com/&?foo=bar before loading the original link.
Take a look at the chrome.webRequest API, in particular the following method:
onBeforeRequest (optionally synchronous)
Fires when a request is about to occur. This event is sent before any TCP connection is made and can be used to cancel or redirect requests.
You can use
window.location.search = "foo=bar"
may be this helps.