Persistent unique ID for Chrome tabs that lasts between browser sessions

Persistent unique ID for Chrome tabs that lasts between browser sessions - javascript

I'm trying to ascertain some way to establish a unique ID for Chrome tabs that meets the following conditions:
Uniquely identifies each tab
Stays the same for a given tab between browser restarts (session-restored tabs)
Stays the same if a tab is closed and then reopened with Undo Closed Tab (Ctrl+Shift+T)
Stays distinct if a tab is duplicated
I've done some rather aggressive research to find a comprehensive solution, but nothing seems to quite do the trick. Here are the methods I have tried, in increasing order of efficacy:
Use Chrome's provided tab.id: does not persist between browser sessions or close/undo-close
Put a GUID in cookies: is not unique per tab, only per domain/URL
Put a GUID in localStorage: persists between browser sessions and close/undo-close, but is not unique per tab, only per domain
Put a GUID in sessionStorage: unique per tab, persists across close/undo-close, unique for duplicated tabs, but is wiped out between browser sessions
Use identifiable webpage document attributes as a unique key: this is the best approach I've found so far. A key can be constructed via a content script from the following values: [location.href, document.referrer, history.length].
Regarding this last approach, the constructed key is unique across all tabs which share a common URL, referrer, and history length. Those values will remain the same for a given tab between browser restarts/session-restores and close/undo-closes. While this key is "pretty" unique, there are cases where it is ambiguous: for example, 3 new tabs opened to http://www.google.com would all have the same key in common (and this kind of thing happens pretty often in practice).
The "put GUID in sessionStorage" method can additionally be used to disambiguate between multiple tabs with the same constructed key for the close/undo-close and duplicated-tab cases during the current browser session. But this does not solve the ambiguity problem between browser restarts.
This last ambiguity can be partially mitigated during session restore by observing which tabs Chrome opens together in which windows, and extrapolating for a given ambiguous key which tab belongs to which window based on the presence of expected 'sibling' tabs (recorded during the previous browser session). As you might imagine, implementing this solution is quite involved and rather dodgy. And it can only disambiguate between same-keyed tabs that Chrome restores into different windows. That leaves same-keyed tabs that restore into the same window as irreconcilably ambiguous.
Is there a better way? A guaranteed unique, browser-generated, per-tab GUID that persists between browser restarts (session restores) and close/undo-close would be ideal but so far I haven't found anything like this.

The question here does most of the discovery work, and the accepted answer basically completes it, but there's a big implementation gap still for people looking to implement something which requires persistent tab IDs. I've attempted to distill this into an actual implementation.
To recap: Tabs can be (almost) uniquely and consistently identified as required by the question by maintaining a register of tabs which stores the following combination of variables in local persistent storage:
Tab.id
Tab.index
A 'fingerprint' of the document open in the tab - [location.href, document.referrer, history.length]
These variables can be tracked and stored in the registry using listeners on a combination of the following events:
onUpdated
onCreated
onMoved
onDetached
onAttached
onRemoved
onReplaced
There are still ways to fool this method, but in practice they are probably pretty rare - mostly edge cases.
Since it looks like I'm not the only one who has needed to solve this problem, I built my implementation as a library with the intention that it could be used in any Chrome extension. It's MIT licensed and available on GitHub for forking and pull requests (in fact, any feedback would be welcome - there are definitely possible improvements).

If I correctly understand your problem, your 5th method should do the trick, but along with these two criteria:
chrome.tabs.windowId (The ID of the window the tab is contained within)
chrome.tabs.index (The zero-based index of the tab within its window)
All these values need to be stored inside your extension. Besides that, you will also have to hook up your extension to chrome.tabs.onUpdated() and updated accordingly, when tabs are being dragged around, moved across owner windows, etc.

Put this as a persistent background script in manifest.json:
"background": {
"scripts": [ "background.js" ],
"persistent": true
},
Here is background.js.
Hopefully the code is self explanatory.
var tabs_hashes = {};
var tabs_hashes_save_queued = false;
function Start(){
chrome.tabs.query({windowType: "normal"}, function(querytabs){
querytabs.forEach(function(tab){
tabs_hashes[tab.id] = GetHash(tab.url);
});
if (localStorage.getItem("tabs_hashes") !== null){
var ref_load = JSON.parse(localStorage["tabs_hashes"]);
var ref_tabId = {};
querytabs.forEach(function(tab){
for (var t = 0; t < ref_load.length; t++){
if (ref_load[t][1] === tabs_hashes[tab.id]){
ref_tabId[ref_load[t][0]] = tab.id;
ref_load.splice(t, 1);
break;
}
}
});
// do what you have to do to convert previous tabId to the new one
// just use ref_tabId[your_previous_tabId] to get the current corresponding new tabId
console.log(ref_tabId);
}
});
}
function SaveHashes(){
if (!tabs_hashes_save_queued && Object.keys(tabs_hashes).length > 0){
tabs_hashes_save_queued = true;
chrome.tabs.query({windowType: "normal"}, function(querytabs){
var data = [];
querytabs.forEach(function(tab){
if (tabs_hashes[tab.id]){
data.push([tab.id, tabs_hashes[tab.id]]);
} else {
data.push([tab.id, GetHash(tab.url)]);
}
});
localStorage["tabs_hashes"] = JSON.stringify(data);
setTimeout(function(){ tabs_hashes_save_queued = false; }, 1000);
});
}
}
function GetHash(s){
var hash = 0;
if (s.length === 0){
return 0;
}
for (var i = 0; i < s.length; i++){
hash = (hash << 5)-hash;
hash = hash+s.charCodeAt(i);
hash |= 0;
}
return Math.abs(hash);
}
chrome.tabs.onCreated.addListener(function(tab){
SaveHashes();
});
chrome.tabs.onAttached.addListener(function(tabId){
SaveHashes();
});
chrome.tabs.onRemoved.addListener(function(tabId){
delete tabs_hashes[tabId];
SaveHashes();
});
chrome.tabs.onDetached.addListener(function(tabId){
SaveHashes();
});
chrome.tabs.onUpdated.addListener(function(tabId, changeInfo){
if (changeInfo.pinned != undefined || changeInfo.url != undefined){
delete tabs_hashes[tabId];
SaveHashes();
}
});
chrome.tabs.onMoved.addListener(function(tabId){
SaveHashes();
});
chrome.tabs.onReplaced.addListener(function(addedTabId, removedTabId){
delete tabs_hashes[removedTabId];
SaveHashes();
});
Start();
I use array to save data, because in this way I can preserve tabs order, which is unlikely if data would be saved in the object. When loading data after browser's restart, even if url is not unique, I can trust that it will be under some "close enough" index. I would do it more complex, for example reverse check if tab was not found, but this works ok so far.

Related

Set document title as constant and override dynamic updating by page (e.g. 'Facebook' -> '(1) Facebook') in Chrome extension [duplicate]

I'm creating a Google Chrome extension and I need to detect when a page's title changes. The page's title is changed like in Twitter: (num) Twitter (see the screenshot below) - when a new tweet is posted, the number increments. Example:
I'm trying to detect the title changes of a URL that's loaded in one of my tabs and play a beep sound whenever there's a difference. This check is to be done in a repeated interval and I think that can be accomplished using setTimeOut() function.
I've created a manifest.json as follows:
{
"manifest_version": 2,
"name": "Detect Page Title Changes",
"description": "Blah",
"version": "1.0",
"browser_action": {
"default_icon": "icon.png",
"default_popup": "background.html"
},
"permissions": [
"tabs"
]
}
However, I'm clueless about the rest. I've searched through the docs 1 2 and tried the solutions on similar Stack Overflow threads such as this one I but couldn't find anything that suits my requirements.
Do you have any suggestions? Please include an example, if possible.

Instead of arguing in comments that a certain approach is better, let me be more constructive and add an answer by showing a particular implementation I co-wrote myself, and explain some gotchas you may run into. Code snippets refer to a service different from Twitter, but the goal was the same. In fact, this code's goal is to report the exact number of unread messages, so yours might be simpler.
My approach is based on an answer here on SO, and instead of being polling-driven (check condition at fixed intervals) is event-driven (be notified of potential changes in condition).
Advantages include immediate detection of a change (which would otherwise not be detected until the next poll) and not wasting resources on polls while the condition does not change. Admittedly, the second argument hardly applies here, but the first one still stands.
Architecture at a glance:
Inject a content script into the page in question.
Analyze initial state of the title, report to background page via sendMessage.
Register a handler for a title change event.
Whenever the event fires and the handler is called, analyze the new state of the title, report to background page via sendMessage.
Already step 1 has a gotcha to it. Normal content script injection mechanism, when the content script is defined in the manifest, will inject it in pages upon navigation to a page that matches the URL.
"content_scripts": [
{
"matches": [
"*://theoldreader.com/*"
],
"js": ["observer.js"],
"run_at": "document_idle"
}
]
This works pretty well, until your extension is reloaded. This can happen in development as you're applying changes you've made, or in deployed instances as it is auto-updated. What happens then is that content scripts are not re-injected in existing open pages (until navigation happens, like a reload). Therefore, if you rely on manifest-based injection, you should also consider including programmatic injection into already-open tabs when extension initializes:
function startupInject() {
chrome.tabs.query(
{url: "*://theoldreader.com/*"},
function (tabs) {
for (var i in tabs) {
chrome.tabs.executeScript(tabs[i].id, {file: "observer.js"});
}
}
);
}
On the other end, content script instances that were active at the time of extension reload are not terminated, but are orphaned: any sendMessage or similar request will fail. It is, therefore, recommended to always check for exceptions when trying to communicate with the parent extension, and self-terminate (by removing handlers) if it fails:
try {
chrome.runtime.sendMessage({'count' : count});
} catch(e) { // Happens when parent extension is no longer available or was reloaded
console.warn("Could not communicate with parent extension, deregistering observer");
observer.disconnect();
}
Step 2 also has a gotcha to it, though it depends on the specifics of the service you're watching. Some pages inside the scope of the content script will not show the number of unread items, but it does not mean that there are no new messages.
After observing how the web service works, I concluded that if the title changes to something without navigation, it's safe to assume the new value if correct, but for the initial title "no new items" should be ignored as unreliable.
So, the analysis code accounts for whether it's the initial reading or handling an update:
function notify(title, changed) {
// ...
var match = /^\((\d+)\)/.exec(title);
var match_zero = /^The Old Reader$/.exec(title);
if (match && match[1]) {
count = match[1];
} else if (match_zero && changed) {
count = 0;
}
// else, consider that we don't know the count
//...
}
It is called with the initial title and changed = false in step 2.
Steps 3 & 4 are the main answer to "how to watch for title changes" (in an event-driven way).
var target = document.querySelector('head > title');
var observer = new window.MutationObserver(
function(mutations) {
mutations.forEach(
function(mutation){
notify(mutation.target.textContent, true);
}
);
}
);
observer.observe(target, { subtree: true, characterData: true, childList: true });
For specifics as to why certain options of observer.observe are set, see the original answer.
Note that notify is called with changed = true, so going from "(1) The Old Reader" to "The Old Reader" without navigation is considered to be a "true" change to zero unread messages.

Put chrome.tabs.onUpdated.addListener in your background script:
chrome.tabs.onUpdated.addListener(function(tabId, changeInfo, tab) {
console.log(changeInfo);
});
changeInfo is an object which includes title changes, e.g. here:
Can then filter on the object so that an action only occurs if changeInfo includes a title change. For additional manipulation, e.g. responding to page title changes with page content / actions, you can send a message to content script from inside the listener after whatever conditions are met.

Create an event page.
Create a content script that gets injected into a webpage when a webpage loads.
Within the content script, use setInterval to poll the page to see if window.document.title changes.
If the title has changed, use chrome.runtime.sendMessage to send a message to your event page.
On your event page, listen for messages with chrome.runtime.onMessage and play a sound.

After researching Chrome's tabs API, it doesn't look like anything stands out to help you directly. However, you should be able to attach an event listener to the title node of the tab(s) you're interested in. The DOMSubtreeModified mutation event works in Chrome, and a quick test in a normal html document proves to work for me - should be no different from within an extension.
var title = document.getElementsByTagName('title')[0];
if (title) {
title.addEventListener('DOMSubtreeModified', function (e) {
// title changed
}, false);
}

Close all tabs to right of current tab with JavaScript or greasemonkey?

I'm currently coding a bot to do something for me and currently it has to open a large number of tabs every iteration and to make the bot fully automatic I have to find a way to close them all except the original the bot is running from. (The tabs have to be closed before the next iteration or what it's doing fails.)

I found a way to actually do this using a function that would detect the URL and do a specific function for that and then close it. Here's the code...
if (window.location.href.indexOf("https://www.google.com") != -1) {
So this detects that my URL contains a certain string and therefore activates on all pages after this URL. It works for what I want.

Not possible generally via userscripts because modern browsers block the attempts to close tabs/windows (Firefox has a config value to allow it but not all users would be willing to enable it). You will have to convert the userscript to an extension/addon.
In case you don't mind changing the default browser config to allow scripts to close tabs, use GM_setValue to raise a flag that will be periodically checked by your script in other tabs:
var dontCloseMe = false;
setInterval(function() {
var shouldClose = Date.now() - GM_getValue("terminate", 0) < 2 * 100;
if (shouldClose && !dontCloseMe) {
window.close();
}
}, 100);
.................
if (shouldCloseOtherTabs) {
dontCloseMe = true;
GM_setValue("terminate", Date.now());
}
And make sure the // #include actually includes the urls of those other tabs.

Are cookie read/write atomic in browser

I am trying to implement a cross tab mutex for my needs. I found a implementation here. which seems quite promising. Basically, it implements Leslie Lamport's algorithm with needs atomic read/writes for creating a mutex.
However it relies on localStorage providing atomic read/writes. This works well in most browsers except for Chrome.
So my question is, can I use cookie read/write instead? Are cookie reads/writes atomic in all mainstream browsers (IE, Chrome, Safari, Firefox)?

Neither cookies, nor localStorage provide atomic transactions.
I think you might have misunderstood that blog post, it doesn't say that his implementation doesn't work in Chrome, it does not rely on localStorage providing atomic read/writes. He says that normal localStorage access is more volatile in Chrome. I'm assuming this is related to the fact that Chrome uses a separate process for each tab, whereas most other browsers tend to use a single process for all tabs. His code implements a locking system on top of localStorage which should protect against things getting overwritten.
Another solution would be to use IndexedDB. IndexedDB does provide atomic transactions. Being a new standard it is not supported in as many browsers as localStorage, but it does have good support in recent versions of Firefox, Chrome and IE10.

No. Even if the browsers probably implement a read and a write lock on the cookie it won't protect you from changes that happens between a read and a consequent write. This is easy to see by looking at the javascript API for cookies, there is no mutex functionality there...

I ran into this concurrency issue using localStorage today (two years alter..)
Scenario: Multiple tabs of a browser (e.g. Chrome) have identical script code that gets executed, basically at the same time (called by e.g. SignalR). The code reads and writes to localStorage. Since the tabs run in different processes but access the shared local storage collectively, reading and writing leads to undefined results since a locking mechanism is missing here. In my case I wanted to make sure that only one of the tabs actually works with the local storage and not all of them..
I tried the locking mechanism of Benjamin Dumke-von der Ehe metioned in the question above but got undesired results. So I decided to roll my own experimental code:
localStorage lock:
Object.getPrototypeOf(localStorage).lockRndId = new Date().getTime() + '.' + Math.random();
Object.getPrototypeOf(localStorage).lock = function (lockName, maxHold, callback) {
var that = this;
var value = this.getItem(lockName);
var start = new Date().getTime();
var wait = setInterval(function() {
if ((value == null) || (parseInt(value.split('_')[1]) + maxHold < start)) {
that.setItem(lockName, that.lockRndId + '_' + start);
setTimeout(function () {
if (that.getItem(lockName) == (that.lockRndId + '_' + start)) {
clearInterval(wait);
try { callback(); }
catch (e) { throw 'exeption in user callback'; }
finally { localStorage.removeItem(lockName); }
}
}, 100);
}
}, 200);
};
usage:
localStorage.lock(lockName, maxHold, callback);
lockName - a global scope unique name for the lock - string
maxHold - the maximum time to protect the script in milliseconds - integer
callback - the function containing the script that gets protected
example: "only play a sound in one tab"
//var msgSound = new Audio('/sounds/message.mp3');
localStorage.lock('lock1', 5000, function(){
// only one of the tabs / browser processes gets here at a time
console.log('lock aquired:' + new Date().getTime());
// work here with local storage using getItem, setItem
// e.g. only one of the tabs is supposed to play a sound and only if none played it within 3 seconds
var tm = new Date().getTime();
if ((localStorage.lastMsgBeep == null)||(localStorage.lastMsgBeep <tm-3000)) {
localStorage.lastMsgBeep = tm;
//msgSound.play();
console.log('beep');
}
});

How to share a data between a window and a frame in JavaScript

This is WebKit browsers specific (meaning that I only need to make it work in WebKit specific, i.e. iOS/Android browsers, but I'm testing in Chrome).
I have a page. The page loads one or more iframes, with contents from another domain. I need to receive messages (using postMessage()) from these iframes, and I need to be able to identify which iframe a specific message came from.
I can't find a way to do that that does not involve throwing something the iframe URL that the iframe contents then can pass back to me. I would like to not have to meddle with the URL, as there is no guarantee I can safely do that (redirects can throw the parameters out, for example).
I tried something that I thought was reasonable. When I create the iframe element (it's done from J/S), I associated a property with the element, let's say 'shared_secret'. When I get the message event back from the frame, I tried to locating the element that the calling frame was created with, and reading that property.
function onMessage(evt) {
var callerId = evt.source.frameElement.shared_secret;
// ....
}
window.addEventListener(message, onMessage);
var frameEl = document.createElement('iframe');
frameEl.shared_secret = 'sommething blue';
frameEl.src = 'http://aliens.com/my.html';
somewhereInMyDoc.appendChild(frameEl);
When the frame loads, it will run:
window.parent.postMessage('do you know who I am?', '*');
However, frameElement turns out undefined in the above onMessage(). I guess for the security reasons, it does work perfectly when the parent/child are from the same domain.
And it's actually ironic. Parent window can not access event.source.frameElement because event.source is an alien window. iFrame window can not call window.frameElement, because frameElement is in an alien window. So nobody can get access to it.
So, is there something that I can use as a token that I can set on a newly loaded frame, and somehow get back?
Thank you.

For people looking for some code, here is what I used to find the iframe who sent the message :
/**
* Returns the iframe corresponding to a message event or null if not found.
*
* 'e' is the event object
*/
function getFrameTarget (e) {
var frames = document.getElementsByTagName('iframe'),
frameId = 0,
framesLength = frames.length;
for (; frameId < framesLength; frameId++) {
if (frames[frameId].contentWindow === e.source) {
return frames[frameId];
}
}
return null;
}
Thanks to Pawel Veselov and DMoses !

This should be credited to https://stackoverflow.com/users/695461/dmoses.
You can actually compare the content window object of the frame element to the event.source of the message event, and the comparison will yield TRUE if they are, in fact, the same.
So, to solve my particular problem, I'll need to keep the list of frame elements that I've created (sprinkling them, if needed, with whatever additional properties), and when the event comes in, iterating through all, looking for one that has its contentWindow property equal to the event.source property.
UPDATE
We did, through encountering some nasty bugs, also found out that you should put an 'id' on the iframe created from within the parent window. Especially if that window is itself in an iframe. Otherwise, certain (Android 4.x being known for sure) browsers will yield true comparison even if the message is being received from a completely different child frame.

using javascript to track another javascript script?

I was just wondering whether there are any way (libraries, frameworks, tutorials) to do javascript tracking with another script? Basically, i want to track as the user work with the site, which function gets executed with what parameters and so on, as detailed as possible.
thanks a lot!

The extent of detail you're expecting will be challenging for any solution to gather and report on without severely slowing down your scripts -- consider that, for every call, at least 1 other call would need to occur to gather this.
You'd be better to pick a few key events (mouse clicks, etc.) and track only a few details (such as time) for them. If you're using ajax, keep JavaScript and the browser oblivious and just track this on server-side.

There's a few options but I'm not sure if there are any "great" ones. I take it Firebug/IE Dev toolbar profiling won't work because you are trying to track remote user's actions.
So, one option (I'm not highly recommending for production purposes), will work in some but not all browsers.
Essentially you overwrite every function, with a wrapper that you then inject your logging.
(I haven't tested this, trying to recall it from memory... hopefully in "pseudo code" you get the idea...)
//e.g. get all functions defined on the global window object
function logAll(){
var funcs = [];
var oldFunc;
for(var i in window){
try {
if(typeof(window[i]) == 'function'){
if(i != 'logAll'){
funcs.push(i);
}
}
} catch(ex){
//handle as desired
}
}
var x;
for(var i in funcs){
x = '_' + new Date().getTime();
window[x] = window[i];//save the old function as new function
//redefine original
window[i] = function(){
//do your logging here...
//then call the real function (and pass all params along)
call(window[x]);
};
}
};

We Keep Coding

JavaScript is the programming language of the Web.