Make a JavaScript code (injected with selenium) survive page reload

Make a JavaScript code (injected with selenium) survive page reload - javascript

I'm working on a macro recording and playback system with selenium and JavaScript. At some point I run a JavaScript code that basically subscribes a new even handler to all window events, and dumps to the localStorage some event data, that I will later collect. The problem is that when the user clicks a link, or by some other reason the page is reloaded, the event handlers are lost. All the data so far is still in the localStorage, but I cannot continue collecting new data.
I don't have control of the server, so I cannot insert code in the page source. I can only control the browser using selenium, so all I can do is execute some JavaScript at some point to start dumping events, and some JavaScript at a later point to recover the events data. The user might be browsing StackOverflow, for what I know.
Is there any workaround?
PS: I'm using selenium for python, if that matters.

If you use a proxy, such as squid, then you can integrate that with an ICAP* server to transform the pages before they arrive at your browser. This would allow any page to be altered before it arrives at the browser, inserting your javascript.
Squid version 3 or greater comes with an integrated ICAP server.
* Internet Content Adaptation Protocol - defined in RFC 3507

I think i have a very really good simple solution:
To make the injection easiest you can the the Gatejs SPDY/HTTP proxy and use the injection gatejs opcode - it works both on forward and reverse proxy.
Gatejs injection will try to add you html code into a content of type HTML (text/html).
Below a forward proxy example using injection.
var serverConfig = function(bs) { return({
hostname: "testServer0",
runDir: "/tmp/gatejs",
dataDir: "/path/to/dataDir",
logDir: "/var/log/gatejs",
http: {
testInterface: {
type: 'forward',
port: 8080,
pipeline: 'pipetest'
},
},
pipeline: {
pipetest: [
['injection', {
code: "<h1>w00t injection</h1>"
}],
['proxyPass', { mode: 'host', timeout: 10 }]
],
}
})};
mk-

Related

How to obtain and manipulate the requestheaders using javascript in console?

I've encountered a paywall and I'm trying to bypass it using javascript in the console. I did some research and found a few different approaches, one of which is changing the requestheader in order to make a given website believe that you got there through a twitter link (thus allowing you to view the content for free). The function I use aims to change the referer by listening to the onBeforeSendHeaders event as specified on https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/onBeforeSendHeaders. It looks like the following (NOTE: This function is typed and executed directly inside of the devtools console):
function setReferer(x){
x.requestHeaders = x.requestHeaders.filter(function(header){
if(header.name === 'Referer'){
return false
return true
)}
x.requestheaders.push(
{
"name: "Referer",
"value": "https://t.co/" //Twitter website
}
return {requestHeaders: x.requestHeaders};
}
//this example uses chrome browser
chrome.webRequest.onBeforeSendHeaders.addListener(setReferer,
{
urls: ["<all_urls>"],
types: ["main_frame"], },
["requestHeaders", "blocking", "extraHeaders"] //extraHeaders meant to bypass CORS protocol
);
Unfortunately upon refreshing the window, this approach gives me folllowing error:
GET <some_url> net:ERR_BLOCKED_BY_CLIENT
Behind this error is the URL to the source code of the article, which I was able to load and copy into word, so I got the article I was looking for anyway. However I wasn't able to view it inside of the browsers main frame. Note that I am doing this only for the purpose of polishing my coding skills. I am trying to get a better understanding of the more complicated facets of the HTTP protocol, especially the way headers get sent clientside and interpreted serverside. If anyone knows more about the subject or knows / has a resource that he or she wants to share, this would me greatly appreciated!

Is there an alternative to preprocessorScript for Chrome DevTools extensions?

I want to create a custom profiler for Javascript as a Chrome DevTools Extension. To do so, I'd have to instrument all Javascript code of a website (parse to AST, inject hooks, generate new source). This should've been easily possible using chrome.devtools.inspectedWindow.reload() and its parameter preprocessorScript described here: https://developer.chrome.com/extensions/devtools_inspectedWindow.
Unfortunately, this feature has been removed (https://bugs.chromium.org/p/chromium/issues/detail?id=438626) because nobody was using it.
Do you know of any other way I could achieve the same thing with a Chrome Extension? Is there any other way I can replace an incoming Javascript source with a changed version? This question is very specific to Chrome Extensions (and maybe extensions to other browsers), I'm asking this as a last resort before going a different route (e.g. dedicated app).

Use the Chrome Debugging Protocol.
First, use DOMDebugger.setInstrumentationBreakpoint with eventName: "scriptFirstStatement" as a parameter to add a break-point to the first statement of each script.
Second, in the Debugger Domain, there is an event called scriptParsed. Listen to it and if called, use Debugger.setScriptSource to change the source.
Finally, call Debugger.resume each time after you edited a source file with setScriptSource.
Example in semi-pseudo-code:
// Prevent code being executed
cdp.sendCommand("DOMDebugger.setInstrumentationBreakpoint", {
eventName: "scriptFirstStatement"
});
// Enable Debugger domain to receive its events
cdp.sendCommand("Debugger.enable");
cdp.addListener("message", (event, method, params) => {
// Script is ready to be edited
if (method === "Debugger.scriptParsed") {
cdp.sendCommand("Debugger.setScriptSource", {
scriptId: params.scriptId,
scriptSource: `console.log("edited script ${params.url}");`
}, (err, msg) => {
// After editing, resume code execution.
cdg.sendCommand("Debugger.resume");
});
}
});
The implementation above is not ideal. It should probably listen to the breakpoint event, get to the script using the associated event data, edit the script and then resume. Listening to scriptParsed and then resuming the debugger are two things that shouldn't be together, it could create problems. It makes for a simpler example, though.

On HTTP you can use the chrome.webRequest API to redirect requests for JS code to data URLs containing the processed JavaScript code.
However, this won't work for inline script tags. It also won't work on HTTPS, since the data URLs are considered unsafe. And data URLs are can't be longer than 2MB in Chrome, so you won't be able to redirect to large JS files.
If the exact order of execution of each script isn't important you could cancel the script requests and then later send a message with the script content to the page. This would make it work on HTTPS.
To address both issues you could redirect the HTML page itself to a data URL, in order to gain more control. That has a few negative consequences though:
Can't reload page because URL is fixed to data URL
Need to add or update <base> tag to make sure stylesheet/image URLs go to the correct URL
Breaks ajax requests that require cookies/authentication (not sure if this can be fixed)
No support for localStorage on data URLs
Not sure if this works: in order to fix #1 and #4 you could consider setting up an HTML page within your Chrome extension and then using that as the base page instead of a data URL.
Another idea that may or may not work: Use chrome.debugger to modify the source code.

How to start two or more custom URL Protocol from Javascript

I have an old html page that creates a script file and executes it using:
fsoObject = new ActiveXObject("Scripting.FileSystemObject")
wshObject = new ActiveXObject("WScript.Shell")
I am trying to modify it and make it usable also from other browsers. If you know the answer stop reading and please answer. If there is no quick answer, here is the description of my attempts. I was successful in doing the job, but only when the script is shorter than 2000 characters. I need help for scripts longer than 2000 characters.
The webpage is for internal use only, so it is easy for me to create a custom URL protocol on each computer that runs a VBScript file from a network drive.
I created my custom URL Protocol that starts a VBScript file like this:
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\MyUrlProtocol]
"URL Protocol"=""
#="Url:MyUrlProtocol"
"UseOriginalUrlEncoding"=dword:00000001
[HKEY_CLASSES_ROOT\MyUrlProtocol\DefaultIcon]
#="C:\\Windows\\System32\\WScript.exe"
[HKEY_CLASSES_ROOT\MyUrlProtocol\shell]
[HKEY_CLASSES_ROOT\MyUrlProtocol\shell\open]
[HKEY_CLASSES_ROOT\MyUrlProtocol\shell\open\command]
#="C:\\Windows\\System32\\WScript.exe \"X:\\MyUrlProtocol.vbs\" \"%1\""
In MyUrlProtocol.vbs I have this:
MsgBox "The length of the link is " & Len(WScript.Arguments(0)) & " characters"
MsgBox "The content of the link is: " & WScript.Arguments(0)
When I click on click me I see two messages, so everything works well (tested with Chrome and IE in Windows 7.)
It works also when I execute document.getElementById("test").click()
I thought this could be the solution: I would pass the text of the script to the VBS static script, which would create the dynamic script and run it, but with this system I can't pass more than ~2000 characters.
So I tried to split the text of the script in chunks smaller than 2000 characters and simulate several clicks on the link, but only the first one works.
So I tried with xmlhttp.open("GET","MyUrlProtocol:test",false);, but Chrome says Cross origin requests are only supported for HTTP.
Is it possible to pass more than 2000 characters to a VBScript script via a custom URL protocol?
If not, is it possible to call several custom URL protocols in sequence?
If not, is there another way to create a script file and run it from Javascript?
EDIT 1
I found a solution, but in Chrome only works when it likes, so I'm back to square one.
The code below in IE executes the script 4 times (correct), but in Chrome only the first execution runs.
If I change it to delay += 2000, then Chrome usually runs the script 2 times, but sometimes 1 and sometimes 3 or even 4 times.
If I change it to delay += 10000, then it usually runs the script 4 times, but sometimes misses one.
The function is always executed 4 times, both in Chrome and IE. What is weird is that the sr.click() sometimes does nothing and the function execution continues.
<HTML>
<HEAD>
<script>
var delay;
function runScript(text) {
setTimeout(function(){runScript2(text)}, delay);
delay += 100;
}
function runScript2(text) {
var sr = document.getElementById('scriptRunner');
sr.href='intelliclad:'+text;
sr.click();
}
function test(){
delay = 0;
runScript("uno");
runScript("due");
runScript("tre");
runScript("quattro");
}
</script>
</HEAD>
<BODY>
<input type="button" value="Run test" onclick="test()">
scriptRunner
</BODY>
</HMTL>
EDIT 2
I tried with Luke's suggestion of setting the next timeout from inside the call back but nothing changed (IE works always, Chrome whenever it likes).
Here is the new code:
var scripts;
var delay = 2000;
function runScript() {
var sr = document.getElementById('scriptRunner');
sr.href = 'intelliclad:' + scripts.shift();
sr.click();
if(scripts.length)
setTimeout(function() {runScript()}, delay);
}
function test(){
scripts = ["uno", "due", "tre", "quattro"];
runScript();
}
Some background: The page asks for the shape of a panel, which can be just a few parameters [nfaces=1, shape1='square', width1=100] or hundreds of parameters for panels with many faces, many slots, many fasteners, etc. After asking for all the parameters a script for our internal 3D CAD (which can be larger than 20KB) is generated and the CAD is started and asked to execute the script.
I would like to do all on the client side, because the page is served by a Domino web server, which can't even dream of managing such a complex script.

I didn't read your whole post...have an answer:
I too wish that custom url protocols can handle long urls. They simply do not. IE is even worse as some OSs only accept 800 chars.
So, here's the solution:
For long urls, only pass a single use token. The vbscript uses the token
and does a url get to your web server to get all of the data.
This is the only way I've been able to successfully pass lots of data around. If you ever find a clearer solution, please remember to post it here.
Update:
Note that this is the best way I have found to deal with the url protocol limitations. I too wish this was not necessary. This does work and works well.
You mentioned Dominos, so possibly you need something in a POS environment... I create a web based POS system, so we could face a lot of the same issues.
Suppose you want a custom url to print a pdf to the default printer without the annoying popup window. We need to do this thousands of times a day...
When building the web page, add the print button which when pressed calls the custom url: myproto://printpdf?id=12345&tocken=onetimetoken
this will execute your vbscript on the local desktop
in your vbscript, parse the arguments and react. In this case, your command is printpdf and the id is 123456 and you have a onetime tocken key.
have the vb script to an https get to: https://mydomain.com/APIs/printpdf.whatever?id=12345&key=onetimetoken
check the credentials based on the ip address and token, if all aligns, then return the contents of the pdf (you may want to convert the pdf to a byte array string)
now the vbscript has the pdf, assemble it and write it to a temp folder then execute a silent pdf print command (I use Sumatra PDF http://blog.kowalczyk.info/software/sumatrapdf/free-pdf-reader.html)
mission accomplished.
Since I do know what you what to do in your custom url and the general workflow, I can only describe how I've solved the sort url issue.
Using this technique, the possibilities are limitless. You have full control over the local computer running the web browser, you have a onetime use token which grants access to a web API with can return any sort of information you program.
You could write a custom url protocol to turn on the pizza oven if you wanted :)
If you are not able to create the server side code which is listening for vbscript's get request then this would not work.
You might be able to pass the data from the browser to the vbscript using the clipboard.
Update 2:
Since in this case the data is on the client (one single form can define hundreds of parameters), the server API doesn't know what to answer to the vb script request. So the workflow described above must be preceded by these two steps:
The onkeypress event executes a submit to send the current parameters to the server
The server replies with the refreshed form, adding to the body onload a call to a function which uses another submit to call the custom url, as described on point 1 listed above.
Update 3:
stenci, what you've added (in Update 2) will work. I would do it like this:
user presses a button saying I'm done editing the form
ajax post the form to the server
the server saves the data and attaches unique key to the datastore
the server returns the key to ajax callback function
now the client has a single use key and invokes the url schema passing the key
vbscript does an https get to the server and passes the key
server returns the data to the vbscript
It is a bit long winded. Once coded it will work like a charm.
The only other alternative I can see is to copy the form data to the clipboard using something like: http://zeroclipboard.org/
and then in vbscript see if you can read the clipboard like: Use clipboard from VBScript

How about creating an iFrame for each instance?
Something like this:
function runScript(text) {
var iframe = document.createElement('iframe');
iframe.src = 'intelliclad:'+text;
document.body.appendChild(iframe);
}
function test(){
runScript("uno");
runScript("due");
runScript("tre");
runScript("quattro");
}
You can then use css styling to make these iframes transparent / hidden.

You might not like this answer, but I've used this method in the past and it works.
Instead of relying on ActiveX, consider using a Java Applet, and JNI.
Basically, you have to make sure the native scripts you want to run are available on your client machine, along with a JNI wrapper.
The applet will have to be at least self signed, for the browser to allow it to load and access a native library. Once the JNI libraries are loaded, you can easily call methods from the page / applet.
As a consequence of using Java, you could possibly use the same applet for windows as well as linux clients, provided of course you have native libraries present on the respective clients.
This series of articles talks about precisely your problem : http://www.javaworld.com/article/2076775/java-security/escape-the-sandbox--access-native-methods-from-an-applet.html
P.S the article is really old, but the concept remains unchanged.

Firefox addon sdk: retrieving value from a different site based on clipboard content

I just got started with firefox addons to help my team fasten up our work, what i am trying to create:
When being on a specific site (let's call it mysite.com/input) i want to fill out automatically an input with an id: "textinput" from the value that is stored on the clipboard.
Yeah it is simple yet it would be simply enough to paste it, wouldn't it?... now here is the twist:
I need an other form of the value: on the clipboard it is x/y/z. There is a database site (let's call it database.com) on which searching like database.com?s=x/y/z would directly give the page from where it is possible to gain the correct value as it has an id: #result
I got lost how to properly communicate between page and content scripts, i'm not even sure in what order should i use the pagemod and the page-worker
Please help me out! Thank you!

The basic flow is this:
In your content script, you get the value form the form, somehow. I'll leave that up to you.
Still in the content script, you send the data to main.js using self.port.emit:
Code:
self.port.emit('got-my-value', myValue);
In main.js, you would then receive the 'got-my-value' event and make a cross-domain request using the request module.
Code:
require('page-mod').PageMod({
include: 'somesite.com',
contentScriptFile: data.url('somescript.js'),
onAttach: function(worker) {
worker.port.on('got-my-value', function(value) {
require('request').Request({
url: 'http://someurl.com',
onComplete: function(response) {
console.log(response);
// maybe send data back to worker?
worker.port.emit('got-other-data', response.json);
}
}).post();
});
}
});
If you need to receive the data back in the original worker, you would another listener for the event coming back.
Code:
self.port.on('got-other-data', function(value) {
// do something
})

I've been struggling with the same issue for the past 2 days until I found this:
https://developer.mozilla.org/en-US/Add-ons/SDK/Guides/Content_Scripts/Cross_Domain_Content_Scripts
They indicate the following:
However, you can enable these features for specific domains by adding
them to your add-on's package.json under the "cross-domain-content"
key, which itself lives under the "permissions" key:
"permissions": {
"cross-domain-content": ["http://example.org/", "http://example.com/"] }
The domains listed must include the scheme
and fully qualified domain name, and these must exactly match the
domains serving the content - so in the example above, the content
script will not be allowed to access content served from
https://example.com/. Wildcards are not allowed. This feature is
currently only available for content scripts, not for page scripts
included in HTML files shipped with your add-on.
That did the trick for me.

Chrome Extension: Skipping red page on created tab

chrome.tabs.create({
'url': 'https://www.myserver.com/',
'selected': false
}, function(tab) {
chrome.tabs.executeScript(tab.id, {
'code': "doSomething();"
});
});
Actually I'm unable to execute the code, because there's invalid
certificate on the "myserver.com", so Chrome displays red page, which
I'm unable to skip and run my code.
Is there any way how to skip the red page except adding the
certification authority to trusted = except any neccessary step on the
client side?

You cannot inject or manipulate that page, due to security reasons. Which makes sense since that page is there to protect the user :)
The only way to do something like that is through Native Code, NPAPI. You implement a plugin that bypasses it. But as you know, implementing a plugin makes the whole computer vulnerable since you will have access to the entire host machine.
That is why creating plugins is not favoured, but recommended if you absolutely cannot do what you wanted with the current API and limitations.

We Keep Coding

JavaScript is the programming language of the Web.