Headless JS site testing with test for each page - javascript

I'm trying to write headless integration tests for every page on my site in CoffeeScript/Javascript and run them in one command. I've tried using casperjs but I keep running into issues When attempting to run more than one test suite in a loop of requests.
Ideally I'd like to do something like this:
for page in ['/products','/about', '/contact']
open(page, ->
require("tests/#{page}/test.coffee").execute()
Where the test file looks something like:
exports.execute ->
test.assert(pageTitleIs('about us'))
So that I could keep tests for each page in separate files, but run them all heedlessly with one command.

casper.thenOpen( url, ->
#CurrentUrl = url
#currentRoute = route
#test.currentSuite = #test.running = #test.started = false # Hack. If we don't do this and a test fails, no tests after it will be executed
#test.info("Testing #{urlPath}");
path = currentDirectory+'/root'+route;
if(fs.isDirectory(path))
for file in fs.list(path)
if(file.indexOf('.spec')!=-1)
#echo 'executing file: ' + path + '/'+ file
#test.exec(path + '/'+ file);
#test.exec currentDirectory+"/smoke.test.js"
#waitFor ->
if test_done #Hack. If we don't do this, new tests will start before old ones finished.
return true
test_done = false
)
This is the solution I came up with, where smoke.test.js gets executed for every page. It's pretty hacky and I've run into many issues related to the scope of the tests, but it works.

String interpolation does not work with single quotes. You have to use double-quotes. See the difference:
require('tests/#{page}/test.coffee').execute()
# becomes: require('tests/#{page}/test.coffee').execute();
require("tests/#{page}/test.coffee").execute()
# becomes: require("tests/" + page + "/test.coffee").execute();

Related

Node.js: requesting a page and allowing the page to build before scraping

I've seen some answers to this that refer the askee to other libraries (like phantom.js), but I'm here wondering if it is at all possible to do this in just node.js?
Considering my code below. It requests a webpage using request, then using cheerio it explores the dom to scrape the page for data. It works flawlessly and if everything had gone as planned, I believe it would have outputted a file as i imagined in my head.
The problem is that the page I am requesting in order to scrape, build the table im looking at asynchronously using either ajax or jsonp, i'm not entirely sure how .jsp pages work.
So here I am trying to find a way to "wait" for this data to load before I scrape the data for my new file.
var cheerio = require('cheerio'),
request = require('request'),
fs = require('fs');
// Go to the page in question
request({
method: 'GET',
url: 'http://www1.chineseshipping.com.cn/en/indices/cbcfinew.jsp'
}, function(err, response, body) {
if (err) return console.error(err);
// Tell Cherrio to load the HTML
$ = cheerio.load(body);
// Create an empty object to write to the file later
var toSort = {}
// Itterate over DOM and fill the toSort object
$('#emb table td.list_right').each(function() {
var row = $(this).parent();
toSort[$(this).text()] = {
[$("#lastdate").text()]: $(row).find(".idx1").html(),
[$("#currdate").text()]: $(row).find(".idx2").html()
}
});
//Write/overwrite a new file
var stream = fs.createWriteStream("/tmp/shipping.txt");
var toWrite = "";
stream.once('open', function(fd) {
toWrite += "{\r\n"
for(i in toSort){
toWrite += "\t" + i + ": { \r\n";
for(j in toSort[i]){
toWrite += "\t\t" + j + ":" + toSort[i][j] + ",\r\n";
}
toWrite += "\t" + "}, \r\n";
}
toWrite += "}"
stream.write(toWrite)
stream.end();
});
});
The expected result is a text file with information formatted like a JSON object.
It should look something like different instances of this
"QINHUANGDAO - GUANGZHOU (50,000-60,000DWT)": {
 "2016-09-29": 26.7,
"2016-09-30": 26.8,
},
But since the name is the only thing that doesn't load async, (the dates and values are async) I get a messed up object.
I tried Actually just setting a setTimeout in various places in the code. The script will only be touched by developers that can afford to run the script several times if it fails a few times. So while not ideal, even a setTimeout (up to maybe 5 seconds) would be good enough.
It turns out the settimeouts don't work. I suspect that once I request the page, I'm stuck with the snapshot of the page "as is" when I receive it, and I'm in fact not looking at a live thing I can wait for to load its dynamic content.
I've wondered investigating how to intercept the packages as they come, but I don't understand HTTP well enough to know where to start.
The setTimeout will not make any difference even if you increase it to an hour. The problem here is that you are making a request against this url:
http://www1.chineseshipping.com.cn/en/indices/cbcfinew.jsp
and their server returns back the html and in this html there are the js and css imports. This is the end of your case, you just have the html and that's it. Instead the browser knows how to use and to parse the html document, so it is able to understand the javascript scripts and to execute/run them and this is exactly your problem. Your program is not able to understand that has something to do with the HTML contents. You need to find or to write a scraper that is able to run javascript. I just found this similar issue on stackoverflow:
Web-scraping JavaScript page with Python
The guy there suggests https://github.com/niklasb/dryscrape and it seems that this tool is able to run javascript. It is written in python though.
You are trying to scrape the original page that doesn't include the data you need.
When the page is loaded, browser evaluates JS code it includes, and this code knows where and how to get the data.
The first option is to evaluate the same code, like PhantomJS do.
The other (and you seem to be interested in it) is to investigate the page's network activity and to understand what additional requests you should perform to get the data you need.
In your case, these are:
http://index.chineseshipping.com.cn/servlet/cbfiDailyGetContrast?SpecifiedDate=&jc=jsonp1475577615267&_=1475577619626
and
http://index.chineseshipping.com.cn/servlet/allGetCurrentComposites?date=Tue%20Oct%2004%202016%2013:40:20%20GMT+0300%20(MSK)&jc=jsonp1475577615268&_=1475577620325
In both requests:
_ is a decache parameter to prevent caching.
jc is a name of a JS wrapper function which should be invoked with the result (https://en.wikipedia.org/wiki/JSONP)
So, scrapping the table template at http://www1.chineseshipping.com.cn/en/indices/cbcfinew.jsp and performing two additional requests you will be able to combine them into the same data structure you see in the browser.

javascript testing using sinon.js & Qunit. how to test for window.location.href and avoid downloading

The function is in coffee script:
downloadCSVData: ->
#interval = $('#line_interval').val()
csv_data_path = "/api/As/" + "&interval=" + #interval
window.location.href = csv_data_path
I need to test this function. I don't know how to check the last line of code. Whenever I call this function, it downloads a file...... I wonder if there's a way I can call the function without downloading the csv file, and I can test if the window.location.href is set to csv_data_path
Thanks.
You can just trust that window.location will do it's thing properly - if it doesn't you have bigger issues. So, just extract the preceding code into a function, and test that:
getCSVURL: ->
#interval = $('#line_interval').val()
"/api/As/" + "&interval=" + #interval
downloadCSVData: ->
window.location.href = getCSVURL.call #

Inject local .js file into a webpage?

I'd like to inject a couple of local .js files into a webpage. I just mean client side, as in within my browser, I don't need anybody else accessing the page to be able to see it. I just need to take a .js file, and then make it so it's as if that file had been included in the page's html via a <script> tag all along.
It's okay if it takes a second after the page has loaded for the stuff in the local files to be available.
It's okay if I have to be at the computer to do this "by hand" with a console or something.
I've been trying to do this for two days, I've tried Greasemonkey, I've tried manually loading files using a JavaScript console. It amazes me that there isn't (apparently) an established way to do this, it seems like such a simple thing to want to do. I guess simple isn't the same thing as common, though.
If it helps, the reason why I want to do this is to run a chatbot on a JS-based chat client. Some of the bot's code is mixed into the pre-existing chat code -- for that, I have Fiddler intercepting requests to .../chat.js and replacing it with a local file. But I have two .js files which are "independant" of anything on the page itself. There aren't any .js files requested by the page that I can substitute them for, so I can't use Fiddler.
Since your already using a fiddler script, you can do something like this in the OnBeforeResponse(oSession: Session) function
if ( oSession.oResponse.headers.ExistsAndContains("Content-Type", "html") &&
oSession.hostname.Contains("MY.TargetSite.com") ) {
oSession.oResponse.headers.Add("DEBUG1_WE_EDITED_THIS", "HERE");
// Remove any compression or chunking
oSession.utilDecodeResponse();
var oBody = System.Text.Encoding.UTF8.GetString(oSession.responseBodyBytes);
// Find the end of the HEAD script, so you can inject script block there.
var oRegEx = oRegEx = /(<\/head>)/gi
// replace the head-close tag with new-script + head-close
oBody = oBody.replace(oRegEx, "<script type='text/javascript'>console.log('We injected it');</script></head>");
// Set the response body to the changed body string
oSession.utilSetResponseBody(oBody);
}
Working example for www.html5rocks.com :
if ( oSession.oResponse.headers.ExistsAndContains("Content-Type", "html") &&
oSession.hostname.Contains("html5rocks") ) { //goto html5rocks.com
oSession.oResponse.headers.Add("DEBUG1_WE_EDITED_THIS", "HERE");
oSession.utilDecodeResponse();
var oBody = System.Text.Encoding.UTF8.GetString(oSession.responseBodyBytes);
var oRegEx = oRegEx = /(<\/head>)/gi
oBody = oBody.replace(oRegEx, "<script type='text/javascript'>alert('We injected it')</script></head>");
oSession.utilSetResponseBody(oBody);
}
Note, you have to turn streaming off in fiddler : http://www.fiddler2.com/fiddler/help/streaming.asp and I assume you would need to decode HTTPS : http://www.fiddler2.com/fiddler/help/httpsdecryption.asp
I have been using fiddler script less and less, in favor of fiddler .Net Extensions - http://fiddler2.com/fiddler/dev/IFiddlerExtension.asp
If you are using Chrome then check out dotjs.
It will do exactly what you want!
How about just using jquery's jQuery.getScript() method?
http://api.jquery.com/jQuery.getScript/
save the normal html pages to the file system, add the js files manually by hand, and then use fiddler to intercept those calls so you get your version of the html file

Logging value of a variable in MonkeyTalk IDE Javascript file

I'm using MonkeyTalk IDE Beta2 for testing iPad application. I exported the javascript from the MonkeyTalk IDE and got a new .js file. I am storing the Boolean value of a Verify command in a var and want to see what is its value, and accordingly do custom logic. I tried document.write, console.log and alert used in javascript but got an error that they are not defined. Please help me with this.
Also, is it possible to output the result of a test as XML (as in FoneMonkey) or as an Excel spreadsheet or something like that?
Thank you in advance.
Believe it or not*, but to date there is no way direct way to cause MonkeyTalk to log messages to the console. What you can do, however, is abuse a command like verifyNot which will result in a log message. In a MonkeyTalk .mt this would be done like:
View * VerifyNot Message
I created the following helper script called log.js for this purpose. Timestamps are automatically added by Eclipse, but not elsewhere so I have prepended the time.
load("libs/Executor.js");
function getTimeStamp() {
var now = new Date();
return now.getHours() + ":" + now.getMinutes() + ":" + now.getSeconds();
}
EXECUTOR.defineScript("Log", function(msg) {
this.app.view().verifyNot(getTimeStamp() + ": " + msg);
});
Finally, you don't need the executor boilerplate (only the verifyNot line), but we use that with scripts by Doba in order to be able to organize files in different directories (Doba.js renamed to Executor.js) -- another feature not available out of the box.
* It's almost like GorillaLogic doesn't want you to be able to resolve your own problems. ;)

Javascript wshell.run not working properly

I'm using HTA and in it I have a function that should run a command line with wshell.run , If I'm writing this line in Windows 'Run' util it is working fine, I want it to work also in the HTA with wshell.run.
The line is:
C:\xxxx\xxx\xxx.EXE aaa.psl abc
( The names are xxx just in here - not in the real code.. )
In the javascript code I'm using:
function runCmd()
{
wshShell.exec( "C:\xxxx\xxx\xxx.EXE aaa.psl abc" );
}
The error I got is in the xxx.EXE application says "couldn't open aaa.psl File not found".
Thanks,
Rotem
I'm surprised the xxx.EXE program is running at all. You need to escape those backslashes in the command:
wshShell.Exec( "C:\\xxxx\\xxx\\xxx.EXE aaa.psl abc" );
// ^-----^----^--- here
If you're doing the same thing in the aaa.psl filename, that's your problem.
If you're not passing a full path to the aaa.psl file, then most programs (not all) will expect it to be in the current directory, so you'll want to make sure you've set the current directory correctly (although using absolute paths may be a better option).
Here's an example, for instance, of telling Notepad to edit a file:
shell = WScript.CreateObject("WScript.Shell");
shell.Exec("c:\\windows\\system32\\notepad.exe c:\\temp\\temp.txt");
...or via the current directory:
shell = WScript.CreateObject("WScript.Shell");
shell.CurrentDirectory = "c:\\temp";
shell.Exec("c:\\windows\\system32\\notepad.exe temp.txt");
Okkkk T.J. is the man!! :)
I finnaly made it with your help by replacing exec to run:
This is the final (and working) code:
function runCmd()
{
wshShell.CurrentDirectory = "G:\\xxx\\xxx";
wshShell.run( "xxx.EXE xxx.psl abc" );
}

Categories