I'm trying to scrape javascript generated html objects from a website. I've tried many different libraries and I found out that selenium has everything i need.
I've tried to run the driver without being in headless mode, and it WORKS,
but that's not what I need. I need a background task to keep the user interface as clean as possible.
var chromeOptions = new ChromeOptions();
chromeOptions.AddArgument("--no-startup-window");
chromeOptions.AddArgument("--user-agent=" + Settings.globalUserAgent);
chromeOptions.AddArgument("--log-level=3");
ChromeDriverService service = ChromeDriverService.CreateDefaultService();
service.HideCommandPromptWindow = true;
service.SuppressInitialDiagnosticInformation = true;
using(var driver = new ChromeDriver(service, chromeOptions))
{
driver.Url = "https://example.com";
driver.Navigate().GoToUrl("https://example.com/otherpage");
var wait = new WebDriverWait(driver, new TimeSpan(0, 0, 30));
var element = wait.Until(SeleniumExtras.WaitHelpers.ExpectedConditions.ElementIsVisible(By.Id("sub-conv-fu")));
string source = driver.PageSource;
if (source.Contains("some string:"))
{
File.WriteAllText("test.txt", source);
} else
{
File.WriteAllText("test.txt", source);
return "Error";
}
}
return "";
This line of code contains a variable which I want to get. It's loaded using javascript
var element = wait.Until(SeleniumExtras.WaitHelpers.ExpectedConditions.ElementIsVisible(By.Id("sub-conv-fu")));
It is working if I remove the "--headless" tag.
Related
I like to write a Thunderbird AddOn that encrypts stuff. For this, I already extracted all data from the compose window. Now I have to save this into files and run a local executable for encryption. But I found no way to save the files and execute an executable on the local machine. How can I do that?
I found the File and Directory Entries API documentation, but it seems to not work. I always get undefined while trying to get the object with this code:
var filesystem = FileSystemEntry.filesystem;
console.log(filesystem); // --> undefined
At least, is there a working AddOn that I can examine to find out how this is working and maybe what permissions I have to request in the manifest.json?
NOTE: Must work cross-platform (Windows and Linux).
The answer is, that WebExtensions are currently not able to execute local files. Also, saving to some local folder on the disk is also not possible.
Instead, you need to add some WebExtension Experiment to your project and there use the legacy APIs. There you can use the IOUtils and FileUtils extensions to reach your goal:
Execute a file:
In your background JS file:
var ret = await browser.experiment.execute("/usr/bin/executable", [ "-v" ]);
In the experiment you can execute like this:
var { ExtensionCommon } = ChromeUtils.import("resource://gre/modules/ExtensionCommon.jsm");
var { FileUtils } = ChromeUtils.import("resource://gre/modules/FileUtils.jsm");
var { XPCOMUtils } = ChromeUtils.import("resource://gre/modules/XPCOMUtils.jsm");
XPCOMUtils.defineLazyGlobalGetters(this, ["IOUtils");
async execute(executable, arrParams) {
var fileExists = await IOUtils.exists(executable);
if (!fileExists) {
Services.wm.getMostRecentWindow("mail:3pane")
.alert("Executable [" + executable + "] not found!");
return false;
}
var progPath = new FileUtils.File(executable);
let process = Cc["#mozilla.org/process/util;1"].createInstance(Ci.nsIProcess);
process.init(progPath);
process.startHidden = false;
process.noShell = true;
process.run(true, arrParams, arrParams.length);
return true;
},
Save an attachment to disk:
In your backround JS file you can do like this:
var f = messenger.compose.getAttachmentFile(attachment.id)
var blob = await f.arrayBuffer();
var t = await browser.experiment.writeFileBinary(tempFile, blob);
In the experiment you can then write the file like this:
async writeFileBinary(filename, data) {
// first we need to convert the arrayBuffer to some Uint8Array
var uint8 = new Uint8Array(data);
uint8.reduce((binary, uint8) => binary + uint8.toString(2), "");
// then we can save it
var ret = await IOUtils.write(filename, uint8);
return ret;
},
IOUtils documentation:
https://searchfox.org/mozilla-central/source/dom/chrome-webidl/IOUtils.webidl
FileUtils documentation:
https://searchfox.org/mozilla-central/source/toolkit/modules/FileUtils.jsm
I am fairly new to JS/Winappdriver.
The application I am trying to test is a windows based "Click Once" application from .Net, so I have to go to a website from IE and click "Install". This will open the application.
Once the application is running, I have no way to connect the application to perform my UI interactions while using JavaScript.
Using C#, I was looping through the processes looking for a process name, get the window handle, convert it to hex, add that as a capability and create the driver - it worked. Sample code below,
public Setup_TearDown()
{
string TopLevelWindowHandleHex = null;
IntPtr TopLevelWindowHandle = new IntPtr();
foreach (Process clsProcess in Process.GetProcesses())
{
if (clsProcess.ProcessName.StartsWith($"SomeName-{exec_pob}-{exec_env}"))
{
TopLevelWindowHandle = clsProcess.Handle;
TopLevelWindowHandleHex = clsProcess.MainWindowHandle.ToString("x");
}
}
var appOptions = new AppiumOptions();
appOptions.AddAdditionalCapability("appTopLevelWindow", TopLevelWindowHandleHex);
appOptions.AddAdditionalCapability("ms:experimental-webdriver", true);
appOptions.AddAdditionalCapability("ms:waitForAppLaunch", "25");
AppDriver = new WindowsDriver<WindowsElement>(new Uri(WinAppDriverUrl), appOptions);
AppDriver.Manage().Timeouts().ImplicitWait = TimeSpan.FromSeconds(60);
}
How do I do this in Javascript ? I can't seem to find any code examples.
Based on an example from this repo, I tried the following in JS to find the process to latch on to but without luck.
import {By2} from "selenium-appium";
// this.appWindow = this.driver.element(By2.nativeAccessibilityId('xxx'));
// this.appWindow = this.driver.element(By2.nativeXpath("//Window[starts-with(#Name,\"xxxx\")]"));
// this.appWindow = this.driver.elementByName('WindowsForms10.Window.8.app.0.13965fa_r11_ad1');
// thisappWindow = this.driver.elementByName('xxxxxxx');
async connectAppDriver(){
await this.waitForAppWindow();
var appWindow = await this.appWindow.getAttribute("NativeWindowHandle");
let hex = (Number(ewarpWindow)).toString(16);
var currentAppCapabilities =
{
"appTopLevelWindow": hex,
"platformName": "Windows",
"deviceName": "WindowsPC",
"newCommandTimeout": "120000"
}
let driverBuilder = new DriverBuilder();
await driverBuilder.stopDriver();
this.driver = await driverBuilder.createDriver(currentEwarpCapabilities);
return this.driver;
}
I keep getting this error in Winappdriver
{"status":13,"value":{"error":"unknown error","message":"An unknown error occurred in the remote end while processing the command."}}
I've also opened this ticket here.
It seems like such an easy thing to do, but I couldn't figure this one out.
Any of nodes packages I could use to get the top level window handle easily?
I am open to suggestions on how to tackle this issue while using JavaScript for Winappdriver.
Hope this helps some one out there,
Got around this by creating an exe using C# that generated hex of the app to connect based on the process name, it looks like something like this.
public string GetTopLevelWindowHandleHex()
{
string TopLevelWindowHandleHex = null;
IntPtr TopLevelWindowHandle = new IntPtr();
foreach (Process clsProcess in Process.GetProcesses())
{
if (clsProcess.ProcessName.StartsWith(_processName))
{
TopLevelWindowHandle = clsProcess.Handle;
TopLevelWindowHandleHex = clsProcess.MainWindowHandle.ToString("x");
}
}
if (!String.IsNullOrEmpty(TopLevelWindowHandleHex))
return TopLevelWindowHandleHex;
else
throw new Exception($"Process: {_processName} cannot be found");
}
Called it from JS to get the hex of the top level window handle, like this,
async getHex () {
var pathToExe =await path.join(process.cwd(), "features\\support\\ProcessUtility\\GetWindowHandleHexByProcessName.exe");
var pathToDir =await path.join(process.cwd(), "features\\support\\ProcessUtility");
const result = await execFileSync(pathToExe, [this.processName]
, {cwd: pathToDir, encoding: 'utf-8'}
, async function (err, data) {
console.log("Error: "+ err);
console.log("Data(hex): "+ data);
return JSON.stringify(data.toString());
});
return result.toString().trim();
}
Used the hex to connect to the app like this,
async connectAppDriver(hex) {
console.log(`Hex received to connect to app using hex: ${hex}`);
const currentAppCapabilities=
{
"browserName": '',
"appTopLevelWindow": hex.trim(),
"platformName": "Windows",
"deviceName": "WindowsPC",
"newCommandTimeout": "120000"
};
const appDriver = await new Builder()
.usingServer("http://localhost:4723/wd/hub")
.withCapabilities(currentAppCapabilities)
.build();
await driver.startWithWebDriver(appDriver);
return driver;
}
Solution:
In WebDriverJS (used by selenium / appium), use getDomAttribute instead of getAttribute. Took several hours to find :(
element.getAttribute("NativeWindowHandle")
POST: /session/270698D2-D93B-4E05-9FC5-3E5FBDA60ECA/execute/sync
Command not implemented: POST: /session/270698D2-D93B-4E05-9FC5-3E5FBDA60ECA/execute/sync
HTTP/1.1 501 Not Implemented
let topLevelWindowHandle = await element.getDomAttribute('NativeWindowHandle')
topLevelWindowHandle = parseInt(topLevelWindowHandle).toString(16)
GET /session/DE4C46E1-CC84-4F5D-88D2-35F56317E34D/element/42.3476754/attribute/NativeWindowHandle HTTP/1.1
HTTP/1.1 200 OK
{"sessionId":"DE4C46E1-CC84-4F5D-88D2-35F56317E34D","status":0,"value":"3476754"}
and topLevelWindowHandle have hex value :)
I'm trying to automate my workflow using selenium in nodejs. When accessing sellercentral.amazon.com it sends an OTP code to my phone. How can I ask for a prompt at nodejs so I can input the code?
I've tried using readline-sync, but the prompt is always displayed even before selenium starts.
const webdriver = require('selenium-webdriver'),
By = webdriver.By,
until = webdriver.until;
const driver = new webdriver.Builder()
.forBrowser('firefox')
// .setFirefoxOptions(options)
.build();
//Main body
driver.get('https://sellercentral.amazon.com');
driver.wait(until.elementLocated(By.id('sign-in-button')));
driver.findElement(By.id('sign-in-button')).click();
const fillForm = (idToLook, keys) => {
this.idToLook = idToLook;
if (keys) {
driver.wait(until.elementLocated(By.id(idToLook)));
driver.findElement(By.id(idToLook)).sendKeys(keys);
}
else {
keys = readline.question(`what are the keys for ${this.idToLook}: `);
driver.findElement(By.id(idToLook)).sendKeys(keys);
}
}
fillForm('ap_email', amazon.id);
fillForm('ap_password', amazon.password);
driver.findElement(By.name('rememberMe')).click();
driver.findElement(By.id('a-autoid-0')).click();
driver.wait(until.elementIsNotVisible(By.id('auth-mfa-optcode')));
// fillForm('auth-mfa-otpcode');
driver.findElement(By.id('auth-mfa-remember-device')).click();
driver.quit();
You may try something in similar fashion . Wrap initializeSite to launch the site as a Promise .
A base script like the below:
function main() {
var initializeSite = initialize();
initializeSite.then(function(result) {
// Do your different actions to bring up the form that need OTP
readline.question (“Add OTP”, (otp) =>{
// Add rest of your codes here
}
console.log(“success”)
}, function(err) {
console.log(err);
})
}
I want to embed a javascript snippet inside of a pdf file so that it will immediately print when it's opened from a browser window. To try and achieve this I'm following this example here.
I have created a helper class that has a static method to handle this task. I already have the pdf file path string ready to pass into the method. What I don't understand is how the output stream portion of this works. I would like the updated pdf to be saved to my servers hard drive. I do not want to stream it back to my browser. Any guidance would be greatly appreciated.
public class PdfHelper
{
public static void AddPrintFunction(string pdfPath, Stream outputStream)
{
PdfReader reader = new PdfReader(pdfPath);
int pageCount = reader.NumberOfPages;
Rectangle pageSize = reader.GetPageSize(1);
// Set up Writer
PdfDocument document = new PdfDocument();
PdfWriter writer = PdfWriter.GetInstance(document, outputStream);
document.Open();
//Copy each page
PdfContentByte content = writer.DirectContent;
for (int i = 0; i < pageCount; i++)
{
document.NewPage();
// page numbers are one based
PdfImportedPage page = writer.GetImportedPage(reader, i + 1);
// x and y correspond to position on the page
content.AddTemplate(page, 0, 0);
}
// Inert Javascript to print the document after a fraction of a second to allow time to become visible.
string jsText = "var res = app.setTimeOut(‘var pp = this.getPrintParams();pp.interactive = pp.constants.interactionLevel.full;this.print(pp);’, 200);";
//string jsTextNoWait = “var pp = this.getPrintParams();pp.interactive = pp.constants.interactionLevel.full;this.print(pp);”;
PdfAction js = PdfAction.JavaScript(jsText, writer);
writer.AddJavaScript(js);
document.Close();
}
}
For how to accomplish this task, please take a look at this and this SO posts.
Basically you should have something like this:
var pdfLocalFilePath = Server.MapPath("~/sourceFile.pdf");
var outputLocalFilePath = Server.MapPath("~/outputFile.pdf");
using (var outputStream = new FileStream(outputLocalFilePath, FileMode.CreateNew))
{
AddPrintFunction(pdfLocalFilePath, outputStream);
outputStream.Flush();
}
I'm trying to do something like a C #include "filename.c", or PHP include(dirname(__FILE__)."filename.php") but in javascript. I know I can do this if I can get the URL a js file was loaded from (e.g. the URL given in the src attribute of the tag). Is there any way for the javascript to know that?
Alternatively, is there any good way to load javascript dynamically from the same domain (without knowing the domain specifically)? For example, lets say we have two identical servers (QA and production) but they clearly have different URL domains. Is there a way to do something like include("myLib.js"); where myLib.js will load from the domain of the file loading it?
Sorry if thats worded a little confusingly.
Within the script:
var scripts = document.getElementsByTagName("script"),
src = scripts[scripts.length-1].src;
This works because the browser loads and executes scripts in order, so while your script is executing, the document it was included in is sure to have your script element as the last one on the page. This code of course must be 'global' to the script, so save src somewhere where you can use it later. Avoid leaking global variables by wrapping it in:
(function() { ... })();
All browsers except Internet Explorer (any version) have document.currentScript, which always works always (no matter how the file was included (async, bookmarklet etc)).
If you want to know the full URL of the JS file you're in right now:
var script = document.currentScript;
var fullUrl = script.src;
Tadaa.
I just made this little trick :
window.getRunningScript = () => {
return () => {
return new Error().stack.match(/([^ \n])*([a-z]*:\/\/\/?)*?[a-z0-9\/\\]*\.js/ig)[0]
}
}
console.log('%c Currently running script:', 'color: blue', getRunningScript()())
✅ Works on: Chrome, Firefox, Edge, Opera
Enjoy !
The accepted answer here does not work if you have inline scripts in your document. To avoid this you can use the following to only target <script> tags with a [src] attribute.
/**
* Current Script Path
*
* Get the dir path to the currently executing script file
* which is always the last one in the scripts array with
* an [src] attr
*/
var currentScriptPath = function () {
var scripts = document.querySelectorAll( 'script[src]' );
var currentScript = scripts[ scripts.length - 1 ].src;
var currentScriptChunks = currentScript.split( '/' );
var currentScriptFile = currentScriptChunks[ currentScriptChunks.length - 1 ];
return currentScript.replace( currentScriptFile, '' );
}
This effectively captures the last external .js file, solving some issues I encountered with inline JS templates.
Refining upon the answers found here I came up with the following:
getCurrentScript.js
var getCurrentScript = function() {
if (document.currentScript) {
return document.currentScript.src;
} else {
var scripts = document.getElementsByTagName('script');
return scripts[scripts.length - 1].src;
}
}
// module.exports = getCurrentScript;
console.log({log: getCurrentScript()})
getCurrentScriptPath.js
var getCurrentScript = require('./getCurrentScript');
var getCurrentScriptPath = function () {
var script = getCurrentScript();
var path = script.substring(0, script.lastIndexOf('/'));
return path;
};
module.exports = getCurrentScriptPath;
BTW: I'm using CommonJS
module format and bundling with webpack.
I've more recently found a much cleaner approach to this, which can be executed at any time, rather than being forced to do it synchronously when the script loads.
Use stackinfo to get a stacktrace at a current location, and grab the info.file name off the top of the stack.
info = stackinfo()
console.log('This is the url of the script '+info[0].file)
I've coded a simple function which allows to get the absolute location of the current javascript file, by using a try/catch method.
// Get script file location
// doesn't work for older browsers
var getScriptLocation = function() {
var fileName = "fileName";
var stack = "stack";
var stackTrace = "stacktrace";
var loc = null;
var matcher = function(stack, matchedLoc) { return loc = matchedLoc; };
try {
// Invalid code
0();
} catch (ex) {
if(fileName in ex) { // Firefox
loc = ex[fileName];
} else if(stackTrace in ex) { // Opera
ex[stackTrace].replace(/called from line \d+, column \d+ in (.*):/gm, matcher);
} else if(stack in ex) { // WebKit, Blink and IE10
ex[stack].replace(/at.*?\(?(\S+):\d+:\d+\)?$/g, matcher);
}
return loc;
}
};
You can see it here.
Refining upon the answers found here:
little trick
getCurrentScript and getCurrentScriptPath
I came up with the following:
//Thanks to https://stackoverflow.com/a/27369985/5175935
var getCurrentScript = function() {
if (document.currentScript && (document.currentScript.src !== ''))
return document.currentScript.src;
var scripts = document.getElementsByTagName('script'),
str = scripts[scripts.length - 1].src;
if (str !== '')
return str ;
//Thanks to https://stackoverflow.com/a/42594856/5175935
return new Error().stack.match(/(https?:[^:]*)/)[0];
};
//Thanks to https://stackoverflow.com/a/27369985/5175935
var getCurrentScriptPath = function() {
var script = getCurrentScript(),
path = script.substring(0, script.lastIndexOf('/'));
return path;
};
console.log({path: getCurrentScriptPath()})
Regardless of whether its a script, a html file (for a frame, for example), css file, image, whatever, if you dont specify a server/domain the path of the html doc will be the default, so you could do, for example,
<script type=text/javascript src='/dir/jsfile.js'></script>
or
<script type=text/javascript src='../../scripts/jsfile.js'></script>
If you don't provide the server/domain, the path will be relative to either the path of the page or script of the main document's path
I may be misunderstanding your question but it seems you should just be able to use a relative path as long as the production and development servers use the same path structure.
<script language="javascript" src="js/myLib.js" />
I've thrown together some spaghetti code that will get the current .js file ran (ex. if you run a script with "node ." you can use this to get the directory of the script that's running)
this gets it as "file://path/to/directoryWhere/fileExists"
var thisFilesDirectoryPath = stackinfo()[0].traceline.substring("readFile (".length, stackinfo()[0].traceline.length - ")".length-"readFile (".length);
this requires an npm package (im sure its on other platforms as well):
npm i stackinfo
import stackinfo from 'stackinfo'; or var {stackinfo} = require("stackinfo");
function getCurrnetScriptName() {
const url = new URL(document.currentScript.src);
const {length:len, [len-1]:last} = url.pathname.split('/');
return last.slice(0,-3);
}