I intend to get the content of a dom element from a website, I cant use
var html = await client.GetAsync("http://example.com/test.html");
as the specific dom element is populated after the JS is executed
I can use a webview to host the content but that won't solve the problem
I can't use HtmlDocument.GetElementById Method (String) either as it is under System.Windows.Forms namespace which isn't supported in Universal Windows Platform!
Another option would be InvokeAsync
await webView5.InvokeScriptAsync("doSomething", null);
as seen in scenario 5 of XAML WebView control sample
but that just fires events and won't help in getting the dom element (or even the source code once the JS execution done).
//The app is a C# UWP, not Winjs
You can use HtmlAgilityPack library to query DOM you downloaded. See example here or search for more, there are plenty.
Alternatively, you can run javascript which does what you need and returns a string containing your ID. As you see, InvokeScriptAsync returns String which can contain anything your javascript returned. F.e., this is how to get DOM with javascript:
var result = await this.webView.InvokeScriptAsync("eval", new[] { "document.documentElement.outerHTML;" });
Related
I tried many ways to get value of the attribute isPermaLink using protractor.
I can get the value of any other element fine, but isPermaLink always returns null..
HTML
<guid isPermaLink="false">public-anger-grows-over-coronavirus-123829864.html</guid>
Code
const isPerma = element(by.xpath('//guid[#isPermaLink]')).
console.log('isPermaLink value ', await isPerma.getAttribute('isPermaLink'));
If I try other elements like source tag I can get the value
<source url="http://www.ap.org/">Associated Press</source>
Element located in dev tools
Link to Yahoo Rss feed being used: https://news.yahoo.com/rss/
Answers (or rather my understanding of the problem):
How you can get the attribute?
let $permaLink = $$('guid[isPermaLink]').get(0);
let attr = await browser.executeScript('return arguments[0].getAttribute("isPermaLink")', $permaLink.getWebElement());
console.log(attr) // false
Why await element.getAttribute('isPermaLink') doesn't work?
Imagine you have an iframe inside an html page, and you're looking for an element inside of this frame. In this case you can interact with the element in the browser's console (locate it etc), but from protractor side you'll only be able to interact with it after browser.switchTo().frame(element(by.tagName('iframe')).getWebElement());
In your case, you have an xml inside of html page, which behaves similarly. However, protractor only works with the document itself (not the document inside the document). The problem is if you try switchTo you'll get an error no such frame: element is not a frame. Because the method is designed to ONLY work with iframes, but the concept is the same
Is there any chance to access the dom elements of a "x-ms-webview" ? For my cordova application, I want to use a custom login page. That's why I need to load a webview in the background, access the login fields and "click" the submit button via javascript.
For iOS and the UIWebview this working (I could not believe) really smooth and nice. Is there any chance to realize this for windows 10 as well ? This is what I have so far (not really much):
function onWebviewLoadedLoginPage(ev) {
webview.removeEventListener("MSWebViewDOMContentLoaded", onWebviewLoadedPage);
// --> Here I need to access the dom elements of the loaded webview !?
}
var webview = document.createElement("x-ms-webview");
document.body.appendChild(webview);
webview.addEventListener("MSWebViewDOMContentLoaded", onWebviewLoadedPage);
webview.navigate(url);
In objective-c I can set the login credentials directly to the dom elements quite easy:
NSString* statement = [NSString stringWithFormat:#"document.getElementById('%#').value = '%#'", self->usernameInputFieldId, self->username];
[webView stringByEvaluatingJavaScriptFromString:statement];
Thanks in advance !!!
I can see that you need to execute javascript inside the webview once the content is loaded. In that case, you can use invokeScriptAsync method of the webview element. invokeScriptAsyc accepts two parameters a function's name and that function's parameters. However, It can only call a javascript function that is written inside the webview. However, eval() is used to execute javascript code that is written in form of string. Hence to execute your javascript code your function should be:
function onWebviewLoadedLoginPage(ev) {
webview.removeEventListener("MSWebViewDOMContentLoaded", onWebviewLoadedPage);
// --> Here I need to access the dom elements of the loaded webview !?
var javascriptcode = "document.getElementByID('#myelement')"; // Your javascript code in the form of string
var injectedJavascript = webview.invokeScriptAsync('eval', javascriptcode);
injectedJavascript.start();
}
So, I have to check the status of an action started by the automation script using another automation script. In order to track the action, I have to capture its Id and write it to a file which can be then read by the second script to look for the action. The problem I am having is that I am unable to get the action Id which is a number using getText(). Instead, I keep seeing other non sensical text when I check the content of the file or the variable to which I first store the Id.
The html code for action Id is:
<dd class="ng-binding">232</dd>
I am trying to capture the Id (#232 here) like this:
var Id = element(by.xpath('html/body/div[1]/div[2]/div/div[2]/div/div[2]/div[3]/dl/dd[1]')).getText();
Upon executing the automation script, the console output shows this for the var Id:
Id:[object Object]
I have verified using protractor elementExplorer that xpath points to the right element and can even extract the Id and displays it on the screen correctly. It just does not work when I am trying to store the same Id to a variable and then write to a file so that I can retrieve it for later use. Any help or hints on how to do this would be greatly appreciated.
getText() returns a promise, you have to resolve it, using absolute xpath is a bad way of locating elements. Have you tried locating with cssSelector?
element(by.css('dd.ng-binding')).getText().then(function(text) {
console.log(text);
});
I'm working on a chrome extension that uses jquery to parse the source of a page for specific things. In example I'm looking through Wikipedia to get the categories.
I get the source of the page via
chrome.tabs.executeScript(tabId, {
code: "chrome.extension.sendMessage({action: 'getContentText', source: document.body.innerHTML, location: window.location});"
}, function() {
if (chrome.extension.lastError)
console.log(chrome.extension.lastError.message);
});
I am then listening for this message (successfully) and then use jquery to parse the source key of the object, like so
if (request.action == "getContentText")
{
//console.log(request.source);
$('#mw-normal-catlinks > ul > li > a', request.source).each(function()
{
console.log("category", $(this).html());
});
}
This works as expected and logs a list of all the category links innerHTML. However the issue happens from that jQuery selector that it tries to load the images that are contained in request.source. This results in errors such as
GET chrome-extension://upload.wikimedia.org/wikipedia/commons/thumb/f/fc/Padlock-silver.svg/20px-Padlock-silver.svg.png net::ERR_FAILED
These are valid links, however they are being called (unneeded) from my extension with the chrome-extension:// prefix (which is invalid). I'm not sure why jquery would try to evaluate/request images from within source using a selector
I guess this is happening because Wikipedia uses relative paths on their images (instead of https:// or http://, simply // - so the content loaded is relative to the server). The requests are being made by jQuery and you can see here how to fix this issue (in future, please make sure to search SO more thoroughly).
A huge thank you to #timonwimmer for helping me in the chat. We both happened to find different solutions at the same time.
My solution was to use a regex to remove any occurances of the images. Via
var source = request.source.replace(/.*?\.wikimedia\.org\/.*?/g, "");
His was an answer on stack overflow already, that was derived from another answer. If you are interested this answer works perfectly
If you give jQuery a string with a complete element declaration it actually generates a new DOM element, similar to calling document.createElement(tagName) and setting all of the attributes.
For instance: var $newEl = $("<p>test</p>") or in your case img tag elements with $("<img/>"). That would get parsed and created as a new DOM HTML element and wrapped by jQuery so you can query it.
Since you are passing a complete and valid HTML string, it is parsing it into an actual DOM first. This is because jQuery uses the built in underlying document.querySelector methods and they act on the DOM not on strings -- think of the DOM as a database with indexes for id and class and attributes for querying. For instance, MongoDB cannot perform queries on a raw JSON string, it needs to first process the JSON into BSON and index it all and the queries are performed on that.
Your problem is less with jQuery and more so with how elements are created and what happens when attributes change for those elements. For instance, when the img elements are created with document.createElement('img') and then the src attribute is set with imgElement.src = "link to image" this automatically triggers the load for the image at location src.
You can test this out for yourself by running this in your JavaScript Developer Console:
var img = document.createElement('img');
img.src = "broken-link";
Notice that this will likely show and errors in your console after running stating that the image cannot be found.
So what you want, to ensure so it does not resolve the image's src, is to either
1) apply jQuery on an existing DOM (document.body, etc), or
2) let it parse and evaluate the string into a DOM and clean the string before hand (remove the img tags using Regex or something). Take a look at https://stackoverflow.com/a/11230103/2578205 for removing HTML tags from string.
Hope it works out!
So essentially I'm trying to build my own version of GitHub's tree slider. The relevant Javascript/JQuery code is:
// handles clicking a link to move through the tree
$('#slider a').click(function() {
history.pushState({ path: this.path }, '', this.href) // change the URL in the browser using HTML5 history module
$.get(this.href, function(data) {
$('#slider').slideTo(data) // handle the page transition, preventing full page reloads
})
return false
})
// binds hitting the back button in the browser to prevent full page reloads
$(window).bind('popstate', function() {
$('#slider').slideTo(location.pathname)
}
Ok, hopefully that's understandable. Now here's my interpretation of what's going on here, followed by my problem/issue:
The callback function for the GET request when navigating through the tree is the slideTo method, and an HTML string is passed in as an argument to that function. I'm assuming that slideTo is a function defined elsewhere in the script or in a custom library, as I can't find it in the JQuery documentation. So, for my purposes, I'm trying to build my own version of this function. But the argument passed into this function, "data", is just the string of HTML returned from the GET request. However, this isn't just a snippet of HTML that I can append to a div in the document, because if I perform the same GET request (e.g. by typing the url into a web browser) I would expect to see a whole webpage and not just a piece of one.
So, within this callback function that I am defining, I would need to parse the "data" argument into a DOM so that I can extract the relevant nodes and then perform the animated transition. However, this doesn't make sense to me. It generally seems like a Bad Idea. It doesn't make sense that the client would have to parse a whole string of HTML just to access part of the DOM. GitHub claims this method is faster than a full page reload. But if my interpretation is correct, the client still has to parse a full string of HTML whether navigating through the tree by clicking (and running the callback function) or by doing full page loads such as by typing the new URL in the browser. So I'm stuck with either parsing the returned HTML string into a DOM, or ideally only fetching part of an HTML document.
Is there a way to simply load the fetched document into a Javascript or JQuery DOM object so I can easily manipulate it? or even better, is there a way to fetch only an element with an arbitrary id without doing some crazy server-side stuff (which I already tried but ended up being too spaghetti code and difficult to maintain)?
I've also already tried simply parsing the data argument into a JQuery object, but that involved a roundabout solution that only seems to work half the time, using javascript methods to strip the HTML of unwanted things, like doctype declarations and head tags:
var d = document.createElement('html');
d.innerHTML = data;
body = div.getElementsByTagName("body")[0].innerHTML;
var newDOM = $(body);
// finally I have a JQuery DOM context that I can use,
// but for some reason it doesn't always seem to work quite right
How would you approach this problem? When I write this code myself and try to make it work on my own, I feel like no matter what I do, I'm doing something horribly inefficient and hacky.
Is there a way to easily return a JQuery DOM object with a GET request? or better, just return part of a document fetched with a GET request?
Just wrap it; jQuery will parse it.
$(data) // in your callback
Imagine you want to parse a <p> tag in your normal HTML web page. You probably would use something like:
var p = $('<p>');
Right? So you have to use the same approach to parse an entire HTML document and then, navigate through the DOM tree to get the specific elements you want. Therefore, you just need to say:
$.get(this.href, function(data) {
var html = $(data);
// (...) Navigating through the DOM tree
$('#slider').slideTo( HTMLportion );
});
Notice that it also works for XML documents, so if you need to download via AJAX a XML document from the server, parse the inner information and display it on the client-side, the method is exactly the same, ok?
I hope it helps you :)
P.S: Don't ever forget to put semicolons at the end of each JavaScript sentence. Probably, if you don't put them, the engine would work but it is better to be safe and write them always!