Interacting with webpage from C++ Application - javascript

i have a sitation where i want to access HTML DOM object from within my application to update certain parts of web page through javascript commands at run time.
It is a local webpage opened in FireFox which would be accessed by my application, so that the final output is always shown at the webpage which is updated by appliation.
It would be great if you could give me some idea about how this can be accomplished.
I have similar requirement like the webmonkey extension of firefox but need to do it outside of browser from my application.

You can try QtWebKit from the Qt framework, it provides an OO set of classes to interact with webpages from basic actions to very complicated and advanced stuff. I believe you may find your answer there, a link is provided below...
Good Luck
see Here

Related

How to save a website made with javascript to a file

A little info:
When 'inspected' (Google Chrome), the website displays the information I need (namely, a simple link to a .pdf).
When I cURL the website, only a part of it gets saved. This coupled with the fact that there are functions and <script> tags leads me to believe that javascript is the culprit (I'm honestly not 100% sure, as I'm pretty new at this).
I need to pull this link periodically, and it changes each time.
The question:
Is there a way for me, in bash, to run this javascript and save the new HTML code it generates to a file?
Not trivially.
Typically, for that approach, you need to:
Construct a DOM from the HTML
Execute the JavaScript in the context of that DOM while resolving URLs relative to the URL you fetched the HTML from
There are tools which can help with this, such as Puppeteer, PhantomJS, and Selenium, but they generally lend themselves to being driven with beefier programming languages than bash.
As an alternative, you can look at reverse engineering the page. It gets the data from somewhere. You can probably work out the URLs (the Network tab of a browser's developer tools is helpful there) and access them directly.
If you want to download a web page that generates itself with JavaScript, you'll need to execute this JavaScript in order to load the page. To achieve this you can use libraries that do this like puppeteer with NodeJS. There's a lot of other libraries, but that's the most popular.
If you're wondering why does this happens, it's because web developers often use frameworks like React, Vue or Angular to quote the most popular ones which only generates a JavaScript output that's not executed by common HTTP requesting libraries.

Chrome Apps <a> tag not working

Its been a few years since I have worked on Chrome apps. I started messing around with making some simple examples to learn the new process. The problem is that I have found that using an <a> tag to change the html page that is loaded to another html page inside the packaged app will not work. I'm looking for different methods of changing the screen with retrieving user input (button click, link click. and so on). I have looked online and have found very little documentation about making chrome apps most examples show how to make a simple "Hello World" and how to publish it. Not many extensive tutorials beyond that. Just to clarify this app would be a real chrome app not a link to some site. all files would be packaged with the chrome application.
Chrome Apps are meant to be "single page apps", and cannot navigate links by design. <a> links should open in a regular browser window instead.
If you want to do "url routing" within your application to change views, you can just roll your own solution, but should probably use a framework to help you out instead.
Here are some examples:
Polymer
Angular
Ember
Meteor
Backbone
The list is extensive. There are likely many other stack overflow answers that compare each framework.
The DOM (Document Object Model) which represents the contents of a Chrome App window is initiated from an HTML file, but from that point on you can't change it by referencing any other file, which is what you're expecting navigation to do. (<a> elements that navigate to an external browser via a "target=_blank" attribute are perfectly OK.)
However, you are free to change the DOM from your JavaScript at runtime. If you like, set an event handler on the <a> element and change the DOM as you wish. If you want to change the DOM via HTML (not from a file), you might find the JavaScript method insertAdjacentHTML useful. Actually, you can get the HTML from a file, but you have to read that file yourself with the Chrome App file I/O API.
Advice in another answer to use a framework is, in my opinion, overkill. If you think of a Windows app, a Mac app, or any other kind of GUI-based app, you would never assume that you could just change the UI over to something completely different by referencing an HTML file. Think of a Chrome App as being similar to those technologies in that sense, and you'll be on the right track.

How to architect a HUD in a Google Chrome extension?

I am trying to make a Google Chrome extension using content script.
My goal is to have a display at the top of the page (which is already working on my own pages) that can interact with the page.
I need things which are very complicated to put together in an extension, due to security policies :
Using require.js on the extension (that works for now, using this Github repo)
Using a templating engine to describe my display : I need to add a lot of content to the page and I don't think writing HTML in javascript would be a good workflow.
For my current version I use jade with my server, but this is not possible with an extension. I think I need to use something like Angular.js or Backbone.js, but I can't make them work on the content script.
I need a lot of communication between my extension and the page : For example I need to detect almost constantly mouse moves
I need communication with my server using socket.io
Every bit of functionality of my extension have been developed and tried in a standalone web page, but now I need to integrate it in a real extension and I am really stuck
So due to these requirements, I am wondering what would be the right approach for building this : putting it all in an iFrame (would the server-side communication work? And how to communicate with the page ?), or a way to make a templating engine work nicely in there, or a solution I didn't think of?
Try this:
Develop the HUD part as a standalone page that the content script will include in an iframe. You should be able to use Angular.js etc. with this, but you will need local copies of as much as possible and you'll need appropriate entries in the manifest.json to get it working in the extension. See/create other questions for the details.
Have your content script inject the code to monitor mouse-moves, etc. into the target page. Have this code digest and summarize the data, so it's not spamming the system. Maybe message the summaries to the HUD page and/or content script five or six times a second.
After that, it should just be a matter of getting the pieces working, one at a time. Break it down to specific problems and ask a question on one specific problem at a time (If you can't find the answers in previous questions).
I'm pretty sure what you appear to want is do-able, but the details are too broad for a single Stack Overflow question.

Interacting with Local webpage through Application in real time?

I want to interact with my local HTML page through my C++ application. Just like using java script console, we can edit a page in real time, e.g
document.getElementById('divlayer').style.visibility = 'hidden';
Similarly i want to call such functions in real time through my application.
Can you give me some idea if there is a way to accomplish this job?
I am using Google Chrome at the moment.
Do i need some plugin, but how can i make plugin to interact with my application then?
Also, i head about JQuery, can this be done using JQuery? Or do i have to try some server mecahnism may be using Ajax??
I know that you can control your IE browser with COM on Windows, and you can interact with the page with it. But I didn't try it with C++, I just use it with Python and it works well. May be you'd like to check it out.

How to simulate JavaScript in a client C# Applications

I'm writing a web crawler (web spider) that crawl all links in a website.
My application is a Win32 App, written in C# with .Net framework 3.5.
Now I'm using HttpWebRequest an HttpWebResponse to communicate with the web server.
I also built my own Http Parser that can parse anything I want.
I found all link like "href", "src", "action"... in the parse.
But I can not solve one problem: Simulate Client Script in the page (like JS and VBS)
For example, if a link like:
a href = "javascript:buildLink(1)"
... with buildLink(parameter) is a Javascript function that will make a custom link due to the parameter.
Please help me to solve this problem. How to simulate JavaScript in this app? I can parse the HTML source code and take all JavaScript code to another file, but how to simulate a function of it?
Thanks.
Your only real option is to automate a browser. As other answers have said, you cannot reliably simulate browser javascript without having a complete DOM.
There are fortunately ways to automate the browser, check out Selenium.
It has a C# API, so you can control the browser from C#.
Use your .NET web crawler code to crawl the site. Whenever you encounter a href="javascript:... link, handle the page containing the link in Selenium:
Use the Selenium API to tell the browser to load the page.
Use the Selenium API to find all links on the page.
This way, your spider only uses Selenium when necessary (pages without javascript links can be handled by the browser-less spider code you already got). And since this is an embarrassingly parallel workload, you could easily have multiple Selenium processes running at the same time (either on one computer or on other computers).
But remember that href="javascript is hardly the only way a page can have dynamic links. The more common case is probably that a onload or $(document).ready() script manipulates the DOM and adds links that way.
To catch that case (and others), the spider probably will have to use Selenium for all pages that have a <script> tag.
You are basically pretending to be a browser, except that HttpWebRequest only does the networking stuff for you.
I would recommend using the ie web browser control and interop'ing into that from your c# application. That will allow you to run JavaScript, set variables, post, etc etc.
Here's some basic links I found after a search for "ie web browser control":
http://www.c-sharpcorner.com/UploadFile/mahesh/WebBrowserInCSMDB12022005001524AM/WebBrowserInCSMDB.aspx
http://support.microsoft.com/kb/313068
This is a problem which is not easily solved. You could consider taking one of the existing JavaScript implementations and porting or interfacing with it somehow.
If I were tackling this problem, I'd probably build a small side application in Java on top of Rhino, with some sort of RPC framework layered on top of that so that I could communicate with it from my primary application.
Unfortunately, without having a complete DOM implementation on top of that, you would be limited to only very simple javascript.
You could execute the javascript by using the MS JScript engine or something similar.
MSDN Reference
Eric Lippert's blog on using Eval (part 1) (part 2) (part 3)
This isn't guaranteed to work, especially if the javascript tries to access the DOM, or somesuch... But for simple scripts, it might be enough.

Categories