I have a website www.website.com. A web user opens website.come/article.html where there is html text, images.... and javascript content (wich is different for every user).
Now my website is wordpress powered, how can i download the final version (javascript loaded and executed) of the pages opened by my users?
I want to do that because i want to know what content javascript displays for each one of them.
Can i use a php/javascript function or is there any service which do that?
You'll need a headless browser like PhantomJS to visit the page, let the javascript run and then extract the content.
There is a PHP bridge available at https://github.com/diggin/php-PhantomjsRunner, but I don't know whether it's any good.
Related
I have a legacy web application that we are not allowed to modify yet. We need to add a new function to the application in the short term. We have been told that we may modify the webpage with any local scripts we want but we have to wait 4 months before they will unlock the application.
So my goal is to create a webpage locally, click on that local html file and have it open the url for the legacy application, and then inject the new JavaScript function to the application.
On "your" page, use an iFrame to "import" the page you cannot edit, on your page add whatever modifications you need/want.
If there is no server side scripting on the page, then copy the page source to your page, and add whatever you want to it. It is difficult to give you a focused answer without having access to or more information about the actual legacy page.
It can't be done directly since browsers prevent cross site scripting so injecting js from local machine will complain with same origin errors the only workaround i know is to use developer tools and open console then you can type your JavaScript there and run it directly
I am studying about the project in which I have to extract the data from the website . The project is in java and the website is in java script . I am using Jsoup to extract the data from the website But there are some modal windows(dialogue box , pop up windows) present in the web page.So Is it possible to extract the data of modal windows using jsoup?????
So if answer is yes , then how could I do it?? please provide links and if not, then what are the other best ways to do it???
Thanks for your help. I really appreciate it.
I assume that the modal is generated by Javascript.
Jsoup is just a parser. This means that it will make an HTTP request (GET or POST, whatever you tell it to do) and the server (website) will respond with the initial html. By saying initial, I mean the html before any javascript is executed.
Javascript can generate html (like the modal in question), but this is not visible to Jsoup because a parser can only read, it cannot execute code. The browser is able to generate the modal because it includes a Javascript execution engine that parses and executes Javascript.
When you visit a web page you don't know what is dynamic (generated by Javascript) and what is static (fetched by the server as is).
A little trick to check what is dynamic and what is static (static is visible to Jsoup) is to do the following:
Visit the web page you want to parse (with chrome if possible, mozilla will work too I think).
Press Ctrl + U. This will open a new tab.
The new tab will contain some mesh of html, css and js. This is what the server fetches to the browser and is also visible to Jsoup.
If the modal is in there, then great, it is visible to Jsoup. If not, then you have to use a library that acts as a headless browser.
A headless browser is essentially a browser without the graphical interface. It can parse and execute Javascript. It "sees" what a normal browser sees.
The most common library used is selenium webdriver. Be careful, selenium is a testing framework that has a lot of parts. What you need is the webdriver.
There a lot of examples out there with ready made code to get you started.
I am developing a wp application in which there is a webbrowser which loads a web page. I want to add javascript file which fills a text box in loaded web page with some data and click on submit button. I want this javascript file to run automatically once webbrowser tool loads web page completely.
When the web browser completely loads a page, the Navigated event will fire.
In the event handler, you can execute arbitrary JavaScript code by calling theBrazza.InvokeScript( "eval", SomeJavaScriptSource ); where SomeJavaScriptSource is a variable or constant containing the JavaScript you’d like to run (just don't forget to specify IsScriptEnabled="True" in your web browser).
If your page already has any JavaScript code in it - you'll be fine, otherwise this approach wont work: that thread is old, however now in Windows Phone 8 the problem is still present :-(
I am developing an extension that fetches pages that the user is likely to access on a website. My extension uses jQuery.get() to fetch a page. This works correctly for a site like amazon.com.
But if the user logs in to gmail and I try to fetch some other pages like "account settings", I get an incomplete page. Somewhere in that page, I get the message:
"Your browser does not support Javascript or Javascript has been disabled.As your browser does not support Javascript or has Javascript disabled, we are not able to display the requested page."
Is there some way to fetch complete page in such cases?
I ended up opening a new tab and fetching the page in that tab. Then using content script, I analyze the page data. Sure this is a problem in the sense that a user will see newly opened tab. But then it is also transparent to the user.
If you are developing an extension on Firefox using Jetpack, you can use page-worker which is an invisible page and gives access to the DOM.
Alright, first off this is not a malicious question I'm asking. I have no intentions of using any info for ill gains.
I have an application that contains an embedded browser. This browser runs within the application's process, so I can't access it via Selenium WebDriver or anything like that. I know that it's possible to dynamically append scripts and html to loaded web pages via WebDriver, because I've done it.
In the embedded browser, I don't have access to the pages that get loaded. Instead, I can create my own html/javascript pages and execute them, to manipulate the application that houses the browser. I'm having trouble manipulating the existing pages within the browser.
Is there a way to dynamically add javascript to a page when you navigate to it and have it execute right after the page loads?
Something like
page1.navigateToUrl(executeThisScriptOnLoad)
page2 then executes the passed script.
I guess it is not possible to do it without knowledge of destination site. Although you can send data to the site and then use eval() function to evaluate sent data on destination page.