Capture URL from iframe either programatically or with user input - javascript

I am setting up a site which will work as a containing site around a 3rd party site using an iframe. I want the user to be able to browse the 3rd party site (iframe), then submit the URL to the containing site, for me to then store.
I would like to therefore provide either:
a way to programatically access the current URL of the iframe, and submit it to the containing site
a way to display the current URL of the iframe, which the user can then copy or paste into the containing site
another method

a way to programatically access the current URL of the iframe
You can't. That would involve spying on what the user does on another website. Browsers will not let you do that.
a way to display the current URL of the iframe
That would require you to programmatically access the current URL of the iframe. See above.
another method
Write a browser extension instead of trying to do this as a website.

Related

Pass URL parameters from iframe

I have an iframe hosted on my site which is provided by a third party service. I can't manipulate the source code of the iframe and this results in users being able to navigate through the frame whilst the parent URL remains the same. For example
mysite.co.uk/
This presents an issue in regards to tracking and further, post session navigation etc.
Is it possible to listen for events of the hosted iframe and then pass in URL parameters? For example, a user clicks on a listing within an iframe and there URL is appended
mysite.co.uk/listings?listing1
mysite.co.uk/listings?listing2
mysite.co.uk/listings?listing3
My research suggests JaveScript window. & location. but I'm struggling with these suggestions given that I can't amend the code of the iframe.
Any direction would be greatly appreciated.
It is only possible to do this with a same-domain iFrame, unless you can get the third party to make changes on their end to enable CORS.

Write to external webpage using javascript

I have a website written in html and javascript/jquery. How can I follow a link to another page(outside my origin--> lets pretend google.com) and automatically enter data into a form and submit it on the outside page?
For example, I want to follow a link on my page to google.com and then have the browser enter info into the search box and make a search.
Selenium?
It's impossible unless the page you are working with has set up GET or POST options that allow you to do such things. There may be other methods, but the page you are working with must first set it up and allow it.
For example, you can run a Google search by appending a q parameter to the URL like so. You can do this because Google uses GET parameters which are easy to determine.
https://www.google.com/q=stackoverflow#safe=off&q=stackoverflow

Access elements on an external page

I have an html page that is being accessed via a link that places an external page in the url - e.g.
http://www.mydomain.com/mypage?external-page=encodedURL
It is the responsibility of my page to scrape some data from the URL it is handed.
How can I access the passed-in page using javascript/jquery? I need to be able to pull out the content for certain classes and ids.
Is this a violation of same origin policy? If so, is there some other way to process an external page like this? Seems strange to me that I can hit the web page in a browser or a terminal command and receive the content, but not in a js file.
You can use a browser extension to scrape the external page, then send the data to your site, OR display it within the page, so that it can then be accessed by your page's javascript via the DOM.
You can use a proxy on your domain which fetches the external page and hands it to your javascript whose origin is on your domain, too.
You can use an API for the external page which is accessible.
You can ask,command, change the code of the external page (if you have access to it) to serve pages with Access-Control-Allow-Origin=*
I think this is all you can do.
EDIT: The "seems strange" is until you realize the intended difference between a user, and a process. The user is not thought to be malicious, but a process could be. A process could for example, grab data from a user's logged in gmail session if it had access to the external page, and transmit that data to a server. Since the user on the terminal is probably (but not always !) the one who logged in to that session, the user is not thought to be malicious. But a script whose origin is some website that user navigates to, should not be able to act with the same permissions as that user. Since that script is an agent as well, and can make actions, but it is not created or directed by the user. That's the strongest reason for the isolation of origin's and the same origin policy.
Example
Execution Context of Bookmarklets, and IFrames
If you are injecting JS into every page via a bookmarklet, then that injected code will behave as if it has the same origin as the rest of the page, or at least the "top frame" of that page. It will execute in the same context as the top frame. If there are nested iframes in the page then you will get an "unsafe attempt to access page x from " error if your bookmarklet tries to inject into there. This is because the bookmarklet has it's origin in the top page, and the top page can never access nested iframes on different domains anyway.
So if some part of the site you wish to scrape is in an iframe below the top frame, your bookmarklet will fail to get it.
Transmitting Data using a bookmarklet
If you want to take a url on one page, on your domain, then grab data from that url, on another domain, then display that data back on the same page, you need a way to get the data across. You could use a bookmarklet but the flow would still involve some "user help". It would go something like this:
Load your domain's page, D. User puts a url into an input box. Clicks submit.
Javascript on D opens a new tab/window pointing to the user provided url.
User clicks your scraping bookmarklet on that external page, which collects the desired data, X.
Desired data, X, is sent via Ajax to a "server", S, with session identifier I.
Page D, polls the server S, until it gets notified that some data with session identifier I has been grabbed, then it gets that data and displays it on D.
There is the need for a server. You can't use local storage to transmit the information since this is specific to a domain. There is an alterative that does not require a server. It requires making a browser extension.
Transmitting data using a browser extension The "background page" of the extension is basically the same as a local server for all the browser tabs, it permits transmitting of information across tabs targeted to different domains. The "clients" in this set up are the "content scripts", which are loaded to every page (just like a bookmarklet, except without the requirement for a user to actually click the bookmarklet to load it. It happens automatically). The flow would go like this:
Page D again. User inputs url in input box. Clicks submit -> which triggers some code in the extension.
The extension background page instructs a tab to open and targets it to the url.
A content script loads automatically into that tab, checks with the background what data it should get. It gets that data, and sends it, via a message (a json string) to the background page.
The background page pushes that notification and the data on to the original contents script on page D. Which displays the information.
Optionally, the background page also transmits the information to your server for saving into that user's datastore.
The language I use for the browser extension "background page" and "content script" is pretty much focussed on Google Chrome. The same concepts are available in Safari, Firefox as well. If you want to support IE you're going to have to work out something else. IE10 does not plan to even support extensions.
If the external page and your page is on the same domain, then you should be able to access that external page using JavaScript. Otherwise, the JavaScript won't be allowed to access the external site, browsers will prevent Cross-site scripting.

Javascript reads previously opening tab html on the save Window

I have a task that i do not know where to start, i hope Stack Overflowers can give me some ideas.
I want to read the html source code of the previously opened and still opening tab in my web page.
My approach was to grab the url of the targeted page, send that url to server and do something, then use it in my web page. But i am facing the "same domain policy" on the server side, i know that JSONP can be used, but i must use POST in this case (other reasons). So i think if the tab (page) has been opened and is still open, there must be some ways that i can read the HTML when my web page is opened.
The flow will be if there is Page1 opening, user opens mywebpage.html on the same Window, mywebpage.html finds there is Page1 opening, then grab the HTML source page and use it.
Thanks!
Edit:
This is the full story.
What I am planning to do is a FireFox plugin. And there is a Button (myPluginButton) on the tool bar.
If user click myPluginButton, the HTML code of the current page will be sent to the server, then server parse the HTML code and generate a report, a new tab then is opened to display this report.
My current approach is to read the HTML of current page using newTabBrowser.contentDocument and send it to server, then do the parsing on server side. But this approach creates extra traffic. The efficient way would be only the url of the current page is sent to the server, and we can read HTML and parse it on the server side. However, the same domain policy does not allow me to do this easily.
So, my question is if it is possible to do when user click myPluginButton to open a new tab, this new tab loop all the opening tabs on the browser and reads the HTML contents of them then generate the report, since these tabs are still opening and the HTML contents must be saved on somewhere ( or i am wrong).
Thanks.
The browsers have a built in protection called same origin policy that prevent a page to read the content of other origin(domain, subdomain, port,...)
If you want to gain access to the current page you can use a bookmarklet.
You ask your users to add it in their bookmarks bar, and each time they want to use it, they don't open a tab but click on the bookmark.
This will load your script in the page, with all access to read the page content.
And oddly enough you can POST from this page to your domain, by posting a FORM to an IFRAME hosted on your domain. But you won't be able to read the response of the POST. You can use a setInterval with a JSONP call to your domain to know if the POST was successful.

monitoring iframe content/status from the parent page

What methods are available to monitor the status of IFRAME page, I know there are security limits but I hope some small notification system is still possible.
My situation is that I have created a parent page that is located on customer's server, and this page has has iframe page located on my server (my domain). I need to somehow communicate a little between these two:
Can I make javascript to the parent page that can check if my iframe page has a specific string on it, or somehow make iframe page to notify the parent page?
Is there e.g. any possibility to make a timer that checks iframe content time to time?
I also accept answer how mydomain/client.page calls callback on customerdomain.intranet.com/parentpage.htm that has client on iframe
You need to use cross site JavaScript techniques to be able to do this. Here is an example.
Put another file into your server, call it helper.html, include it to your file served by customers server using an iframe. Set the src of the helper.html iframe with adding get parameters, ie. http:/myserver.com/helper.html?param1=a&param2=b, in the helper file use javascript to call method on parent's parent ( parent.parent.messageFromIframe(params) ). Which is the page on your server itself. Since helper and the container page are on the same domain it should work. The technique is popular, for instance Facebook was using it for their Javascript api.
I got information that this is possible by setting parent.location (from iframe) to have hash data like this "mydomain.com/mypage#mymessage"
By default, security restrictions in the browser will prevent access from/to the document in the iframe if it is in a different domain to the parent page. This is, of course, just as it should be.
I believe this would prevent even checking the current location of the iframe, but that's easily testable. If it's accessible, then you could poll the iframe for its location, and whenever the page in the iframe updates, have it append a random querystring parameter. Comparison of that parameter to the value from the previous poll would tell you if it's changed.
However, as I say, I suspect it's not possible.
Edit: This question suggests it is only possible for the initial src attribute: How do I get the current location of an iframe?

Categories