I have a task that i do not know where to start, i hope Stack Overflowers can give me some ideas.
I want to read the html source code of the previously opened and still opening tab in my web page.
My approach was to grab the url of the targeted page, send that url to server and do something, then use it in my web page. But i am facing the "same domain policy" on the server side, i know that JSONP can be used, but i must use POST in this case (other reasons). So i think if the tab (page) has been opened and is still open, there must be some ways that i can read the HTML when my web page is opened.
The flow will be if there is Page1 opening, user opens mywebpage.html on the same Window, mywebpage.html finds there is Page1 opening, then grab the HTML source page and use it.
Thanks!
Edit:
This is the full story.
What I am planning to do is a FireFox plugin. And there is a Button (myPluginButton) on the tool bar.
If user click myPluginButton, the HTML code of the current page will be sent to the server, then server parse the HTML code and generate a report, a new tab then is opened to display this report.
My current approach is to read the HTML of current page using newTabBrowser.contentDocument and send it to server, then do the parsing on server side. But this approach creates extra traffic. The efficient way would be only the url of the current page is sent to the server, and we can read HTML and parse it on the server side. However, the same domain policy does not allow me to do this easily.
So, my question is if it is possible to do when user click myPluginButton to open a new tab, this new tab loop all the opening tabs on the browser and reads the HTML contents of them then generate the report, since these tabs are still opening and the HTML contents must be saved on somewhere ( or i am wrong).
Thanks.
The browsers have a built in protection called same origin policy that prevent a page to read the content of other origin(domain, subdomain, port,...)
If you want to gain access to the current page you can use a bookmarklet.
You ask your users to add it in their bookmarks bar, and each time they want to use it, they don't open a tab but click on the bookmark.
This will load your script in the page, with all access to read the page content.
And oddly enough you can POST from this page to your domain, by posting a FORM to an IFRAME hosted on your domain. But you won't be able to read the response of the POST. You can use a setInterval with a JSONP call to your domain to know if the POST was successful.
Related
I have the next task - I have a page where we have some interaction logic:
After a user clicks a button, my script redirects the user to another site where it must be populate 2 textfields then click button, after redirect to new page it must click on another button.
My project is based on ASP.NET MVC4.
My questions are:
May I do all of this?
If yes, how can I redirect to another page and run my script
P.S.: Second web site isn't my site and everything I know is id of buttons where I need to click.
Elaborating on my comment
You cannot do this in a normal browser. You could write a bookmarklet or two that would navigate and click but there is no script you can write in a web page that will do what you want for security reasons. A long time ago, it was possible in IE to load a banking site into an iFrame and script and monitor user interaction to steal credentials. This has been blocked.
If you save an HTML page with the extension HTA, it can be loaded from harddisk in windows and will have relaxed security so you could load the other site into an iFrame and script the interaction. This is likely not what you want.
The last method is to use for example CURL to get the foreign page, insert stuff and submit the form to the foreign site and return the result. This is not recommended either.
So the question to you is: Why do you need this and are there other ways to do what you want
1) location.href = "http://another.page.com"
2) impossible for security purposes
I have sent Email to My client with some links.. (http://example.com?id=1234).
When the User click this link, It will open a new tab and play same Video's using iFrame.
if the site is already open, no need to open a new window, and just launch the video on the already open window.
if the site is not yet open, then open a window with the site and play.
How can i find the site (http://example.com) already opened or not?
Is there option in JavaScript?
There is no way to run any client side code in an HTML formatted email. So this is impossible.
The closest you could come would be to:
use some kind of token to identify a user (e.g. stored in a cookie)
run some heart beat code to see if they are still on the page (e.g. use XMLHttpRequest to request a 1 byte file every 15 seconds using a page id generated when the page was loaded and the user id in the cookie)
check on the server to see if a heart beat from a different page was received recently when a new copy of the page is loaded
serve different content if it is
For security reasons this is not possible directly in JavaScript.
But you can work around and add a marker in the URL then detect server side if the site is already streaming the video to that computer (match with URL marker, IP and browser).
Upon response the server can say close or not...
In a certain webpage that I was inspecting I saw some redirect links that didn't redirect directly to that link. For example: A button says "Go to Google" and then opened "www.examplesite.com/redirect_google" instead of just opening Google via <href>.
I wasn't sure if I trusted that link so a question came up: "How can I inspect that page to know what kind of scripts they do there?". But as you already understood I can't open it in my browser because I get redirected, so where can I type it so it inspects the page instantly?
If the redirect is implemented at the network layer, then there's no page to inspect; it's just a http 301 response (or 302, etc.)
If the redirect is via a meta tag or Javascript, then you can request the page via curl without rendering the HTML or having a browser act upon the meta redirect.
In the case of Javascript, you could also disable JS in your browser (methods of how to do that vary depending on the browser you're using).
Using CURL in command line on the given page, you will get the source code of the page.
Added to another programming language you can simply parse the files to check if it contains a redirect.
I'm also pretty sure a few tools exist over the Internet to check such behavior on websites, but I don't know any.
Linux/UNIX command line:
$ curl -i www.example.com/redirect_google
There are many variations of this; some small utility that downloads content from URLs without caring about what the content is and showing you information about the responses (here -i to show HTTP headers).
But if your concern is that the page may not be trustworthy... well, why this Google redirect page in particular? Any site could try to attack you with some "bad content"...
You can download the whole html file or whatever is stored there with tools like WinHTTrack or WSSniffer for example.
I have an html page that is being accessed via a link that places an external page in the url - e.g.
http://www.mydomain.com/mypage?external-page=encodedURL
It is the responsibility of my page to scrape some data from the URL it is handed.
How can I access the passed-in page using javascript/jquery? I need to be able to pull out the content for certain classes and ids.
Is this a violation of same origin policy? If so, is there some other way to process an external page like this? Seems strange to me that I can hit the web page in a browser or a terminal command and receive the content, but not in a js file.
You can use a browser extension to scrape the external page, then send the data to your site, OR display it within the page, so that it can then be accessed by your page's javascript via the DOM.
You can use a proxy on your domain which fetches the external page and hands it to your javascript whose origin is on your domain, too.
You can use an API for the external page which is accessible.
You can ask,command, change the code of the external page (if you have access to it) to serve pages with Access-Control-Allow-Origin=*
I think this is all you can do.
EDIT: The "seems strange" is until you realize the intended difference between a user, and a process. The user is not thought to be malicious, but a process could be. A process could for example, grab data from a user's logged in gmail session if it had access to the external page, and transmit that data to a server. Since the user on the terminal is probably (but not always !) the one who logged in to that session, the user is not thought to be malicious. But a script whose origin is some website that user navigates to, should not be able to act with the same permissions as that user. Since that script is an agent as well, and can make actions, but it is not created or directed by the user. That's the strongest reason for the isolation of origin's and the same origin policy.
Example
Execution Context of Bookmarklets, and IFrames
If you are injecting JS into every page via a bookmarklet, then that injected code will behave as if it has the same origin as the rest of the page, or at least the "top frame" of that page. It will execute in the same context as the top frame. If there are nested iframes in the page then you will get an "unsafe attempt to access page x from " error if your bookmarklet tries to inject into there. This is because the bookmarklet has it's origin in the top page, and the top page can never access nested iframes on different domains anyway.
So if some part of the site you wish to scrape is in an iframe below the top frame, your bookmarklet will fail to get it.
Transmitting Data using a bookmarklet
If you want to take a url on one page, on your domain, then grab data from that url, on another domain, then display that data back on the same page, you need a way to get the data across. You could use a bookmarklet but the flow would still involve some "user help". It would go something like this:
Load your domain's page, D. User puts a url into an input box. Clicks submit.
Javascript on D opens a new tab/window pointing to the user provided url.
User clicks your scraping bookmarklet on that external page, which collects the desired data, X.
Desired data, X, is sent via Ajax to a "server", S, with session identifier I.
Page D, polls the server S, until it gets notified that some data with session identifier I has been grabbed, then it gets that data and displays it on D.
There is the need for a server. You can't use local storage to transmit the information since this is specific to a domain. There is an alterative that does not require a server. It requires making a browser extension.
Transmitting data using a browser extension The "background page" of the extension is basically the same as a local server for all the browser tabs, it permits transmitting of information across tabs targeted to different domains. The "clients" in this set up are the "content scripts", which are loaded to every page (just like a bookmarklet, except without the requirement for a user to actually click the bookmarklet to load it. It happens automatically). The flow would go like this:
Page D again. User inputs url in input box. Clicks submit -> which triggers some code in the extension.
The extension background page instructs a tab to open and targets it to the url.
A content script loads automatically into that tab, checks with the background what data it should get. It gets that data, and sends it, via a message (a json string) to the background page.
The background page pushes that notification and the data on to the original contents script on page D. Which displays the information.
Optionally, the background page also transmits the information to your server for saving into that user's datastore.
The language I use for the browser extension "background page" and "content script" is pretty much focussed on Google Chrome. The same concepts are available in Safari, Firefox as well. If you want to support IE you're going to have to work out something else. IE10 does not plan to even support extensions.
If the external page and your page is on the same domain, then you should be able to access that external page using JavaScript. Otherwise, the JavaScript won't be allowed to access the external site, browsers will prevent Cross-site scripting.
I have a site where I save URLs and I want to process and save the entire DOM (in case the site goes down -- I'll still have access to the content).
The current version of my javascript bookmarklet (which only saves the URL and Page title) has been submitting a series of GET variables to a PHP page. However this will not work for the entire DOM because there are URL limit constrictions (usually ~15,000 characters it seems).
I think that using POST would allow me to send more information but I believe that the browser will stop it because of XSS (cross site scripting) concerns.
Is there a way to send a large amount of data (15,000char+) from a javascript bookmarklet?
I'm happy to clarify!
create a form(in an iframe) -> set its values -> submit -> remove the iframe.
the reason for the iframe is so the page doesnt navigate away when you submit the form.
there wont be any permission issues.