How does Google Custom Search Engine solves XSS? - javascript

I have been thinking about building a service that would use a similar approach as used by Google CSE - https://developers.google.com/custom-search/docs/js/rendering I have not been able to understand how Google gets around the XSS. Is it because they host the JS file that they are able to write into the DIV? Are they using CORS headers? Please share your inputs if you have experience using this pattern.

It uses a combination of same-origin requests and jsonp. It requests www.googleapis.com/customsearch/v1element and www.google.com/uds via the <script> tag and then those scripts are allowed to request from www.googleapis.com and www.google.com respectively.

Related

Browser-based client-side scraping

I wonder if its possible to scrape an external (cross-domain) page through the user's IP?
For a shopping comparison site, I need to scrape pages of an e-com site but several requests from the server would get me banned, so I'm looking for ways to do client-side scraping — that is, request pages from the user's IP and send to server for processing.
No, you won't be able to use the browser of your clients to scrape content from other websites using JavaScript because of a security measure called Same-origin policy.
There should be no way to circumvent this policy and that's for a good reason. Imagine you could instruct the browser of your visitors to do anything on any website. That's not something you want to happen automatically.
However, you could create a browser extension to do that. JavaScript browser extensions can be equipped with more privileges than regular JavaScript.
Adobe Flash has similar security features but I guess you could use Java (not JavaScript) to create a web-scraper that uses your user's IP address. Then again, you probably don't want to do that as Java plugins are considered insecure (and slow to load!) and not all users will even have it installed.
So now back to your problem:
I need to scrape pages of an e-com site but several requests from the server would get me banned.
If the owner of that website doesn't want you to use his service in that way, you probably shouldn't do it. Otherwise you would risk legal implications (look here for details).
If you are on the "dark side of the law" and don't care if that's illegal or not, you could use something like http://luminati.io/ to use IP adresses of real people.
Basically browsers are made to avoid doing this…
The solution everyone thinks about first:
jQuery/JavaScript: accessing contents of an iframe
But it will not work in most cases with "recent" browsers (<10 years old)
Alternatives are:
Using the official apis of the server (if any)
Try finding if the server is providing a JSONP service (good luck)
Being on the same domain, try a cross site scripting (if possible, not very ethical)
Using a trusted relay or proxy (but this will still use your own ip)
Pretends you are a google web crawler (why not, but not very reliable and no warranties about it)
Use a hack to setup the relay / proxy on the client itself I can think about java or possibly flash. (will not work on most mobile devices, slow, and flash does have its own cross site limitations too)
Ask google or another search engine for getting the content (you might have then a problem with the search engine if you abuse of it…)
Just do this job by yourself and cache the answer, this in order to unload their server and decrease the risk of being banned.
Index the site by yourself (your own web crawler), then use your own indexed website. (depends on the source changes frequency)
http://www.quora.com/How-can-I-build-a-web-crawler-from-scratch
[EDIT]
One more solution I can think about is using going through a YQL service, in this manner it is a bit like using a search engine / a public proxy as a bridge to retrieve the informations for you.
Here is a simple example to do so, In short, you get cross domain GET requests
Have a look at http://import.io, they provide a couple of crawlers, connectors and extractors. I'm not pretty sure how they get around bans but they do somehow (we are using their system over a year now with no problems).
You could build an browser extension with artoo.
http://medialab.github.io/artoo/chrome/
That would allow you to get around the same orgin policy restrictions. It is all javascript and on the client side.

Iframes cannot display https on http site, breaks Google API

Information
I am writing a piece of code that will be going up on a web server that I do not have control over. This webserver does not have https. In this code I use the Google Javascript API. When I put in the example code with all the correct API keys and client ID's and whatnot I get a Protocols must match error on an iframe it tries to create to get OAth2 information.
This Protocols must match error is of course caused by the fact that the webserver is http and the OAth2 url it is using is https.
Main Question
Is there any way to use Google API's on a server that does not have https? Is it possible to shut off this Security feature and make the https OAth2 iframe work on a http server?
Note:
The Google API is creating the iframe that is giving me problems.
You shouldn't be creating your own iframe. The Google JS library takes care of this for you.

Javascript API hindered by Cross Domain API calls

I need to provide a functionality similar to "Share with Facebook" for my social networking site. Facebook uses nested iframes and also xd_receiver concepts. I want to write a JavaScript API(JS file hosted on my domain), which can be used by different sites to call my web server APIs in order to share, post or recommend on my social networking site. I have a few questions -
Even though I provide the JS API, and diff sites load the JS file using the source, if any API call is made, it will again be a cross domain call(If I am comprehending correctly) and will be rejected on the server?
How to overcome such situation?
Is there any other better mechanism to implement this functionality?
Please suggest so that I can proceed with the implementation.
I think the default way is to use jsonp to get around cross domain limitation. http://en.wikipedia.org/wiki/JSONP. It might require a change in your api though. A user requests your api through the src of a script tag passing in a function callback. Your api would return pass your json response to the function specified.
Do you know why they use iframes and not simple get requests with JSONP/Images/scripts?
The answer is security. I cannot write a script that clicks their button which will automatically "like" the page.
Using plain old JavaScript with a JSONP will allow the developer to automatically click the button. Do you want that to happen?
The requests are made by the browser and not from the JS file, so, your requests will be cross-domain every time they did from another domain site.
Your server will only reject cross-domain requests if you implement a referrer validation.
And you can use JSONP if your API needs custom contents from your site...
To allow cross domain requests, you need to set the following Header in your HTTP Response:
Access-Control-Allow-Origin: *
The implementation will vary depending on the back-end you are using.
If the host in the Origin header of the request is anything but the host of the request, the response must include the listed Origin in the Access-Control-Allow-Origin header. Setting this header to * will allow all origins.
For very specific information on cross origin resource sharing see http://www.w3.org/TR/cors/. If you're not big on reading w3c documents, check out MDN's primer.
Note: Internet Explorer does its own thing with regards to cross domain requests. This answer is a good start if you have issues with IE.

Call JavaScript function from remote server

I tried to call JavaScript function exist on some server(server1) from another server(server2) and I got this error:
Unsafe JavaScript attempt to access frame with URL https://server1/ from frame with URL https://server2/ . Domains, protocols and ports must match.
I used JSP, Java, JavaScript and tomcat7, is there any way to solve this problem? any help will be appreciated.
Yes, must add a cross-origin rule to the header of your javascript file, which allows access from your other server.
Otherwise, your Browser doesn´t let you do that.
You can look at the answer of this Question: XmlHttpRequest error: Origin null is not allowed by Access-Control-Allow-Origin
It should tell you how to do it.
Take a look at easyXDM - it provides an RPC feature allowing you to call methods across the Same Origin Policy.
Take a look at one of the demo's here
As described you are subject to the Same Origin Policy, this is designed to protect users.
Google have a good write-up: http://code.google.com/p/browsersec/wiki/Part2.
There are several typical approaches to working around this:
jquery has a getJson or jsonp type of function. most other js libs have something similar. They use a dynamic Script tag, suitable for GET requests from other domains.
Create a servlet on domain1 that proxies to domain2 - allows unrestricted HTTP methods and use of XmlHTTPRequest.
I've not tried http://easyxdm.net/wp/
There are improvements coming, like cross document messaging in HTML5

Avoid x-domain solutions

I'm currently working on a web application that customers can add to their webpages by adding a javascript link to a js file on my server. The application read all the javascriptfiles from my sever, but I still get an error when trying to use ajax to get data from my database. I didn't think that would be a problem because the files is on my server.
Can I fix this or do I have to make a cross-browser solution? I don't have any control over the costumers server.
Thanks in advance
Mikael
This is not possible: When you execute a remote script, it runs in the context of the containing document.
There are some popular workarounds for this:
Using an iframe, which fixes the cross-domain problem but doesn't integrate well with the remote site (e.g. no custom styling)
Using JSONP to make cross-domain Ajax requests (detailed explanation here)
Using a server-side proxy script (not an option in this scenario)
Using YQL (I'm not familiar with this but it's said to work)
The same origin policy is based on the host document not the script itself.
You need to use a cross domain ajax technique.

Categories