Assume a substantial (MIT Licence) Open Source (Node) Javascript web application.
I'd like to establish confidence that the application does not leak information over the internet. One approah would be to read every single line and make sure I understand it. This sort of detailed code-review feels like overkill.
As the Application is behind a NAT/Firewall, it may be sufficient to establish that the implementation does not initiate any remote connections. It is important to consider both the obvious direct initiation of connections (TCP/HTTP/HTTPS/FTP/SSH - etc.) in the source itself - and by any of the dependencies... and any by indirect means - for example - involving the client web-browser during operation.
I'm not really worried if the application is insecure in the sense that its access controls are fallible from the network on which access is provided... though I don't object to a heads-up about any clear flaws.
Are there tools that make this sort of assesment straightforward? Would a sandboxing approach be viable - and, if so, what mecnaism would be suitable to create such a Node sandbox?
This question is surprisingly broad. You seem to be asking:
are there any "trojan horses" embedded in my nodejs application or its npm dependencies. Did the developers of the code sneak in any malicious code that might exfiltrate my data?
is my nodejs web interface secure against cybercreeps?
is my server (hosting my nodejs app) secure against cybercreeps?
The first rule of information security is this: Nobody can steal information you don't store. If you don't need it, don't store it.
About the trojan horse situation, I have these suggestions about code:
Inspect your own code to the extent you can. Looking at the require() lines of your code is a good start; it alerts you to your own modules that might use outbound networking.
Use npm audit to take advantage of crowdsourced inspection of your dependencies.
If your source code is on github, they'll do some of that npm audit work for you and pester you with emails about vulnerabilities.
Rework your nodejs web app to use the hapi.js framework instead of what you use now. Developed by paranoiacs at Walmart, it has zero external dependencies.
If you have the time and/or money, use a static code analysis tool to inspect your code. Sonarqube, Checkmarx, etc etc.
About rigging up your servers to slow down exfiltration of data:
Monitor or log the outbound traffic from your server and inspect the logs. This may help.
Set up your outbound firewall rules to disallow traffic that's not part of your application. Digital Ocean's tutorial is here. Careful: you can overdo this and break stuff you need.
Keep your sensitive data (in a dbms, maybe) on a separate machine from your web applications. Set up access for that machine with a whitelist that only allows your web applications, and nobody else, to connect to it.
About whether your web interface can repel cybercreeps while allowing legitimate users:
Read up on anti-cybercreep packages like helmet. Understand them and then use them.
Consider attacking your web app every few months with a white-hat hacker's tool like burpscan. That tool costs money. Its developers routinely update it to add test for newly discovered vulnerabilities. It detects stuff like these OWASP Top Ten vulnerabilities and more.
Pay attention to your web server logs. (And be aware that web servers facing the public net get many many probes from around the world to see if they're vulnerable.)
Check the security of your TLS (https) with Qualys's SSL Server Test.
About whether your server machines can repel cybercreeps while allowing legitimate administrative access:
Promptly post security updates from your operating system vendor.
Lock down your machines. Don't run any services you don't need. Turn off, for example, smtp, ftp, dns, ntp, and nfs services on a web server (the services, not the clients for those services). These days this is fairly easy because bare-bones server editions of operating systems don't come with any of that stuff installed or running.
Use nmap to check your servers for open ports, routinely. One of Digital Ocean's excellent tutorials describes this. For a web server, only the ssh and https ports should be open for the safest setup.
Make your public-facing servers "sacrificial". That is, make it so you can build a new copy of the server and its software load from scratch at a moment's notice. That way, if a cybercreep does break into your server, you can just burn it down and start up another one, forcing the creep to start over.
Did I mention? Promptly post security updates from your operating system vendor.
Maybe you can try Sonarqube
I have no experience in installing it, but my workplace use it for checking code quality.
it shows something like this
it will show which line of code is bad, and give suggestion on how to fix it.
i remember seeing something related to OWASP in sonarqube, but have not give it a try yet.
Related
I work with a team whose only way to get a user in their company's database, is to navigate through and fill out ~5 or so pages of web forms in their browser. Truly brutal stuff. I've developed web automation scripts in VBScript, Java (w/ Selenium WebDriver) and iMacro, but all of these solutions are slow. They also depend on the browser, which I'm trying to move away from.
I'm looking for a new platform, possibly some scripting technique/language that will allow me to issue HTTP requests and read HTTP responses, then build my script around there. The script would perform calculations on the HTTP responses, use File I/O and use this data to issue further HTTP requests. Again, I'm just spitballing here. If anyone else has a better solution, I'm all ears!
My question for you is: Accepting the team's limitations (read-only DB access), how would you approach a solution and what tools/languages/platforms would you use to do so?
Broad and ambiguous answers are welcome. Thank you for your time.
I agree with #Grisk on using NodeJS/ioJS as a platform. It is a powerful tool designed from the ground up for I/O, making it perfect for solving your problem. Additionally, the node community is extraordinarily vibrant, with npm, the nodejs package manager, hosting thousands of easily accessible modules. To avoid any future confusion: don't mistake NodeJS for a language or a backend framework; it is a native javascript interpreter built atop Google's V8 engine as well as a set of built in modules to build powerful I/O applications. Read up about node online.
As for your specific problem, I'd say you have two options:
To feign being a browser using phantom cookies
By programmatically navigating through the website as you have been doing.
As for the former option, you'd need to manually determine which cookies are sent to the server when forms are submitted on each page and then in your script generate these cookies and include them in the http request. Check out the nodejs http documentation for more information on customizing the headers of requests.
You're header will need to look something like this:
var headers = {
'host': < website host address here > ,
'origin' : <website origin here>
'referer' : <website origin here>
'User-Agent': 'Opera/9.52 (X11; Linux i686; U; en)',
'Cookie': <cookie sent over by server here>
}
I recently came across the node-icloud library, which uses the first method I describe above to provide programmatic access to one's icloud account. I strongly suggest reading through its code to see how it works here.
Additionally, I'd suggest reading up about http headers here
For the second option, check out phantomjs and zombiejs. Phantom is nice because it works without a browser. I'm not sure how the speed of these two libraries compare to what you have already been doing, but they are worth testing out.
One last thing: I would recommend building a custom (JSON)DSL for automating interaction with webpages so you can very easily redesign your browser interaction workflows.
Additionally, if you choose to use nodejs, an understanding of node streams and the details behind its event loop would be beneficial.
Best of luck!
I would start looking into NodeJS as a platform. The HTTP library is an incredibly powerful method for writing applications that need to make multiple http requests with unusual structure and it can communicate easily with a browser or basically anything else you could possibly need. Look at using the FileSystem class if you need to do file I/O.
If you wanted to get really fancy and use websockets to build a dynamic webapp that you can use as a front-end for your tool, you could even do that, so there's a lot of flexibility.
I can build all sorts of web applications with common web technologies on both the client and server (JavaScript, PHP, CFML, etc.).
I would like to build some home automation tools and I have no idea how to get from the strictly digital world to the physical world.
Let's say I want a super simple web app to display a bunch of switches in the user interface for some different things in my house. Let's say I'm using X10 hardware (http://www.x10.com/x10-basics.html) that is "listening" for some radio signal.
Is there a way to use web technology to "instruct" my devices (smartphone, tablet, laptop, whatever) to "broadcast signals" to these X10 (or any) physical device in order to make my home more Jetsons-like?
It seems like JavaScript couldn't do any of this because of security stuff, but perhaps a server app running on my local device on my home network could tie into some underlying OS library and do this?
wirelessService = new system.os.superCoolWirelessBroadcasterService();
wirelessService.broadcastSignal("6520 mghz", true); // toaster frequency
All my mobile stuff is an HTML5 front end with an self-hosting asp.net web API backend. I use a https proxy application for security. But I run my stuff on an intranet. It's very easy in my opinion and very rewarding.
Here are a couple of videos:
https://www.youtube.com/watch?v=_2_JSbEytnM
https://www.youtube.com/watch?v=zOhOEWoED4M
Now I did integrate Google Glass which is an app:
https://www.youtube.com/watch?v=vLmPJ9xvfs0
Here you can find a complete listing:
https://www.youtube.com/results?search_query=nick+tullos+home+automation
Here is some of the source code:
https://github.com/NickTullos/CrestJson
Good luck!
You absolutely can. I created several automated processes with Coldfusion. Look at the scheduled tasks section of the Coldfusion administrator.
Many things that are one of specialized tools like barcodes generation or scanner software (just as examples) have third party dlls on Windows with Coldfusion (nothing is perfect mind you) some even required us to extend Internet Explorer via activeX controls. Some of these things included warehousing housekeeping tools, three dimensional boxing interfacing, shipped product checks and payment authorization switches, refund switches, warehousing scale interfaces and U.S. Mail/Endicia/UPS manifest generation.
Nowadays, I do many automated import processes with third party source data. Just formatted CSV or Excel files sent via FTP where I scan and pick of the file for processing.
We also parse raw data from a power inverter and create graphs for review and other statistically useful things for a client. This was not an easy task because there are things in that technology that I am not equipped for and had to learn (power inverter speak). Also the shorthand their technologists used to name data-points made some sense to them, but was immensely obscure and not very easy to translate.
I will tell you that one of the hardest interfaces I worked with was a 1996 serial port based warehouse scale that we got after the DHL bankruptcy. I thought I would lose my mind. There were baud settings like older modems and if there was a failure it didn't do anything (no error nothing).
I would assume you would have to consider that obscure real world interfacing with things that are digital may or may not be feasible.
Coldfusion is very good at automating because it is a dynamic language with an easy to use administrative backend that can access deeper things via Java objects and native .NET support (so anything is possible)!
I have been using SilkJS for a few hobby projects of mine. So far, the performance is amazing, and I absolutely love being able to use JavaScript for both the front-end and the back-end. I am thinking about using it in some commercial projects, but I want to do my due diligence on the viability of such a decision. There are some questions I have, and would like some insight into.
1) What enterprise projects, if any, do you guys know using SilkJS?
2) What resources are available regarding the security of using SilkJS as a web-server, or other V8 based solutions? (history of vulnerabilities, average time to patch, etc).
3) What pitfalls have you guys faced with using SilkJS or other V8 based solutions as a web-server, and how, if possible, have you dealt with it?
4) Does SilkJS handle horizontal scaling well (distributing load across multiple servers)? Is your answer based on theoretical calculations, and field-tested examples?
5) What resources are you aware of regarding the building of a website using SilkJS as the web-server, besides the official website itself?
Before responding, let me first eliminate 80% of the responses I will get with the following constraints:
1) No, I will not use NodeJS. For both business and mental-health reasons, asynchronous call-back frenzied programming is not something I will use. Do not attempt to convince me that I will "get used to it and love it". It's not optimal for the type of projects I am working on. Yes, you heard me - asynchronous is not perfect for everything.
2) I am aware that synchronous programming can be simulated in NodeJS. No, I am not interested in that either. I am not using NodeJS - get over it.
3) I am fully aware that most applications are i/o bound and not cpu bound. As a result, yes, using PHP is usually fine. However, there are certain projects for which cpu optimzations due yield a sizeable return on investment. No, a company does not have to be Facebook for this to be true. This is not intended to be a discussion on "why PHP is okay". It is an exploration of the reliability of SilkJS for more commercial projects.
3) Yes, I know what Java is. No, I am not interested on why that would be great if I wanted to reduce the cpu bottleneck. Once again, this is not intended to be a discussion on "why other languages are okay". It is an exploration of the reliability of SilkJS and V8 based server-side solutions for more enterprise projects.
4) Yes, it is possible to have a best answer to this question. Whoever makes the best case for or against the use of SilkJS for use in an enterprise environment gets the correct answer vote.
Also, I am aware that despite my desire to avoid NodeJS, it does utilize V8. In that regard, I am open to security reviews and stability reviews for V8 on the server-side within the context of usage via NodeJS.
As for what I mean by "enterprise", think e-commerce sites with several hundred thousand hits per month and/or applications for which stability and up-time are essential and have hundred of thousands of users.
My goal here is not to bash SilkJS. I absolutely love it, and will continue using it when possible. However, as a professional programmer, I can't just use what I enjoy for every project. So, let the insight commence..
SilkJS should scale exactly like you would Apache+PHP. Load balancer in front of a farm of SilkJS servers. Scale a MySQL backend like you already know how.
SilkJS does not do GZIP or SSL. I think it would be a risk to trust an implementation of either or both of those in the wild, against all the various bots (hacker or otherwise), spiders, browser, custom perl programs, etc. You can trivially implement Apache as a reverse proxy in front of SilkJS to provide those functions.
In fact, you can shard your server side application and use apache as a reverse proxy to connect to the proper shard based upon URL requested.
I think if you post any security or other issues to the SilkJS google group, you will see a patch posted to the github repository in a timely manner.
Other than the SilkJS.net site, you might look for various repos on github that have example programs using SilkJS.
http://www.sencha.com/blog/discover-music-with-sencha-touch-2
That article discusses how Modus Create built an application for NPR using Sencha Touch as the front end and SilkJS as the back end. It says:
"The SilkJS servers are hosted on Amazon’s EC2 cloud, behind a load balancer for both speed and redundancy. Both SilkJS hosts are fed by the NPR API via cURL and are responsible for trimming more than 300KB out of the data package, bringing the average load to less than 200KB before being gzipped for transmission!"
For a couple recent projects on our corporate intranet, I have used a very simple stack of nginx + redis + webdis + client-side javascript to implement some simple data analysis tools. The experience was absolutely wonderful, especially compared to my previous experience with other stacks (including custom c++, apache/mod_perl, ASP.Net MVC, .Net HttpListener, Ruby on Rails, and a bit of Node.js). Given the availability of client-side templating tools and frontend libraries such as jquery-ui, it seems that I could happily implement much more complicated web-apps using such a no-server-side-code stack (perhaps substituting/augmenting redis with couchdb if warranted)...
The major limitation of this stack, of course, is that my database is directly exposed to the network - acceptable in this case on a firewalled corporate network, but not really an option if I wanted to use the same techniques on the internet. I need to have some level of server-side logic to securely handle authentication and user-role management.
Are there any best practices or common development stacks for this? Ideally I'd like something that is lightweight, and gives me a simple framework for filtering the client-side requests through my custom user-role logic before forwarding them on to the database back-end. I'm not interested in any sort of server-side templating, or ActiveRecord-style storage-level abstractions.
I can't comment on a framework.
You've already mentioned the primary weakness of this, especially on the internet, that being security. The problem there is not just authentication. The problem there is essentially the openness of the client, in this case the web browser, and the protocol, notably HTTP using JSON or XML or some other plaintext protocol.
Consider one example. It's quite simple. Imagine an HTTP service that takes an SQL query and returns a collection of JSON representing the rows. This is straightforward to write. You could probably pound out a nascent one in less than an hour from scratch using any tool that gives you SQL access to an RDBMS.
Arguably, back in the Golden Days of Client Server development, this is exactly what folks did, only instead of a some data tunneled over HTTP, folks used a DB specific driver and sent SQL text over to the back in DB directly.
The problem today is that the protocols are too open. If you implemented that SQL service mentioned above, you essentially turn your entire application in a SQL injection vector.
You simply can not secure something like that in the wild. The protocol is open to trivial observation (every browser comes with a built in packet sniffer, effectively today), along with all of the source code for the application. If you try to encrypt the data, that's all done on the client as well -- with the source to the process, as well as any keys involved.
CouchDB, for example, can not be secured this way. If someone has rights to the server, they have rights to all of the data. ALL of the data. The stuff you want them to see, and the stuff you don't.
The solution, naturally, is a service layer. Something that speaks at a higher level than simply raw data streams. Something that can be secured, and can keep secrets from the clients. But that, naturally, takes server side programming to enable, and its a ostensibly more work, more layers, more data conversion, more a pain.
Back in the day, folks would write entire systems using nothing but stored procedures in the DB. The procedures would have rights that the users invoking them did not, thus you could limit at the server what a user could or could not see or change. You could given them unlimited SELECT capability on a restricted view, perhaps, while a stored procedure would have rights to actually change data or access some of the hidden columns.
Stored procedures have mostly been replaced by application layers and application servers, with the DB being more and more relegated to "dumb storage". But the concepts are similar.
There's value for some scenarios to publishing data straight to the web, like you analytics example. That's a specific, read heavy niche. But beyond that, the concept doesn't work well, I fear. Obfuscated JS is hard to read, but not secure.
This is likely why you may have a little difficulty locating such a framework (I haven't looked at all, myself).
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I have an application whose primary function works in real time, through websockets or long polling.
However, most of the site is written in a RESTful fashion, which is nice for application s and other clients in the future. However, I'm thinking about transitioning to a websocket API for all site functions, away from REST. That would make it easier for me to integrate real time features into all parts of the site. Would this make it more difficult to build applications or mobile clients?
I found that some people are already doing stuff like this: SocketStream
Not to say that the other answers here don't have merit, they make some good points. But I'm going to go against the general consensus and agree with you that moving to websockets for more than just realtime features is very appealing.
I am seriously considering moving my app from a RESTful architecture to more of an RPC style via websockets. This is not a "toy app", and I'm not talking about only realtime features, so I do have reservations. But I see many benefits in going this route and feel it could turn out to be an exceptional solution.
My plan is to use DNode, SocketIO, and Backbone. With these tools, my Backbone models and collections can be passed around from/to client and server by simply calling a functions RPC-style. No more managing REST endpoints, serializing/deserializing objects, and so forth. I haven't worked with socketstream yet, but it looks worth checking out.
I still have a long way to go before I can definitively say this is a good solution, and I'm sure it isn't the best solution for every application, but I'm convinced that this combination would be exceptionally powerful. I admit that there are some drawbacks, such as losing the ability to cache resources. But I have a feeling the advantages will outweigh them.
I'd be interested in following your progress exploring this type of solution. If you have any github experiments, please point me at them. I don't have any yet, but hope to soon.
Below is a list of to-read-later links that I've been collecting. I can't vouch that they are all worthwhile, as I've only skimmed many of them. But hopefully some will help.
Great tutorial on using Socket.IO with Express. It exposes express sessions to socket.io and discusses how to have different rooms for each authenticated user.
http://www.danielbaulig.de/socket-ioexpress/
Tutorial on node.js/socket.io/backbone.js/express/connect/jade/redis with authentication, Joyent hosting, etc:
http://fzysqr.com/2011/02/28/nodechat-js-using-node-js-backbone-js-socket-io-and-redis-to-make-a-real-time-chat-app/
http://fzysqr.com/2011/03/27/nodechat-js-continued-authentication-profiles-ponies-and-a-meaner-socket-io/
Tutorial on using Pusher with Backbone.js (using Rails):
http://blog.pusher.com/2011/6/21/backbone-js-now-realtime-with-pusher
Build application with backbone.js on the client and node.js with express, socket.io, dnode on the server.
http://andyet.net/blog/2011/feb/15/re-using-backbonejs-models-on-the-server-with-node/
http://addyosmani.com/blog/building-spas-jquerys-best-friends/
http://fzysqr.com/2011/02/28/nodechat-js-using-node-js-backbone-js-socket-io-and-redis-to-make-a-real-time-chat-app/
http://fzysqr.com/2011/03/27/nodechat-js-continued-authentication-profiles-ponies-and-a-meaner-socket-io/
Using Backbone with DNode:
http://quickleft.com/blog/backbone-without-ajax-part-ii
http://quickleft.com/blog/backbone-without-ajax-part-1
http://sorensen.posterous.com/introducing-backbone-redis
https://github.com/cowboyrushforth/minespotter
http://amir.unoc.net/how-to-share-backbonejs-models-with-nodejs
http://hackerne.ws/item?id=2222935
http://substack.net/posts/24ab8c
HTTP REST and WebSockets are very different. HTTP is stateless, so the web server doesn't need to know anything, and you get caching in the web browser and in proxies. If you use WebSockets, your server is becoming stateful and you need to have a connection to the client on the server.
Request-Reply communication vs Push
Use WebSockets only if you need to PUSH data from the server to the client, that communication pattern is not included in HTTP (only by workarounds). PUSH is helpful if events created by other clients needs to be available to other connected clients e.g. in games where users should act on other clients behaviour. Or if your website is monitoring something, where the server pushes data to the client all the time e.g. stock markets (live).
If you don't need to PUSH data from the server, it's usually easier to use a stateless HTTP REST server. HTTP uses a simple Request-Reply communication pattern.
I'm thinking about transitioning to a WebSocket api for all site functions
No. You should not do it. There is no harm if you support both models. Use REST for one way communication/simple requests & WebSocket for two way communication especially when server want to send real time notification.
WebSocket is a more efficient protocol than RESTful HTTP but still RESTful HTTP scores over WebSocket in below areas.
Create/Update/Delete resources have been defined well for HTTP. You have to implement these operations at low level for WebSockets.
WebSocket connections scale vertically on a single server where as HTTP connections scale horizontally. There are some proprietary non standards-based solutions for WebSocket horizontal scaling .
HTTP comes with a lot of good features such as caching, routing, multiplexing, gzipping etc. These have to built on top of Websocket if you chose Websocket.
Search engine optimizations works well for HTTP URLs.
All Proxy, DNS, firewalls are not yet fully aware of WebSocket traffic. They allow port 80 but might restrict traffic by snooping on it first.
Security with WebSocket is all-or-nothing approach.
Have a look at this article for more details.
The only problem I can using TCP (WebSockets) as your main web content delivery strategy is that there is very little reading material out there about how to design your website architecture and infrastructure using TCP.
So you can't learn from other people's mistakes and development is going to be slower. It's also not a "tried and tested" strategy.
Of course your also going to lose all the advantages of HTTP (Being stateless, and caching are the bigger advantages).
Remember that HTTP is an abstraction for TCP designed for serving web content.
And let's not forget that SEO and search engines don't do websockets. So you can forget about SEO.
Personally I would recommend against this as there's too much risk.
Don't use WS for serving websites, use it for serving web applications
However if you have a toy or a personal websites by all means go for it. Try it, be cutting-edge. For a business or company you cannot justify the risk of doing this.
I learned a little lesson (the hard way). I made a number crunching application that runs on Ubuntu AWS EC2 cloud services (uses powerful GPUs), and I wanted to make a front-end for it just to watch its progress in realtime. Due to the fact that it needed realtime data, it was obvious that I needed websockets to push the updates.
It started with a proof of concept, and worked great. But then when we wanted to make it available to the public, we had to add user session, so we needed login features. And no matter how you look at it, the websocket has to know which user it deals with, so we took the shortcut of using the websockets to authenticate the users. It seemed obvious, and it was convenient.
We actually had to spend quiet some time to make the connections reliable. We started out with some cheap websocket tutorials, but discovered that our implementation was not able to automatically reconnect when the connection was broken. That all improved when we switched to socket-io. Socket-io is a must !
Having said all that, to be honest, I think we missed out on some great socket-io features. Socket-io has a lot more to offer, and I am sure, if you take it in account in your initial design, you can get more out of it. In contrast, we just replaced the old websockets with the websocket functionality of socket-io, and that was it. (no rooms, no channels, ...) A redesign could have made everything more powerful. But we didn't have time for that. That's something to remember for our next project.
Next we started to store more and more data (user history, invoices, transactions, ...). We stored all of it in an AWS dynamodb database, and AGAIN, we used socket-io to communicate the CRUD operations from the front-end to the backend. I think we took a wrong turn there. It was a mistake.
Because shortly after we found out that Amazon's cloud services (AWS) offer some great load-balancing/scaling tools for RESTful applications.
We have the impression now that we need to write a lot of code to perform the handshakes of the CRUD operations.
Recently we implemented Paypal integration. We managed to get it to work. But again, all tutorials are doing it with RESTful APIs. We had to rewrite/rethink their examples to implement them with websockets. We got it to work fairly fast though. But it does feel like we are going against the flow.
Having said all that, we are going live next week. We got there in time, everything works. And it's fast, but will it scale ?
I would consider using both. Each technology has their merit and there is no one-size fits all solution.
The separation of work goes this way:
WebSockets would be the primary method of an application to communicate with the server where a session is required. This eliminates many hacks that are needed for the older browsers (the problem is support for the older browsers which will eliminate this)
RESTful API is used for GET calls that are not session oriented (i.e. not authentication needed) that benefit from browser caching. A good example of this would be reference data for drop downs used by a web application. However. can change a bit more often than...
HTML and Javascript. These comprise the UI of the webapp. These would generally benefit being placed on a CDN.
Web Services using WSDL are still the best way of enterprise level and cross-enterprise communication as it provides a well defined standard for message and data passing. Primarily you'd offload this to a Datapower device to proxy to your web service handler.
All of this happen on the HTTP protocol which gives use secure sockets via SSL already.
For the mobile application though, websockets cannot reconnect back to a disconnected session (How to reconnect to websocket after close connection) and managing that isn't trivial. So for mobile apps, I would still recommend REST API and polling.
Another thing to watch out for when using WebSockets vs REST is scalability. WebSocket sessions are still managed by the server. RESTful API when done properly are stateless (which mean there is no server state that needs to be managed), thus scalability can grow horizontally (which is cheaper) than vertically.
Do I want updates from the server?
Yes: Socket.io
No: REST
The downsides to Socket.io are:
Scalability: WebSockets require open connections and a much different Ops setup to web scale.
Learnin: I don't have unlimited time for my learnin. Things have to get done!
I'll still use Socket.io in my project, but not for basic web forms that REST will do nicely.
WebSockets (or long polling) based transports mostly serve for (near) real-time communication between the server and client. Although there are numerous scenarios where these kinds of transports are required, such as chat or some kind of real-time feeds or other stuff, not all parts of some web application need to be necessarily connected bidirectionally with the server.
REST is resource based architecture which is well understood and offers it's own benefits over other architectures. WebSockets incline more to streams/feeds of data in real-time which would require you to create some kind of server based logic in order to prioritize or differentiate between resources and feeds (in case you don't want to use REST).
I assume that eventually there would be more WebSockets centric frameworks like socketstream in the future when this transport would be more widespread and better understood/documented in the form of data type/form agnostic delivery. However, I think, this doesn't mean that it would/should replace the REST just because it offers functionality which isn't necessarily required in numerous use cases and scenarios.
I'd like to point out this blog post that is up to me, the best answer to this question.
In short, YES
The post contains all the best practices for such kind of API.
That's not a good idea. The standard isn't even finalized yet, support varies across browsers, etc. If you want to do this now you'll end up needing to fallback to flash or long polling, etc. In the future it probably still won't make a lot of sense, since the server has to support leaving connections open to every single user. Most web servers are designed instead to excel at quickly responding to requests and closing them as quickly as possibly. Heck even your operating system would have to be tuned to deal with a high number of simultaneous connections (each connection using up more ephemeral ports and memory). Stick to using REST for as much of the site as you can.