Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I have an application whose primary function works in real time, through websockets or long polling.
However, most of the site is written in a RESTful fashion, which is nice for application s and other clients in the future. However, I'm thinking about transitioning to a websocket API for all site functions, away from REST. That would make it easier for me to integrate real time features into all parts of the site. Would this make it more difficult to build applications or mobile clients?
I found that some people are already doing stuff like this: SocketStream
Not to say that the other answers here don't have merit, they make some good points. But I'm going to go against the general consensus and agree with you that moving to websockets for more than just realtime features is very appealing.
I am seriously considering moving my app from a RESTful architecture to more of an RPC style via websockets. This is not a "toy app", and I'm not talking about only realtime features, so I do have reservations. But I see many benefits in going this route and feel it could turn out to be an exceptional solution.
My plan is to use DNode, SocketIO, and Backbone. With these tools, my Backbone models and collections can be passed around from/to client and server by simply calling a functions RPC-style. No more managing REST endpoints, serializing/deserializing objects, and so forth. I haven't worked with socketstream yet, but it looks worth checking out.
I still have a long way to go before I can definitively say this is a good solution, and I'm sure it isn't the best solution for every application, but I'm convinced that this combination would be exceptionally powerful. I admit that there are some drawbacks, such as losing the ability to cache resources. But I have a feeling the advantages will outweigh them.
I'd be interested in following your progress exploring this type of solution. If you have any github experiments, please point me at them. I don't have any yet, but hope to soon.
Below is a list of to-read-later links that I've been collecting. I can't vouch that they are all worthwhile, as I've only skimmed many of them. But hopefully some will help.
Great tutorial on using Socket.IO with Express. It exposes express sessions to socket.io and discusses how to have different rooms for each authenticated user.
http://www.danielbaulig.de/socket-ioexpress/
Tutorial on node.js/socket.io/backbone.js/express/connect/jade/redis with authentication, Joyent hosting, etc:
http://fzysqr.com/2011/02/28/nodechat-js-using-node-js-backbone-js-socket-io-and-redis-to-make-a-real-time-chat-app/
http://fzysqr.com/2011/03/27/nodechat-js-continued-authentication-profiles-ponies-and-a-meaner-socket-io/
Tutorial on using Pusher with Backbone.js (using Rails):
http://blog.pusher.com/2011/6/21/backbone-js-now-realtime-with-pusher
Build application with backbone.js on the client and node.js with express, socket.io, dnode on the server.
http://andyet.net/blog/2011/feb/15/re-using-backbonejs-models-on-the-server-with-node/
http://addyosmani.com/blog/building-spas-jquerys-best-friends/
http://fzysqr.com/2011/02/28/nodechat-js-using-node-js-backbone-js-socket-io-and-redis-to-make-a-real-time-chat-app/
http://fzysqr.com/2011/03/27/nodechat-js-continued-authentication-profiles-ponies-and-a-meaner-socket-io/
Using Backbone with DNode:
http://quickleft.com/blog/backbone-without-ajax-part-ii
http://quickleft.com/blog/backbone-without-ajax-part-1
http://sorensen.posterous.com/introducing-backbone-redis
https://github.com/cowboyrushforth/minespotter
http://amir.unoc.net/how-to-share-backbonejs-models-with-nodejs
http://hackerne.ws/item?id=2222935
http://substack.net/posts/24ab8c
HTTP REST and WebSockets are very different. HTTP is stateless, so the web server doesn't need to know anything, and you get caching in the web browser and in proxies. If you use WebSockets, your server is becoming stateful and you need to have a connection to the client on the server.
Request-Reply communication vs Push
Use WebSockets only if you need to PUSH data from the server to the client, that communication pattern is not included in HTTP (only by workarounds). PUSH is helpful if events created by other clients needs to be available to other connected clients e.g. in games where users should act on other clients behaviour. Or if your website is monitoring something, where the server pushes data to the client all the time e.g. stock markets (live).
If you don't need to PUSH data from the server, it's usually easier to use a stateless HTTP REST server. HTTP uses a simple Request-Reply communication pattern.
I'm thinking about transitioning to a WebSocket api for all site functions
No. You should not do it. There is no harm if you support both models. Use REST for one way communication/simple requests & WebSocket for two way communication especially when server want to send real time notification.
WebSocket is a more efficient protocol than RESTful HTTP but still RESTful HTTP scores over WebSocket in below areas.
Create/Update/Delete resources have been defined well for HTTP. You have to implement these operations at low level for WebSockets.
WebSocket connections scale vertically on a single server where as HTTP connections scale horizontally. There are some proprietary non standards-based solutions for WebSocket horizontal scaling .
HTTP comes with a lot of good features such as caching, routing, multiplexing, gzipping etc. These have to built on top of Websocket if you chose Websocket.
Search engine optimizations works well for HTTP URLs.
All Proxy, DNS, firewalls are not yet fully aware of WebSocket traffic. They allow port 80 but might restrict traffic by snooping on it first.
Security with WebSocket is all-or-nothing approach.
Have a look at this article for more details.
The only problem I can using TCP (WebSockets) as your main web content delivery strategy is that there is very little reading material out there about how to design your website architecture and infrastructure using TCP.
So you can't learn from other people's mistakes and development is going to be slower. It's also not a "tried and tested" strategy.
Of course your also going to lose all the advantages of HTTP (Being stateless, and caching are the bigger advantages).
Remember that HTTP is an abstraction for TCP designed for serving web content.
And let's not forget that SEO and search engines don't do websockets. So you can forget about SEO.
Personally I would recommend against this as there's too much risk.
Don't use WS for serving websites, use it for serving web applications
However if you have a toy or a personal websites by all means go for it. Try it, be cutting-edge. For a business or company you cannot justify the risk of doing this.
I learned a little lesson (the hard way). I made a number crunching application that runs on Ubuntu AWS EC2 cloud services (uses powerful GPUs), and I wanted to make a front-end for it just to watch its progress in realtime. Due to the fact that it needed realtime data, it was obvious that I needed websockets to push the updates.
It started with a proof of concept, and worked great. But then when we wanted to make it available to the public, we had to add user session, so we needed login features. And no matter how you look at it, the websocket has to know which user it deals with, so we took the shortcut of using the websockets to authenticate the users. It seemed obvious, and it was convenient.
We actually had to spend quiet some time to make the connections reliable. We started out with some cheap websocket tutorials, but discovered that our implementation was not able to automatically reconnect when the connection was broken. That all improved when we switched to socket-io. Socket-io is a must !
Having said all that, to be honest, I think we missed out on some great socket-io features. Socket-io has a lot more to offer, and I am sure, if you take it in account in your initial design, you can get more out of it. In contrast, we just replaced the old websockets with the websocket functionality of socket-io, and that was it. (no rooms, no channels, ...) A redesign could have made everything more powerful. But we didn't have time for that. That's something to remember for our next project.
Next we started to store more and more data (user history, invoices, transactions, ...). We stored all of it in an AWS dynamodb database, and AGAIN, we used socket-io to communicate the CRUD operations from the front-end to the backend. I think we took a wrong turn there. It was a mistake.
Because shortly after we found out that Amazon's cloud services (AWS) offer some great load-balancing/scaling tools for RESTful applications.
We have the impression now that we need to write a lot of code to perform the handshakes of the CRUD operations.
Recently we implemented Paypal integration. We managed to get it to work. But again, all tutorials are doing it with RESTful APIs. We had to rewrite/rethink their examples to implement them with websockets. We got it to work fairly fast though. But it does feel like we are going against the flow.
Having said all that, we are going live next week. We got there in time, everything works. And it's fast, but will it scale ?
I would consider using both. Each technology has their merit and there is no one-size fits all solution.
The separation of work goes this way:
WebSockets would be the primary method of an application to communicate with the server where a session is required. This eliminates many hacks that are needed for the older browsers (the problem is support for the older browsers which will eliminate this)
RESTful API is used for GET calls that are not session oriented (i.e. not authentication needed) that benefit from browser caching. A good example of this would be reference data for drop downs used by a web application. However. can change a bit more often than...
HTML and Javascript. These comprise the UI of the webapp. These would generally benefit being placed on a CDN.
Web Services using WSDL are still the best way of enterprise level and cross-enterprise communication as it provides a well defined standard for message and data passing. Primarily you'd offload this to a Datapower device to proxy to your web service handler.
All of this happen on the HTTP protocol which gives use secure sockets via SSL already.
For the mobile application though, websockets cannot reconnect back to a disconnected session (How to reconnect to websocket after close connection) and managing that isn't trivial. So for mobile apps, I would still recommend REST API and polling.
Another thing to watch out for when using WebSockets vs REST is scalability. WebSocket sessions are still managed by the server. RESTful API when done properly are stateless (which mean there is no server state that needs to be managed), thus scalability can grow horizontally (which is cheaper) than vertically.
Do I want updates from the server?
Yes: Socket.io
No: REST
The downsides to Socket.io are:
Scalability: WebSockets require open connections and a much different Ops setup to web scale.
Learnin: I don't have unlimited time for my learnin. Things have to get done!
I'll still use Socket.io in my project, but not for basic web forms that REST will do nicely.
WebSockets (or long polling) based transports mostly serve for (near) real-time communication between the server and client. Although there are numerous scenarios where these kinds of transports are required, such as chat or some kind of real-time feeds or other stuff, not all parts of some web application need to be necessarily connected bidirectionally with the server.
REST is resource based architecture which is well understood and offers it's own benefits over other architectures. WebSockets incline more to streams/feeds of data in real-time which would require you to create some kind of server based logic in order to prioritize or differentiate between resources and feeds (in case you don't want to use REST).
I assume that eventually there would be more WebSockets centric frameworks like socketstream in the future when this transport would be more widespread and better understood/documented in the form of data type/form agnostic delivery. However, I think, this doesn't mean that it would/should replace the REST just because it offers functionality which isn't necessarily required in numerous use cases and scenarios.
I'd like to point out this blog post that is up to me, the best answer to this question.
In short, YES
The post contains all the best practices for such kind of API.
That's not a good idea. The standard isn't even finalized yet, support varies across browsers, etc. If you want to do this now you'll end up needing to fallback to flash or long polling, etc. In the future it probably still won't make a lot of sense, since the server has to support leaving connections open to every single user. Most web servers are designed instead to excel at quickly responding to requests and closing them as quickly as possibly. Heck even your operating system would have to be tuned to deal with a high number of simultaneous connections (each connection using up more ephemeral ports and memory). Stick to using REST for as much of the site as you can.
Related
What is the best technology to use if I want to make instant user notification,
like StackOverflow has?
I consider SSE and WebSockets. What are pros and cons of each solution?
Should I use socket.io or better to use WebSockets directly?
The main difference is that with SSE you can only receive messages from the server. You cannot send messages to the server. Everything doable with SSE is doable with WebSockets. But not vice versa - WebSocket is capable of sending data to the server. So from that point of view WebSockets win. I can't really see any advantage of SSE (perhaps performance?), but then again I don't have much experience with it.
Note that StackOverflow uses WebSockets. They might have some fallback for older browser, I don't know about that.
As for the third question: perhaps you should ask what language you want to use in the first place? I've been working with WebSockets and Python and it worked really well. You could work with WebSockets directly. The advantage of using socket.io is mostly the fallback (assuming it matters - it does not IMHO): if WebSockets are not available it can automatically switch to other ways of communication (like Flash or long polling). The disadvantage is that it is Node.js (in the sense that you have to restrict yourself to one language) plus there are some performance issues, i.e. socket.io does not scale well beyond one machine.
You might consider using a library like SocketIO that abstracts out the transport layer, so you don't have to worry about the mechanics of how the real-time connection is maintained. This will save you a TON of headaches.
I am using Flask and I want to show the user how many visits that he has on his website in realtime.
Currently, I think a way is to, create an infinite loop which has some delay after every iteration and which makes an ajax request getting the current number of visits.
I have also heard about node.js however I think that running another process might make the computer that its running on slower (i'm assuming) ?
How can I achieve the realtime updates on my site? Is there a way to do this with Flask?
Thank you in advance!
Well, there are many possibilites:
1) Polling - this is exactly what you've described. Infinite loop which makes an AJAX request every now and then. Easy to implement, can be easily done with Flask however quite inefficient - eats lots of resources and scales horribly - making it a really bad choice (avoid it at all costs). You will either kill your machine with it or the notification period (polling interval) will have to be so big that it will be a horrible user experience.
2) Long polling - a technique where a client makes an AJAX request but the server responds to that request only when a notification is available. After receiving the notification the client immediately makes a new request. You will require a custom web server for this - I doubt it can be done with Flask. A lot better then polling (many real websites use it) but could've been more efficient. That's why we have now:
3) WebSockets - truely bidirectional communication. Each client maintains an open TCP connection with the server and can react to incoming data. But again: requires a custom server plus only the most modern browsers support it.
4) Other stuff like Flash or Silverlight or other HTTP tricks (chunked encoding): pretty much the same as no 3). Though more difficult to maintain.
So as you can see if you want something more elegant (and efficient) than polling it requires some serious preparation.
As for processes: you should not worry about that. It's not about how many processes you use but how heavy they are. 1 badly written process can easily kill your machine while 100 well written will work smoothly. So make sure it is written in such a way that it won't freeze your machine (and I assure you that it can be done up to some point defined by number of simultaneous users).
As for language: it doesn't matter whether this is Node.js or any other language (like Python). Pick the one you are feeling better with. However I am aware that there's a strong tendency to use Node.js for such projects and thus there might be more proper libraries out there in the internets. Or maybe not. Python has for example Twisted and/or Tornado specially for that (and probably much much more).
Websocket is an event-driven protocol, which means you can actually use it for truly real-time communication.
Kenneth Reitz wrote an extension named Flask-Sockets that is excellent for websockets:
Article: introducing-flask-sockets
Github: flask-sockets
In my opinion, the best option for achieving real time data streaming to a frontend UI is to use a messaging service like pubnub. They have libraries for any language you are going to want to be using. Basically, your user interfaces subscribe to a data channel. Things which create data then publish to that channel, and all subscribers receive the publish very quickly. It is also extremely simple to implement.
You can use PubNub and specifically PubNub presence meant especially for online presence detection. It provides key features like
Track online and offline status of users and devices in realtime
Occupancy to monitor user and machine presence in realtime
Join/Leave Notification for immediate updates of all client connections
Global Scale with synchronized servers across the PubNub Data Stream Network
PubNub Presence Tutorial provides code that can be downloaded to get presence up and running in a few minutes. Five Ways You Can Use PubNub Presence shows you the different ways you can use PubNub presence.
You can get started by signing up for PubNub and getting your API keys.
For a couple recent projects on our corporate intranet, I have used a very simple stack of nginx + redis + webdis + client-side javascript to implement some simple data analysis tools. The experience was absolutely wonderful, especially compared to my previous experience with other stacks (including custom c++, apache/mod_perl, ASP.Net MVC, .Net HttpListener, Ruby on Rails, and a bit of Node.js). Given the availability of client-side templating tools and frontend libraries such as jquery-ui, it seems that I could happily implement much more complicated web-apps using such a no-server-side-code stack (perhaps substituting/augmenting redis with couchdb if warranted)...
The major limitation of this stack, of course, is that my database is directly exposed to the network - acceptable in this case on a firewalled corporate network, but not really an option if I wanted to use the same techniques on the internet. I need to have some level of server-side logic to securely handle authentication and user-role management.
Are there any best practices or common development stacks for this? Ideally I'd like something that is lightweight, and gives me a simple framework for filtering the client-side requests through my custom user-role logic before forwarding them on to the database back-end. I'm not interested in any sort of server-side templating, or ActiveRecord-style storage-level abstractions.
I can't comment on a framework.
You've already mentioned the primary weakness of this, especially on the internet, that being security. The problem there is not just authentication. The problem there is essentially the openness of the client, in this case the web browser, and the protocol, notably HTTP using JSON or XML or some other plaintext protocol.
Consider one example. It's quite simple. Imagine an HTTP service that takes an SQL query and returns a collection of JSON representing the rows. This is straightforward to write. You could probably pound out a nascent one in less than an hour from scratch using any tool that gives you SQL access to an RDBMS.
Arguably, back in the Golden Days of Client Server development, this is exactly what folks did, only instead of a some data tunneled over HTTP, folks used a DB specific driver and sent SQL text over to the back in DB directly.
The problem today is that the protocols are too open. If you implemented that SQL service mentioned above, you essentially turn your entire application in a SQL injection vector.
You simply can not secure something like that in the wild. The protocol is open to trivial observation (every browser comes with a built in packet sniffer, effectively today), along with all of the source code for the application. If you try to encrypt the data, that's all done on the client as well -- with the source to the process, as well as any keys involved.
CouchDB, for example, can not be secured this way. If someone has rights to the server, they have rights to all of the data. ALL of the data. The stuff you want them to see, and the stuff you don't.
The solution, naturally, is a service layer. Something that speaks at a higher level than simply raw data streams. Something that can be secured, and can keep secrets from the clients. But that, naturally, takes server side programming to enable, and its a ostensibly more work, more layers, more data conversion, more a pain.
Back in the day, folks would write entire systems using nothing but stored procedures in the DB. The procedures would have rights that the users invoking them did not, thus you could limit at the server what a user could or could not see or change. You could given them unlimited SELECT capability on a restricted view, perhaps, while a stored procedure would have rights to actually change data or access some of the hidden columns.
Stored procedures have mostly been replaced by application layers and application servers, with the DB being more and more relegated to "dumb storage". But the concepts are similar.
There's value for some scenarios to publishing data straight to the web, like you analytics example. That's a specific, read heavy niche. But beyond that, the concept doesn't work well, I fear. Obfuscated JS is hard to read, but not secure.
This is likely why you may have a little difficulty locating such a framework (I haven't looked at all, myself).
I'm building a web app for 'brainstorming.' Here's how it works: essentially, a user can come onto the app, and submit a challenge, or click on one that's already there, then think up ideas to resolve that challenge and post them up. I hacked together a basic example here on couchdb: http://wamoyo.iriscouch.com/ideageneration/_design/IdeaGeneration/attachments%2findex.html
I'm going to rebuild it from scratch and all, and I'm hitting up against a challenge that's very unfamiliar to me. I'd like for multiple users to be able to generate ideas for the same challenge at the same time. Kinda like the way google docs allows multiple people to edit a shared document. I have some preliminary thoughts on how to go about this, but I thought I'd ask the expert network here.
I'm fairly comfortable with AJAX, is there a pure AJAX way to make it live and multiuser? Would there be an enormous benefit to going with node.js? What might be some other options?
Thanks soo much!
There are several approaches in making such web pages, using plain ajax polling, using long polling and using web sockets.
Ajax polling - easy to implement, essentially connecting to server recurrently via javascript timer, retrieve data from server and send it back via regular Ajax.
Advantages: easy to implement, works everywhere
Disadvantages: the updates are not in real-time, the data is exchanged only when the timer ticks.
Long polling - the idea is that the connection stays open until it times out, then the connection is reestablished. Can be tricky to implement because of different settings for request timeouts for different web servers, routers, etc.
Web sockets - part of HTML5 umbrella, works only in fairly modern browsers, the protocol changes often which may cause incompatibilities during development and production. Can be used natively with modern browsers and via a Flash plugin with older ones. This technology is most lightweight, because it doesn't incur all the HTTP overhead. Think of it as bi-directional, full-duplex communication channel between a browser and a web server via TCP.
For a detailed discussion, I recommend reading this good post by Scott Hanselman. It tells the story about SignalR, but is applicable to other server-side frameworks.
There is also a podcast by same author, the guest goes fairly deeply into explaining these technologies. Worth listening, IMO.
To answer your question about node.js, please share us your current server technology, so we could get more insight into your stack.
I've been trying to learn a little about SproutCore, following the "Todos" tutorial, and I have a couple of questions that haven't been able to find online.
SproutCore is supposed to move all of the business logic to the client. How is that not insecure? A malicious user could easily tamper with the code (since it's all on the client) and change the way the app behaves. How am I wrong here?
SproutCore uses "DataStores", and some of them can be remote. How can I avoid that a malicious user does not interact with the backend on his own? Using some sort of API key wouldn't work since the code is on the client side. Is there some sort of convention here? Any ideas? This really bugs me.
Thanks in advance!
PS: Anyone thinks Cappuccino is a better alternative? I decided to go with SproutCore because the documentation on Cappuccino seemed pretty bad, although SproutCore's doesn't get any better.
Ian
your concerns are valid. The thing is, they apply to all client side code, no matter what framework. So:
Web applications are complicated things. Moving processing to the client is a good thing, because it speeds up the responsiveness of the application. However, it is imperative that the server validate all data inputs, just like in any other web application.
Additionally, all web applications should use the well known authentication/authorization paradigms that are prevalent in system security. Authentication means you must verify that the user is who they say they are, and they can use the system, with Authorization means that the server must verify that the user can do what they are trying e.g. can they create a new data entry, or edit an existing one. It is good design to not present users with UI options that they are not allowed to perform, but you should not rely on that.
All web applications must do those things.
With respect to the 'interacting with the back end' concern: Again, all web applications have this concern. You can open up firebug/webkit, and look at all the the xhr requests that RIAs use in their operations, and mimic them to try to do something on that system. Again, this concern is dealt with by the authentication/authorization checks that you must implement. Anybody can use any webclient to send a request to the server. It is up to the developer to validate that request.
The DataSources in SproutCore are just an abstraction around how SC apps interact with the server. At the end of the day, however, all SC is doing is making XHR requests to the server, like any other RIA.