I am trying to use worker threads with the worker pool in my application which is intended to be running in a 256mb docker containers.
My main thread is taking around 30mb of memory and 1 worker thread to be taking around 25mb of memory (considering require of third party node modules). Considering this, I would only be able to create a pool of ~7 workers.
But my application requirements are such that it should be able to handle many jobs at a time by creating many workers up and listening for a job (like around 20 or more).
Is there any way wherein I can use the third-party modules like (lodash, request, etc) to be shared across worker threads to save memory it needs in requiring all necessary modules.
My initial thought process was like I can give a try with shared memory (SharedArrayBuffer) but then it will not work as it won't allow passing such complex object structure and functions.
Can anyone help me what can be a possible solution?
Thanks in advance!
SITUATION:
I am trying to implement caching for rate-limited third-party API requests my website needs to make.
Apparently, basic solutions such as:
https://www.npmjs.com/package/node-cache
won't even share cache between CPUs, much less between instances ?
Is that correct ?
And if it is, how can I share cache between instances so that I have one unified cache for my website across instances ?
After googling for a while, it seems Redis would be a solution. But from what I gathered, I would have to host Redis on it's own dedicated instance for the cache to be unique across my website VM instances ?
What if the instance hosting Redis is overloaded and also needs to be auto-scaled to multiple instances ?
QUESTION:
How can I implement shared cache between VM instances of my website ?
You could add to your GAE application a 1st generation standard environment service which would:
act as a caching service for your node.js (or other 2nd generation standard environment or flexible environment) service(s) and, under the hood, use itself the GAE memcache service, only available in those 1st gen standard environments.
maybe even make itself those rate-limited 3rd party API calls, it will probably be simpler to properly coordinate the cached results that way
be configured for auto scaling to address the scalability concern
Currently I have two node processes that I need to run. One is my own custom app and another is iframely, another Node app that returns embed codes. Right now, I have the node app make requests to, say http://localhost:8061/iframely?url=.... But now, switching to Heroku, only 1 process in my app can accept HTTP requests (this is the process designated with web: in the Procfile, as I understand it).
To run iframely along side my app, do I have to create another app? Or can I have the two processes speak to each other bypassing http? Keeping in mind I don't want to devilry restructure iframely.
It sounds from your description as if you have two separate node apps, and each one serves its own purpose.
Regardless of how these apps are implemented, the best way to handle this sort of thing is via multiple heroku apps. This is what they were designed for!
You can think of a Heroku app as a single-purpose web server. If you have one codebase that does something independent of another, create two Heroku apps. If you have 3 codebases that all do different things, make 3 Heroku apps.
In addition to this being the best way to handle this sort of thing in general (you get more reliability as each service has it's own servers), it's also cheaper: you get 1 free Heroku dyno per app, this means you'll have 2x's the free web servers you would have otherwise.
I have designed a meteor.js application and it works great on localhost and even when deployed to the internet. Now I want create a sign-up site that will spin up new instances of the application for each client who signs up on the back-end. Assuming a meteor.js application and python or javascript for the sign-up site, what high level steps need to be taken to implement this?
I am looking for a more correct and complete answer that takes the form of my poorly imagined version of this:
Use something like node or python to call a shell script that may or may not run as sudo
That script might create a new folder to hold instance specific stuff (like client files, config, and or that instances database).
The script or python code would deploy an instance of the application to that folder and on a specific port
Python might add configuration information to a tool like Pound to forward a subdomain to a port
Other things....!?
I don't really understand the high level steps that need to be taken here so if someone could provide those steps and maybe even some useful tools or tutorials for doing so I'd be extremely grateful.
I have a similar situation to you but ended up solving it in a completely different way. It is now available as a Meteor smart package:
https://github.com/mizzao/meteor-partitioner
The problem we share is that we wanted to write a meteor app as if only one client (or group of clients, in my case) exists, but that it needs to handle multiple sets of clients without them knowing about each other. I am doing the following:
Assume the Meteor app is programmed for just a single instance
Using a smart package, hook the collections on server (and possibly client) so that all operations are 'scoped' only to the instance of the user that is calling them. One way to do this is to automatically attach an 'instance' or 'group' field to each document that is being added.
Doing this correctly requires a lot of knowledge about the internals of Meteor, which I've been learning. However, this approach is a lot cleaner and less resource-intensive than trying to deploy multiple meteor apps at once. It means that you can still code the app as if only one client exists, instead of explicitly doing so for multiple clients. Additionally, it allows you to share resources between the instances that can be shared (i.e. static assets, shared state, etc.)
For more details and discussions, see:
https://groups.google.com/forum/#!topic/meteor-talk/8u2LVk8si_s
https://github.com/matb33/meteor-collection-hooks (the collection-hooks package; read issues for additional discussions)
Let me remark first that I think spinning up multiple instances of the same app is a bad design choice. If it is a stop gap measure, here's what I would suggest:
Create an archive that can be readily deployed. (Bundle the app, reinstall fibers if necessary, rezip). Deploy (unzip) the archive to a new folder when a new instance is created using a script.
Create a template of an init script and use forever or daemonize or jesus etc to start the site on reboot and keep the sites up during normal operation. See Meteor deploying to a VM by installing meteor or How does one start a node.js server as a daemon process? for examples. when a new instance is deployed populate the template with new values (i.e. port number, database name, folder). Copy the filled out template to init.d and link to the runlevel. Alternatively, create one script in init.d that executes other scripts to bring up the site.
Each instance should be listening to its own port, so you'll need a reverse proxy. AFAIK, Apache and Nginx require restarts when you change the configuration, so you'll probably want to look at Hipache https://github.com/dotcloud/hipache. Hipache uses redis to store the configuration information. Adding the new instance requires to add a key to redis. There is an experimental port of Hipache that brings the functionality to Nginx https://github.com/samalba/hipache-nginx
What about DNS updates? Once you create a new instance, do you need to add a new record to your DNS configuration?
I don't really have an answer to your question... but I just want to remind you of another potential problem that you may run into cuz I see you mentioned python, in other words, you may be running another web app on Apache/Nginx, etc... the problem is Meteor is not very friendly when it comes to co-exist with another http server, the project I'm working on was troubled by this issue and we had to move it to a stand alone server after days of hassle with guys from Meteor... I did not work on the issue, so I'm not able to give you more details, but I just looked up online and found something similar: https://serverfault.com/questions/424693/serving-meteor-on-main-domain-and-apache-on-subdomain-independently.
Just something to keep in mind...
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I have an application whose primary function works in real time, through websockets or long polling.
However, most of the site is written in a RESTful fashion, which is nice for application s and other clients in the future. However, I'm thinking about transitioning to a websocket API for all site functions, away from REST. That would make it easier for me to integrate real time features into all parts of the site. Would this make it more difficult to build applications or mobile clients?
I found that some people are already doing stuff like this: SocketStream
Not to say that the other answers here don't have merit, they make some good points. But I'm going to go against the general consensus and agree with you that moving to websockets for more than just realtime features is very appealing.
I am seriously considering moving my app from a RESTful architecture to more of an RPC style via websockets. This is not a "toy app", and I'm not talking about only realtime features, so I do have reservations. But I see many benefits in going this route and feel it could turn out to be an exceptional solution.
My plan is to use DNode, SocketIO, and Backbone. With these tools, my Backbone models and collections can be passed around from/to client and server by simply calling a functions RPC-style. No more managing REST endpoints, serializing/deserializing objects, and so forth. I haven't worked with socketstream yet, but it looks worth checking out.
I still have a long way to go before I can definitively say this is a good solution, and I'm sure it isn't the best solution for every application, but I'm convinced that this combination would be exceptionally powerful. I admit that there are some drawbacks, such as losing the ability to cache resources. But I have a feeling the advantages will outweigh them.
I'd be interested in following your progress exploring this type of solution. If you have any github experiments, please point me at them. I don't have any yet, but hope to soon.
Below is a list of to-read-later links that I've been collecting. I can't vouch that they are all worthwhile, as I've only skimmed many of them. But hopefully some will help.
Great tutorial on using Socket.IO with Express. It exposes express sessions to socket.io and discusses how to have different rooms for each authenticated user.
http://www.danielbaulig.de/socket-ioexpress/
Tutorial on node.js/socket.io/backbone.js/express/connect/jade/redis with authentication, Joyent hosting, etc:
http://fzysqr.com/2011/02/28/nodechat-js-using-node-js-backbone-js-socket-io-and-redis-to-make-a-real-time-chat-app/
http://fzysqr.com/2011/03/27/nodechat-js-continued-authentication-profiles-ponies-and-a-meaner-socket-io/
Tutorial on using Pusher with Backbone.js (using Rails):
http://blog.pusher.com/2011/6/21/backbone-js-now-realtime-with-pusher
Build application with backbone.js on the client and node.js with express, socket.io, dnode on the server.
http://andyet.net/blog/2011/feb/15/re-using-backbonejs-models-on-the-server-with-node/
http://addyosmani.com/blog/building-spas-jquerys-best-friends/
http://fzysqr.com/2011/02/28/nodechat-js-using-node-js-backbone-js-socket-io-and-redis-to-make-a-real-time-chat-app/
http://fzysqr.com/2011/03/27/nodechat-js-continued-authentication-profiles-ponies-and-a-meaner-socket-io/
Using Backbone with DNode:
http://quickleft.com/blog/backbone-without-ajax-part-ii
http://quickleft.com/blog/backbone-without-ajax-part-1
http://sorensen.posterous.com/introducing-backbone-redis
https://github.com/cowboyrushforth/minespotter
http://amir.unoc.net/how-to-share-backbonejs-models-with-nodejs
http://hackerne.ws/item?id=2222935
http://substack.net/posts/24ab8c
HTTP REST and WebSockets are very different. HTTP is stateless, so the web server doesn't need to know anything, and you get caching in the web browser and in proxies. If you use WebSockets, your server is becoming stateful and you need to have a connection to the client on the server.
Request-Reply communication vs Push
Use WebSockets only if you need to PUSH data from the server to the client, that communication pattern is not included in HTTP (only by workarounds). PUSH is helpful if events created by other clients needs to be available to other connected clients e.g. in games where users should act on other clients behaviour. Or if your website is monitoring something, where the server pushes data to the client all the time e.g. stock markets (live).
If you don't need to PUSH data from the server, it's usually easier to use a stateless HTTP REST server. HTTP uses a simple Request-Reply communication pattern.
I'm thinking about transitioning to a WebSocket api for all site functions
No. You should not do it. There is no harm if you support both models. Use REST for one way communication/simple requests & WebSocket for two way communication especially when server want to send real time notification.
WebSocket is a more efficient protocol than RESTful HTTP but still RESTful HTTP scores over WebSocket in below areas.
Create/Update/Delete resources have been defined well for HTTP. You have to implement these operations at low level for WebSockets.
WebSocket connections scale vertically on a single server where as HTTP connections scale horizontally. There are some proprietary non standards-based solutions for WebSocket horizontal scaling .
HTTP comes with a lot of good features such as caching, routing, multiplexing, gzipping etc. These have to built on top of Websocket if you chose Websocket.
Search engine optimizations works well for HTTP URLs.
All Proxy, DNS, firewalls are not yet fully aware of WebSocket traffic. They allow port 80 but might restrict traffic by snooping on it first.
Security with WebSocket is all-or-nothing approach.
Have a look at this article for more details.
The only problem I can using TCP (WebSockets) as your main web content delivery strategy is that there is very little reading material out there about how to design your website architecture and infrastructure using TCP.
So you can't learn from other people's mistakes and development is going to be slower. It's also not a "tried and tested" strategy.
Of course your also going to lose all the advantages of HTTP (Being stateless, and caching are the bigger advantages).
Remember that HTTP is an abstraction for TCP designed for serving web content.
And let's not forget that SEO and search engines don't do websockets. So you can forget about SEO.
Personally I would recommend against this as there's too much risk.
Don't use WS for serving websites, use it for serving web applications
However if you have a toy or a personal websites by all means go for it. Try it, be cutting-edge. For a business or company you cannot justify the risk of doing this.
I learned a little lesson (the hard way). I made a number crunching application that runs on Ubuntu AWS EC2 cloud services (uses powerful GPUs), and I wanted to make a front-end for it just to watch its progress in realtime. Due to the fact that it needed realtime data, it was obvious that I needed websockets to push the updates.
It started with a proof of concept, and worked great. But then when we wanted to make it available to the public, we had to add user session, so we needed login features. And no matter how you look at it, the websocket has to know which user it deals with, so we took the shortcut of using the websockets to authenticate the users. It seemed obvious, and it was convenient.
We actually had to spend quiet some time to make the connections reliable. We started out with some cheap websocket tutorials, but discovered that our implementation was not able to automatically reconnect when the connection was broken. That all improved when we switched to socket-io. Socket-io is a must !
Having said all that, to be honest, I think we missed out on some great socket-io features. Socket-io has a lot more to offer, and I am sure, if you take it in account in your initial design, you can get more out of it. In contrast, we just replaced the old websockets with the websocket functionality of socket-io, and that was it. (no rooms, no channels, ...) A redesign could have made everything more powerful. But we didn't have time for that. That's something to remember for our next project.
Next we started to store more and more data (user history, invoices, transactions, ...). We stored all of it in an AWS dynamodb database, and AGAIN, we used socket-io to communicate the CRUD operations from the front-end to the backend. I think we took a wrong turn there. It was a mistake.
Because shortly after we found out that Amazon's cloud services (AWS) offer some great load-balancing/scaling tools for RESTful applications.
We have the impression now that we need to write a lot of code to perform the handshakes of the CRUD operations.
Recently we implemented Paypal integration. We managed to get it to work. But again, all tutorials are doing it with RESTful APIs. We had to rewrite/rethink their examples to implement them with websockets. We got it to work fairly fast though. But it does feel like we are going against the flow.
Having said all that, we are going live next week. We got there in time, everything works. And it's fast, but will it scale ?
I would consider using both. Each technology has their merit and there is no one-size fits all solution.
The separation of work goes this way:
WebSockets would be the primary method of an application to communicate with the server where a session is required. This eliminates many hacks that are needed for the older browsers (the problem is support for the older browsers which will eliminate this)
RESTful API is used for GET calls that are not session oriented (i.e. not authentication needed) that benefit from browser caching. A good example of this would be reference data for drop downs used by a web application. However. can change a bit more often than...
HTML and Javascript. These comprise the UI of the webapp. These would generally benefit being placed on a CDN.
Web Services using WSDL are still the best way of enterprise level and cross-enterprise communication as it provides a well defined standard for message and data passing. Primarily you'd offload this to a Datapower device to proxy to your web service handler.
All of this happen on the HTTP protocol which gives use secure sockets via SSL already.
For the mobile application though, websockets cannot reconnect back to a disconnected session (How to reconnect to websocket after close connection) and managing that isn't trivial. So for mobile apps, I would still recommend REST API and polling.
Another thing to watch out for when using WebSockets vs REST is scalability. WebSocket sessions are still managed by the server. RESTful API when done properly are stateless (which mean there is no server state that needs to be managed), thus scalability can grow horizontally (which is cheaper) than vertically.
Do I want updates from the server?
Yes: Socket.io
No: REST
The downsides to Socket.io are:
Scalability: WebSockets require open connections and a much different Ops setup to web scale.
Learnin: I don't have unlimited time for my learnin. Things have to get done!
I'll still use Socket.io in my project, but not for basic web forms that REST will do nicely.
WebSockets (or long polling) based transports mostly serve for (near) real-time communication between the server and client. Although there are numerous scenarios where these kinds of transports are required, such as chat or some kind of real-time feeds or other stuff, not all parts of some web application need to be necessarily connected bidirectionally with the server.
REST is resource based architecture which is well understood and offers it's own benefits over other architectures. WebSockets incline more to streams/feeds of data in real-time which would require you to create some kind of server based logic in order to prioritize or differentiate between resources and feeds (in case you don't want to use REST).
I assume that eventually there would be more WebSockets centric frameworks like socketstream in the future when this transport would be more widespread and better understood/documented in the form of data type/form agnostic delivery. However, I think, this doesn't mean that it would/should replace the REST just because it offers functionality which isn't necessarily required in numerous use cases and scenarios.
I'd like to point out this blog post that is up to me, the best answer to this question.
In short, YES
The post contains all the best practices for such kind of API.
That's not a good idea. The standard isn't even finalized yet, support varies across browsers, etc. If you want to do this now you'll end up needing to fallback to flash or long polling, etc. In the future it probably still won't make a lot of sense, since the server has to support leaving connections open to every single user. Most web servers are designed instead to excel at quickly responding to requests and closing them as quickly as possibly. Heck even your operating system would have to be tuned to deal with a high number of simultaneous connections (each connection using up more ephemeral ports and memory). Stick to using REST for as much of the site as you can.