Background processes in Node.js

Background processes in Node.js - javascript

What is a good aproach to handle background processes in a NodeJS application?
Scenario: After a user posts something to an app I want to crunch the data, request additional data from external resources, etc. All of this is quite time consuming, so I want it out of the req/res loop. Ideal would be to just have a queue of jobs where you can quickly dump a job on and a daemon or task runner will always take the oldest one and process it.
In RoR I would have done it with something like Delayed Job. What is the Node equivalent of this API?

If you want something lightweight, that runs in the same process as the server, I highly recommend Bull. It has a simple API that allows for a fine grained control over your queues.
If you're familiar with Ruby's Resque, there is a node implementation called Node-resque
Bull and Node-resque are all backed by Redis, which is ubiquitous among Node.js worker queues. They would be able to do what RoR's DelayedJob does, it's matter of specific features that you want, and your API preferences.

Background jobs are not directly related to your web service work, so they should not be in the same process. As you scale up, the memory usage of the background jobs will impact the web service performance. But you can put them in the same code repository if you want, whatever makes more sense.
One good choice for messaging between the two processes would be redis, if dropping a message every now and then is OK. If you want "no message left behind" you'll need a more heavyweight broker like Rabbit. Your web service process can publish and your background job process can subscribe.
It is not necessary for the two processes to be co-hosted, they can be on separate VMs, Docker containers, whatever you use. This allows you to scale out without much trouble.

If you're using MongoDB, I recommend Agenda. That way, separate Redis instances aren't running and features such as scheduling, queuing, and Web UI are all present. Agenda UI is optional and can be run separately of course.
Would also recommend setting up a loosely coupled abstraction between your application logic and the queuing / scheduling system so the entire background processing system can be swapped out if needed. In other words, keep as much application / processing logic away from your Agenda job definitions in order to keep them lightweight.

I'd like to suggest using Redis for scheduling jobs. It has plenty of different data structures, you can always pick one that suits better to your use case.
You mentioned RoR and DJ, so I assume you're familiar with sidekiq. You can use node-sidekiq for job scheduling if you want to, but its suboptimal imo, since it's main purpose is to integrate nodejs with RoR.
For worker daemonising I'd recommend using PM2. It's widely used and actively-maintained. It solves a lot of problems (e.g. deployment, monitoring, clustering) so make sure it won't be an overkill for you.

I tried bee-queue & bull and chose bull in the end.
I first chose bee-queue b/c it is quite simple, their examples are easy to understand, while bull's examples are bit complicated. bee's wiki Bee Queue's Origin also resonates with me. But the problem with bee is <1> their issue resolution time is quite slow, their latest update was 10 months ago. <2> I can't find an easy way to pause/cancel job.
Bull, on the other hand, frequently updates their codes, response to issues. Node.js job queue evaluation said bull's weakness is "slow issues resolution time", but my experience is the opposite!
But anyway their api is similar so it is quite easy to switch from one to another.

I suggest to use a proper Node.js framework to build you app.
I think that the most powerful and easy to use is Sails.js.
It's a MVC framework so if you are used to develop in ROR, you will find it very very easy!
If you use it, It's already present a powerful (in javascript terms) job manager.
new sails.cronJobs('0 01 01 * * 0', function () {
sails.log.warn("START ListJob");
}, null, true, "Europe/Dublin");
If you need more info not hesitate to contact me!

Related

Confused about nodes purpose [duplicate]

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am new to this kind of stuff, but lately I've been hearing a lot about how good Node.js is. Considering how much I love working with jQuery and JavaScript in general, I can't help but wonder how to decide when to use Node.js. The web application I have in mind is something like Bitly - takes some content, archives it.
From all the homework I have been doing in the last few days, I obtained the following information. Node.js
is a command-line tool that can be run as a regular web server and lets one run JavaScript programs
utilizes the great V8 JavaScript engine
is very good when you need to do several things at the same time
is event-based so all the wonderful Ajax-like stuff can be done on the server side
lets us share code between the browser and the backend
lets us talk with MySQL
Some of the sources that I have come across are:
Diving into Node.js – Introduction and Installation
Understanding NodeJS
Node by Example (Archive.is)
Let’s Make a Web App: NodePad
Considering that Node.js can be run almost out-of-the-box on Amazon's EC2 instances, I am trying to understand what type of problems require Node.js as opposed to any of the mighty kings out there like PHP, Python and Ruby. I understand that it really depends on the expertise one has on a language, but my question falls more into the general category of: When to use a particular framework and what type of problems is it particularly suited for?

You did a great job of summarizing what's awesome about Node.js. My feeling is that Node.js is especially suited for applications where you'd like to maintain a persistent connection from the browser back to the server. Using a technique known as "long-polling", you can write an application that sends updates to the user in real time. Doing long polling on many of the web's giants, like Ruby on Rails or Django, would create immense load on the server, because each active client eats up one server process. This situation amounts to a tarpit attack. When you use something like Node.js, the server has no need of maintaining separate threads for each open connection.
This means you can create a browser-based chat application in Node.js that takes almost no system resources to serve a great many clients. Any time you want to do this sort of long-polling, Node.js is a great option.
It's worth mentioning that Ruby and Python both have tools to do this sort of thing (eventmachine and twisted, respectively), but that Node.js does it exceptionally well, and from the ground up. JavaScript is exceptionally well situated to a callback-based concurrency model, and it excels here. Also, being able to serialize and deserialize with JSON native to both the client and the server is pretty nifty.
I look forward to reading other answers here, this is a fantastic question.
It's worth pointing out that Node.js is also great for situations in which you'll be reusing a lot of code across the client/server gap. The Meteor framework makes this really easy, and a lot of folks are suggesting this might be the future of web development. I can say from experience that it's a whole lot of fun to write code in Meteor, and a big part of this is spending less time thinking about how you're going to restructure your data, so the code that runs in the browser can easily manipulate it and pass it back.
Here's an article on Pyramid and long-polling, which turns out to be very easy to set up with a little help from gevent: TicTacToe and Long Polling with Pyramid.

I believe Node.js is best suited for real-time applications: online games, collaboration tools, chat rooms, or anything where what one user (or robot? or sensor?) does with the application needs to be seen by other users immediately, without a page refresh.
I should also mention that Socket.IO in combination with Node.js will reduce your real-time latency even further than what is possible with long polling. Socket.IO will fall back to long polling as a worst case scenario, and instead use web sockets or even Flash if they are available.
But I should also mention that just about any situation where the code might block due to threads can be better addressed with Node.js. Or any situation where you need the application to be event-driven.
Also, Ryan Dahl said in a talk that I once attended that the Node.js benchmarks closely rival Nginx for regular old HTTP requests. So if we build with Node.js, we can serve our normal resources quite effectively, and when we need the event-driven stuff, it's ready to handle it.
Plus it's all JavaScript all the time. Lingua Franca on the whole stack.

Reasons to use NodeJS:
It runs Javascript, so you can use the same language on server and client, and even share some code between them (e.g. for form validation, or to render views at either end.)
The single-threaded event-driven system is fast even when handling lots of requests at once, and also simple, compared to traditional multi-threaded Java or ROR frameworks.
The ever-growing pool of packages accessible through NPM, including client and server-side libraries/modules, as well as command-line tools for web development. Most of these are conveniently hosted on github, where sometimes you can report an issue and find it fixed within hours! It's nice to have everything under one roof, with standardized issue reporting and easy forking.
It has become the defacto standard environment in which to run Javascript-related tools and other web-related tools, including task runners, minifiers, beautifiers, linters, preprocessors, bundlers and analytics processors.
It seems quite suitable for prototyping, agile development and rapid product iteration.
Reasons not to use NodeJS:
It runs Javascript, which has no compile-time type checking. For large, complex safety-critical systems, or projects including collaboration between different organizations, a language which encourages contractual interfaces and provides static type checking may save you some debugging time (and explosions) in the long run. (Although the JVM is stuck with null, so please use Haskell for your nuclear reactors.)
Added to that, many of the packages in NPM are a little raw, and still under rapid development. Some libraries for older frameworks have undergone a decade of testing and bugfixing, and are very stable by now. Npmjs.org has no mechanism to rate packages, which has lead to a proliferation of packages doing more or less the same thing, out of which a large percentage are no longer maintained.
Nested callback hell. (Of course there are 20 different solutions to this...)
The ever-growing pool of packages can make one NodeJS project appear radically different from the next. There is a large diversity in implementations due to the huge number of options available (e.g. Express/Sails.js/Meteor/Derby). This can sometimes make it harder for a new developer to jump in on a Node project. Contrast that with a Rails developer joining an existing project: he should be able to get familiar with the app pretty quickly, because all Rails apps are encouraged to use a similar structure.
Dealing with files can be a bit of a pain. Things that are trivial in other languages, like reading a line from a text file, are weird enough to do with Node.js that there's a StackOverflow question on that with 80+ upvotes. There's no simple way to read one record at a time from a CSV file. Etc.
I love NodeJS, it is fast and wild and fun, but I am concerned it has little interest in provable-correctness. Let's hope we can eventually merge the best of both worlds. I am eager to see what will replace Node in the future... :)

To make it short:
Node.js is well suited for applications that have a lot of concurrent connections and each request only needs very few CPU cycles, because the event loop (with all the other clients) is blocked during execution of a function.
A good article about the event loop in Node.js is Mixu's tech blog: Understanding the node.js event loop.

I have one real-world example where I have used Node.js. The company where I work got one client who wanted to have a simple static HTML website. This website is for selling one item using PayPal and the client also wanted to have a counter which shows the amount of sold items. Client expected to have huge amount of visitors to this website. I decided to make the counter using Node.js and the Express.js framework.
The Node.js application was simple. Get the sold items amount from a Redis database, increase the counter when item is sold and serve the counter value to users via the API.
Some reasons why I chose to use Node.js in this case
It is very lightweight and fast. There has been over 200000 visits on this website in three weeks and minimal server resources has been able to handle it all.
The counter is really easy to make to be real time.
Node.js was easy to configure.
There are lots of modules available for free. For example, I found a Node.js module for PayPal.
In this case, Node.js was an awesome choice.

The most important reasons to start your next project using Node ...
All the coolest dudes are into it ... so it must be fun.
You can hangout at the cooler and have lots of Node adventures to brag about.
You're a penny pincher when it comes to cloud hosting costs.
Been there done that with Rails
You hate IIS deployments
Your old IT job is getting rather dull and you wish you were in a shiny new Start Up.
What to expect ...
You'll feel safe and secure with Express without all the server bloatware you never needed.
Runs like a rocket and scales well.
You dream it. You installed it. The node package repo npmjs.org is the largest ecosystem of open source libraries in the world.
Your brain will get time warped in the land of nested callbacks ...
... until you learn to keep your Promises.
Sequelize and Passport are your new API friends.
Debugging mostly async code will get umm ... interesting .
Time for all Noders to master Typescript.
Who uses it?
PayPal, Netflix, Walmart, LinkedIn, Groupon, Uber, GoDaddy, Dow Jones
Here's why they switched to Node.

There is nothing like Silver Bullet. Everything comes with some cost associated with it. It is like if you eat oily food, you will compromise your health and healthy food does not come with spices like oily food. It is individual choice whether they want health or spices as in their food.
Same way Node.js consider to be used in specific scenario. If your app does not fit into that scenario you should not consider it for your app development. I am just putting my thought on the same:
When to use Node.JS
If your server side code requires very few cpu cycles. In other world you are doing non blocking operation and does not have heavy algorithm/Job which consumes lots of CPU cycles.
If you are from Javascript back ground and comfortable in writing Single Threaded code just like client side JS.
When NOT to use Node.JS
Your server request is dependent on heavy CPU consuming algorithm/Job.
Scalability Consideration with Node.JS
Node.JS itself does not utilize all core of underlying system and it is single threaded by default, you have to write logic by your own to utilize multi core processor and make it multi threaded.
Node.JS Alternatives
There are other option to use in place of Node.JS however Vert.x seems to be pretty promising and has lots of additional features like polygot and better scalability considerations.

Another great thing that I think no one has mentioned about Node.js is the amazing community, the package management system (npm) and the amount of modules that exist that you can include by simply including them in your package.json file.

My piece: nodejs is great for making real time systems like analytics, chat-apps, apis, ad servers, etc.
Hell, I made my first chat app using nodejs and socket.io under 2 hours and that too during exam
week!
Edit
Its been several years since I have started using nodejs and I have used it in making many different things including static file servers, simple analytics, chat apps and much more.
This is my take on when to use nodejs
When to use
When making system which put emphasis on concurrency and speed.
Sockets only servers like chat apps, irc apps, etc.
Social networks which put emphasis on realtime resources like geolocation, video stream, audio stream, etc.
Handling small chunks of data really fast like an analytics webapp.
As exposing a REST only api.
When not to use
Its a very versatile webserver so you can use it wherever you want but probably not these places.
Simple blogs and static sites.
Just as a static file server.
Keep in mind that I am just nitpicking. For static file servers, apache is better mainly because it is widely available. The nodejs community has grown larger and more mature over the years and it is safe to say nodejs can be used just about everywhere if you have your own choice of hosting.

It can be used where
Applications that are highly event driven & are heavily I/O bound
Applications handling a large number of connections to other systems
Real-time applications (Node.js was designed from the ground up for real time and to be easy
to use.)
Applications that juggle scads of information streaming to and from other sources
High traffic, Scalable applications
Mobile apps that have to talk to platform API & database, without having to do a lot of data
analytics
Build out networked applications
Applications that need to talk to the back end very often
On Mobile front, prime-time companies have relied on Node.js for their mobile solutions. Check out why?
LinkedIn is a prominent user. Their entire mobile stack is built on Node.js. They went from running 15 servers with 15 instances on each physical machine, to just 4 instances – that can handle double the traffic!
eBay launched ql.io, a web query language for HTTP APIs, which uses Node.js as the runtime stack. They were able to tune a regular developer-quality Ubuntu workstation to handle more than 120,000 active connections per node.js process, with each connection consuming about 2kB memory!
Walmart re-engineered its mobile app to use Node.js and pushed its JavaScript processing to the server.
Read more at: http://www.pixelatingbits.com/a-closer-look-at-mobile-app-development-with-node-js/

Node best for concurrent request handling -
So, Let’s start with a story. From last 2 years I am working on JavaScript and developing web front end and I am enjoying it. Back end guys provide’s us some API’s written in Java,python (we don’t care) and we simply write a AJAX call, get our data and guess what ! we are done. But in real it is not that easy, If data we are getting is not correct or there is some server error then we stuck and we have to contact our back end guys over the mail or chat(sometimes on whatsApp too :).) This is not cool. What if we wrote our API’s in JavaScript and call those API’s from our front end ? Yes that’s pretty cool because if we face any problem in API we can look into it. Guess what ! you can do this now , How ? – Node is there for you.
Ok agreed that you can write your API in JavaScript but what if I am ok with above problem. Do you have any other reason to use node for rest API ?
so here is the magic begins. Yes I do have other reasons to use node for our API’s.
Let’s go back to our traditional rest API system which is based on either blocking operation or threading. Suppose two concurrent request occurs( r1 and r2) , each of them require database operation. So In traditional system what will happens :
1. Waiting Way : Our server starts serving r1 request and waits for query response. after completion of r1 , server starts to serve r2 and does it in same way. So waiting is not a good idea because we don’t have that much time.
2. Threading Way : Our server will creates two threads for both requests r1 and r2 and serve their purpose after querying database so cool its fast.But it is memory consuming because you can see we started two threads also problem increases when both request is querying same data then you have to deal with deadlock kind of issues . So its better than waiting way but still issues are there.
Now here is , how node will do it:
3. Nodeway : When same concurrent request comes in node then it will register an event with its callback and move ahead it will not wait for query response for a particular request.So when r1 request comes then node’s event loop (yes there is an event loop in node which serves this purpose.) register an event with its callback function and move ahead for serving r2 request and similarly register its event with its callback. Whenever any query finishes it triggers its corresponding event and execute its callback to completion without being interrupted.
So no waiting, no threading , no memory consumption – yes this is nodeway for serving rest API.

My one more reason to choose Node.js for a new project is:
Be able to do pure cloud based development
I have used Cloud9 IDE for a while and now I can't imagine without it, it covers all the development lifecycles. All you need is a browser and you can code anytime anywhere on any devices. You don't need to check in code in one Computer(like at home), then checkout in another computer(like at work place).
Of course, there maybe cloud based IDE for other languages or platforms (Cloud 9 IDE is adding supports for other languages as well), but using Cloud 9 to do Node.js developement is really a great experience for me.

One more thing node provides is the ability to create multiple v8 instanes of node using node's child process( childProcess.fork() each requiring 10mb memory as per docs) on the fly, thus not affecting the main process running the server. So offloading a background job that requires huge server load becomes a child's play and we can easily kill them as and when needed.
I've been using node a lot and in most of the apps we build, require server connections at the same time thus a heavy network traffic. Frameworks like Express.js and the new Koajs (which removed callback hell) have made working on node even more easier.

Donning asbestos longjohns...
Yesterday my title with Packt Publications, Reactive Programming with JavaScript. It isn't really a Node.js-centric title; early chapters are intended to cover theory, and later code-heavy chapters cover practice. Because I didn't really think it would be appropriate to fail to give readers a webserver, Node.js seemed by far the obvious choice. The case was closed before it was even opened.
I could have given a very rosy view of my experience with Node.js. Instead I was honest about good points and bad points I encountered.
Let me include a few quotes that are relevant here:
Warning: Node.js and its ecosystem are hot--hot enough to burn you badly!
When I was a teacher’s assistant in math, one of the non-obvious suggestions I was told was not to tell a student that something was “easy.” The reason was somewhat obvious in retrospect: if you tell people something is easy, someone who doesn’t see a solution may end up feeling (even more) stupid, because not only do they not get how to solve the problem, but the problem they are too stupid to understand is an easy one!
There are gotchas that don’t just annoy people coming from Python / Django, which immediately reloads the source if you change anything. With Node.js, the default behavior is that if you make one change, the old version continues to be active until the end of time or until you manually stop and restart the server. This inappropriate behavior doesn’t just annoy Pythonistas; it also irritates native Node.js users who provide various workarounds. The StackOverflow question “Auto-reload of files in Node.js” has, at the time of this writing, over 200 upvotes and 19 answers; an edit directs the user to a nanny script, node-supervisor, with homepage at http://tinyurl.com/reactjs-node-supervisor. This problem affords new users with great opportunity to feel stupid because they thought they had fixed the problem, but the old, buggy behavior is completely unchanged. And it is easy to forget to bounce the server; I have done so multiple times. And the message I would like to give is, “No, you’re not stupid because this behavior of Node.js bit your back; it’s just that the designers of Node.js saw no reason to provide appropriate behavior here. Do try to cope with it, perhaps taking a little help from node-supervisor or another solution, but please don’t walk away feeling that you’re stupid. You’re not the one with the problem; the problem is in Node.js’s default behavior.”
This section, after some debate, was left in, precisely because I don't want to give an impression of “It’s easy.” I cut my hands repeatedly while getting things to work, and I don’t want to smooth over difficulties and set you up to believe that getting Node.js and its ecosystem to function well is a straightforward matter and if it’s not straightforward for you too, you don’t know what you’re doing. If you don’t run into obnoxious difficulties using Node.js, that’s wonderful. If you do, I would hope that you don’t walk away feeling, “I’m stupid—there must be something wrong with me.” You’re not stupid if you experience nasty surprises dealing with Node.js. It’s not you! It’s Node.js and its ecosystem!
The Appendix, which I did not really want after the rising crescendo in the last chapters and the conclusion, talks about what I was able to find in the ecosystem, and provided a workaround for moronic literalism:
Another database that seemed like a perfect fit, and may yet be redeemable, is a server-side implementation of the HTML5 key-value store. This approach has the cardinal advantage of an API that most good front-end developers understand well enough. For that matter, it’s also an API that most not-so-good front-end developers understand well enough. But with the node-localstorage package, while dictionary-syntax access is not offered (you want to use localStorage.setItem(key, value) or localStorage.getItem(key), not localStorage[key]), the full localStorage semantics are implemented, including a default 5MB quota—WHY? Do server-side JavaScript developers need to be protected from themselves?
For client-side database capabilities, a 5MB quota per website is really a generous and useful amount of breathing room to let developers work with it. You could set a much lower quota and still offer developers an immeasurable improvement over limping along with cookie management. A 5MB limit doesn’t lend itself very quickly to Big Data client-side processing, but there is a really quite generous allowance that resourceful developers can use to do a lot. But on the other hand, 5MB is not a particularly large portion of most disks purchased any time recently, meaning that if you and a website disagree about what is reasonable use of disk space, or some site is simply hoggish, it does not really cost you much and you are in no danger of a swamped hard drive unless your hard drive was already too full. Maybe we would be better off if the balance were a little less or a little more, but overall it’s a decent solution to address the intrinsic tension for a client-side context.
However, it might gently be pointed out that when you are the one writing code for your server, you don’t need any additional protection from making your database more than a tolerable 5MB in size. Most developers will neither need nor want tools acting as a nanny and protecting them from storing more than 5MB of server-side data. And the 5MB quota that is a golden balancing act on the client-side is rather a bit silly on a Node.js server. (And, for a database for multiple users such as is covered in this Appendix, it might be pointed out, slightly painfully, that that’s not 5MB per user account unless you create a separate database on disk for each user account; that’s 5MB shared between all user accounts together. That could get painful if you go viral!) The documentation states that the quota is customizable, but an email a week ago to the developer asking how to change the quota is unanswered, as was the StackOverflow question asking the same. The only answer I have been able to find is in the Github CoffeeScript source, where it is listed as an optional second integer argument to a constructor. So that’s easy enough, and you could specify a quota equal to a disk or partition size. But besides porting a feature that does not make sense, the tool’s author has failed completely to follow a very standard convention of interpreting 0 as meaning “unlimited” for a variable or function where an integer is to specify a maximum limit for some resource use. The best thing to do with this misfeature is probably to specify that the quota is Infinity:
if (typeof localStorage === 'undefined' || localStorage === null)
{
var LocalStorage = require('node-localstorage').LocalStorage;
localStorage = new LocalStorage(__dirname + '/localStorage',
Infinity);
}
Swapping two comments in order:
People needlessly shot themselves in the foot constantly using JavaScript as a whole, and part of JavaScript being made respectable language was a Douglas Crockford saying in essence, “JavaScript as a language has some really good parts and some really bad parts. Here are the good parts. Just forget that anything else is there.” Perhaps the hot Node.js ecosystem will grow its own “Douglas Crockford,” who will say, “The Node.js ecosystem is a coding Wild West, but there are some real gems to be found. Here’s a roadmap. Here are the areas to avoid at almost any cost. Here are the areas with some of the richest paydirt to be found in ANY language or environment.”
Perhaps someone else can take those words as a challenge, and follow Crockford’s lead and write up “the good parts” and / or “the better parts” for Node.js and its ecosystem. I’d buy a copy!
And given the degree of enthusiasm and sheer work-hours on all projects, it may be warranted in a year, or two, or three, to sharply temper any remarks about an immature ecosystem made at the time of this writing. It really may make sense in five years to say, “The 2015 Node.js ecosystem had several minefields. The 2020 Node.js ecosystem has multiple paradises.”

If your application mainly tethers web apis, or other io channels, give or take a user interface, node.js may be a fair pick for you, especially if you want to squeeze out the most scalability, or, if your main language in life is javascript (or javascript transpilers of sorts). If you build microservices, node.js is also okay. Node.js is also suitable for any project that is small or simple.
Its main selling point is it allows front-enders take responsibility for back-end stuff rather than the typical divide. Another justifiable selling point is if your workforce is javascript oriented to begin with.
Beyond a certain point however, you cannot scale your code without terrible hacks for forcing modularity, readability and flow control. Some people like those hacks though, especially coming from an event-driven javascript background, they seem familiar or forgivable.
In particular, when your application needs to perform synchronous flows, you start bleeding over half-baked solutions that slow you down considerably in terms of your development process. If you have computation intensive parts in your application, tread with caution picking (only) node.js. Maybe http://koajs.com/ or other novelties alleviate those originally thorny aspects, compared to when I originally used node.js or wrote this.

I can share few points where&why to use node js.
For realtime applications like chat,collaborative editing better we go with nodejs as it is event base where fire event and data to clients from server.
Simple and easy to understand as it is javascript base where most of people have idea.
Most of current web applications going towards angular js&backbone, with node it is easy to interact with client side code as both will use json data.
Lot of plugins available.
Drawbacks:-
Node will support most of databases but best is mongodb which won't support complex joins and others.
Compilation Errors...developer should handle each and every exceptions other wise if any error accord application will stop working where again we need to go and start it manually or using any automation tool.
Conclusion:-
Nodejs best to use for simple and real time applications..if you have very big business logic and complex functionality better should not use nodejs.
If you want to build an application along with chat and any collaborative functionality.. node can be used in specific parts and remain should go with your convenience technology.

Node is great for quick prototypes but I'd never use it again for anything complex.
I spent 20 years developing a relationship with a compiler and I sure miss it.
Node is especially painful for maintaining code that you haven't visited for awhile. Type info and compile time error detection are GOOD THINGS. Why throw all that out? For what? And dang, when something does go south the stack traces quite often completely useless.

Run jobs on FCFS basis in Nodejs from a database

I am developing a NodeJS application wherein a user can schedule a job (CPU intensive) to be run. I am keeping the event loop free and want to run the job in a separate process. When the user submits the job, I make an entry in the database (PostgreSQL), with the timestamp along with some other information. The processes should be run in the FCFS order. Upon some research on stackoverflow, I found people suggesting Bulljs (with Redis), Kue, RabbitMQ, etc. as a solution. My doubt is why do I need to use those when I can just poll the database and get the oldest job. I don't intend to poll the db at a regular interval but instead only when the current job is done executing.
My application does not receive too many simultaneous requests. And also users do not wait for the job to be completed. Instead they logout and are notified through mail when the job is done. What can be the potential drawbacks of using child_process (spawn/exec) module as a solution?

My doubt is why do I need to use those when I can just poll the database and get the oldest job.
How are you planning on handling failures? What if Node.js crashes with a job mid-progress, would that effect your users? Would you then retry a failed job? How do you support back-off? How many attempts before it should completely stop?
These questions are answered in the Bull implementation, RabbitMQ and almost every solution you'll find for your current challenge.
From what I noticed (child_process), it's a lower level implementation (low-level in Node.js), meaning that a lot of the functionality you'll typically require (failover/backoff) isn't included. You'll have to implement this.
That's where it usually becomes more trouble than it's worth, although admittedly managing, monitoring and deploying a Redis server may not be the most optimal solution either.
Have you considered a different approach, how would a periodic CRON job work? (For example).
The challenge with such a system is usually how you plan to handle failure and what impact failure has on your application and end-users.
I will say, in the defense of Bull, for a CPU intensive task I prefer to have a separated instance of the worker process, I can then re-deploy that single process as many times as I need. This keeps my back-end code separated and generally easier to manage, whilst also giving me the ability to easily scale up/down when required.
EDIT: I mention "more trouble than it's worth", if you're looking to really learn how technology like this is developed, go with child process and build your own abstractions on-top, if it's something you need today, use Bull, RabbitMQ or any purpose-built alternative.

What is the best way I can scale my nodejs app?

The basics
Right now a few of my friends and I are trying to develope a browser game made in nodejs. It's a multiplayer top-down shooter, and most of both the client-side and server-side code is in javascript. We have a good general direction that we'd like to go in, and we're having a lot of fun developing the game. One of our goals when making this game was to make it as hard as possible to cheat. Do do that, we have all of the game logic handled server-side. The client only sends their input the the server via web socket, and the server updates the client (also web socket) with what is happening in the game. Here's the start of our problem.
All of the server side math is getting pretty hefty, and we're finding that we need to scale in some way to handle anything more than 10 players (we want to be able to host many more). At first we had figured that we could just scale vertically as we needed to, but since nodejs is single threaded, is can only take advantage of one core. This means that getting a beefier server won't help that problem. Our only solution is to scale horizontally.
Why we're asking here
We haven't been able to find any good examples of how to scale out a nodejs game. Our use case is pretty particular, and while we've done our best to do this by ourselves, we could really benefit from outside opinions and advice
Details
We've already put a LOT of thought into how to solve this problem. We've been working on it for over a week. Here's what we have put together so far:
Four types of servers
We're splitting tasks into 4 different 'types' of servers. Each one will have a specific task it completes.
The proxy server
The proxy server would sit at the front of the entire stack, and be the only server directly accessible from the internet (there could potentially be more of these). It would have haproxy on it, and it would route all connections to the web servers. We chose haproxy because of its rich feature set, reliability, and nearly unbeatable speed.
The web server
The web server would receive the web-requests, and serve all web-pages. They would also handle lobby creation/management and game creation/management. To do this, they would tell the game servers what lobbies it has, what users are in that lobby, and info about the game they're going to play. The web servers would then update the game servers about user input, and the game server would update the web servers (who would then update the clients) of what's happening in the game. The web servers would use TCP sockets to communicate with the game servers about any type of management, and they would use UDP sockets when communicating about game updates. This would all be done with nodejs.
The game server
The game server would handle all the game math and variable updates about the game. The game servers also communicate with the db servers to record cool stats about players in game. This would be done with nodejs.
The db server
The db server would host the database. This part actually turned out to be the easiest since we found rethinkdb, the coolest db ever. This scales easily, and oddly enough, turned out to be the easiest part of scaling our application.
Some other details
If you're having trouble getting your head around our whole getup, look at this, it's a semi-accurate chart of how we think we'll scale.
If you're just curious, or think it might be helpful to look at our game, it's currently hosted in it's un-scaled state here.
Some things we don't want
We don't want to use the cluster module of nodejs. It isn't stable (said here), and it doesn't scale to other servers, only other processors. We'd like to just take the leap to horizontal scaling.
Our question, summed up
We hope we're going in the right direction, and we've done our homework, but we're not certain. We could certainly take a few tips on how to do this the right way.
Thanks
I realize that this is a pretty long question, and making a well thought out answer will not be easy, but I would really appreciate it.
Thanks!!

Following my spontaneous thoughts on your case:
Multicore usage
node.js can scale with multiple cores as well. How, you can read for example here (or just think about it: You have one thread/process running on one core, what do you need to use multiple cores? Multiple threads or multiple processes. Push work from main thread to other threads or processes and you are done).
I personally would say it is childish to develop an application, which does not make use of multiple cores. If you make use of some background processes, ok, but if you until now only do work in the node.js main event loop, you should definitely invest some time to make the app scalable over cores.
Implementing something like IPC is not that easy by the way. You can do, but if your case is complicated maybe you are good to go with the cluster module. This is obviously not your favorite, but just because something is called "experimental" it does not mean it's trashy. Just give it a try, maybe you can even fix some bugs of the module on the way. It's most likely better to use some broadly used software for complex problems, than invent a new wheel.
You should also (if you do not already) think about (wise) usage of nextTick functionality. This allows the main event loop to pause some cpu intensive task and perform other work in the meanwhile. You can read about it for example here.
General thoughts on computations
You should definitely take a very close look at your algorithms of the game engine. You already noticed that this is your bottleneck right now and actually computations are the most critical part of mostly every game. Scaling does solve this problem in one way, but scaling introduces other problems. Also you cannot throw "scaling" as problem solver on everything and expect every problem to disappear.
Your best bet is to make your game code elegant and fast. Think about how to solve problems efficiently. If you cannot solve something in Javascript efficiently, but the problem can easily be extracted, why not write a little C component instead? This counts as a separate process as well, which reduces load on your main node.js event loop.
Proxy?
Personally I do not see the advantage of the proxy level right now. You do not seem to expect large amount of users, you therefore won't need to solve problems like CDN solves or whatever... it's okay to think about it, but I would not invest much time there right now.
Technically there is a high chance your webserver software provides proxy functionality anyway. So it is ok to have it on the paper, but I would not plan with dedicated hardware right now.
Epilogue
The rest seems more or less fine to me.

Little late to the game, but take a look here: http://goldfirestudios.com/blog/136/Horizontally-Scaling-Node.js-and-WebSockets-with-Redis
You did not mention anything to do with memory management. As you know, nodejs doesn't share its memory with other processes, so an in-memory database is a must if you want to scale. (Redis, Memcache, etc). You need to setup a publisher & subscriber event on each node to accept incoming requests from redis. This way, you can scale up x nilo amount of servers (infront of your HAProxy) and utilize the data piped from redis.
There is also this node addon: http://blog.varunajayasiri.com/shared-memory-with-nodejs That lets you share memory between processes, but only works under Linux. This will help if you don't want to send data across local processes all the time or have to deal with nodes ipc api.
You can also fork child processes within node for a new v8 isolate to help with expensive cpu bound tasks. For example, players can kill monsters and obtain quite a bit of loot within my action rpg game. I have a child process called LootGenerater, and basically whenever a player kills a monster it sends the game id, mob_id, and user_id to the process via the default IPC api .send. Once the child process receives it, it iterates over the large loot table and manages the items (stores to redis, or whatever) and pipes it back.
This helps free up the event loop greatly, and just one idea I can think of to help you scale. But most importantly you will want to use an in-memory database system and make sure your game code architecture is designed around whatever database system you use. Don't make the mistake I did by now having to re-write everything :)
Hope this helps!
Note: If you do decide to go with Memcache, you will need to utilize another pub/sub system.

How to decide when to use Node.js?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am new to this kind of stuff, but lately I've been hearing a lot about how good Node.js is. Considering how much I love working with jQuery and JavaScript in general, I can't help but wonder how to decide when to use Node.js. The web application I have in mind is something like Bitly - takes some content, archives it.
From all the homework I have been doing in the last few days, I obtained the following information. Node.js
is a command-line tool that can be run as a regular web server and lets one run JavaScript programs
utilizes the great V8 JavaScript engine
is very good when you need to do several things at the same time
is event-based so all the wonderful Ajax-like stuff can be done on the server side
lets us share code between the browser and the backend
lets us talk with MySQL
Some of the sources that I have come across are:
Diving into Node.js – Introduction and Installation
Understanding NodeJS
Node by Example (Archive.is)
Let’s Make a Web App: NodePad
Considering that Node.js can be run almost out-of-the-box on Amazon's EC2 instances, I am trying to understand what type of problems require Node.js as opposed to any of the mighty kings out there like PHP, Python and Ruby. I understand that it really depends on the expertise one has on a language, but my question falls more into the general category of: When to use a particular framework and what type of problems is it particularly suited for?

I believe Node.js is best suited for real-time applications: online games, collaboration tools, chat rooms, or anything where what one user (or robot? or sensor?) does with the application needs to be seen by other users immediately, without a page refresh.
I should also mention that Socket.IO in combination with Node.js will reduce your real-time latency even further than what is possible with long polling. Socket.IO will fall back to long polling as a worst case scenario, and instead use web sockets or even Flash if they are available.
But I should also mention that just about any situation where the code might block due to threads can be better addressed with Node.js. Or any situation where you need the application to be event-driven.
Also, Ryan Dahl said in a talk that I once attended that the Node.js benchmarks closely rival Nginx for regular old HTTP requests. So if we build with Node.js, we can serve our normal resources quite effectively, and when we need the event-driven stuff, it's ready to handle it.
Plus it's all JavaScript all the time. Lingua Franca on the whole stack.

To make it short:
Node.js is well suited for applications that have a lot of concurrent connections and each request only needs very few CPU cycles, because the event loop (with all the other clients) is blocked during execution of a function.
A good article about the event loop in Node.js is Mixu's tech blog: Understanding the node.js event loop.

I have one real-world example where I have used Node.js. The company where I work got one client who wanted to have a simple static HTML website. This website is for selling one item using PayPal and the client also wanted to have a counter which shows the amount of sold items. Client expected to have huge amount of visitors to this website. I decided to make the counter using Node.js and the Express.js framework.
The Node.js application was simple. Get the sold items amount from a Redis database, increase the counter when item is sold and serve the counter value to users via the API.
Some reasons why I chose to use Node.js in this case
It is very lightweight and fast. There has been over 200000 visits on this website in three weeks and minimal server resources has been able to handle it all.
The counter is really easy to make to be real time.
Node.js was easy to configure.
There are lots of modules available for free. For example, I found a Node.js module for PayPal.
In this case, Node.js was an awesome choice.

The most important reasons to start your next project using Node ...
All the coolest dudes are into it ... so it must be fun.
You can hangout at the cooler and have lots of Node adventures to brag about.
You're a penny pincher when it comes to cloud hosting costs.
Been there done that with Rails
You hate IIS deployments
Your old IT job is getting rather dull and you wish you were in a shiny new Start Up.
What to expect ...
You'll feel safe and secure with Express without all the server bloatware you never needed.
Runs like a rocket and scales well.
You dream it. You installed it. The node package repo npmjs.org is the largest ecosystem of open source libraries in the world.
Your brain will get time warped in the land of nested callbacks ...
... until you learn to keep your Promises.
Sequelize and Passport are your new API friends.
Debugging mostly async code will get umm ... interesting .
Time for all Noders to master Typescript.
Who uses it?
PayPal, Netflix, Walmart, LinkedIn, Groupon, Uber, GoDaddy, Dow Jones
Here's why they switched to Node.

There is nothing like Silver Bullet. Everything comes with some cost associated with it. It is like if you eat oily food, you will compromise your health and healthy food does not come with spices like oily food. It is individual choice whether they want health or spices as in their food.
Same way Node.js consider to be used in specific scenario. If your app does not fit into that scenario you should not consider it for your app development. I am just putting my thought on the same:
When to use Node.JS
If your server side code requires very few cpu cycles. In other world you are doing non blocking operation and does not have heavy algorithm/Job which consumes lots of CPU cycles.
If you are from Javascript back ground and comfortable in writing Single Threaded code just like client side JS.
When NOT to use Node.JS
Your server request is dependent on heavy CPU consuming algorithm/Job.
Scalability Consideration with Node.JS
Node.JS itself does not utilize all core of underlying system and it is single threaded by default, you have to write logic by your own to utilize multi core processor and make it multi threaded.
Node.JS Alternatives
There are other option to use in place of Node.JS however Vert.x seems to be pretty promising and has lots of additional features like polygot and better scalability considerations.

Another great thing that I think no one has mentioned about Node.js is the amazing community, the package management system (npm) and the amount of modules that exist that you can include by simply including them in your package.json file.

My piece: nodejs is great for making real time systems like analytics, chat-apps, apis, ad servers, etc.
Hell, I made my first chat app using nodejs and socket.io under 2 hours and that too during exam
week!
Edit
Its been several years since I have started using nodejs and I have used it in making many different things including static file servers, simple analytics, chat apps and much more.
This is my take on when to use nodejs
When to use
When making system which put emphasis on concurrency and speed.
Sockets only servers like chat apps, irc apps, etc.
Social networks which put emphasis on realtime resources like geolocation, video stream, audio stream, etc.
Handling small chunks of data really fast like an analytics webapp.
As exposing a REST only api.
When not to use
Its a very versatile webserver so you can use it wherever you want but probably not these places.
Simple blogs and static sites.
Just as a static file server.
Keep in mind that I am just nitpicking. For static file servers, apache is better mainly because it is widely available. The nodejs community has grown larger and more mature over the years and it is safe to say nodejs can be used just about everywhere if you have your own choice of hosting.

It can be used where
Applications that are highly event driven & are heavily I/O bound
Applications handling a large number of connections to other systems
Real-time applications (Node.js was designed from the ground up for real time and to be easy
to use.)
Applications that juggle scads of information streaming to and from other sources
High traffic, Scalable applications
Mobile apps that have to talk to platform API & database, without having to do a lot of data
analytics
Build out networked applications
Applications that need to talk to the back end very often
On Mobile front, prime-time companies have relied on Node.js for their mobile solutions. Check out why?
LinkedIn is a prominent user. Their entire mobile stack is built on Node.js. They went from running 15 servers with 15 instances on each physical machine, to just 4 instances – that can handle double the traffic!
eBay launched ql.io, a web query language for HTTP APIs, which uses Node.js as the runtime stack. They were able to tune a regular developer-quality Ubuntu workstation to handle more than 120,000 active connections per node.js process, with each connection consuming about 2kB memory!
Walmart re-engineered its mobile app to use Node.js and pushed its JavaScript processing to the server.
Read more at: http://www.pixelatingbits.com/a-closer-look-at-mobile-app-development-with-node-js/

Node best for concurrent request handling -
So, Let’s start with a story. From last 2 years I am working on JavaScript and developing web front end and I am enjoying it. Back end guys provide’s us some API’s written in Java,python (we don’t care) and we simply write a AJAX call, get our data and guess what ! we are done. But in real it is not that easy, If data we are getting is not correct or there is some server error then we stuck and we have to contact our back end guys over the mail or chat(sometimes on whatsApp too :).) This is not cool. What if we wrote our API’s in JavaScript and call those API’s from our front end ? Yes that’s pretty cool because if we face any problem in API we can look into it. Guess what ! you can do this now , How ? – Node is there for you.
Ok agreed that you can write your API in JavaScript but what if I am ok with above problem. Do you have any other reason to use node for rest API ?
so here is the magic begins. Yes I do have other reasons to use node for our API’s.
Let’s go back to our traditional rest API system which is based on either blocking operation or threading. Suppose two concurrent request occurs( r1 and r2) , each of them require database operation. So In traditional system what will happens :
1. Waiting Way : Our server starts serving r1 request and waits for query response. after completion of r1 , server starts to serve r2 and does it in same way. So waiting is not a good idea because we don’t have that much time.
2. Threading Way : Our server will creates two threads for both requests r1 and r2 and serve their purpose after querying database so cool its fast.But it is memory consuming because you can see we started two threads also problem increases when both request is querying same data then you have to deal with deadlock kind of issues . So its better than waiting way but still issues are there.
Now here is , how node will do it:
3. Nodeway : When same concurrent request comes in node then it will register an event with its callback and move ahead it will not wait for query response for a particular request.So when r1 request comes then node’s event loop (yes there is an event loop in node which serves this purpose.) register an event with its callback function and move ahead for serving r2 request and similarly register its event with its callback. Whenever any query finishes it triggers its corresponding event and execute its callback to completion without being interrupted.
So no waiting, no threading , no memory consumption – yes this is nodeway for serving rest API.

My one more reason to choose Node.js for a new project is:
Be able to do pure cloud based development
I have used Cloud9 IDE for a while and now I can't imagine without it, it covers all the development lifecycles. All you need is a browser and you can code anytime anywhere on any devices. You don't need to check in code in one Computer(like at home), then checkout in another computer(like at work place).
Of course, there maybe cloud based IDE for other languages or platforms (Cloud 9 IDE is adding supports for other languages as well), but using Cloud 9 to do Node.js developement is really a great experience for me.

One more thing node provides is the ability to create multiple v8 instanes of node using node's child process( childProcess.fork() each requiring 10mb memory as per docs) on the fly, thus not affecting the main process running the server. So offloading a background job that requires huge server load becomes a child's play and we can easily kill them as and when needed.
I've been using node a lot and in most of the apps we build, require server connections at the same time thus a heavy network traffic. Frameworks like Express.js and the new Koajs (which removed callback hell) have made working on node even more easier.

If your application mainly tethers web apis, or other io channels, give or take a user interface, node.js may be a fair pick for you, especially if you want to squeeze out the most scalability, or, if your main language in life is javascript (or javascript transpilers of sorts). If you build microservices, node.js is also okay. Node.js is also suitable for any project that is small or simple.
Its main selling point is it allows front-enders take responsibility for back-end stuff rather than the typical divide. Another justifiable selling point is if your workforce is javascript oriented to begin with.
Beyond a certain point however, you cannot scale your code without terrible hacks for forcing modularity, readability and flow control. Some people like those hacks though, especially coming from an event-driven javascript background, they seem familiar or forgivable.
In particular, when your application needs to perform synchronous flows, you start bleeding over half-baked solutions that slow you down considerably in terms of your development process. If you have computation intensive parts in your application, tread with caution picking (only) node.js. Maybe http://koajs.com/ or other novelties alleviate those originally thorny aspects, compared to when I originally used node.js or wrote this.

I can share few points where&why to use node js.
For realtime applications like chat,collaborative editing better we go with nodejs as it is event base where fire event and data to clients from server.
Simple and easy to understand as it is javascript base where most of people have idea.
Most of current web applications going towards angular js&backbone, with node it is easy to interact with client side code as both will use json data.
Lot of plugins available.
Drawbacks:-
Node will support most of databases but best is mongodb which won't support complex joins and others.
Compilation Errors...developer should handle each and every exceptions other wise if any error accord application will stop working where again we need to go and start it manually or using any automation tool.
Conclusion:-
Nodejs best to use for simple and real time applications..if you have very big business logic and complex functionality better should not use nodejs.
If you want to build an application along with chat and any collaborative functionality.. node can be used in specific parts and remain should go with your convenience technology.

Node is great for quick prototypes but I'd never use it again for anything complex.
I spent 20 years developing a relationship with a compiler and I sure miss it.
Node is especially painful for maintaining code that you haven't visited for awhile. Type info and compile time error detection are GOOD THINGS. Why throw all that out? For what? And dang, when something does go south the stack traces quite often completely useless.

What is Node.js? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I don't fully get what Node.js is all about. Maybe it's because I am mainly a web based business application developer. What is it and what is the use of it?
My understanding so far is that:
The programming model is event driven, especially the way it handles I/O.
It uses JavaScript and the parser is V8.
It can be easily used to create concurrent server applications.
Are my understandings correct? If yes, then what are the benefits of evented I/O, is it just more for the concurrency stuff? Also, is the direction of Node.js to become a framework like, JavaScript based (V8 based) programming model?

I use Node.js at work, and find it to be very powerful. Forced to choose one word to describe Node.js, I'd say "interesting" (which is not a purely positive adjective). The community is vibrant and growing. JavaScript, despite its oddities can be a great language to code in. And you will daily rethink your own understanding of "best practice" and the patterns of well-structured code. There's an enormous energy of ideas flowing into Node.js right now, and working in it exposes you to all this thinking - great mental weightlifting.
Node.js in production is definitely possible, but far from the "turn-key" deployment seemingly promised by the documentation. With Node.js v0.6.x, "cluster" has been integrated into the platform, providing one of the essential building blocks, but my "production.js" script is still ~150 lines of logic to handle stuff like creating the log directory, recycling dead workers, etc. For a "serious" production service, you also need to be prepared to throttle incoming connections and do all the stuff that Apache does for PHP. To be fair, Ruby on Rails has this exact problem. It is solved via two complementary mechanisms: 1) Putting Ruby on Rails/Node.js behind a dedicated webserver (written in C and tested to hell and back) like Nginx (or Apache / Lighttd). The webserver can efficiently serve static content, access logging, rewrite URLs, terminate SSL, enforce access rules, and manage multiple sub-services. For requests that hit the actual node service, the webserver proxies the request through. 2) Using a framework like Unicorn that will manage the worker processes, recycle them periodically, etc. I've yet to find a Node.js serving framework that seems fully baked; it may exist, but I haven't found it yet and still use ~150 lines in my hand-rolled "production.js".
Reading frameworks like Express makes it seem like the standard practice is to just serve everything through one jack-of-all-trades Node.js service ... "app.use(express.static(__dirname + '/public'))". For lower-load services and development, that's probably fine. But as soon as you try to put big time load on your service and have it run 24/7, you'll quickly discover the motivations that push big sites to have well baked, hardened C-code like Nginx fronting their site and handling all of the static content requests (...until you set up a CDN, like Amazon CloudFront)). For a somewhat humorous and unabashedly negative take on this, see this guy.
Node.js is also finding more and more non-service uses. Even if you are using something else to serve web content, you might still use Node.js as a build tool, using npm modules to organize your code, Browserify to stitch it into a single asset, and uglify-js to minify it for deployment. For dealing with the web, JavaScript is a perfect impedance match and frequently that makes it the easiest route of attack. For example, if you want to grovel through a bunch of JSON response payloads, you should use my underscore-CLI module, the utility-belt of structured data.
Pros / Cons:
Pro: For a server guy, writing JavaScript on the backend has been a "gateway drug" to learning modern UI patterns. I no longer dread writing client code.
Pro: Tends to encourage proper error checking (err is returned by virtually all callbacks, nagging the programmer to handle it; also, async.js and other libraries handle the "fail if any of these subtasks fails" paradigm much better than typical synchronous code)
Pro: Some interesting and normally hard tasks become trivial - like getting status on tasks in flight, communicating between workers, or sharing cache state
Pro: Huge community and tons of great libraries based on a solid package manager (npm)
Con: JavaScript has no standard library. You get so used to importing functionality that it feels weird when you use JSON.parse or some other build in method that doesn't require adding an npm module. This means that there are five versions of everything. Even the modules included in the Node.js "core" have five more variants should you be unhappy with the default implementation. This leads to rapid evolution, but also some level of confusion.
Versus a simple one-process-per-request model (LAMP):
Pro: Scalable to thousands of active connections. Very fast and very efficient. For a web fleet, this could mean a 10X reduction in the number of boxes required versus PHP or Ruby
Pro: Writing parallel patterns is easy. Imagine that you need to fetch three (or N) blobs from Memcached. Do this in PHP ... did you just write code the fetches the first blob, then the second, then the third? Wow, that's slow. There's a special PECL module to fix that specific problem for Memcached, but what if you want to fetch some Memcached data in parallel with your database query? In Node.js, because the paradigm is asynchronous, having a web request do multiple things in parallel is very natural.
Con: Asynchronous code is fundamentally more complex than synchronous code, and the up-front learning curve can be hard for developers without a solid understanding of what concurrent execution actually means. Still, it's vastly less difficult than writing any kind of multithreaded code with locking.
Con: If a compute-intensive request runs for, for example, 100 ms, it will stall processing of other requests that are being handled in the same Node.js process ... AKA, cooperative-multitasking. This can be mitigated with the Web Workers pattern (spinning off a subprocess to deal with the expensive task). Alternatively, you could use a large number of Node.js workers and only let each one handle a single request concurrently (still fairly efficient because there is no process recycle).
Con: Running a production system is MUCH more complicated than a CGI model like Apache + PHP, Perl, Ruby, etc. Unhandled exceptions will bring down the entire process, necessitating logic to restart failed workers (see cluster). Modules with buggy native code can hard-crash the process. Whenever a worker dies, any requests it was handling are dropped, so one buggy API can easily degrade service for other cohosted APIs.
Versus writing a "real" service in Java / C# / C (C? really?)
Pro: Doing asynchronous in Node.js is easier than doing thread-safety anywhere else and arguably provides greater benefit. Node.js is by far the least painful asynchronous paradigm I've ever worked in. With good libraries, it is only slightly harder than writing synchronous code.
Pro: No multithreading / locking bugs. True, you invest up front in writing more verbose code that expresses a proper asynchronous workflow with no blocking operations. And you need to write some tests and get the thing to work (it is a scripting language and fat fingering variable names is only caught at unit-test time). BUT, once you get it to work, the surface area for heisenbugs -- strange problems that only manifest once in a million runs -- that surface area is just much much lower. The taxes writing Node.js code are heavily front-loaded into the coding phase. Then you tend to end up with stable code.
Pro: JavaScript is much more lightweight for expressing functionality. It's hard to prove this with words, but JSON, dynamic typing, lambda notation, prototypal inheritance, lightweight modules, whatever ... it just tends to take less code to express the same ideas.
Con: Maybe you really, really like coding services in Java?
For another perspective on JavaScript and Node.js, check out From Java to Node.js, a blog post on a Java developer's impressions and experiences learning Node.js.
Modules
When considering node, keep in mind that your choice of JavaScript libraries will DEFINE your experience. Most people use at least two, an asynchronous pattern helper (Step, Futures, Async), and a JavaScript sugar module (Underscore.js).
Helper / JavaScript Sugar:
Underscore.js - use this. Just do it. It makes your code nice and readable with stuff like _.isString(), and _.isArray(). I'm not really sure how you could write safe code otherwise. Also, for enhanced command-line-fu, check out my own Underscore-CLI.
Asynchronous Pattern Modules:
Step - a very elegant way to express combinations of serial and parallel actions. My personal reccomendation. See my post on what Step code looks like.
Futures - much more flexible (is that really a good thing?) way to express ordering through requirements. Can express things like "start a, b, c in parallel. When A, and B finish, start AB. When A, and C finish, start AC." Such flexibility requires more care to avoid bugs in your workflow (like never calling the callback, or calling it multiple times). See Raynos's post on using futures (this is the post that made me "get" futures).
Async - more traditional library with one method for each pattern. I started with this before my religious conversion to step and subsequent realization that all patterns in Async could be expressed in Step with a single more readable paradigm.
TameJS - Written by OKCupid, it's a precompiler that adds a new language primative "await" for elegantly writing serial and parallel workflows. The pattern looks amazing, but it does require pre-compilation. I'm still making up my mind on this one.
StreamlineJS - competitor to TameJS. I'm leaning toward Tame, but you can make up your own mind.
Or to read all about the asynchronous libraries, see this panel-interview with the authors.
Web Framework:
Express Great Ruby on Rails-esk framework for organizing web sites. It uses JADE as a XML/HTML templating engine, which makes building HTML far less painful, almost elegant even.
jQuery While not technically a node module, jQuery is quickly becoming a de-facto standard for client-side user interface. jQuery provides CSS-like selectors to 'query' for sets of DOM elements that can then be operated on (set handlers, properties, styles, etc). Along the same vein, Twitter's Bootstrap CSS framework, Backbone.js for an MVC pattern, and Browserify.js to stitch all your JavaScript files into a single file. These modules are all becoming de-facto standards so you should at least check them out if you haven't heard of them.
Testing:
JSHint - Must use; I didn't use this at first which now seems incomprehensible. JSLint adds back a bunch of the basic verifications you get with a compiled language like Java. Mismatched parenthesis, undeclared variables, typeos of many shapes and sizes. You can also turn on various forms of what I call "anal mode" where you verify style of whitespace and whatnot, which is OK if that's your cup of tea -- but the real value comes from getting instant feedback on the exact line number where you forgot a closing ")" ... without having to run your code and hit the offending line. "JSHint" is a more-configurable variant of Douglas Crockford's JSLint.
Mocha competitor to Vows which I'm starting to prefer. Both frameworks handle the basics well enough, but complex patterns tend to be easier to express in Mocha.
Vows Vows is really quite elegant. And it prints out a lovely report (--spec) showing you which test cases passed / failed. Spend 30 minutes learning it, and you can create basic tests for your modules with minimal effort.
Zombie - Headless testing for HTML and JavaScript using JSDom as a virtual "browser". Very powerful stuff. Combine it with Replay to get lightning fast deterministic tests of in-browser code.
A comment on how to "think about" testing:
Testing is non-optional. With a dynamic language like JavaScript, there are very few static checks. For example, passing two parameters to a method that expects 4 won't break until the code is executed. Pretty low bar for creating bugs in JavaScript. Basic tests are essential to making up the verification gap with compiled languages.
Forget validation, just make your code execute. For every method, my first validation case is "nothing breaks", and that's the case that fires most often. Proving that your code runs without throwing catches 80% of the bugs and will do so much to improve your code confidence that you'll find yourself going back and adding the nuanced validation cases you skipped.
Start small and break the inertial barrier. We are all lazy, and pressed for time, and it's easy to see testing as "extra work". So start small. Write test case 0 - load your module and report success. If you force yourself to do just this much, then the inertial barrier to testing is broken. That's <30 min to do it your first time, including reading the documentation. Now write test case 1 - call one of your methods and verify "nothing breaks", that is, that you don't get an error back. Test case 1 should take you less than one minute. With the inertia gone, it becomes easy to incrementally expand your test coverage.
Now evolve your tests with your code. Don't get intimidated by what the "correct" end-to-end test would look like with mock servers and all that. Code starts simple and evolves to handle new cases; tests should too. As you add new cases and new complexity to your code, add test cases to exercise the new code. As you find bugs, add verifications and / or new cases to cover the flawed code. When you are debugging and lose confidence in a piece of code, go back and add tests to prove that it is doing what you think it is. Capture strings of example data (from other services you call, websites you scrape, whatever) and feed them to your parsing code. A few cases here, improved validation there, and you will end up with highly reliable code.
Also, check out the official list of recommended Node.js modules. However, GitHub's Node Modules Wiki is much more complete and a good resource.
To understand Node, it's helpful to consider a few of the key design choices:
Node.js is EVENT BASED and ASYNCHRONOUS / NON-BLOCKING. Events, like an incoming HTTP connection will fire off a JavaScript function that does a little bit of work and kicks off other asynchronous tasks like connecting to a database or pulling content from another server. Once these tasks have been kicked off, the event function finishes and Node.js goes back to sleep. As soon as something else happens, like the database connection being established or the external server responding with content, the callback functions fire, and more JavaScript code executes, potentially kicking off even more asynchronous tasks (like a database query). In this way, Node.js will happily interleave activities for multiple parallel workflows, running whatever activities are unblocked at any point in time. This is why Node.js does such a great job managing thousands of simultaneous connections.
Why not just use one process/thread per connection like everyone else? In Node.js, a new connection is just a very small heap allocation. Spinning up a new process takes significantly more memory, a megabyte on some platforms. But the real cost is the overhead associated with context-switching. When you have 10^6 kernel threads, the kernel has to do a lot of work figuring out who should execute next. A bunch of work has gone into building an O(1) scheduler for Linux, but in the end, it's just way way more efficient to have a single event-driven process than 10^6 processes competing for CPU time. Also, under overload conditions, the multi-process model behaves very poorly, starving critical administration and management services, especially SSHD (meaning you can't even log into the box to figure out how screwed it really is).
Node.js is SINGLE THREADED and LOCK FREE. Node.js, as a very deliberate design choice only has a single thread per process. Because of this, it's fundamentally impossible for multiple threads to access data simultaneously. Thus, no locks are needed. Threads are hard. Really really hard. If you don't believe that, you haven't done enough threaded programming. Getting locking right is hard and results in bugs that are really hard to track down. Eliminating locks and multi-threading makes one of the nastiest classes of bugs just go away. This might be the single biggest advantage of node.
But how do I take advantage of my 16 core box?
Two ways:
For big heavy compute tasks like image encoding, Node.js can fire up child processes or send messages to additional worker processes. In this design, you'd have one thread managing the flow of events and N processes doing heavy compute tasks and chewing up the other 15 CPUs.
For scaling throughput on a webservice, you should run multiple Node.js servers on one box, one per core, using cluster (With Node.js v0.6.x, the official "cluster" module linked here replaces the learnboost version which has a different API). These local Node.js servers can then compete on a socket to accept new connections, balancing load across them. Once a connection is accepted, it becomes tightly bound to a single one of these shared processes. In theory, this sounds bad, but in practice it works quite well and allows you to avoid the headache of writing thread-safe code. Also, this means that Node.js gets excellent CPU cache affinity, more effectively using memory bandwidth.
Node.js lets you do some really powerful things without breaking a sweat. Suppose you have a Node.js program that does a variety of tasks, listens on a TCP port for commands, encodes some images, whatever. With five lines of code, you can add in an HTTP based web management portal that shows the current status of active tasks. This is EASY to do:
var http = require('http');
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end(myJavascriptObject.getSomeStatusInfo());
}).listen(1337, "127.0.0.1");
Now you can hit a URL and check the status of your running process. Add a few buttons, and you have a "management portal". If you have a running Perl / Python / Ruby script, just "throwing in a management portal" isn't exactly simple.
But isn't JavaScript slow / bad / evil / spawn-of-the-devil? JavaScript has some weird oddities, but with "the good parts" there's a very powerful language there, and in any case, JavaScript is THE language on the client (browser). JavaScript is here to stay; other languages are targeting it as an IL, and world class talent is competing to produce the most advanced JavaScript engines. Because of JavaScript's role in the browser, an enormous amount of engineering effort is being thrown at making JavaScript blazing fast. V8 is the latest and greatest javascript engine, at least for this month. It blows away the other scripting languages in both efficiency AND stability (looking at you, Ruby). And it's only going to get better with huge teams working on the problem at Microsoft, Google, and Mozilla, competing to build the best JavaScript engine (It's no longer a JavaScript "interpreter" as all the modern engines do tons of JIT compiling under the hood with interpretation only as a fallback for execute-once code). Yeah, we all wish we could fix a few of the odder JavaScript language choices, but it's really not that bad. And the language is so darn flexible that you really aren't coding JavaScript, you are coding Step or jQuery -- more than any other language, in JavaScript, the libraries define the experience. To build web applications, you pretty much have to know JavaScript anyway, so coding with it on the server has a sort of skill-set synergy. It has made me not dread writing client code.
Besides, if you REALLY hate JavaScript, you can use syntactic sugar like CoffeeScript. Or anything else that creates JavaScript code, like Google Web Toolkit (GWT).
Speaking of JavaScript, what's a "closure"? - Pretty much a fancy way of saying that you retain lexically scoped variables across call chains. ;) Like this:
var myData = "foo";
database.connect( 'user:pass', function myCallback( result ) {
database.query("SELECT * from Foo where id = " + myData);
} );
// Note that doSomethingElse() executes _BEFORE_ "database.query" which is inside a callback
doSomethingElse();
See how you can just use "myData" without doing anything awkward like stashing it into an object? And unlike in Java, the "myData" variable doesn't have to be read-only. This powerful language feature makes asynchronous-programming much less verbose and less painful.
Writing asynchronous code is always going to be more complex than writing a simple single-threaded script, but with Node.js, it's not that much harder and you get a lot of benefits in addition to the efficiency and scalability to thousands of concurrent connections...

Locked. There are disputes about this answer’s content being resolved at this time. It is not currently accepting new interactions.
I think the advantages are:
Web development in a dynamic language (JavaScript) on a VM that is incredibly fast (V8). It is much faster than Ruby, Python, or Perl.
Ability to handle thousands of concurrent connections with minimal overhead on a single process.
JavaScript is perfect for event loops with first class function objects and closures. People already know how to use it this way having used it in the browser to respond to user initiated events.
A lot of people already know JavaScript, even people who do not claim to be programmers. It is arguably the most popular programming language.
Using JavaScript on a web server as well as the browser reduces the impedance mismatch between the two programming environments which can communicate data structures via JSON that work the same on both sides of the equation. Duplicate form validation code can be shared between server and client, etc.

V8 is an implementation of JavaScript. It lets you run standalone JavaScript applications (among other things).
Node.js is simply a library written for V8 which does evented I/O. This concept is a bit trickier to explain, and I'm sure someone will answer with a better explanation than I... The gist is that rather than doing some input or output and waiting for it to happen, you just don't wait for it to finish. So for example, ask for the last edited time of a file:
// Pseudo code
stat( 'somefile' )
That might take a couple of milliseconds, or it might take seconds. With evented I/O you simply fire off the request and instead of waiting around you attach a callback that gets run when the request finishes:
// Pseudo code
stat( 'somefile', function( result ) {
// Use the result here
} );
// ...more code here
This makes it a lot like JavaScript code in the browser (for example, with Ajax style functionality).
For more information, you should check out the article Node.js is genuinely exciting which was my introduction to the library/platform... I found it quite good.

Node.js is an open source command line tool built for the server side JavaScript code. You can download a tarball, compile and install the source. It lets you run JavaScript programs.
The JavaScript is executed by the V8, a JavaScript engine developed by Google which is used in Chrome browser. It uses a JavaScript API to access the network and file system.
It is popular for its performance and the ability to perform parallel operations.
Understanding node.js is the best explanation of node.js I have found so far.
Following are some good articles on the topic.
Learning Server-Side JavaScript with Node.js
This Time, You’ll Learn Node.js

The closures are a way to execute code in the context it was created in.
What this means for concurency is that you can define variables, then initiate a nonblocking I/O function, and send it an anonymous function for its callback.
When the task is complete, the callback function will execute in the context with the variables, this is the closure.
The reason closures are so good for writing applications with nonblocking I/O is that it's very easy to manage the context of functions executing asynchronously.

Two good examples are regarding how you manage templates and use progressive enhancements with it. You just need a few lightweight pieces of JavaScript code to make it work perfectly.
I strongly recommend that you watch and read these articles:
Video glass node
Node.js YUI DOM manipulation
Pick up any language and try to remember how you would manage your HTML file templates and what you had to do to update a single CSS class name in your DOM structure (for instance, a user clicked on a menu item and you want that marked as "selected" and update the content of the page).
With Node.js it is as simple as doing it in client-side JavaScript code. Get your DOM node and apply your CSS class to that. Get your DOM node and innerHTML your content (you will need some additional JavaScript code to do this. Read the article to know more).
Another good example, is that you can make your web page compatible both with JavaScript turned on or off with the same piece of code. Imagine you have a date selection made in JavaScript that would allow your users to pick up any date using a calendar. You can write (or use) the same piece of JavaScript code to make it work with your JavaScript turned ON or OFF.

There is a very good fast food place analogy that best explains the event driven model of Node.js, see the full article, Node.js, Doctor’s Offices and Fast Food Restaurants – Understanding Event-driven Programming
Here is a summary:
If the fast food joint followed a traditional thread-based model, you'd order your food and wait in line until you received it. The person behind you wouldn't be able to order until your order was done. In an event-driven model, you order your food and then get out of line to wait. Everyone else is then free to order.
Node.js is event-driven, but most web servers are thread-based.York explains how Node.js works:
You use your web browser to make a request for "/about.html" on a
Node.js web server.
The Node.js server accepts your request and calls a function to retrieve
that file from disk.
While the Node.js server is waiting for the file to be retrieved, it
services the next web request.
When the file is retrieved, there is a callback function that is
inserted in the Node.js servers queue.
The Node.js server executes that function which in this case would
render the "/about.html" page and send it back to your web browser."

Well, I understand that
Node's goal is to provide an easy way
to build scalable network programs.
Node is similar in design to and influenced by systems like Ruby's Event Machine or Python's Twisted.
Evented I/O for V8 javascript.
For me that means that you were correct in all three assumptions. The library sure looks promising!

Also, do not forget to mention that Google's V8 is VERY fast. It actually converts the JavaScript code to machine code with the matched performance of compiled binary. So along with all the other great things, it's INSANELY fast.

Q: The programming model is event driven, especially the way it handles I/O.
Correct. It uses call-backs, so any request to access the file system would cause a request to be sent to the file system and then Node.js would start processing its next request. It would only worry about the I/O request once it gets a response back from the file system, at which time it will run the callback code. However, it is possible to make synchronous I/O requests (that is, blocking requests). It is up to the developer to choose between asynchronous (callbacks) or synchronous (waiting).
Q: It uses JavaScript and the parser is V8.
Yes
Q: It can be easily used to create concurrent server applications.
Yes, although you'd need to hand-code quite a lot of JavaScript. It might be better to look at a framework, such as http://www.easynodejs.com/ - which comes with full online documentation and a sample application.

We Keep Coding

JavaScript is the programming language of the Web.