Peak load using Node.js

Peak load using Node.js - javascript

I am creating a web application using Node.js, socket.io & express modules. and I would like to find out concerning peak loads on server. Every user make 5-7 request per second.
Server:
RAM 2GB
CPU 2x2 GHz
How much connections can this server process?
Is it necessary to use Web Workers?
What recommendations can give concerning high load or maybe you know some statistic information.

This is almost impossible to answer question since no one knows what/how you are going to develop. In case you need a general number that you can compare to your system and case, you can check this link for single node process, cluster and multithreaded benchmarks for node and jxcore distro.

Related

Confused about nodes purpose [duplicate]

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am new to this kind of stuff, but lately I've been hearing a lot about how good Node.js is. Considering how much I love working with jQuery and JavaScript in general, I can't help but wonder how to decide when to use Node.js. The web application I have in mind is something like Bitly - takes some content, archives it.
From all the homework I have been doing in the last few days, I obtained the following information. Node.js
is a command-line tool that can be run as a regular web server and lets one run JavaScript programs
utilizes the great V8 JavaScript engine
is very good when you need to do several things at the same time
is event-based so all the wonderful Ajax-like stuff can be done on the server side
lets us share code between the browser and the backend
lets us talk with MySQL
Some of the sources that I have come across are:
Diving into Node.js – Introduction and Installation
Understanding NodeJS
Node by Example (Archive.is)
Let’s Make a Web App: NodePad
Considering that Node.js can be run almost out-of-the-box on Amazon's EC2 instances, I am trying to understand what type of problems require Node.js as opposed to any of the mighty kings out there like PHP, Python and Ruby. I understand that it really depends on the expertise one has on a language, but my question falls more into the general category of: When to use a particular framework and what type of problems is it particularly suited for?

You did a great job of summarizing what's awesome about Node.js. My feeling is that Node.js is especially suited for applications where you'd like to maintain a persistent connection from the browser back to the server. Using a technique known as "long-polling", you can write an application that sends updates to the user in real time. Doing long polling on many of the web's giants, like Ruby on Rails or Django, would create immense load on the server, because each active client eats up one server process. This situation amounts to a tarpit attack. When you use something like Node.js, the server has no need of maintaining separate threads for each open connection.
This means you can create a browser-based chat application in Node.js that takes almost no system resources to serve a great many clients. Any time you want to do this sort of long-polling, Node.js is a great option.
It's worth mentioning that Ruby and Python both have tools to do this sort of thing (eventmachine and twisted, respectively), but that Node.js does it exceptionally well, and from the ground up. JavaScript is exceptionally well situated to a callback-based concurrency model, and it excels here. Also, being able to serialize and deserialize with JSON native to both the client and the server is pretty nifty.
I look forward to reading other answers here, this is a fantastic question.
It's worth pointing out that Node.js is also great for situations in which you'll be reusing a lot of code across the client/server gap. The Meteor framework makes this really easy, and a lot of folks are suggesting this might be the future of web development. I can say from experience that it's a whole lot of fun to write code in Meteor, and a big part of this is spending less time thinking about how you're going to restructure your data, so the code that runs in the browser can easily manipulate it and pass it back.
Here's an article on Pyramid and long-polling, which turns out to be very easy to set up with a little help from gevent: TicTacToe and Long Polling with Pyramid.

I believe Node.js is best suited for real-time applications: online games, collaboration tools, chat rooms, or anything where what one user (or robot? or sensor?) does with the application needs to be seen by other users immediately, without a page refresh.
I should also mention that Socket.IO in combination with Node.js will reduce your real-time latency even further than what is possible with long polling. Socket.IO will fall back to long polling as a worst case scenario, and instead use web sockets or even Flash if they are available.
But I should also mention that just about any situation where the code might block due to threads can be better addressed with Node.js. Or any situation where you need the application to be event-driven.
Also, Ryan Dahl said in a talk that I once attended that the Node.js benchmarks closely rival Nginx for regular old HTTP requests. So if we build with Node.js, we can serve our normal resources quite effectively, and when we need the event-driven stuff, it's ready to handle it.
Plus it's all JavaScript all the time. Lingua Franca on the whole stack.

Reasons to use NodeJS:
It runs Javascript, so you can use the same language on server and client, and even share some code between them (e.g. for form validation, or to render views at either end.)
The single-threaded event-driven system is fast even when handling lots of requests at once, and also simple, compared to traditional multi-threaded Java or ROR frameworks.
The ever-growing pool of packages accessible through NPM, including client and server-side libraries/modules, as well as command-line tools for web development. Most of these are conveniently hosted on github, where sometimes you can report an issue and find it fixed within hours! It's nice to have everything under one roof, with standardized issue reporting and easy forking.
It has become the defacto standard environment in which to run Javascript-related tools and other web-related tools, including task runners, minifiers, beautifiers, linters, preprocessors, bundlers and analytics processors.
It seems quite suitable for prototyping, agile development and rapid product iteration.
Reasons not to use NodeJS:
It runs Javascript, which has no compile-time type checking. For large, complex safety-critical systems, or projects including collaboration between different organizations, a language which encourages contractual interfaces and provides static type checking may save you some debugging time (and explosions) in the long run. (Although the JVM is stuck with null, so please use Haskell for your nuclear reactors.)
Added to that, many of the packages in NPM are a little raw, and still under rapid development. Some libraries for older frameworks have undergone a decade of testing and bugfixing, and are very stable by now. Npmjs.org has no mechanism to rate packages, which has lead to a proliferation of packages doing more or less the same thing, out of which a large percentage are no longer maintained.
Nested callback hell. (Of course there are 20 different solutions to this...)
The ever-growing pool of packages can make one NodeJS project appear radically different from the next. There is a large diversity in implementations due to the huge number of options available (e.g. Express/Sails.js/Meteor/Derby). This can sometimes make it harder for a new developer to jump in on a Node project. Contrast that with a Rails developer joining an existing project: he should be able to get familiar with the app pretty quickly, because all Rails apps are encouraged to use a similar structure.
Dealing with files can be a bit of a pain. Things that are trivial in other languages, like reading a line from a text file, are weird enough to do with Node.js that there's a StackOverflow question on that with 80+ upvotes. There's no simple way to read one record at a time from a CSV file. Etc.
I love NodeJS, it is fast and wild and fun, but I am concerned it has little interest in provable-correctness. Let's hope we can eventually merge the best of both worlds. I am eager to see what will replace Node in the future... :)

To make it short:
Node.js is well suited for applications that have a lot of concurrent connections and each request only needs very few CPU cycles, because the event loop (with all the other clients) is blocked during execution of a function.
A good article about the event loop in Node.js is Mixu's tech blog: Understanding the node.js event loop.

I have one real-world example where I have used Node.js. The company where I work got one client who wanted to have a simple static HTML website. This website is for selling one item using PayPal and the client also wanted to have a counter which shows the amount of sold items. Client expected to have huge amount of visitors to this website. I decided to make the counter using Node.js and the Express.js framework.
The Node.js application was simple. Get the sold items amount from a Redis database, increase the counter when item is sold and serve the counter value to users via the API.
Some reasons why I chose to use Node.js in this case
It is very lightweight and fast. There has been over 200000 visits on this website in three weeks and minimal server resources has been able to handle it all.
The counter is really easy to make to be real time.
Node.js was easy to configure.
There are lots of modules available for free. For example, I found a Node.js module for PayPal.
In this case, Node.js was an awesome choice.

The most important reasons to start your next project using Node ...
All the coolest dudes are into it ... so it must be fun.
You can hangout at the cooler and have lots of Node adventures to brag about.
You're a penny pincher when it comes to cloud hosting costs.
Been there done that with Rails
You hate IIS deployments
Your old IT job is getting rather dull and you wish you were in a shiny new Start Up.
What to expect ...
You'll feel safe and secure with Express without all the server bloatware you never needed.
Runs like a rocket and scales well.
You dream it. You installed it. The node package repo npmjs.org is the largest ecosystem of open source libraries in the world.
Your brain will get time warped in the land of nested callbacks ...
... until you learn to keep your Promises.
Sequelize and Passport are your new API friends.
Debugging mostly async code will get umm ... interesting .
Time for all Noders to master Typescript.
Who uses it?
PayPal, Netflix, Walmart, LinkedIn, Groupon, Uber, GoDaddy, Dow Jones
Here's why they switched to Node.

There is nothing like Silver Bullet. Everything comes with some cost associated with it. It is like if you eat oily food, you will compromise your health and healthy food does not come with spices like oily food. It is individual choice whether they want health or spices as in their food.
Same way Node.js consider to be used in specific scenario. If your app does not fit into that scenario you should not consider it for your app development. I am just putting my thought on the same:
When to use Node.JS
If your server side code requires very few cpu cycles. In other world you are doing non blocking operation and does not have heavy algorithm/Job which consumes lots of CPU cycles.
If you are from Javascript back ground and comfortable in writing Single Threaded code just like client side JS.
When NOT to use Node.JS
Your server request is dependent on heavy CPU consuming algorithm/Job.
Scalability Consideration with Node.JS
Node.JS itself does not utilize all core of underlying system and it is single threaded by default, you have to write logic by your own to utilize multi core processor and make it multi threaded.
Node.JS Alternatives
There are other option to use in place of Node.JS however Vert.x seems to be pretty promising and has lots of additional features like polygot and better scalability considerations.

Another great thing that I think no one has mentioned about Node.js is the amazing community, the package management system (npm) and the amount of modules that exist that you can include by simply including them in your package.json file.

My piece: nodejs is great for making real time systems like analytics, chat-apps, apis, ad servers, etc.
Hell, I made my first chat app using nodejs and socket.io under 2 hours and that too during exam
week!
Edit
Its been several years since I have started using nodejs and I have used it in making many different things including static file servers, simple analytics, chat apps and much more.
This is my take on when to use nodejs
When to use
When making system which put emphasis on concurrency and speed.
Sockets only servers like chat apps, irc apps, etc.
Social networks which put emphasis on realtime resources like geolocation, video stream, audio stream, etc.
Handling small chunks of data really fast like an analytics webapp.
As exposing a REST only api.
When not to use
Its a very versatile webserver so you can use it wherever you want but probably not these places.
Simple blogs and static sites.
Just as a static file server.
Keep in mind that I am just nitpicking. For static file servers, apache is better mainly because it is widely available. The nodejs community has grown larger and more mature over the years and it is safe to say nodejs can be used just about everywhere if you have your own choice of hosting.

It can be used where
Applications that are highly event driven & are heavily I/O bound
Applications handling a large number of connections to other systems
Real-time applications (Node.js was designed from the ground up for real time and to be easy
to use.)
Applications that juggle scads of information streaming to and from other sources
High traffic, Scalable applications
Mobile apps that have to talk to platform API & database, without having to do a lot of data
analytics
Build out networked applications
Applications that need to talk to the back end very often
On Mobile front, prime-time companies have relied on Node.js for their mobile solutions. Check out why?
LinkedIn is a prominent user. Their entire mobile stack is built on Node.js. They went from running 15 servers with 15 instances on each physical machine, to just 4 instances – that can handle double the traffic!
eBay launched ql.io, a web query language for HTTP APIs, which uses Node.js as the runtime stack. They were able to tune a regular developer-quality Ubuntu workstation to handle more than 120,000 active connections per node.js process, with each connection consuming about 2kB memory!
Walmart re-engineered its mobile app to use Node.js and pushed its JavaScript processing to the server.
Read more at: http://www.pixelatingbits.com/a-closer-look-at-mobile-app-development-with-node-js/

Node best for concurrent request handling -
So, Let’s start with a story. From last 2 years I am working on JavaScript and developing web front end and I am enjoying it. Back end guys provide’s us some API’s written in Java,python (we don’t care) and we simply write a AJAX call, get our data and guess what ! we are done. But in real it is not that easy, If data we are getting is not correct or there is some server error then we stuck and we have to contact our back end guys over the mail or chat(sometimes on whatsApp too :).) This is not cool. What if we wrote our API’s in JavaScript and call those API’s from our front end ? Yes that’s pretty cool because if we face any problem in API we can look into it. Guess what ! you can do this now , How ? – Node is there for you.
Ok agreed that you can write your API in JavaScript but what if I am ok with above problem. Do you have any other reason to use node for rest API ?
so here is the magic begins. Yes I do have other reasons to use node for our API’s.
Let’s go back to our traditional rest API system which is based on either blocking operation or threading. Suppose two concurrent request occurs( r1 and r2) , each of them require database operation. So In traditional system what will happens :
1. Waiting Way : Our server starts serving r1 request and waits for query response. after completion of r1 , server starts to serve r2 and does it in same way. So waiting is not a good idea because we don’t have that much time.
2. Threading Way : Our server will creates two threads for both requests r1 and r2 and serve their purpose after querying database so cool its fast.But it is memory consuming because you can see we started two threads also problem increases when both request is querying same data then you have to deal with deadlock kind of issues . So its better than waiting way but still issues are there.
Now here is , how node will do it:
3. Nodeway : When same concurrent request comes in node then it will register an event with its callback and move ahead it will not wait for query response for a particular request.So when r1 request comes then node’s event loop (yes there is an event loop in node which serves this purpose.) register an event with its callback function and move ahead for serving r2 request and similarly register its event with its callback. Whenever any query finishes it triggers its corresponding event and execute its callback to completion without being interrupted.
So no waiting, no threading , no memory consumption – yes this is nodeway for serving rest API.

My one more reason to choose Node.js for a new project is:
Be able to do pure cloud based development
I have used Cloud9 IDE for a while and now I can't imagine without it, it covers all the development lifecycles. All you need is a browser and you can code anytime anywhere on any devices. You don't need to check in code in one Computer(like at home), then checkout in another computer(like at work place).
Of course, there maybe cloud based IDE for other languages or platforms (Cloud 9 IDE is adding supports for other languages as well), but using Cloud 9 to do Node.js developement is really a great experience for me.

One more thing node provides is the ability to create multiple v8 instanes of node using node's child process( childProcess.fork() each requiring 10mb memory as per docs) on the fly, thus not affecting the main process running the server. So offloading a background job that requires huge server load becomes a child's play and we can easily kill them as and when needed.
I've been using node a lot and in most of the apps we build, require server connections at the same time thus a heavy network traffic. Frameworks like Express.js and the new Koajs (which removed callback hell) have made working on node even more easier.

Donning asbestos longjohns...
Yesterday my title with Packt Publications, Reactive Programming with JavaScript. It isn't really a Node.js-centric title; early chapters are intended to cover theory, and later code-heavy chapters cover practice. Because I didn't really think it would be appropriate to fail to give readers a webserver, Node.js seemed by far the obvious choice. The case was closed before it was even opened.
I could have given a very rosy view of my experience with Node.js. Instead I was honest about good points and bad points I encountered.
Let me include a few quotes that are relevant here:
Warning: Node.js and its ecosystem are hot--hot enough to burn you badly!
When I was a teacher’s assistant in math, one of the non-obvious suggestions I was told was not to tell a student that something was “easy.” The reason was somewhat obvious in retrospect: if you tell people something is easy, someone who doesn’t see a solution may end up feeling (even more) stupid, because not only do they not get how to solve the problem, but the problem they are too stupid to understand is an easy one!
There are gotchas that don’t just annoy people coming from Python / Django, which immediately reloads the source if you change anything. With Node.js, the default behavior is that if you make one change, the old version continues to be active until the end of time or until you manually stop and restart the server. This inappropriate behavior doesn’t just annoy Pythonistas; it also irritates native Node.js users who provide various workarounds. The StackOverflow question “Auto-reload of files in Node.js” has, at the time of this writing, over 200 upvotes and 19 answers; an edit directs the user to a nanny script, node-supervisor, with homepage at http://tinyurl.com/reactjs-node-supervisor. This problem affords new users with great opportunity to feel stupid because they thought they had fixed the problem, but the old, buggy behavior is completely unchanged. And it is easy to forget to bounce the server; I have done so multiple times. And the message I would like to give is, “No, you’re not stupid because this behavior of Node.js bit your back; it’s just that the designers of Node.js saw no reason to provide appropriate behavior here. Do try to cope with it, perhaps taking a little help from node-supervisor or another solution, but please don’t walk away feeling that you’re stupid. You’re not the one with the problem; the problem is in Node.js’s default behavior.”
This section, after some debate, was left in, precisely because I don't want to give an impression of “It’s easy.” I cut my hands repeatedly while getting things to work, and I don’t want to smooth over difficulties and set you up to believe that getting Node.js and its ecosystem to function well is a straightforward matter and if it’s not straightforward for you too, you don’t know what you’re doing. If you don’t run into obnoxious difficulties using Node.js, that’s wonderful. If you do, I would hope that you don’t walk away feeling, “I’m stupid—there must be something wrong with me.” You’re not stupid if you experience nasty surprises dealing with Node.js. It’s not you! It’s Node.js and its ecosystem!
The Appendix, which I did not really want after the rising crescendo in the last chapters and the conclusion, talks about what I was able to find in the ecosystem, and provided a workaround for moronic literalism:
Another database that seemed like a perfect fit, and may yet be redeemable, is a server-side implementation of the HTML5 key-value store. This approach has the cardinal advantage of an API that most good front-end developers understand well enough. For that matter, it’s also an API that most not-so-good front-end developers understand well enough. But with the node-localstorage package, while dictionary-syntax access is not offered (you want to use localStorage.setItem(key, value) or localStorage.getItem(key), not localStorage[key]), the full localStorage semantics are implemented, including a default 5MB quota—WHY? Do server-side JavaScript developers need to be protected from themselves?
For client-side database capabilities, a 5MB quota per website is really a generous and useful amount of breathing room to let developers work with it. You could set a much lower quota and still offer developers an immeasurable improvement over limping along with cookie management. A 5MB limit doesn’t lend itself very quickly to Big Data client-side processing, but there is a really quite generous allowance that resourceful developers can use to do a lot. But on the other hand, 5MB is not a particularly large portion of most disks purchased any time recently, meaning that if you and a website disagree about what is reasonable use of disk space, or some site is simply hoggish, it does not really cost you much and you are in no danger of a swamped hard drive unless your hard drive was already too full. Maybe we would be better off if the balance were a little less or a little more, but overall it’s a decent solution to address the intrinsic tension for a client-side context.
However, it might gently be pointed out that when you are the one writing code for your server, you don’t need any additional protection from making your database more than a tolerable 5MB in size. Most developers will neither need nor want tools acting as a nanny and protecting them from storing more than 5MB of server-side data. And the 5MB quota that is a golden balancing act on the client-side is rather a bit silly on a Node.js server. (And, for a database for multiple users such as is covered in this Appendix, it might be pointed out, slightly painfully, that that’s not 5MB per user account unless you create a separate database on disk for each user account; that’s 5MB shared between all user accounts together. That could get painful if you go viral!) The documentation states that the quota is customizable, but an email a week ago to the developer asking how to change the quota is unanswered, as was the StackOverflow question asking the same. The only answer I have been able to find is in the Github CoffeeScript source, where it is listed as an optional second integer argument to a constructor. So that’s easy enough, and you could specify a quota equal to a disk or partition size. But besides porting a feature that does not make sense, the tool’s author has failed completely to follow a very standard convention of interpreting 0 as meaning “unlimited” for a variable or function where an integer is to specify a maximum limit for some resource use. The best thing to do with this misfeature is probably to specify that the quota is Infinity:
if (typeof localStorage === 'undefined' || localStorage === null)
{
var LocalStorage = require('node-localstorage').LocalStorage;
localStorage = new LocalStorage(__dirname + '/localStorage',
Infinity);
}
Swapping two comments in order:
People needlessly shot themselves in the foot constantly using JavaScript as a whole, and part of JavaScript being made respectable language was a Douglas Crockford saying in essence, “JavaScript as a language has some really good parts and some really bad parts. Here are the good parts. Just forget that anything else is there.” Perhaps the hot Node.js ecosystem will grow its own “Douglas Crockford,” who will say, “The Node.js ecosystem is a coding Wild West, but there are some real gems to be found. Here’s a roadmap. Here are the areas to avoid at almost any cost. Here are the areas with some of the richest paydirt to be found in ANY language or environment.”
Perhaps someone else can take those words as a challenge, and follow Crockford’s lead and write up “the good parts” and / or “the better parts” for Node.js and its ecosystem. I’d buy a copy!
And given the degree of enthusiasm and sheer work-hours on all projects, it may be warranted in a year, or two, or three, to sharply temper any remarks about an immature ecosystem made at the time of this writing. It really may make sense in five years to say, “The 2015 Node.js ecosystem had several minefields. The 2020 Node.js ecosystem has multiple paradises.”

If your application mainly tethers web apis, or other io channels, give or take a user interface, node.js may be a fair pick for you, especially if you want to squeeze out the most scalability, or, if your main language in life is javascript (or javascript transpilers of sorts). If you build microservices, node.js is also okay. Node.js is also suitable for any project that is small or simple.
Its main selling point is it allows front-enders take responsibility for back-end stuff rather than the typical divide. Another justifiable selling point is if your workforce is javascript oriented to begin with.
Beyond a certain point however, you cannot scale your code without terrible hacks for forcing modularity, readability and flow control. Some people like those hacks though, especially coming from an event-driven javascript background, they seem familiar or forgivable.
In particular, when your application needs to perform synchronous flows, you start bleeding over half-baked solutions that slow you down considerably in terms of your development process. If you have computation intensive parts in your application, tread with caution picking (only) node.js. Maybe http://koajs.com/ or other novelties alleviate those originally thorny aspects, compared to when I originally used node.js or wrote this.

I can share few points where&why to use node js.
For realtime applications like chat,collaborative editing better we go with nodejs as it is event base where fire event and data to clients from server.
Simple and easy to understand as it is javascript base where most of people have idea.
Most of current web applications going towards angular js&backbone, with node it is easy to interact with client side code as both will use json data.
Lot of plugins available.
Drawbacks:-
Node will support most of databases but best is mongodb which won't support complex joins and others.
Compilation Errors...developer should handle each and every exceptions other wise if any error accord application will stop working where again we need to go and start it manually or using any automation tool.
Conclusion:-
Nodejs best to use for simple and real time applications..if you have very big business logic and complex functionality better should not use nodejs.
If you want to build an application along with chat and any collaborative functionality.. node can be used in specific parts and remain should go with your convenience technology.

Node is great for quick prototypes but I'd never use it again for anything complex.
I spent 20 years developing a relationship with a compiler and I sure miss it.
Node is especially painful for maintaining code that you haven't visited for awhile. Type info and compile time error detection are GOOD THINGS. Why throw all that out? For what? And dang, when something does go south the stack traces quite often completely useless.

How to scale NodeJS while Proxying it through Apache web server?

I am having Apache Web Server as the web server and NodeJS is also running on the same system at a different port. I am reverse Proxying it in order to connect to it and utilising it for different purposes. I would like to know what are the various ways in which this kind of architecture could be scaled up for 10 Million users or so, Is Apache a good choice to go with. If yes please let me know how to manage such a scenario and because I could not go ahead with different architecture.

If you throw enough resources at it (more and more instances behind a load balancer), of course Apache can take it. That's the philosophy of pretty much all web applications: scaling horizontally. But since it creates a new thread for each request, it's efficiency (per instance) is quite low. If it takes you x instances of Apache to sustain your traffic, you can probably switch to nginX(which uses the much more efficient evented model, as opposed to Apache's threaded model) and sustain the same traffic with, perhaps x/2 instances.
nginX has tradeoffs too: far fewer support resources, less uptake in the enterprise, hasn't been around as long as apache, need to recompile if you want to add a new module, etc. It is up to you to decide if it is worth it. Smaller/leaner companies prefer nginx to reduce costs, while big enterprises tend to prefer apache becuase of more support and people who know to configure it.

Handle I/O requests in amazon ec2 instances

After having learnt node, javascript and all the rest the hard way, I am finally about to release my first web app.
So I subscribed to Amazon Web Services and created a micro instance, planning on the first year free tier to allow me to make the app available to the world.
My concern is more about hidden costs. I know that with the free tier comes 1 million I/O requests per month for the Amazon EC2 EBS.
Thing is, I started testing my app one the ec2 instance to check that everything was running fine; and I am already at more than 100, 000 I/O requests. And I have basically been the only one using it so far (37 hours that the instance runs).
So I am quite afraid of what could happen if my app gets some traffic, and I don't want to end up with a huge unexpected bill at the end of the month.
I find it quite surprising, because I mainly serve static stuff, and my server side code consists in :
Receving a search request from a client
1 http request to a website
1 https request to the youtube api
saving the data to a mongoDB
Sending the results to the client
Do you have any advice on how to dramatically reduce my IO?
I do not use any other Amazon services so far, maybe am I missing something?
Or maybe Amazon free tier in not enough in my case, but then what can it be enough for? I mean, my app is really simple after all.
I'd be really glad for any help you could provide me
Thanks!

You did not mention total number of visits to your app. So I am assuming you have fairly less visits.
What are I/O requests ?
A single I/O request is a read/write instruction that reaches the EBS volumes. Beware! Execution of large read/writes is broken into multiple smaller pieces, which is the block size of the volume.
Possible reasons of high I/O:
Your app uses lot of RAM. After you hit the limit, the OS starts swapping memory to and fro from swap area in your disk, constantly.
This is most likely the problem, the mongoDB search. mongoDB searches can be long complex queries internally. From one of the answers to this question, the person was using mySQL and it caused him 1 billion I/O requests in 24 days. So 1 database search can be many I/O requests.
Cache is disabled, or you write/modify lot of files. You mentioned you were testing. Free-teir is just not suitable for developing stuff.
You should read this, in case you want to know what happens after free-tier expires.

I recently ran into a similar situation recording very high I/O request rates for a website with little to no traffic. The culprit seems to be a variation of what #prajwalkman discovered testing Chef deployments on a micro instance.
I am not using Chef, but I have been using boto3, Docker and Git to automatically 'build' test images inside a micro instance. Each time I go through my test script, a new image is built and I was not careful to read the fine print regarding default settings on the VolumeType argument on the boto3 run_instance command. Each test image was being built with the 'standard' volume type which, according to current EBS pricing, bills at the rate of $0.05/million I/Os. Whereas, the 'gp2' general purpose memory has a flat cost of $0.10 per GB per month with no extra charge for I/O.
With a handful of lean docker containers taking up a total of 2GB, on top of the 1.3GB for the amazon-ecs-optimized-ami, my storage is well within the free-tier usage. So, once I fixed the volumetype attribute in the blockdevicemappings settings in my scripts to 'gp2', I no longer had an I/O problem on my server.
Prior to that point, the constant downloading of docker images and git repos produced nearly 10m I/Os in less than a week.

The micro instance and the free tier is meant for testing their offerings, not a free way for you to host your site/web application.
You may have to pay money at the end of the month, but I really doubt if you can get away with paying less by using some other company for hosting. AFAIK AWS really is the rock bottom of the price charts.
As for the IO requests themselves, it's hard to give generic advice. I once was in a situation where my micro instance racked up ridiculous number of IO requests. Turns out testing Chef deployments on EC2 is a bad idea.

I/O Requests have to do with reading and writing blocks to EBS volumes. You can reduce this by using as much in memory caching as possible. Micro instances only have about 613 MB of memory available, so you may not be able to do much here.

Ok, so it seems like I/O requests are related to the EBS volume, and that caching may reduce it.
Something I had not considered though is all the operations I made to get my app running.
I updated the linux image, installed node and npm, several modules, mongodb, ....
This is likely to be the main cause of the I/O.
The number requests hasn't grown much in the last days, where the server stayed mostly idle.

Clustering Node JS in Heavy Traffic Production Environment

I have a web service handling http requests to redirect to specific URLs. Right the CPU is hammered at about 5 million hits per day, but I need to scale it up to handle 20 million plus. This is a production environment so I am a little apprehensive about the new Node Cluster method b/c it is still listed as experimental. I need suggestions on how to cluster Node on handle the traffic on a linux server. Any thoughts?

5 million per day is equivalent to 57.87 per second, and 25 million is 289.4 per second. These numbers are not too much for a single server for your case. If you only want to redirect specif urls, you can go with another alternatives such as nginx that is more fit for that job. However, if you still want to use NodeJS, I think a modern server can handle that load. Look at my blog post as an example of how to use clustering: NodeJS: Simple Clustering Benchmark. If you want to use all of your cores, you should use clustering.

How to decide when to use Node.js?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am new to this kind of stuff, but lately I've been hearing a lot about how good Node.js is. Considering how much I love working with jQuery and JavaScript in general, I can't help but wonder how to decide when to use Node.js. The web application I have in mind is something like Bitly - takes some content, archives it.
From all the homework I have been doing in the last few days, I obtained the following information. Node.js
is a command-line tool that can be run as a regular web server and lets one run JavaScript programs
utilizes the great V8 JavaScript engine
is very good when you need to do several things at the same time
is event-based so all the wonderful Ajax-like stuff can be done on the server side
lets us share code between the browser and the backend
lets us talk with MySQL
Some of the sources that I have come across are:
Diving into Node.js – Introduction and Installation
Understanding NodeJS
Node by Example (Archive.is)
Let’s Make a Web App: NodePad
Considering that Node.js can be run almost out-of-the-box on Amazon's EC2 instances, I am trying to understand what type of problems require Node.js as opposed to any of the mighty kings out there like PHP, Python and Ruby. I understand that it really depends on the expertise one has on a language, but my question falls more into the general category of: When to use a particular framework and what type of problems is it particularly suited for?

I believe Node.js is best suited for real-time applications: online games, collaboration tools, chat rooms, or anything where what one user (or robot? or sensor?) does with the application needs to be seen by other users immediately, without a page refresh.
I should also mention that Socket.IO in combination with Node.js will reduce your real-time latency even further than what is possible with long polling. Socket.IO will fall back to long polling as a worst case scenario, and instead use web sockets or even Flash if they are available.
But I should also mention that just about any situation where the code might block due to threads can be better addressed with Node.js. Or any situation where you need the application to be event-driven.
Also, Ryan Dahl said in a talk that I once attended that the Node.js benchmarks closely rival Nginx for regular old HTTP requests. So if we build with Node.js, we can serve our normal resources quite effectively, and when we need the event-driven stuff, it's ready to handle it.
Plus it's all JavaScript all the time. Lingua Franca on the whole stack.

To make it short:
Node.js is well suited for applications that have a lot of concurrent connections and each request only needs very few CPU cycles, because the event loop (with all the other clients) is blocked during execution of a function.
A good article about the event loop in Node.js is Mixu's tech blog: Understanding the node.js event loop.

I have one real-world example where I have used Node.js. The company where I work got one client who wanted to have a simple static HTML website. This website is for selling one item using PayPal and the client also wanted to have a counter which shows the amount of sold items. Client expected to have huge amount of visitors to this website. I decided to make the counter using Node.js and the Express.js framework.
The Node.js application was simple. Get the sold items amount from a Redis database, increase the counter when item is sold and serve the counter value to users via the API.
Some reasons why I chose to use Node.js in this case
It is very lightweight and fast. There has been over 200000 visits on this website in three weeks and minimal server resources has been able to handle it all.
The counter is really easy to make to be real time.
Node.js was easy to configure.
There are lots of modules available for free. For example, I found a Node.js module for PayPal.
In this case, Node.js was an awesome choice.

The most important reasons to start your next project using Node ...
All the coolest dudes are into it ... so it must be fun.
You can hangout at the cooler and have lots of Node adventures to brag about.
You're a penny pincher when it comes to cloud hosting costs.
Been there done that with Rails
You hate IIS deployments
Your old IT job is getting rather dull and you wish you were in a shiny new Start Up.
What to expect ...
You'll feel safe and secure with Express without all the server bloatware you never needed.
Runs like a rocket and scales well.
You dream it. You installed it. The node package repo npmjs.org is the largest ecosystem of open source libraries in the world.
Your brain will get time warped in the land of nested callbacks ...
... until you learn to keep your Promises.
Sequelize and Passport are your new API friends.
Debugging mostly async code will get umm ... interesting .
Time for all Noders to master Typescript.
Who uses it?
PayPal, Netflix, Walmart, LinkedIn, Groupon, Uber, GoDaddy, Dow Jones
Here's why they switched to Node.

There is nothing like Silver Bullet. Everything comes with some cost associated with it. It is like if you eat oily food, you will compromise your health and healthy food does not come with spices like oily food. It is individual choice whether they want health or spices as in their food.
Same way Node.js consider to be used in specific scenario. If your app does not fit into that scenario you should not consider it for your app development. I am just putting my thought on the same:
When to use Node.JS
If your server side code requires very few cpu cycles. In other world you are doing non blocking operation and does not have heavy algorithm/Job which consumes lots of CPU cycles.
If you are from Javascript back ground and comfortable in writing Single Threaded code just like client side JS.
When NOT to use Node.JS
Your server request is dependent on heavy CPU consuming algorithm/Job.
Scalability Consideration with Node.JS
Node.JS itself does not utilize all core of underlying system and it is single threaded by default, you have to write logic by your own to utilize multi core processor and make it multi threaded.
Node.JS Alternatives
There are other option to use in place of Node.JS however Vert.x seems to be pretty promising and has lots of additional features like polygot and better scalability considerations.

Another great thing that I think no one has mentioned about Node.js is the amazing community, the package management system (npm) and the amount of modules that exist that you can include by simply including them in your package.json file.

My piece: nodejs is great for making real time systems like analytics, chat-apps, apis, ad servers, etc.
Hell, I made my first chat app using nodejs and socket.io under 2 hours and that too during exam
week!
Edit
Its been several years since I have started using nodejs and I have used it in making many different things including static file servers, simple analytics, chat apps and much more.
This is my take on when to use nodejs
When to use
When making system which put emphasis on concurrency and speed.
Sockets only servers like chat apps, irc apps, etc.
Social networks which put emphasis on realtime resources like geolocation, video stream, audio stream, etc.
Handling small chunks of data really fast like an analytics webapp.
As exposing a REST only api.
When not to use
Its a very versatile webserver so you can use it wherever you want but probably not these places.
Simple blogs and static sites.
Just as a static file server.
Keep in mind that I am just nitpicking. For static file servers, apache is better mainly because it is widely available. The nodejs community has grown larger and more mature over the years and it is safe to say nodejs can be used just about everywhere if you have your own choice of hosting.

It can be used where
Applications that are highly event driven & are heavily I/O bound
Applications handling a large number of connections to other systems
Real-time applications (Node.js was designed from the ground up for real time and to be easy
to use.)
Applications that juggle scads of information streaming to and from other sources
High traffic, Scalable applications
Mobile apps that have to talk to platform API & database, without having to do a lot of data
analytics
Build out networked applications
Applications that need to talk to the back end very often
On Mobile front, prime-time companies have relied on Node.js for their mobile solutions. Check out why?
LinkedIn is a prominent user. Their entire mobile stack is built on Node.js. They went from running 15 servers with 15 instances on each physical machine, to just 4 instances – that can handle double the traffic!
eBay launched ql.io, a web query language for HTTP APIs, which uses Node.js as the runtime stack. They were able to tune a regular developer-quality Ubuntu workstation to handle more than 120,000 active connections per node.js process, with each connection consuming about 2kB memory!
Walmart re-engineered its mobile app to use Node.js and pushed its JavaScript processing to the server.
Read more at: http://www.pixelatingbits.com/a-closer-look-at-mobile-app-development-with-node-js/

Node best for concurrent request handling -
So, Let’s start with a story. From last 2 years I am working on JavaScript and developing web front end and I am enjoying it. Back end guys provide’s us some API’s written in Java,python (we don’t care) and we simply write a AJAX call, get our data and guess what ! we are done. But in real it is not that easy, If data we are getting is not correct or there is some server error then we stuck and we have to contact our back end guys over the mail or chat(sometimes on whatsApp too :).) This is not cool. What if we wrote our API’s in JavaScript and call those API’s from our front end ? Yes that’s pretty cool because if we face any problem in API we can look into it. Guess what ! you can do this now , How ? – Node is there for you.
Ok agreed that you can write your API in JavaScript but what if I am ok with above problem. Do you have any other reason to use node for rest API ?
so here is the magic begins. Yes I do have other reasons to use node for our API’s.
Let’s go back to our traditional rest API system which is based on either blocking operation or threading. Suppose two concurrent request occurs( r1 and r2) , each of them require database operation. So In traditional system what will happens :
1. Waiting Way : Our server starts serving r1 request and waits for query response. after completion of r1 , server starts to serve r2 and does it in same way. So waiting is not a good idea because we don’t have that much time.
2. Threading Way : Our server will creates two threads for both requests r1 and r2 and serve their purpose after querying database so cool its fast.But it is memory consuming because you can see we started two threads also problem increases when both request is querying same data then you have to deal with deadlock kind of issues . So its better than waiting way but still issues are there.
Now here is , how node will do it:
3. Nodeway : When same concurrent request comes in node then it will register an event with its callback and move ahead it will not wait for query response for a particular request.So when r1 request comes then node’s event loop (yes there is an event loop in node which serves this purpose.) register an event with its callback function and move ahead for serving r2 request and similarly register its event with its callback. Whenever any query finishes it triggers its corresponding event and execute its callback to completion without being interrupted.
So no waiting, no threading , no memory consumption – yes this is nodeway for serving rest API.

My one more reason to choose Node.js for a new project is:
Be able to do pure cloud based development
I have used Cloud9 IDE for a while and now I can't imagine without it, it covers all the development lifecycles. All you need is a browser and you can code anytime anywhere on any devices. You don't need to check in code in one Computer(like at home), then checkout in another computer(like at work place).
Of course, there maybe cloud based IDE for other languages or platforms (Cloud 9 IDE is adding supports for other languages as well), but using Cloud 9 to do Node.js developement is really a great experience for me.

One more thing node provides is the ability to create multiple v8 instanes of node using node's child process( childProcess.fork() each requiring 10mb memory as per docs) on the fly, thus not affecting the main process running the server. So offloading a background job that requires huge server load becomes a child's play and we can easily kill them as and when needed.
I've been using node a lot and in most of the apps we build, require server connections at the same time thus a heavy network traffic. Frameworks like Express.js and the new Koajs (which removed callback hell) have made working on node even more easier.

If your application mainly tethers web apis, or other io channels, give or take a user interface, node.js may be a fair pick for you, especially if you want to squeeze out the most scalability, or, if your main language in life is javascript (or javascript transpilers of sorts). If you build microservices, node.js is also okay. Node.js is also suitable for any project that is small or simple.
Its main selling point is it allows front-enders take responsibility for back-end stuff rather than the typical divide. Another justifiable selling point is if your workforce is javascript oriented to begin with.
Beyond a certain point however, you cannot scale your code without terrible hacks for forcing modularity, readability and flow control. Some people like those hacks though, especially coming from an event-driven javascript background, they seem familiar or forgivable.
In particular, when your application needs to perform synchronous flows, you start bleeding over half-baked solutions that slow you down considerably in terms of your development process. If you have computation intensive parts in your application, tread with caution picking (only) node.js. Maybe http://koajs.com/ or other novelties alleviate those originally thorny aspects, compared to when I originally used node.js or wrote this.

I can share few points where&why to use node js.
For realtime applications like chat,collaborative editing better we go with nodejs as it is event base where fire event and data to clients from server.
Simple and easy to understand as it is javascript base where most of people have idea.
Most of current web applications going towards angular js&backbone, with node it is easy to interact with client side code as both will use json data.
Lot of plugins available.
Drawbacks:-
Node will support most of databases but best is mongodb which won't support complex joins and others.
Compilation Errors...developer should handle each and every exceptions other wise if any error accord application will stop working where again we need to go and start it manually or using any automation tool.
Conclusion:-
Nodejs best to use for simple and real time applications..if you have very big business logic and complex functionality better should not use nodejs.
If you want to build an application along with chat and any collaborative functionality.. node can be used in specific parts and remain should go with your convenience technology.

Node is great for quick prototypes but I'd never use it again for anything complex.
I spent 20 years developing a relationship with a compiler and I sure miss it.
Node is especially painful for maintaining code that you haven't visited for awhile. Type info and compile time error detection are GOOD THINGS. Why throw all that out? For what? And dang, when something does go south the stack traces quite often completely useless.

We Keep Coding

JavaScript is the programming language of the Web.