reinventing the wheels: Node.JS/Event-driven programming v.s. Functional Programming? - javascript

Now there's all the hype lately about Node.JS, an event driven framework using Javascript callbacks. To my limited understanding, its primary advantage seems to be that you don't have to wait step by step sequentially (for example, you can fetch the SQL results, while calling other functions too).
So my question is: how is this different, or better than just functional languages, like CL, Haskell, Clojure etc? If not better, then why don't people just do functional languages then (instead of reinventing the wheel with Javascript)?
Please note that I have none experience in either Node.JS nor functional programming. So some basic explanation can be helpful.

Having read through the Node.JS docs (and the nice slide deck), I think the other answers here are missing the point about it: Node.JS is based on the idea that the style of writing programs where we expect them to block on I/O is wrong, and that instead we should initiate I/O (such as reading a database, or reading a socket) and pass along a function to handle the result of the I/O along with the request.
So rather than do this:
var result = db.query("select.."); // blocking
// use result
Node.JS is based on the idea of doing this:
db.query("select..", function (result) {
// use result
});
The authors (of Node.JS) point out that this way of programming is very awkward in many systems as the languages don't have closures or anonymous functions and the libraries are primarily blocking I/O. However, Javascript supplies the former (and programmers are used to it given how JS is used in an event like manner in the browser), and Node.JS fills in the later: being an entirely event driven I/O library with no blocking calls at all.
How does this relate to functional programming? All functional programming languages provide closure constructs powerful enough to do what Node.JS is trying to do with Javascript. Most make it even easier to code, as passing closures is generally fundamental to the language.
As for Haskell, using Monads, this sort of thing could be very easy to construct. For example:
doQuery :: DBConnection -> IO ()
doQuery db = do
rows <- query db "select..."
doSomething rows
doSomethingElse rows
These very sequential, imperative lines of code are actually a sequence of closures under control of the IO monad. It is as if in JavaScript you had written:
db.query("select...", function (rows) {
doSomething(rows, function () {
doSomethingElse(rows, function () { /* done */ })
})
})
In essence, when writing monadic code in a functional language, you already are writing it in the form the Node.JS authors want us to write: Where each step of the sequential computation is passed as a closure to the prior one. However, look how much nicer that code looks in Haskell!
Furthermore, you can easily use concurrent Haskell features to achieve this non-blocking operation easily:
forkQuery :: DBConnection -> IO ThreadId
forkQuery db = forkIO $ do
rows <- query db "select..."
doSomething rows
doSomethingElse rows
Don't confuse that forkIO with the expensive process fork you're used to, or even OS process threads. It is basically the same light weight thread of execution that Node.JS is using (only with somewhat richer semantics). You can have 1,000s of 'em just like Node.JS aims for.
So, in short - I think Node.JS is building on a facility that exists in JavaScript, but that is much more natural in functional languages. Furthermore, I think all the elements that are in Node.JS already exist in Haskell and its packages, and then some. For me, I'd just use Haskell!

I don't really know about Node.JS either, but I don't really see any striking similarity between it (from your description) and functional programming. From your description, Node.JS seems to be aimed at aiding asynchronous programming -- as you state "you don't have to wait step by step sequentially", you can do other tasks as one long-running task does its thing.
Functional programming is completely orthogonal to this -- i.e. it doesn't really have any link to asynchronicity. You can have one without the other, or both together, or neither of them. Functional programming is about eliminating side-effects in your programs, and about allowing functions as first-class members of the language, to be manipulated and composed similarly to other values.

This isn't really "reinventing the wheel." Javascript isn't really a functional language per se, but it was based on Lisp and this is the sort of thing it was designed to do. Javascript is really stronger as a Lisp-ish functional language than it is as an OO language in my opinion. That's why frameworks with strongly functional* designs like jQuery fit the language so well.
(* Note: Not pure, obviously, but functional in much the same way as Scheme.)

I haven't used node.js yet, but I'm definitely interested in it and will be trying it out soon. There are already lots of great answers about functional programming and such here, so I won't go into that.
You ask why not use some other language on the server, such as haskell, Closure, etc. For me, the attraction of node.js over others is that it IS javascript. My applications are already heavy in javascript on the client, so it means I could work in one language on both the server and the client.
My hope is that this will streamline and simplify development because I won't need to be switching contexts so drastically. There might even be some reduction in work if some logic that is used both on client and server could be shared (perhaps form validation code and the like).

Javascript's key role in node as a webserver is that it is largely event driven.
I believe functional programming has advantages for concurrency, due to immutability among other things.
not sure how event driven other functional languages are, just wanted to highlight it as part of the advantages of node.

One of the main benifits with the gap between the client and the server shrinking is the ability to reuse code on the client and the server. For instance if you want to have a rich and dynamic AJAX website for modern browsers and a stripped down version for older browsers you can use the same display code to format incomming data on both the client and the server.
Other benifits with this include the HTML5/Google Gears/Adobe Air ability to have a local storage DB and server to run web apps offline, you can have code that would traditionally be on the server stored locally for when the server is not available.

Related

PHP vs JS: How come PHP operations are mostly executed synchronous whilst JS operations are not?

If we look into the way PHP and JS frameworks are written you can see a great difference. JavaScript generally has callbacks/promises for operations that might take an undetermined amount of time. Even though PHP has the possibility to pass functions as a parameter, it's not commonly used.
I'd like to take database operations as an example to state my case:
PHP's Eloquent:
$organization = Organization::where('name', '=', 'test')->first();
Node.JS Sequelize:
Organization.findOne({where: {name: 'test'}})
.then(function (organization) {
// Organization has been fetched here
}
);
As you can see Sequelize uses promises to make the operation async. However, PHP's Eloquent does not.
Apart from framework benchmarks, does this mean the Node.JS is faster/more efficient since it is not blocking? Or is this rather because JS is single-threaded? Maybe I have a wrong perception and is the pattern equally implemented in both languages?
It's possible to write blocking node.js code, and it's possible to write asynchronous PHP code (see the libevent extension, the amphp project or the ReactPHP project). Neither is really idiomatic. This means it's atypical and perhaps not recommended except for specialized use-cases.
PHP can work synchronously because in a typical webserver there are many PHP process running at the same time. Each handling 1 HTTP request at a time.
There are pros and cons to each model. For example, it's harder to do multiple things in node.js if you're doing CPU-bound work, and in PHP you might need more memory usage for higher concurrency, because often you might be waiting on IO in PHP, so most of your PHP processes might end up waiting. So your CPU usage ends up being low, and you have a lot of PHP process waiting around for IO while all of them have a copy of your favourite web framework in memory.
Node.js allows for building some types of applications that are harder to do in PHP because there's no shared memory space. But then again, that advantage might end up being less relevant once your node.js application needs to grow beyond a single process.
Anyway, it's an extremely broad question. In some ways both node.js and PHP can address similar use-cases, but when your use-cases become more exotic one of the two might be better.
If you're mostly a building a web application, you're not expecting massive scale, it probably doesn't really matter that much what you go for.
So ultimately which is the better choice depends on your use-case.

Is it accurate to say JavaScript is a "single-thread" language?

I've heard this kind of statements many times, but personally I think this doesn't quite make sense. I think people are confusing JavaScript as a language specification and JavaScript in practice (browser, Node, etc.). Of course in most cases JavaScript are executed in a single-thread environment; but AFAIK nothing in the language specification requires it to be so. I think this is just like saying Python is "interpreted", while it's in fact entirely a matter of implementation.
So, is it accurate to say JavaScript is a "single-thread" language?
By JavaScript you seem to mean ECMAScript.
There's already multithreading in the browser, built with webworkers, and based on a strong isolation of data : workers only communicate by message passing, nothing is shared.
If you want more intricate multithreading, with data sharing, then that doesn't look possible now. Nothing in ECMAScript explicitely forbids multithreading but you can't do multithreading without
facilities to create "threads" (in a general sense, that could be coroutines)
mutexes and facilities to synchronize accesses
a low level support to ensure for example a property change won't break the data in case of simultaneous accesses. None of the current engines has been designed with this kind of strength (yes, some of them support multiple threads but in isolation).
The fact ECMAScript wasn't designed to include multi-threading is enough to prevent, currently, to support it (other than message-passing isolated multi-threading as is already done but it's a very limited kind of multi-threading).
You have to realize that
data sharing multi-threading is very expensive (not even speaking about simultaneous actions on the DOM)
you would rarely use it in JavaScript
Why do I say you would rarely use it ? Because most of the IO blocking tasks (file reading, requests, db queries, etc.), most of the low level tasks (for example image decoding or page rendering), most of the UI management (with event queue), most of the scheduling (timeouts and intervals) is done outside for you.
Multi-threading behavior is available in both HTML5 and node.js, BUT there is no native threading API in the Javascript language, so I guess the answer to your contrived question (I mean that in the nicest possible way, of course) is "yes, Javascript is a single-threaded language."
AFAIK nothing in the language specification requires it to be so.
In TC39 it says:
At any point in time, there is at most one execution context per agent that is actually executing code.
That appears to me to be the crucial guarantee that you never have to synchronize access to variables in ECMAScript. Which is what I expect is meant when someone says a language is single-threaded.
Of course, most ECMAScript host environments use more than one thread in their host environment implementation for things like garbage collection, etc. And, ECMAScript itself allows for multiple separate contexts of execution that could each be driven by their own thread--though the standard makes it clear you could also just drive all of them with the same thread.
The point, again, is you never have to protect any ECMAScript variable with a mutex, semaphore, or the like (which is why ECMAScript provides no such facilities) since the language promises there will never be two threads of control with simultaneous access to the same context.
I don't know of any JavaScript implementation that violates this promise either, though there certainly could be.

What are some architectural reasons to use node.js aside from scalability?

The most common theme I read about why to use node.js is for high scalability due to it's evented, non-blocking I/O model. I'm trying to understand other non-scalability uses cases (and aside from being used as a general server-side javascript engine).
Does node.js have other use cases if scalability isn't a concern of mine?
If yes to #1, what are they?
Is node.js usage appropriate for any particular type of application architectures? E.g. similar to how some key/value (nosql - ugh I hate that term) databases are useful other than for scalability reasons.
My reason for trying out node is the fact that it is incredibly easy to send JSON data between the server and the client for ajax requests.
If you use something like MongoDB, which stores data as JSON object too, you never have to worry about translating or parsing your data.
If your site uses a lot of ajax, and you're sending your data as JSON objects (rather than XML or plain text) then node.js will save you a fair bit of effort.
I think this blog posts sums it up quite well:
http://debuggable.com/posts/understanding-node-js:4bd98440-45e4-4a9a-8ef7-0f7ecbdd56cb
In short (pro node.js):
Speed
Javascript (especially if you know it already)
Efficiency
node.js is really great. Give it a try! :)
I can think of one reason, but its not very deep. Basically, If you are developing an RIA, your entire stack can be javascript. That might have some value.
But Ill question my own answer, namely the thought is, even if it makes the server side code more accessible to client side developers, they still need to understand how their server stack works. So there is still some learning to.
To be accurate, I think the general server-side JavaScript engine in your question would be V8, while according to its creator Node was built for "scripting network programs."
Based on many of his comments I don't believe he views it as broadly as many of us do, but recognizes where it can go. [I can't speak for anyone else--that's just my interpretation based on the writings and presentations I've seen.]
So it approaches things from a somewhat lower level, makes HTTP a first-class citizen and just happens to be really cool, which I think makes it enough of a "use case" for most of us. ;)
It does have a learning curve and isn't the most stable platform to build on due to its rapid development. I believe time will tell where it's most useful.
For now people are using it for "real-time" apps due to its lightweight, asynchronous nature, as well as general Web development, though IMO its sweet spot remains with its originally stated purpose.
What I like about node.js aside from non-blocking I/O model, scalability and all that "primary reasons" stuff:
Lightweight nature of it's framework. Basics are easy to learn.
Developer community building tons of useful modules and libraries on github which are expanding node.js lightweight core and it's capabilities.
It's really easy and fast to build server side and real-time systems (for example http or socket based) without the knowledge of complex libraries.
I like to use NodeJS to write testing harness because you can write a stub/server/client really quickly. And you can drive your application, with ease. I can easily script a third party back-end server to do performance testing on my application. I also use it to drive my application. I can perform complex client server scenarios using setTimout to cause multiple events to be trigger based on any logic I want and test them at scale.

Should writing Javascript be avoided in favour of GWT/WebSharper or some other abstraction?

I'm curious what's the view on "things that compile into javascript" e.g. GWT, Script# and WebSharper and their like. These seem to be fairly niche components aimed at allowing folks to write javascript without writing javascript.
Personally I'm comfortable writing javascript (using JQuery/Prototype/ExtJS or some other such library) and view things like GWT these as needless abstractions that may end up limiting what a developer needs to accomplish or best-case providing a very long-winded workaround. In some cases you still end up writing javascript e.g. JSNI.
Worse still if you don't know what's going on under the covers you run the risk of unintended consequences. E.g. how do you know GWT is creating closures and managing namespaces correctly?
I'm curious to hear others' opinions. Is this where web programming is headed?
Should JavaScript be avoided in favor of X? By all means!
I will start with a disclaimer: my answer is very biased as I am on the WebSharper developer team. The reason I am on this team in the first place is that I found myself a complete failure in writing pure JavaScript, and then suggested to my company that we try and write a compiler from our favorite language, F#, to JavaScript.
For me, JavaScript is the portable assembly of the web, fulfilling the same role as C does in the rest of the world. It is portable, widely used, and it will stay. But I do not want to write JavaScript, no more than I want to write assembly. The reasons that I do not want to use JavaScript as a language include:
There is no static analysis, it does not even check if functions are called with the right number of arguments. Typos bite!
There is no or a very limited concept of libraries, namespaces, modules, classes, therefore every framework invents their own (a similar situation to that of R5RS Scheme).
The tooling (code editors, debuggers, profilers) is rather poor, and most of it because of (1) and (2): JavaScript is not amenable to static analysis.
There is no or a very limited standard library.
There are many rough edges and a bias to using mutation. JavaScript is a poorly designed language even in the untyped family (I prefer Scheme).
We are trying to address all of these issues in WebSharper. For example, WebSharper in Visual Studio has code completion, even when it exposes third-party JavaScript APIs, like Ext Js. But whether we have or will succeed or fail is not really the point. The point is that it is possible, and, I would hope, very desirable to address these issues.
Worse still if you don't know what's
going on under the covers you run the
risk of unintended consequences. E.g.
how do you know GWT is creating
closures and managing namespaces
correctly?
This is just about writing the compiler the right way. WebSharper, for instance, maps F# lambdas to JavaScript lambdas in a 1-1 manner (in fact, it never introduces a lambda). I would perhaps accept your argument if you mentioned that, say, WebSharper is not yet mature and tested enough and therefore you are hesitant to trust it. But then GWT has been around for a while and should produce correct code.
The bottom line is that strongly typed languages are strictly better than untyped languages - you can easily write untyped code in them if you need to, but you have the option of using the type-checker, which is the programmer's spell-checker. Why wouldn't you? A refusal to do so sounds a bit luddite to me.
Although, I don't personally favor one style over another, I don't think that abstraction from Javascript is the only benefit that these frameworks bring to the table. Surely, in abstracting the entire language, there will be things that become impossible that were previously possible, and vice-versa. The decision to go with a framework such as GWT over writing vanilla JavaScript depends on many factors.
Making this a discussion of JavaScript vs language X is fruitless as each language has its strengths and weaknesses. Instead, do an objective cost-benefit analysis on what is to be gained or lost by going with such a framework, and that can only be done by you and not the SO community unfortunately.
The issue of not knowing what goes on under the hood applies to JavaScript just as much as it does to any translated source. How many people do you think would know exactly what is going on in jQuery when they try to do a comparison such as $("p") == $("p") and get back false as a result. This is not a hypothetical situation and there are several questions on SO regarding the same. It takes time to learn a language or framework, and given sufficient time, developers could just as well understand the compiled source of these frameworks.
Another related aspect to the above question is of trust. We continuously build higher level abstractions upon lower level abstractions, and rely on the fact that the lower level stuff is supposed to work as expected. What was the last time you dug down into the compiled binary of a C++ or Java program just to ensure that it worked correctly? We don't because we trust the compiler.
Moreover, when using such a framework, there is no shame in falling back to JavaScript using JSNI, for example. It's all about solving the problem in the best possible manner with the tools at hand. There is nothing sacred about JavaScript, or Java, or C#, or Ruby, etc. They are all tools for solving problems, and while it may be a barrier for you, it might be a real time-saver and advantageous to someone else.
As for where I think web programming is headed, there are many interesting trends that I think or rather hope will succeed such as JavaScript on the server side. It solves very real problems for me at least in that we can avoid code duplication easily in a web application. Same validations, logic, etc. can be shared on the client and server sides. It also allows for writing a simple (de)serialization mechanism so RPC or RMI communication becomes possible very easily. I mean it would be really nice to be able to write:
account.debit(200);
on the client side, instead of:
$.ajax({
url: "/account",
data: { amount: 200 },
success: function(data) {
..
}
error: function() {
..
}
});
Finally, it's great that we have all this diversity in frameworks and solutions for building web applications as the next generation of solutions can learn from the failures of each and focus on their successes to build even better, faster, and more awesome tools.
I have three big practical issues I have with websharper and other compilers that claim to avoid the pain of Javascript.
If you won’t know Javascript well you can’t understand most of the examples on the web of using the DOM/ExtJs etc., so you have to learn Javascript whatever. (For the some reason all F# programmers must be able to at least read C# or VB.NET otherwise they cannot access most information about the .net framework)
On any web project you need a few web experts that know the DOM and CSS inside out; would such a person be willing to work with F# rather than Javascript?
Being tied into the provider of the compiler, will they be about in 5 years’ time; I want full open source or the tools to be supported by Microsoft.
The 4 big positives I see with these frameworks are:
Shareing code between the server/client
Having fewer languages a programmer needs to know (javascript is a real pain as it looks like Jave/C# but is not anything like them)
The average quality of a F# programmer is a lot better than a jscript programmer.
My opinion for what it's worth is that every framework has its pros/cons and a project team should evaluate their use cases before including one. To me any framework is just a tool to be used to solve a problem, and you should pick the best one for the job.
I prefer to stick to pure JavaScript solutions myself, but that being said I can think of a few cases where GWT would be helpful.
GWT would allow a team to share code between the server/client, reducing the need to write the same code twice (JS and Java). Or if someone was porting a Java client to a web UI, they may find it easier to stick to GWT ( of course then again it may make it harder :-) ).
I know this is a gross over-simplification, because there are many other things that frameworks like GWT offer, but here is how I view it: if you like JavaScript, write JavaScript; if you don't, use GWT or Cappuccino or whatever.
The reason people use frameworks like GWT is not necessarily the abstraction that they give--you can have that with JavaScript frameworks like ExtJS--but rather the fact that they allow you to write web applications in something other than JavaScript. If I were a Java programmer who wanted to write a web application, I would use GWT because I would not have to learn a new language.
It's all preference, really. I prefer to write JavaScript, but many people don't.

What is Node.js? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I don't fully get what Node.js is all about. Maybe it's because I am mainly a web based business application developer. What is it and what is the use of it?
My understanding so far is that:
The programming model is event driven, especially the way it handles I/O.
It uses JavaScript and the parser is V8.
It can be easily used to create concurrent server applications.
Are my understandings correct? If yes, then what are the benefits of evented I/O, is it just more for the concurrency stuff? Also, is the direction of Node.js to become a framework like, JavaScript based (V8 based) programming model?
I use Node.js at work, and find it to be very powerful. Forced to choose one word to describe Node.js, I'd say "interesting" (which is not a purely positive adjective). The community is vibrant and growing. JavaScript, despite its oddities can be a great language to code in. And you will daily rethink your own understanding of "best practice" and the patterns of well-structured code. There's an enormous energy of ideas flowing into Node.js right now, and working in it exposes you to all this thinking - great mental weightlifting.
Node.js in production is definitely possible, but far from the "turn-key" deployment seemingly promised by the documentation. With Node.js v0.6.x, "cluster" has been integrated into the platform, providing one of the essential building blocks, but my "production.js" script is still ~150 lines of logic to handle stuff like creating the log directory, recycling dead workers, etc. For a "serious" production service, you also need to be prepared to throttle incoming connections and do all the stuff that Apache does for PHP. To be fair, Ruby on Rails has this exact problem. It is solved via two complementary mechanisms: 1) Putting Ruby on Rails/Node.js behind a dedicated webserver (written in C and tested to hell and back) like Nginx (or Apache / Lighttd). The webserver can efficiently serve static content, access logging, rewrite URLs, terminate SSL, enforce access rules, and manage multiple sub-services. For requests that hit the actual node service, the webserver proxies the request through. 2) Using a framework like Unicorn that will manage the worker processes, recycle them periodically, etc. I've yet to find a Node.js serving framework that seems fully baked; it may exist, but I haven't found it yet and still use ~150 lines in my hand-rolled "production.js".
Reading frameworks like Express makes it seem like the standard practice is to just serve everything through one jack-of-all-trades Node.js service ... "app.use(express.static(__dirname + '/public'))". For lower-load services and development, that's probably fine. But as soon as you try to put big time load on your service and have it run 24/7, you'll quickly discover the motivations that push big sites to have well baked, hardened C-code like Nginx fronting their site and handling all of the static content requests (...until you set up a CDN, like Amazon CloudFront)). For a somewhat humorous and unabashedly negative take on this, see this guy.
Node.js is also finding more and more non-service uses. Even if you are using something else to serve web content, you might still use Node.js as a build tool, using npm modules to organize your code, Browserify to stitch it into a single asset, and uglify-js to minify it for deployment. For dealing with the web, JavaScript is a perfect impedance match and frequently that makes it the easiest route of attack. For example, if you want to grovel through a bunch of JSON response payloads, you should use my underscore-CLI module, the utility-belt of structured data.
Pros / Cons:
Pro: For a server guy, writing JavaScript on the backend has been a "gateway drug" to learning modern UI patterns. I no longer dread writing client code.
Pro: Tends to encourage proper error checking (err is returned by virtually all callbacks, nagging the programmer to handle it; also, async.js and other libraries handle the "fail if any of these subtasks fails" paradigm much better than typical synchronous code)
Pro: Some interesting and normally hard tasks become trivial - like getting status on tasks in flight, communicating between workers, or sharing cache state
Pro: Huge community and tons of great libraries based on a solid package manager (npm)
Con: JavaScript has no standard library. You get so used to importing functionality that it feels weird when you use JSON.parse or some other build in method that doesn't require adding an npm module. This means that there are five versions of everything. Even the modules included in the Node.js "core" have five more variants should you be unhappy with the default implementation. This leads to rapid evolution, but also some level of confusion.
Versus a simple one-process-per-request model (LAMP):
Pro: Scalable to thousands of active connections. Very fast and very efficient. For a web fleet, this could mean a 10X reduction in the number of boxes required versus PHP or Ruby
Pro: Writing parallel patterns is easy. Imagine that you need to fetch three (or N) blobs from Memcached. Do this in PHP ... did you just write code the fetches the first blob, then the second, then the third? Wow, that's slow. There's a special PECL module to fix that specific problem for Memcached, but what if you want to fetch some Memcached data in parallel with your database query? In Node.js, because the paradigm is asynchronous, having a web request do multiple things in parallel is very natural.
Con: Asynchronous code is fundamentally more complex than synchronous code, and the up-front learning curve can be hard for developers without a solid understanding of what concurrent execution actually means. Still, it's vastly less difficult than writing any kind of multithreaded code with locking.
Con: If a compute-intensive request runs for, for example, 100 ms, it will stall processing of other requests that are being handled in the same Node.js process ... AKA, cooperative-multitasking. This can be mitigated with the Web Workers pattern (spinning off a subprocess to deal with the expensive task). Alternatively, you could use a large number of Node.js workers and only let each one handle a single request concurrently (still fairly efficient because there is no process recycle).
Con: Running a production system is MUCH more complicated than a CGI model like Apache + PHP, Perl, Ruby, etc. Unhandled exceptions will bring down the entire process, necessitating logic to restart failed workers (see cluster). Modules with buggy native code can hard-crash the process. Whenever a worker dies, any requests it was handling are dropped, so one buggy API can easily degrade service for other cohosted APIs.
Versus writing a "real" service in Java / C# / C (C? really?)
Pro: Doing asynchronous in Node.js is easier than doing thread-safety anywhere else and arguably provides greater benefit. Node.js is by far the least painful asynchronous paradigm I've ever worked in. With good libraries, it is only slightly harder than writing synchronous code.
Pro: No multithreading / locking bugs. True, you invest up front in writing more verbose code that expresses a proper asynchronous workflow with no blocking operations. And you need to write some tests and get the thing to work (it is a scripting language and fat fingering variable names is only caught at unit-test time). BUT, once you get it to work, the surface area for heisenbugs -- strange problems that only manifest once in a million runs -- that surface area is just much much lower. The taxes writing Node.js code are heavily front-loaded into the coding phase. Then you tend to end up with stable code.
Pro: JavaScript is much more lightweight for expressing functionality. It's hard to prove this with words, but JSON, dynamic typing, lambda notation, prototypal inheritance, lightweight modules, whatever ... it just tends to take less code to express the same ideas.
Con: Maybe you really, really like coding services in Java?
For another perspective on JavaScript and Node.js, check out From Java to Node.js, a blog post on a Java developer's impressions and experiences learning Node.js.
Modules
When considering node, keep in mind that your choice of JavaScript libraries will DEFINE your experience. Most people use at least two, an asynchronous pattern helper (Step, Futures, Async), and a JavaScript sugar module (Underscore.js).
Helper / JavaScript Sugar:
Underscore.js - use this. Just do it. It makes your code nice and readable with stuff like _.isString(), and _.isArray(). I'm not really sure how you could write safe code otherwise. Also, for enhanced command-line-fu, check out my own Underscore-CLI.
Asynchronous Pattern Modules:
Step - a very elegant way to express combinations of serial and parallel actions. My personal reccomendation. See my post on what Step code looks like.
Futures - much more flexible (is that really a good thing?) way to express ordering through requirements. Can express things like "start a, b, c in parallel. When A, and B finish, start AB. When A, and C finish, start AC." Such flexibility requires more care to avoid bugs in your workflow (like never calling the callback, or calling it multiple times). See Raynos's post on using futures (this is the post that made me "get" futures).
Async - more traditional library with one method for each pattern. I started with this before my religious conversion to step and subsequent realization that all patterns in Async could be expressed in Step with a single more readable paradigm.
TameJS - Written by OKCupid, it's a precompiler that adds a new language primative "await" for elegantly writing serial and parallel workflows. The pattern looks amazing, but it does require pre-compilation. I'm still making up my mind on this one.
StreamlineJS - competitor to TameJS. I'm leaning toward Tame, but you can make up your own mind.
Or to read all about the asynchronous libraries, see this panel-interview with the authors.
Web Framework:
Express Great Ruby on Rails-esk framework for organizing web sites. It uses JADE as a XML/HTML templating engine, which makes building HTML far less painful, almost elegant even.
jQuery While not technically a node module, jQuery is quickly becoming a de-facto standard for client-side user interface. jQuery provides CSS-like selectors to 'query' for sets of DOM elements that can then be operated on (set handlers, properties, styles, etc). Along the same vein, Twitter's Bootstrap CSS framework, Backbone.js for an MVC pattern, and Browserify.js to stitch all your JavaScript files into a single file. These modules are all becoming de-facto standards so you should at least check them out if you haven't heard of them.
Testing:
JSHint - Must use; I didn't use this at first which now seems incomprehensible. JSLint adds back a bunch of the basic verifications you get with a compiled language like Java. Mismatched parenthesis, undeclared variables, typeos of many shapes and sizes. You can also turn on various forms of what I call "anal mode" where you verify style of whitespace and whatnot, which is OK if that's your cup of tea -- but the real value comes from getting instant feedback on the exact line number where you forgot a closing ")" ... without having to run your code and hit the offending line. "JSHint" is a more-configurable variant of Douglas Crockford's JSLint.
Mocha competitor to Vows which I'm starting to prefer. Both frameworks handle the basics well enough, but complex patterns tend to be easier to express in Mocha.
Vows Vows is really quite elegant. And it prints out a lovely report (--spec) showing you which test cases passed / failed. Spend 30 minutes learning it, and you can create basic tests for your modules with minimal effort.
Zombie - Headless testing for HTML and JavaScript using JSDom as a virtual "browser". Very powerful stuff. Combine it with Replay to get lightning fast deterministic tests of in-browser code.
A comment on how to "think about" testing:
Testing is non-optional. With a dynamic language like JavaScript, there are very few static checks. For example, passing two parameters to a method that expects 4 won't break until the code is executed. Pretty low bar for creating bugs in JavaScript. Basic tests are essential to making up the verification gap with compiled languages.
Forget validation, just make your code execute. For every method, my first validation case is "nothing breaks", and that's the case that fires most often. Proving that your code runs without throwing catches 80% of the bugs and will do so much to improve your code confidence that you'll find yourself going back and adding the nuanced validation cases you skipped.
Start small and break the inertial barrier. We are all lazy, and pressed for time, and it's easy to see testing as "extra work". So start small. Write test case 0 - load your module and report success. If you force yourself to do just this much, then the inertial barrier to testing is broken. That's <30 min to do it your first time, including reading the documentation. Now write test case 1 - call one of your methods and verify "nothing breaks", that is, that you don't get an error back. Test case 1 should take you less than one minute. With the inertia gone, it becomes easy to incrementally expand your test coverage.
Now evolve your tests with your code. Don't get intimidated by what the "correct" end-to-end test would look like with mock servers and all that. Code starts simple and evolves to handle new cases; tests should too. As you add new cases and new complexity to your code, add test cases to exercise the new code. As you find bugs, add verifications and / or new cases to cover the flawed code. When you are debugging and lose confidence in a piece of code, go back and add tests to prove that it is doing what you think it is. Capture strings of example data (from other services you call, websites you scrape, whatever) and feed them to your parsing code. A few cases here, improved validation there, and you will end up with highly reliable code.
Also, check out the official list of recommended Node.js modules. However, GitHub's Node Modules Wiki is much more complete and a good resource.
To understand Node, it's helpful to consider a few of the key design choices:
Node.js is EVENT BASED and ASYNCHRONOUS / NON-BLOCKING. Events, like an incoming HTTP connection will fire off a JavaScript function that does a little bit of work and kicks off other asynchronous tasks like connecting to a database or pulling content from another server. Once these tasks have been kicked off, the event function finishes and Node.js goes back to sleep. As soon as something else happens, like the database connection being established or the external server responding with content, the callback functions fire, and more JavaScript code executes, potentially kicking off even more asynchronous tasks (like a database query). In this way, Node.js will happily interleave activities for multiple parallel workflows, running whatever activities are unblocked at any point in time. This is why Node.js does such a great job managing thousands of simultaneous connections.
Why not just use one process/thread per connection like everyone else? In Node.js, a new connection is just a very small heap allocation. Spinning up a new process takes significantly more memory, a megabyte on some platforms. But the real cost is the overhead associated with context-switching. When you have 10^6 kernel threads, the kernel has to do a lot of work figuring out who should execute next. A bunch of work has gone into building an O(1) scheduler for Linux, but in the end, it's just way way more efficient to have a single event-driven process than 10^6 processes competing for CPU time. Also, under overload conditions, the multi-process model behaves very poorly, starving critical administration and management services, especially SSHD (meaning you can't even log into the box to figure out how screwed it really is).
Node.js is SINGLE THREADED and LOCK FREE. Node.js, as a very deliberate design choice only has a single thread per process. Because of this, it's fundamentally impossible for multiple threads to access data simultaneously. Thus, no locks are needed. Threads are hard. Really really hard. If you don't believe that, you haven't done enough threaded programming. Getting locking right is hard and results in bugs that are really hard to track down. Eliminating locks and multi-threading makes one of the nastiest classes of bugs just go away. This might be the single biggest advantage of node.
But how do I take advantage of my 16 core box?
Two ways:
For big heavy compute tasks like image encoding, Node.js can fire up child processes or send messages to additional worker processes. In this design, you'd have one thread managing the flow of events and N processes doing heavy compute tasks and chewing up the other 15 CPUs.
For scaling throughput on a webservice, you should run multiple Node.js servers on one box, one per core, using cluster (With Node.js v0.6.x, the official "cluster" module linked here replaces the learnboost version which has a different API). These local Node.js servers can then compete on a socket to accept new connections, balancing load across them. Once a connection is accepted, it becomes tightly bound to a single one of these shared processes. In theory, this sounds bad, but in practice it works quite well and allows you to avoid the headache of writing thread-safe code. Also, this means that Node.js gets excellent CPU cache affinity, more effectively using memory bandwidth.
Node.js lets you do some really powerful things without breaking a sweat. Suppose you have a Node.js program that does a variety of tasks, listens on a TCP port for commands, encodes some images, whatever. With five lines of code, you can add in an HTTP based web management portal that shows the current status of active tasks. This is EASY to do:
var http = require('http');
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end(myJavascriptObject.getSomeStatusInfo());
}).listen(1337, "127.0.0.1");
Now you can hit a URL and check the status of your running process. Add a few buttons, and you have a "management portal". If you have a running Perl / Python / Ruby script, just "throwing in a management portal" isn't exactly simple.
But isn't JavaScript slow / bad / evil / spawn-of-the-devil? JavaScript has some weird oddities, but with "the good parts" there's a very powerful language there, and in any case, JavaScript is THE language on the client (browser). JavaScript is here to stay; other languages are targeting it as an IL, and world class talent is competing to produce the most advanced JavaScript engines. Because of JavaScript's role in the browser, an enormous amount of engineering effort is being thrown at making JavaScript blazing fast. V8 is the latest and greatest javascript engine, at least for this month. It blows away the other scripting languages in both efficiency AND stability (looking at you, Ruby). And it's only going to get better with huge teams working on the problem at Microsoft, Google, and Mozilla, competing to build the best JavaScript engine (It's no longer a JavaScript "interpreter" as all the modern engines do tons of JIT compiling under the hood with interpretation only as a fallback for execute-once code). Yeah, we all wish we could fix a few of the odder JavaScript language choices, but it's really not that bad. And the language is so darn flexible that you really aren't coding JavaScript, you are coding Step or jQuery -- more than any other language, in JavaScript, the libraries define the experience. To build web applications, you pretty much have to know JavaScript anyway, so coding with it on the server has a sort of skill-set synergy. It has made me not dread writing client code.
Besides, if you REALLY hate JavaScript, you can use syntactic sugar like CoffeeScript. Or anything else that creates JavaScript code, like Google Web Toolkit (GWT).
Speaking of JavaScript, what's a "closure"? - Pretty much a fancy way of saying that you retain lexically scoped variables across call chains. ;) Like this:
var myData = "foo";
database.connect( 'user:pass', function myCallback( result ) {
database.query("SELECT * from Foo where id = " + myData);
} );
// Note that doSomethingElse() executes _BEFORE_ "database.query" which is inside a callback
doSomethingElse();
See how you can just use "myData" without doing anything awkward like stashing it into an object? And unlike in Java, the "myData" variable doesn't have to be read-only. This powerful language feature makes asynchronous-programming much less verbose and less painful.
Writing asynchronous code is always going to be more complex than writing a simple single-threaded script, but with Node.js, it's not that much harder and you get a lot of benefits in addition to the efficiency and scalability to thousands of concurrent connections...
Locked. There are disputes about this answer’s content being resolved at this time. It is not currently accepting new interactions.
I think the advantages are:
Web development in a dynamic language (JavaScript) on a VM that is incredibly fast (V8). It is much faster than Ruby, Python, or Perl.
Ability to handle thousands of concurrent connections with minimal overhead on a single process.
JavaScript is perfect for event loops with first class function objects and closures. People already know how to use it this way having used it in the browser to respond to user initiated events.
A lot of people already know JavaScript, even people who do not claim to be programmers. It is arguably the most popular programming language.
Using JavaScript on a web server as well as the browser reduces the impedance mismatch between the two programming environments which can communicate data structures via JSON that work the same on both sides of the equation. Duplicate form validation code can be shared between server and client, etc.
V8 is an implementation of JavaScript. It lets you run standalone JavaScript applications (among other things).
Node.js is simply a library written for V8 which does evented I/O. This concept is a bit trickier to explain, and I'm sure someone will answer with a better explanation than I... The gist is that rather than doing some input or output and waiting for it to happen, you just don't wait for it to finish. So for example, ask for the last edited time of a file:
// Pseudo code
stat( 'somefile' )
That might take a couple of milliseconds, or it might take seconds. With evented I/O you simply fire off the request and instead of waiting around you attach a callback that gets run when the request finishes:
// Pseudo code
stat( 'somefile', function( result ) {
// Use the result here
} );
// ...more code here
This makes it a lot like JavaScript code in the browser (for example, with Ajax style functionality).
For more information, you should check out the article Node.js is genuinely exciting which was my introduction to the library/platform... I found it quite good.
Node.js is an open source command line tool built for the server side JavaScript code. You can download a tarball, compile and install the source. It lets you run JavaScript programs.
The JavaScript is executed by the V8, a JavaScript engine developed by Google which is used in Chrome browser. It uses a JavaScript API to access the network and file system.
It is popular for its performance and the ability to perform parallel operations.
Understanding node.js is the best explanation of node.js I have found so far.
Following are some good articles on the topic.
Learning Server-Side JavaScript with Node.js
This Time, You’ll Learn Node.js
The closures are a way to execute code in the context it was created in.
What this means for concurency is that you can define variables, then initiate a nonblocking I/O function, and send it an anonymous function for its callback.
When the task is complete, the callback function will execute in the context with the variables, this is the closure.
The reason closures are so good for writing applications with nonblocking I/O is that it's very easy to manage the context of functions executing asynchronously.
Two good examples are regarding how you manage templates and use progressive enhancements with it. You just need a few lightweight pieces of JavaScript code to make it work perfectly.
I strongly recommend that you watch and read these articles:
Video glass node
Node.js YUI DOM manipulation
Pick up any language and try to remember how you would manage your HTML file templates and what you had to do to update a single CSS class name in your DOM structure (for instance, a user clicked on a menu item and you want that marked as "selected" and update the content of the page).
With Node.js it is as simple as doing it in client-side JavaScript code. Get your DOM node and apply your CSS class to that. Get your DOM node and innerHTML your content (you will need some additional JavaScript code to do this. Read the article to know more).
Another good example, is that you can make your web page compatible both with JavaScript turned on or off with the same piece of code. Imagine you have a date selection made in JavaScript that would allow your users to pick up any date using a calendar. You can write (or use) the same piece of JavaScript code to make it work with your JavaScript turned ON or OFF.
There is a very good fast food place analogy that best explains the event driven model of Node.js, see the full article, Node.js, Doctor’s Offices and Fast Food Restaurants – Understanding Event-driven Programming
Here is a summary:
If the fast food joint followed a traditional thread-based model, you'd order your food and wait in line until you received it. The person behind you wouldn't be able to order until your order was done. In an event-driven model, you order your food and then get out of line to wait. Everyone else is then free to order.
Node.js is event-driven, but most web servers are thread-based.York explains how Node.js works:
You use your web browser to make a request for "/about.html" on a
Node.js web server.
The Node.js server accepts your request and calls a function to retrieve
that file from disk.
While the Node.js server is waiting for the file to be retrieved, it
services the next web request.
When the file is retrieved, there is a callback function that is
inserted in the Node.js servers queue.
The Node.js server executes that function which in this case would
render the "/about.html" page and send it back to your web browser."
Well, I understand that
Node's goal is to provide an easy way
to build scalable network programs.
Node is similar in design to and influenced by systems like Ruby's Event Machine or Python's Twisted.
Evented I/O for V8 javascript.
For me that means that you were correct in all three assumptions. The library sure looks promising!
Also, do not forget to mention that Google's V8 is VERY fast. It actually converts the JavaScript code to machine code with the matched performance of compiled binary. So along with all the other great things, it's INSANELY fast.
Q: The programming model is event driven, especially the way it handles I/O.
Correct. It uses call-backs, so any request to access the file system would cause a request to be sent to the file system and then Node.js would start processing its next request. It would only worry about the I/O request once it gets a response back from the file system, at which time it will run the callback code. However, it is possible to make synchronous I/O requests (that is, blocking requests). It is up to the developer to choose between asynchronous (callbacks) or synchronous (waiting).
Q: It uses JavaScript and the parser is V8.
Yes
Q: It can be easily used to create concurrent server applications.
Yes, although you'd need to hand-code quite a lot of JavaScript. It might be better to look at a framework, such as http://www.easynodejs.com/ - which comes with full online documentation and a sample application.

Categories