Node modules undefined in my webpack bundled script - javascript

I am making a web app which, based on a jped picture, recognizes characters and renders it in an interactive interface for the user - this includes some async code. There are 4 js scripts file, which all require npm modules, and an html view.
In order to test the app client-side, I have decided to bundle the scripts together.
It shows the following error message:
Uncaught ReferenceError: require is not defined
List of my npm modules whose code returns this error at run time:
isexe: requires fs
destroy: fs
tesseractocr: child_process, fs
I have tried:
browserify my scripts into a bundle, but I read that it would not work with async functions ;
webpack the scripts into a bundle, but Node modules like fs and child_process are returning 'undefined' ;
adding a specific Node module, child-process-ctor, to force child_process to the included
Alas, the same error message is returned.
Questions:
Is bundling the scripts the right approach?
Is the problem that webpack does not transpile fs and child_process correctly?
Which possible solutions should I consider?
Thanks all. This is my 1st question on SO -- any feedback is much welcome!
PS: This might be redundant with Using module "child_process" without Webpack.

Okay this answer is a follow up to my comments, which answer the question more directly. However here I'll go into more detail than is probably necessary, but it will thoroughly answer what you asked. Plus it's educational and I'd say it's pretty fun once you start really digging into it :D
To start at the beginning. As the internet in its early days became more advanced the need for a type of "front end logic" increased and Netscape's response to this demand was to birth a competitive, cutting edge programming language in record time.
And by record time I mean 10 days, and by competitive I mean barely functional.
That's right Javascript was born in 10 days (literally). As you can imagine it was a pretty poor language, but it worked well enough that people started using it.
Because it was the programming language of the internet, and because of how fast the internet grew, enough people started to use it that the thought of removing it became scary.
If you changed it you would destroy backward compatibility with millions of websites. The other idea would be to keep it, but also implement a new standard. However it would be hard to justify this because javascript already took a lot of work to upkeep, upkeeping multiple standards would be a nightmare (cough... flash cough).
Javascipt was easy enough for "new" programmers to learn, but the problem was javascript's only 1 language in a world where php, ruby, mySql, Mongo, Css, Html all rule as dominant kings in their respective kingdoms.
So someone thought it was a good idea to move javascript to the server and thus node.js was born.
However for javascript to mean anything on the server it had to be able to do things that you wouldn't want it to be able to do in your browser. For example, scan your hard drive and edit your files.
If every website you visit could start scanning and uploading everything in your system well....
However if your server software can't edit or read files you need it to well....
You get the idea. It's the same language, but because of security issues node.js has some differences. Mainly the modules it's allowed to use.
Now for the fun part. Can you run node.js client side in a browser. Technically yes. In fact now that we're dumping entire operating systems into a subset of javascript called asm.js there really isn't anything javascript can't do with enough processing power.
However even if you dump the entire node.js engine (which is basically a stripped down version of chrome) into asm.js you would still have the same security limitations placed by the "Host" browser and so your modules could only run within the sandbox the browser provides.
So it is technically just a browser within another browser Running at half the speed with the same security limitations.
Is it something I would recommend doing? Of course not.
Is it something that people haven't already tried before? Of course not.
So I hope that helps answer your question.

Related

Confused about nodes purpose [duplicate]

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am new to this kind of stuff, but lately I've been hearing a lot about how good Node.js is. Considering how much I love working with jQuery and JavaScript in general, I can't help but wonder how to decide when to use Node.js. The web application I have in mind is something like Bitly - takes some content, archives it.
From all the homework I have been doing in the last few days, I obtained the following information. Node.js
is a command-line tool that can be run as a regular web server and lets one run JavaScript programs
utilizes the great V8 JavaScript engine
is very good when you need to do several things at the same time
is event-based so all the wonderful Ajax-like stuff can be done on the server side
lets us share code between the browser and the backend
lets us talk with MySQL
Some of the sources that I have come across are:
Diving into Node.js – Introduction and Installation
Understanding NodeJS
Node by Example (Archive.is)
Let’s Make a Web App: NodePad
Considering that Node.js can be run almost out-of-the-box on Amazon's EC2 instances, I am trying to understand what type of problems require Node.js as opposed to any of the mighty kings out there like PHP, Python and Ruby. I understand that it really depends on the expertise one has on a language, but my question falls more into the general category of: When to use a particular framework and what type of problems is it particularly suited for?
You did a great job of summarizing what's awesome about Node.js. My feeling is that Node.js is especially suited for applications where you'd like to maintain a persistent connection from the browser back to the server. Using a technique known as "long-polling", you can write an application that sends updates to the user in real time. Doing long polling on many of the web's giants, like Ruby on Rails or Django, would create immense load on the server, because each active client eats up one server process. This situation amounts to a tarpit attack. When you use something like Node.js, the server has no need of maintaining separate threads for each open connection.
This means you can create a browser-based chat application in Node.js that takes almost no system resources to serve a great many clients. Any time you want to do this sort of long-polling, Node.js is a great option.
It's worth mentioning that Ruby and Python both have tools to do this sort of thing (eventmachine and twisted, respectively), but that Node.js does it exceptionally well, and from the ground up. JavaScript is exceptionally well situated to a callback-based concurrency model, and it excels here. Also, being able to serialize and deserialize with JSON native to both the client and the server is pretty nifty.
I look forward to reading other answers here, this is a fantastic question.
It's worth pointing out that Node.js is also great for situations in which you'll be reusing a lot of code across the client/server gap. The Meteor framework makes this really easy, and a lot of folks are suggesting this might be the future of web development. I can say from experience that it's a whole lot of fun to write code in Meteor, and a big part of this is spending less time thinking about how you're going to restructure your data, so the code that runs in the browser can easily manipulate it and pass it back.
Here's an article on Pyramid and long-polling, which turns out to be very easy to set up with a little help from gevent: TicTacToe and Long Polling with Pyramid.
I believe Node.js is best suited for real-time applications: online games, collaboration tools, chat rooms, or anything where what one user (or robot? or sensor?) does with the application needs to be seen by other users immediately, without a page refresh.
I should also mention that Socket.IO in combination with Node.js will reduce your real-time latency even further than what is possible with long polling. Socket.IO will fall back to long polling as a worst case scenario, and instead use web sockets or even Flash if they are available.
But I should also mention that just about any situation where the code might block due to threads can be better addressed with Node.js. Or any situation where you need the application to be event-driven.
Also, Ryan Dahl said in a talk that I once attended that the Node.js benchmarks closely rival Nginx for regular old HTTP requests. So if we build with Node.js, we can serve our normal resources quite effectively, and when we need the event-driven stuff, it's ready to handle it.
Plus it's all JavaScript all the time. Lingua Franca on the whole stack.
Reasons to use NodeJS:
It runs Javascript, so you can use the same language on server and client, and even share some code between them (e.g. for form validation, or to render views at either end.)
The single-threaded event-driven system is fast even when handling lots of requests at once, and also simple, compared to traditional multi-threaded Java or ROR frameworks.
The ever-growing pool of packages accessible through NPM, including client and server-side libraries/modules, as well as command-line tools for web development. Most of these are conveniently hosted on github, where sometimes you can report an issue and find it fixed within hours! It's nice to have everything under one roof, with standardized issue reporting and easy forking.
It has become the defacto standard environment in which to run Javascript-related tools and other web-related tools, including task runners, minifiers, beautifiers, linters, preprocessors, bundlers and analytics processors.
It seems quite suitable for prototyping, agile development and rapid product iteration.
Reasons not to use NodeJS:
It runs Javascript, which has no compile-time type checking. For large, complex safety-critical systems, or projects including collaboration between different organizations, a language which encourages contractual interfaces and provides static type checking may save you some debugging time (and explosions) in the long run. (Although the JVM is stuck with null, so please use Haskell for your nuclear reactors.)
Added to that, many of the packages in NPM are a little raw, and still under rapid development. Some libraries for older frameworks have undergone a decade of testing and bugfixing, and are very stable by now. Npmjs.org has no mechanism to rate packages, which has lead to a proliferation of packages doing more or less the same thing, out of which a large percentage are no longer maintained.
Nested callback hell. (Of course there are 20 different solutions to this...)
The ever-growing pool of packages can make one NodeJS project appear radically different from the next. There is a large diversity in implementations due to the huge number of options available (e.g. Express/Sails.js/Meteor/Derby). This can sometimes make it harder for a new developer to jump in on a Node project. Contrast that with a Rails developer joining an existing project: he should be able to get familiar with the app pretty quickly, because all Rails apps are encouraged to use a similar structure.
Dealing with files can be a bit of a pain. Things that are trivial in other languages, like reading a line from a text file, are weird enough to do with Node.js that there's a StackOverflow question on that with 80+ upvotes. There's no simple way to read one record at a time from a CSV file. Etc.
I love NodeJS, it is fast and wild and fun, but I am concerned it has little interest in provable-correctness. Let's hope we can eventually merge the best of both worlds. I am eager to see what will replace Node in the future... :)
To make it short:
Node.js is well suited for applications that have a lot of concurrent connections and each request only needs very few CPU cycles, because the event loop (with all the other clients) is blocked during execution of a function.
A good article about the event loop in Node.js is Mixu's tech blog: Understanding the node.js event loop.
I have one real-world example where I have used Node.js. The company where I work got one client who wanted to have a simple static HTML website. This website is for selling one item using PayPal and the client also wanted to have a counter which shows the amount of sold items. Client expected to have huge amount of visitors to this website. I decided to make the counter using Node.js and the Express.js framework.
The Node.js application was simple. Get the sold items amount from a Redis database, increase the counter when item is sold and serve the counter value to users via the API.
Some reasons why I chose to use Node.js in this case
It is very lightweight and fast. There has been over 200000 visits on this website in three weeks and minimal server resources has been able to handle it all.
The counter is really easy to make to be real time.
Node.js was easy to configure.
There are lots of modules available for free. For example, I found a Node.js module for PayPal.
In this case, Node.js was an awesome choice.
The most important reasons to start your next project using Node ...
All the coolest dudes are into it ... so it must be fun.
You can hangout at the cooler and have lots of Node adventures to brag about.
You're a penny pincher when it comes to cloud hosting costs.
Been there done that with Rails
You hate IIS deployments
Your old IT job is getting rather dull and you wish you were in a shiny new Start Up.
What to expect ...
You'll feel safe and secure with Express without all the server bloatware you never needed.
Runs like a rocket and scales well.
You dream it. You installed it. The node package repo npmjs.org is the largest ecosystem of open source libraries in the world.
Your brain will get time warped in the land of nested callbacks ...
... until you learn to keep your Promises.
Sequelize and Passport are your new API friends.
Debugging mostly async code will get umm ... interesting .
Time for all Noders to master Typescript.
Who uses it?
PayPal, Netflix, Walmart, LinkedIn, Groupon, Uber, GoDaddy, Dow Jones
Here's why they switched to Node.
There is nothing like Silver Bullet. Everything comes with some cost associated with it. It is like if you eat oily food, you will compromise your health and healthy food does not come with spices like oily food. It is individual choice whether they want health or spices as in their food.
Same way Node.js consider to be used in specific scenario. If your app does not fit into that scenario you should not consider it for your app development. I am just putting my thought on the same:
When to use Node.JS
If your server side code requires very few cpu cycles. In other world you are doing non blocking operation and does not have heavy algorithm/Job which consumes lots of CPU cycles.
If you are from Javascript back ground and comfortable in writing Single Threaded code just like client side JS.
When NOT to use Node.JS
Your server request is dependent on heavy CPU consuming algorithm/Job.
Scalability Consideration with Node.JS
Node.JS itself does not utilize all core of underlying system and it is single threaded by default, you have to write logic by your own to utilize multi core processor and make it multi threaded.
Node.JS Alternatives
There are other option to use in place of Node.JS however Vert.x seems to be pretty promising and has lots of additional features like polygot and better scalability considerations.
Another great thing that I think no one has mentioned about Node.js is the amazing community, the package management system (npm) and the amount of modules that exist that you can include by simply including them in your package.json file.
My piece: nodejs is great for making real time systems like analytics, chat-apps, apis, ad servers, etc.
Hell, I made my first chat app using nodejs and socket.io under 2 hours and that too during exam
week!
Edit
Its been several years since I have started using nodejs and I have used it in making many different things including static file servers, simple analytics, chat apps and much more.
This is my take on when to use nodejs
When to use
When making system which put emphasis on concurrency and speed.
Sockets only servers like chat apps, irc apps, etc.
Social networks which put emphasis on realtime resources like geolocation, video stream, audio stream, etc.
Handling small chunks of data really fast like an analytics webapp.
As exposing a REST only api.
When not to use
Its a very versatile webserver so you can use it wherever you want but probably not these places.
Simple blogs and static sites.
Just as a static file server.
Keep in mind that I am just nitpicking. For static file servers, apache is better mainly because it is widely available. The nodejs community has grown larger and more mature over the years and it is safe to say nodejs can be used just about everywhere if you have your own choice of hosting.
It can be used where
Applications that are highly event driven & are heavily I/O bound
Applications handling a large number of connections to other systems
Real-time applications (Node.js was designed from the ground up for real time and to be easy
to use.)
Applications that juggle scads of information streaming to and from other sources
High traffic, Scalable applications
Mobile apps that have to talk to platform API & database, without having to do a lot of data
analytics
Build out networked applications
Applications that need to talk to the back end very often
On Mobile front, prime-time companies have relied on Node.js for their mobile solutions. Check out why?
LinkedIn is a prominent user. Their entire mobile stack is built on Node.js. They went from running 15 servers with 15 instances on each physical machine, to just 4 instances – that can handle double the traffic!
eBay launched ql.io, a web query language for HTTP APIs, which uses Node.js as the runtime stack. They were able to tune a regular developer-quality Ubuntu workstation to handle more than 120,000 active connections per node.js process, with each connection consuming about 2kB memory!
Walmart re-engineered its mobile app to use Node.js and pushed its JavaScript processing to the server.
Read more at: http://www.pixelatingbits.com/a-closer-look-at-mobile-app-development-with-node-js/
Node best for concurrent request handling -
So, Let’s start with a story. From last 2 years I am working on JavaScript and developing web front end and I am enjoying it. Back end guys provide’s us some API’s written in Java,python (we don’t care) and we simply write a AJAX call, get our data and guess what ! we are done. But in real it is not that easy, If data we are getting is not correct or there is some server error then we stuck and we have to contact our back end guys over the mail or chat(sometimes on whatsApp too :).) This is not cool. What if we wrote our API’s in JavaScript and call those API’s from our front end ? Yes that’s pretty cool because if we face any problem in API we can look into it. Guess what ! you can do this now , How ? – Node is there for you.
Ok agreed that you can write your API in JavaScript but what if I am ok with above problem. Do you have any other reason to use node for rest API ?
so here is the magic begins. Yes I do have other reasons to use node for our API’s.
Let’s go back to our traditional rest API system which is based on either blocking operation or threading. Suppose two concurrent request occurs( r1 and r2) , each of them require database operation. So In traditional system what will happens :
1. Waiting Way : Our server starts serving r1 request and waits for query response. after completion of r1 , server starts to serve r2 and does it in same way. So waiting is not a good idea because we don’t have that much time.
2. Threading Way : Our server will creates two threads for both requests r1 and r2 and serve their purpose after querying database so cool its fast.But it is memory consuming because you can see we started two threads also problem increases when both request is querying same data then you have to deal with deadlock kind of issues . So its better than waiting way but still issues are there.
Now here is , how node will do it:
3. Nodeway : When same concurrent request comes in node then it will register an event with its callback and move ahead it will not wait for query response for a particular request.So when r1 request comes then node’s event loop (yes there is an event loop in node which serves this purpose.) register an event with its callback function and move ahead for serving r2 request and similarly register its event with its callback. Whenever any query finishes it triggers its corresponding event and execute its callback to completion without being interrupted.
So no waiting, no threading , no memory consumption – yes this is nodeway for serving rest API.
My one more reason to choose Node.js for a new project is:
Be able to do pure cloud based development
I have used Cloud9 IDE for a while and now I can't imagine without it, it covers all the development lifecycles. All you need is a browser and you can code anytime anywhere on any devices. You don't need to check in code in one Computer(like at home), then checkout in another computer(like at work place).
Of course, there maybe cloud based IDE for other languages or platforms (Cloud 9 IDE is adding supports for other languages as well), but using Cloud 9 to do Node.js developement is really a great experience for me.
One more thing node provides is the ability to create multiple v8 instanes of node using node's child process( childProcess.fork() each requiring 10mb memory as per docs) on the fly, thus not affecting the main process running the server. So offloading a background job that requires huge server load becomes a child's play and we can easily kill them as and when needed.
I've been using node a lot and in most of the apps we build, require server connections at the same time thus a heavy network traffic. Frameworks like Express.js and the new Koajs (which removed callback hell) have made working on node even more easier.
Donning asbestos longjohns...
Yesterday my title with Packt Publications, Reactive Programming with JavaScript. It isn't really a Node.js-centric title; early chapters are intended to cover theory, and later code-heavy chapters cover practice. Because I didn't really think it would be appropriate to fail to give readers a webserver, Node.js seemed by far the obvious choice. The case was closed before it was even opened.
I could have given a very rosy view of my experience with Node.js. Instead I was honest about good points and bad points I encountered.
Let me include a few quotes that are relevant here:
Warning: Node.js and its ecosystem are hot--hot enough to burn you badly!
When I was a teacher’s assistant in math, one of the non-obvious suggestions I was told was not to tell a student that something was “easy.” The reason was somewhat obvious in retrospect: if you tell people something is easy, someone who doesn’t see a solution may end up feeling (even more) stupid, because not only do they not get how to solve the problem, but the problem they are too stupid to understand is an easy one!
There are gotchas that don’t just annoy people coming from Python / Django, which immediately reloads the source if you change anything. With Node.js, the default behavior is that if you make one change, the old version continues to be active until the end of time or until you manually stop and restart the server. This inappropriate behavior doesn’t just annoy Pythonistas; it also irritates native Node.js users who provide various workarounds. The StackOverflow question “Auto-reload of files in Node.js” has, at the time of this writing, over 200 upvotes and 19 answers; an edit directs the user to a nanny script, node-supervisor, with homepage at http://tinyurl.com/reactjs-node-supervisor. This problem affords new users with great opportunity to feel stupid because they thought they had fixed the problem, but the old, buggy behavior is completely unchanged. And it is easy to forget to bounce the server; I have done so multiple times. And the message I would like to give is, “No, you’re not stupid because this behavior of Node.js bit your back; it’s just that the designers of Node.js saw no reason to provide appropriate behavior here. Do try to cope with it, perhaps taking a little help from node-supervisor or another solution, but please don’t walk away feeling that you’re stupid. You’re not the one with the problem; the problem is in Node.js’s default behavior.”
This section, after some debate, was left in, precisely because I don't want to give an impression of “It’s easy.” I cut my hands repeatedly while getting things to work, and I don’t want to smooth over difficulties and set you up to believe that getting Node.js and its ecosystem to function well is a straightforward matter and if it’s not straightforward for you too, you don’t know what you’re doing. If you don’t run into obnoxious difficulties using Node.js, that’s wonderful. If you do, I would hope that you don’t walk away feeling, “I’m stupid—there must be something wrong with me.” You’re not stupid if you experience nasty surprises dealing with Node.js. It’s not you! It’s Node.js and its ecosystem!
The Appendix, which I did not really want after the rising crescendo in the last chapters and the conclusion, talks about what I was able to find in the ecosystem, and provided a workaround for moronic literalism:
Another database that seemed like a perfect fit, and may yet be redeemable, is a server-side implementation of the HTML5 key-value store. This approach has the cardinal advantage of an API that most good front-end developers understand well enough. For that matter, it’s also an API that most not-so-good front-end developers understand well enough. But with the node-localstorage package, while dictionary-syntax access is not offered (you want to use localStorage.setItem(key, value) or localStorage.getItem(key), not localStorage[key]), the full localStorage semantics are implemented, including a default 5MB quota—WHY? Do server-side JavaScript developers need to be protected from themselves?
For client-side database capabilities, a 5MB quota per website is really a generous and useful amount of breathing room to let developers work with it. You could set a much lower quota and still offer developers an immeasurable improvement over limping along with cookie management. A 5MB limit doesn’t lend itself very quickly to Big Data client-side processing, but there is a really quite generous allowance that resourceful developers can use to do a lot. But on the other hand, 5MB is not a particularly large portion of most disks purchased any time recently, meaning that if you and a website disagree about what is reasonable use of disk space, or some site is simply hoggish, it does not really cost you much and you are in no danger of a swamped hard drive unless your hard drive was already too full. Maybe we would be better off if the balance were a little less or a little more, but overall it’s a decent solution to address the intrinsic tension for a client-side context.
However, it might gently be pointed out that when you are the one writing code for your server, you don’t need any additional protection from making your database more than a tolerable 5MB in size. Most developers will neither need nor want tools acting as a nanny and protecting them from storing more than 5MB of server-side data. And the 5MB quota that is a golden balancing act on the client-side is rather a bit silly on a Node.js server. (And, for a database for multiple users such as is covered in this Appendix, it might be pointed out, slightly painfully, that that’s not 5MB per user account unless you create a separate database on disk for each user account; that’s 5MB shared between all user accounts together. That could get painful if you go viral!) The documentation states that the quota is customizable, but an email a week ago to the developer asking how to change the quota is unanswered, as was the StackOverflow question asking the same. The only answer I have been able to find is in the Github CoffeeScript source, where it is listed as an optional second integer argument to a constructor. So that’s easy enough, and you could specify a quota equal to a disk or partition size. But besides porting a feature that does not make sense, the tool’s author has failed completely to follow a very standard convention of interpreting 0 as meaning “unlimited” for a variable or function where an integer is to specify a maximum limit for some resource use. The best thing to do with this misfeature is probably to specify that the quota is Infinity:
if (typeof localStorage === 'undefined' || localStorage === null)
{
var LocalStorage = require('node-localstorage').LocalStorage;
localStorage = new LocalStorage(__dirname + '/localStorage',
Infinity);
}
Swapping two comments in order:
People needlessly shot themselves in the foot constantly using JavaScript as a whole, and part of JavaScript being made respectable language was a Douglas Crockford saying in essence, “JavaScript as a language has some really good parts and some really bad parts. Here are the good parts. Just forget that anything else is there.” Perhaps the hot Node.js ecosystem will grow its own “Douglas Crockford,” who will say, “The Node.js ecosystem is a coding Wild West, but there are some real gems to be found. Here’s a roadmap. Here are the areas to avoid at almost any cost. Here are the areas with some of the richest paydirt to be found in ANY language or environment.”
Perhaps someone else can take those words as a challenge, and follow Crockford’s lead and write up “the good parts” and / or “the better parts” for Node.js and its ecosystem. I’d buy a copy!
And given the degree of enthusiasm and sheer work-hours on all projects, it may be warranted in a year, or two, or three, to sharply temper any remarks about an immature ecosystem made at the time of this writing. It really may make sense in five years to say, “The 2015 Node.js ecosystem had several minefields. The 2020 Node.js ecosystem has multiple paradises.”
If your application mainly tethers web apis, or other io channels, give or take a user interface, node.js may be a fair pick for you, especially if you want to squeeze out the most scalability, or, if your main language in life is javascript (or javascript transpilers of sorts). If you build microservices, node.js is also okay. Node.js is also suitable for any project that is small or simple.
Its main selling point is it allows front-enders take responsibility for back-end stuff rather than the typical divide. Another justifiable selling point is if your workforce is javascript oriented to begin with.
Beyond a certain point however, you cannot scale your code without terrible hacks for forcing modularity, readability and flow control. Some people like those hacks though, especially coming from an event-driven javascript background, they seem familiar or forgivable.
In particular, when your application needs to perform synchronous flows, you start bleeding over half-baked solutions that slow you down considerably in terms of your development process. If you have computation intensive parts in your application, tread with caution picking (only) node.js. Maybe http://koajs.com/ or other novelties alleviate those originally thorny aspects, compared to when I originally used node.js or wrote this.
I can share few points where&why to use node js.
For realtime applications like chat,collaborative editing better we go with nodejs as it is event base where fire event and data to clients from server.
Simple and easy to understand as it is javascript base where most of people have idea.
Most of current web applications going towards angular js&backbone, with node it is easy to interact with client side code as both will use json data.
Lot of plugins available.
Drawbacks:-
Node will support most of databases but best is mongodb which won't support complex joins and others.
Compilation Errors...developer should handle each and every exceptions other wise if any error accord application will stop working where again we need to go and start it manually or using any automation tool.
Conclusion:-
Nodejs best to use for simple and real time applications..if you have very big business logic and complex functionality better should not use nodejs.
If you want to build an application along with chat and any collaborative functionality.. node can be used in specific parts and remain should go with your convenience technology.
Node is great for quick prototypes but I'd never use it again for anything complex.
I spent 20 years developing a relationship with a compiler and I sure miss it.
Node is especially painful for maintaining code that you haven't visited for awhile. Type info and compile time error detection are GOOD THINGS. Why throw all that out? For what? And dang, when something does go south the stack traces quite often completely useless.

node.js for cpu intensive operations

There is something that I'm really not understanding about Node.js: pretty much everywhere you can read that node.js is not recommended for HPC (high performance computing) due to his async but single-thread nature.
You can find node.js pretty much always explained with Express.js to build some really fast web-server or service that also allows you to send HTML or JSON in your response after some query to an SQL or NoSQL database.
But here the thing.
You can also find on npm lots of packages build for time consuming and intensive operations, like fluent-ffmpeg for video encoding.
Or you can use request and cheerio and build a web scraper.
Npm in also full of command line application written for node.js (in node.js). Are all the application for non-time-consuming operations?
Also we can find a lot of frameworks, like next.js that, at least to me, doesn't seem like they are doing something so easy.
So, what can we do with node.js?
What does "cpu intensive operations" really mean?
I love using node and javascript to build web-servers, service and command line applications too, but sometimes I feel like I did not understand the real potential and the real limits of node.js.
If you look closely at the ffmepeg package, you'll note that it says:
In order to be able to use this module, make sure you have ffmpeg installed on your system
This is a hint as to what's going on in this case. This package is not reimplementing the entirety of ffmpeg, but instead simply serving as an API to an existing ffmpeg installation.
If you look at the code, you can see that it's actually just spawning a copy of ffmpeg to do the work. This therefore isn't actually running "in node".
So that's ffmpeg, what about your other examples? Well, I suspect most of them aren't as CPU heavy as you might think - after all, the entire design of many, many node applications is to deal with HTML and webpages, and a scraper isn't something that takes a lot of processing power to do.
So, "What does "cpu intensive operations" really mean?" is a pretty subjective one. Some things to note from your source link and real life:
The copyright at the bottom of the page is 2011. That's ancient in javascript development time. This advice was written before many iterations and innovations happened. It's likely not wholly wrong, but it's missing the current point of view we have.
CPU-heavy applications are called out in comparison to their I/O:
very heavy on CPU usage, and very light on actual I/O
Web scrapers are probably not considered "light on actual I/O"
This is a subjective choice. No one can dictate exactly how you should be implementing your application. If they were, they'd be writing it, not you.
The real world is not strictly defined into "CPU-intensive" and not. Many applications start with some requirements that look great for node, and then later, some get added that aren't as perfect as a fit, or are even a bad fit. Real world teams can't always reinvent everything whenever a new requirement gets added, so shims like the mentioned ffmpeg package get created.
So how do you know the limits? Again, this is a subjective choice. It's fair to set some hard boundaries, like video encoding, as things that really should not be done in pure javascript. But the space from there to a simple API gets pretty murky depending on the exact requirements and details. If it works and is reasonably performant, it's probably ok! You might get more performance out of another system, but you might also lose your knowledge of the ecosystem and integration with the community.

Client side usage of Stylus (CSS)

the new guy here. I've been looking for a good solution to using Stylus (compiled CSS) client side.
Now, I know the tips regarding not using compiled CSS client side because:
It breaks if JS is not in use.
It takes extra time to compile in a live client environment.
It needs to be recompiled at every single client, which just isn't green.
However, my environment is an extension made for Chrome and Opera. It works in a JS environment and it works offline, so neither 1, 2 or 3 applies. What I'm really looking for here is just a way to write CSS more efficiently with less headaches, more variables, nesting and mixins.
I have tried Less, which is the only one of the trio Less, Sass and Stylus which currently works nicely client side. So, does anyone know a good solution for Stylus?
CSS Preprocessors aren't actually meant to be run client-side. Some tools (i.e. LESS) provide a development-time client-side (JavaScript) compiler that will compile on the fly; however, this isn't meant for production.
The fact that Stylus/Sass do not provide this by default is actually a good thing and I personally wish that LESS did not; however, at the same time, I do realize that having it opens the door to people that may prefer to have some training wheels which can help them along in the beginning. Everyone learns in a different way so this may be just the feature that can get certain groups of people in the door initially. So, for development, it may be fine, but at the time of this writing, this workflow is not the most performant thing to do in production. Hopefully, at some point, most of the useful features in these tools will be added to native CSS then this will be a moot point.
Right now, my advice would be to deploy the compiled CSS only and use something like watch or guard or live-reload or codekit (or any suitable equivalent file watcher) in development so your stylus files are getting re-compiled as you code.
This page likely has the solution: http://learnboost.github.io/stylus/try.html
It seems to be compiling Stylus on the fly.
Stylus is capable of running in the browser
There's a client branch available in the GitHub repo
I don't totally understand your question but I'll offer some of the experience I've have with compiled css using LESS.
Earlier implementations needed javascript to compile the LESS files into CSS in the browser, I've never tried to work this way didn't seem that great to me and as you say if JS is switched off your in for a rough time.
I've been using applications recently to compile the LESS code into valid CSS, this gets around the need for JS to convert the source code.
The first application I used was crunch http://crunchapp.net/ which worked quite well but didn't compile the css on the fly.
The application I'm using now is called simpless http://wearekiss.com/simpless and this creates valid css on the fly so as soon as I've hit save in sublime text and refresh in the browser I can see my changes to the css.
Using this work flow, I'm able to get around the issues your raised above, when I'm done doing development I just upload my css file outputted from simpless which is also heavily minified which also saves time in terms of needing to optimise the css further.
I hope I have understood the question correctly, if not apologies.
Cheers,
Stefan

How to decide when to use Node.js?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am new to this kind of stuff, but lately I've been hearing a lot about how good Node.js is. Considering how much I love working with jQuery and JavaScript in general, I can't help but wonder how to decide when to use Node.js. The web application I have in mind is something like Bitly - takes some content, archives it.
From all the homework I have been doing in the last few days, I obtained the following information. Node.js
is a command-line tool that can be run as a regular web server and lets one run JavaScript programs
utilizes the great V8 JavaScript engine
is very good when you need to do several things at the same time
is event-based so all the wonderful Ajax-like stuff can be done on the server side
lets us share code between the browser and the backend
lets us talk with MySQL
Some of the sources that I have come across are:
Diving into Node.js – Introduction and Installation
Understanding NodeJS
Node by Example (Archive.is)
Let’s Make a Web App: NodePad
Considering that Node.js can be run almost out-of-the-box on Amazon's EC2 instances, I am trying to understand what type of problems require Node.js as opposed to any of the mighty kings out there like PHP, Python and Ruby. I understand that it really depends on the expertise one has on a language, but my question falls more into the general category of: When to use a particular framework and what type of problems is it particularly suited for?
You did a great job of summarizing what's awesome about Node.js. My feeling is that Node.js is especially suited for applications where you'd like to maintain a persistent connection from the browser back to the server. Using a technique known as "long-polling", you can write an application that sends updates to the user in real time. Doing long polling on many of the web's giants, like Ruby on Rails or Django, would create immense load on the server, because each active client eats up one server process. This situation amounts to a tarpit attack. When you use something like Node.js, the server has no need of maintaining separate threads for each open connection.
This means you can create a browser-based chat application in Node.js that takes almost no system resources to serve a great many clients. Any time you want to do this sort of long-polling, Node.js is a great option.
It's worth mentioning that Ruby and Python both have tools to do this sort of thing (eventmachine and twisted, respectively), but that Node.js does it exceptionally well, and from the ground up. JavaScript is exceptionally well situated to a callback-based concurrency model, and it excels here. Also, being able to serialize and deserialize with JSON native to both the client and the server is pretty nifty.
I look forward to reading other answers here, this is a fantastic question.
It's worth pointing out that Node.js is also great for situations in which you'll be reusing a lot of code across the client/server gap. The Meteor framework makes this really easy, and a lot of folks are suggesting this might be the future of web development. I can say from experience that it's a whole lot of fun to write code in Meteor, and a big part of this is spending less time thinking about how you're going to restructure your data, so the code that runs in the browser can easily manipulate it and pass it back.
Here's an article on Pyramid and long-polling, which turns out to be very easy to set up with a little help from gevent: TicTacToe and Long Polling with Pyramid.
I believe Node.js is best suited for real-time applications: online games, collaboration tools, chat rooms, or anything where what one user (or robot? or sensor?) does with the application needs to be seen by other users immediately, without a page refresh.
I should also mention that Socket.IO in combination with Node.js will reduce your real-time latency even further than what is possible with long polling. Socket.IO will fall back to long polling as a worst case scenario, and instead use web sockets or even Flash if they are available.
But I should also mention that just about any situation where the code might block due to threads can be better addressed with Node.js. Or any situation where you need the application to be event-driven.
Also, Ryan Dahl said in a talk that I once attended that the Node.js benchmarks closely rival Nginx for regular old HTTP requests. So if we build with Node.js, we can serve our normal resources quite effectively, and when we need the event-driven stuff, it's ready to handle it.
Plus it's all JavaScript all the time. Lingua Franca on the whole stack.
Reasons to use NodeJS:
It runs Javascript, so you can use the same language on server and client, and even share some code between them (e.g. for form validation, or to render views at either end.)
The single-threaded event-driven system is fast even when handling lots of requests at once, and also simple, compared to traditional multi-threaded Java or ROR frameworks.
The ever-growing pool of packages accessible through NPM, including client and server-side libraries/modules, as well as command-line tools for web development. Most of these are conveniently hosted on github, where sometimes you can report an issue and find it fixed within hours! It's nice to have everything under one roof, with standardized issue reporting and easy forking.
It has become the defacto standard environment in which to run Javascript-related tools and other web-related tools, including task runners, minifiers, beautifiers, linters, preprocessors, bundlers and analytics processors.
It seems quite suitable for prototyping, agile development and rapid product iteration.
Reasons not to use NodeJS:
It runs Javascript, which has no compile-time type checking. For large, complex safety-critical systems, or projects including collaboration between different organizations, a language which encourages contractual interfaces and provides static type checking may save you some debugging time (and explosions) in the long run. (Although the JVM is stuck with null, so please use Haskell for your nuclear reactors.)
Added to that, many of the packages in NPM are a little raw, and still under rapid development. Some libraries for older frameworks have undergone a decade of testing and bugfixing, and are very stable by now. Npmjs.org has no mechanism to rate packages, which has lead to a proliferation of packages doing more or less the same thing, out of which a large percentage are no longer maintained.
Nested callback hell. (Of course there are 20 different solutions to this...)
The ever-growing pool of packages can make one NodeJS project appear radically different from the next. There is a large diversity in implementations due to the huge number of options available (e.g. Express/Sails.js/Meteor/Derby). This can sometimes make it harder for a new developer to jump in on a Node project. Contrast that with a Rails developer joining an existing project: he should be able to get familiar with the app pretty quickly, because all Rails apps are encouraged to use a similar structure.
Dealing with files can be a bit of a pain. Things that are trivial in other languages, like reading a line from a text file, are weird enough to do with Node.js that there's a StackOverflow question on that with 80+ upvotes. There's no simple way to read one record at a time from a CSV file. Etc.
I love NodeJS, it is fast and wild and fun, but I am concerned it has little interest in provable-correctness. Let's hope we can eventually merge the best of both worlds. I am eager to see what will replace Node in the future... :)
To make it short:
Node.js is well suited for applications that have a lot of concurrent connections and each request only needs very few CPU cycles, because the event loop (with all the other clients) is blocked during execution of a function.
A good article about the event loop in Node.js is Mixu's tech blog: Understanding the node.js event loop.
I have one real-world example where I have used Node.js. The company where I work got one client who wanted to have a simple static HTML website. This website is for selling one item using PayPal and the client also wanted to have a counter which shows the amount of sold items. Client expected to have huge amount of visitors to this website. I decided to make the counter using Node.js and the Express.js framework.
The Node.js application was simple. Get the sold items amount from a Redis database, increase the counter when item is sold and serve the counter value to users via the API.
Some reasons why I chose to use Node.js in this case
It is very lightweight and fast. There has been over 200000 visits on this website in three weeks and minimal server resources has been able to handle it all.
The counter is really easy to make to be real time.
Node.js was easy to configure.
There are lots of modules available for free. For example, I found a Node.js module for PayPal.
In this case, Node.js was an awesome choice.
The most important reasons to start your next project using Node ...
All the coolest dudes are into it ... so it must be fun.
You can hangout at the cooler and have lots of Node adventures to brag about.
You're a penny pincher when it comes to cloud hosting costs.
Been there done that with Rails
You hate IIS deployments
Your old IT job is getting rather dull and you wish you were in a shiny new Start Up.
What to expect ...
You'll feel safe and secure with Express without all the server bloatware you never needed.
Runs like a rocket and scales well.
You dream it. You installed it. The node package repo npmjs.org is the largest ecosystem of open source libraries in the world.
Your brain will get time warped in the land of nested callbacks ...
... until you learn to keep your Promises.
Sequelize and Passport are your new API friends.
Debugging mostly async code will get umm ... interesting .
Time for all Noders to master Typescript.
Who uses it?
PayPal, Netflix, Walmart, LinkedIn, Groupon, Uber, GoDaddy, Dow Jones
Here's why they switched to Node.
There is nothing like Silver Bullet. Everything comes with some cost associated with it. It is like if you eat oily food, you will compromise your health and healthy food does not come with spices like oily food. It is individual choice whether they want health or spices as in their food.
Same way Node.js consider to be used in specific scenario. If your app does not fit into that scenario you should not consider it for your app development. I am just putting my thought on the same:
When to use Node.JS
If your server side code requires very few cpu cycles. In other world you are doing non blocking operation and does not have heavy algorithm/Job which consumes lots of CPU cycles.
If you are from Javascript back ground and comfortable in writing Single Threaded code just like client side JS.
When NOT to use Node.JS
Your server request is dependent on heavy CPU consuming algorithm/Job.
Scalability Consideration with Node.JS
Node.JS itself does not utilize all core of underlying system and it is single threaded by default, you have to write logic by your own to utilize multi core processor and make it multi threaded.
Node.JS Alternatives
There are other option to use in place of Node.JS however Vert.x seems to be pretty promising and has lots of additional features like polygot and better scalability considerations.
Another great thing that I think no one has mentioned about Node.js is the amazing community, the package management system (npm) and the amount of modules that exist that you can include by simply including them in your package.json file.
My piece: nodejs is great for making real time systems like analytics, chat-apps, apis, ad servers, etc.
Hell, I made my first chat app using nodejs and socket.io under 2 hours and that too during exam
week!
Edit
Its been several years since I have started using nodejs and I have used it in making many different things including static file servers, simple analytics, chat apps and much more.
This is my take on when to use nodejs
When to use
When making system which put emphasis on concurrency and speed.
Sockets only servers like chat apps, irc apps, etc.
Social networks which put emphasis on realtime resources like geolocation, video stream, audio stream, etc.
Handling small chunks of data really fast like an analytics webapp.
As exposing a REST only api.
When not to use
Its a very versatile webserver so you can use it wherever you want but probably not these places.
Simple blogs and static sites.
Just as a static file server.
Keep in mind that I am just nitpicking. For static file servers, apache is better mainly because it is widely available. The nodejs community has grown larger and more mature over the years and it is safe to say nodejs can be used just about everywhere if you have your own choice of hosting.
It can be used where
Applications that are highly event driven & are heavily I/O bound
Applications handling a large number of connections to other systems
Real-time applications (Node.js was designed from the ground up for real time and to be easy
to use.)
Applications that juggle scads of information streaming to and from other sources
High traffic, Scalable applications
Mobile apps that have to talk to platform API & database, without having to do a lot of data
analytics
Build out networked applications
Applications that need to talk to the back end very often
On Mobile front, prime-time companies have relied on Node.js for their mobile solutions. Check out why?
LinkedIn is a prominent user. Their entire mobile stack is built on Node.js. They went from running 15 servers with 15 instances on each physical machine, to just 4 instances – that can handle double the traffic!
eBay launched ql.io, a web query language for HTTP APIs, which uses Node.js as the runtime stack. They were able to tune a regular developer-quality Ubuntu workstation to handle more than 120,000 active connections per node.js process, with each connection consuming about 2kB memory!
Walmart re-engineered its mobile app to use Node.js and pushed its JavaScript processing to the server.
Read more at: http://www.pixelatingbits.com/a-closer-look-at-mobile-app-development-with-node-js/
Node best for concurrent request handling -
So, Let’s start with a story. From last 2 years I am working on JavaScript and developing web front end and I am enjoying it. Back end guys provide’s us some API’s written in Java,python (we don’t care) and we simply write a AJAX call, get our data and guess what ! we are done. But in real it is not that easy, If data we are getting is not correct or there is some server error then we stuck and we have to contact our back end guys over the mail or chat(sometimes on whatsApp too :).) This is not cool. What if we wrote our API’s in JavaScript and call those API’s from our front end ? Yes that’s pretty cool because if we face any problem in API we can look into it. Guess what ! you can do this now , How ? – Node is there for you.
Ok agreed that you can write your API in JavaScript but what if I am ok with above problem. Do you have any other reason to use node for rest API ?
so here is the magic begins. Yes I do have other reasons to use node for our API’s.
Let’s go back to our traditional rest API system which is based on either blocking operation or threading. Suppose two concurrent request occurs( r1 and r2) , each of them require database operation. So In traditional system what will happens :
1. Waiting Way : Our server starts serving r1 request and waits for query response. after completion of r1 , server starts to serve r2 and does it in same way. So waiting is not a good idea because we don’t have that much time.
2. Threading Way : Our server will creates two threads for both requests r1 and r2 and serve their purpose after querying database so cool its fast.But it is memory consuming because you can see we started two threads also problem increases when both request is querying same data then you have to deal with deadlock kind of issues . So its better than waiting way but still issues are there.
Now here is , how node will do it:
3. Nodeway : When same concurrent request comes in node then it will register an event with its callback and move ahead it will not wait for query response for a particular request.So when r1 request comes then node’s event loop (yes there is an event loop in node which serves this purpose.) register an event with its callback function and move ahead for serving r2 request and similarly register its event with its callback. Whenever any query finishes it triggers its corresponding event and execute its callback to completion without being interrupted.
So no waiting, no threading , no memory consumption – yes this is nodeway for serving rest API.
My one more reason to choose Node.js for a new project is:
Be able to do pure cloud based development
I have used Cloud9 IDE for a while and now I can't imagine without it, it covers all the development lifecycles. All you need is a browser and you can code anytime anywhere on any devices. You don't need to check in code in one Computer(like at home), then checkout in another computer(like at work place).
Of course, there maybe cloud based IDE for other languages or platforms (Cloud 9 IDE is adding supports for other languages as well), but using Cloud 9 to do Node.js developement is really a great experience for me.
One more thing node provides is the ability to create multiple v8 instanes of node using node's child process( childProcess.fork() each requiring 10mb memory as per docs) on the fly, thus not affecting the main process running the server. So offloading a background job that requires huge server load becomes a child's play and we can easily kill them as and when needed.
I've been using node a lot and in most of the apps we build, require server connections at the same time thus a heavy network traffic. Frameworks like Express.js and the new Koajs (which removed callback hell) have made working on node even more easier.
Donning asbestos longjohns...
Yesterday my title with Packt Publications, Reactive Programming with JavaScript. It isn't really a Node.js-centric title; early chapters are intended to cover theory, and later code-heavy chapters cover practice. Because I didn't really think it would be appropriate to fail to give readers a webserver, Node.js seemed by far the obvious choice. The case was closed before it was even opened.
I could have given a very rosy view of my experience with Node.js. Instead I was honest about good points and bad points I encountered.
Let me include a few quotes that are relevant here:
Warning: Node.js and its ecosystem are hot--hot enough to burn you badly!
When I was a teacher’s assistant in math, one of the non-obvious suggestions I was told was not to tell a student that something was “easy.” The reason was somewhat obvious in retrospect: if you tell people something is easy, someone who doesn’t see a solution may end up feeling (even more) stupid, because not only do they not get how to solve the problem, but the problem they are too stupid to understand is an easy one!
There are gotchas that don’t just annoy people coming from Python / Django, which immediately reloads the source if you change anything. With Node.js, the default behavior is that if you make one change, the old version continues to be active until the end of time or until you manually stop and restart the server. This inappropriate behavior doesn’t just annoy Pythonistas; it also irritates native Node.js users who provide various workarounds. The StackOverflow question “Auto-reload of files in Node.js” has, at the time of this writing, over 200 upvotes and 19 answers; an edit directs the user to a nanny script, node-supervisor, with homepage at http://tinyurl.com/reactjs-node-supervisor. This problem affords new users with great opportunity to feel stupid because they thought they had fixed the problem, but the old, buggy behavior is completely unchanged. And it is easy to forget to bounce the server; I have done so multiple times. And the message I would like to give is, “No, you’re not stupid because this behavior of Node.js bit your back; it’s just that the designers of Node.js saw no reason to provide appropriate behavior here. Do try to cope with it, perhaps taking a little help from node-supervisor or another solution, but please don’t walk away feeling that you’re stupid. You’re not the one with the problem; the problem is in Node.js’s default behavior.”
This section, after some debate, was left in, precisely because I don't want to give an impression of “It’s easy.” I cut my hands repeatedly while getting things to work, and I don’t want to smooth over difficulties and set you up to believe that getting Node.js and its ecosystem to function well is a straightforward matter and if it’s not straightforward for you too, you don’t know what you’re doing. If you don’t run into obnoxious difficulties using Node.js, that’s wonderful. If you do, I would hope that you don’t walk away feeling, “I’m stupid—there must be something wrong with me.” You’re not stupid if you experience nasty surprises dealing with Node.js. It’s not you! It’s Node.js and its ecosystem!
The Appendix, which I did not really want after the rising crescendo in the last chapters and the conclusion, talks about what I was able to find in the ecosystem, and provided a workaround for moronic literalism:
Another database that seemed like a perfect fit, and may yet be redeemable, is a server-side implementation of the HTML5 key-value store. This approach has the cardinal advantage of an API that most good front-end developers understand well enough. For that matter, it’s also an API that most not-so-good front-end developers understand well enough. But with the node-localstorage package, while dictionary-syntax access is not offered (you want to use localStorage.setItem(key, value) or localStorage.getItem(key), not localStorage[key]), the full localStorage semantics are implemented, including a default 5MB quota—WHY? Do server-side JavaScript developers need to be protected from themselves?
For client-side database capabilities, a 5MB quota per website is really a generous and useful amount of breathing room to let developers work with it. You could set a much lower quota and still offer developers an immeasurable improvement over limping along with cookie management. A 5MB limit doesn’t lend itself very quickly to Big Data client-side processing, but there is a really quite generous allowance that resourceful developers can use to do a lot. But on the other hand, 5MB is not a particularly large portion of most disks purchased any time recently, meaning that if you and a website disagree about what is reasonable use of disk space, or some site is simply hoggish, it does not really cost you much and you are in no danger of a swamped hard drive unless your hard drive was already too full. Maybe we would be better off if the balance were a little less or a little more, but overall it’s a decent solution to address the intrinsic tension for a client-side context.
However, it might gently be pointed out that when you are the one writing code for your server, you don’t need any additional protection from making your database more than a tolerable 5MB in size. Most developers will neither need nor want tools acting as a nanny and protecting them from storing more than 5MB of server-side data. And the 5MB quota that is a golden balancing act on the client-side is rather a bit silly on a Node.js server. (And, for a database for multiple users such as is covered in this Appendix, it might be pointed out, slightly painfully, that that’s not 5MB per user account unless you create a separate database on disk for each user account; that’s 5MB shared between all user accounts together. That could get painful if you go viral!) The documentation states that the quota is customizable, but an email a week ago to the developer asking how to change the quota is unanswered, as was the StackOverflow question asking the same. The only answer I have been able to find is in the Github CoffeeScript source, where it is listed as an optional second integer argument to a constructor. So that’s easy enough, and you could specify a quota equal to a disk or partition size. But besides porting a feature that does not make sense, the tool’s author has failed completely to follow a very standard convention of interpreting 0 as meaning “unlimited” for a variable or function where an integer is to specify a maximum limit for some resource use. The best thing to do with this misfeature is probably to specify that the quota is Infinity:
if (typeof localStorage === 'undefined' || localStorage === null)
{
var LocalStorage = require('node-localstorage').LocalStorage;
localStorage = new LocalStorage(__dirname + '/localStorage',
Infinity);
}
Swapping two comments in order:
People needlessly shot themselves in the foot constantly using JavaScript as a whole, and part of JavaScript being made respectable language was a Douglas Crockford saying in essence, “JavaScript as a language has some really good parts and some really bad parts. Here are the good parts. Just forget that anything else is there.” Perhaps the hot Node.js ecosystem will grow its own “Douglas Crockford,” who will say, “The Node.js ecosystem is a coding Wild West, but there are some real gems to be found. Here’s a roadmap. Here are the areas to avoid at almost any cost. Here are the areas with some of the richest paydirt to be found in ANY language or environment.”
Perhaps someone else can take those words as a challenge, and follow Crockford’s lead and write up “the good parts” and / or “the better parts” for Node.js and its ecosystem. I’d buy a copy!
And given the degree of enthusiasm and sheer work-hours on all projects, it may be warranted in a year, or two, or three, to sharply temper any remarks about an immature ecosystem made at the time of this writing. It really may make sense in five years to say, “The 2015 Node.js ecosystem had several minefields. The 2020 Node.js ecosystem has multiple paradises.”
If your application mainly tethers web apis, or other io channels, give or take a user interface, node.js may be a fair pick for you, especially if you want to squeeze out the most scalability, or, if your main language in life is javascript (or javascript transpilers of sorts). If you build microservices, node.js is also okay. Node.js is also suitable for any project that is small or simple.
Its main selling point is it allows front-enders take responsibility for back-end stuff rather than the typical divide. Another justifiable selling point is if your workforce is javascript oriented to begin with.
Beyond a certain point however, you cannot scale your code without terrible hacks for forcing modularity, readability and flow control. Some people like those hacks though, especially coming from an event-driven javascript background, they seem familiar or forgivable.
In particular, when your application needs to perform synchronous flows, you start bleeding over half-baked solutions that slow you down considerably in terms of your development process. If you have computation intensive parts in your application, tread with caution picking (only) node.js. Maybe http://koajs.com/ or other novelties alleviate those originally thorny aspects, compared to when I originally used node.js or wrote this.
I can share few points where&why to use node js.
For realtime applications like chat,collaborative editing better we go with nodejs as it is event base where fire event and data to clients from server.
Simple and easy to understand as it is javascript base where most of people have idea.
Most of current web applications going towards angular js&backbone, with node it is easy to interact with client side code as both will use json data.
Lot of plugins available.
Drawbacks:-
Node will support most of databases but best is mongodb which won't support complex joins and others.
Compilation Errors...developer should handle each and every exceptions other wise if any error accord application will stop working where again we need to go and start it manually or using any automation tool.
Conclusion:-
Nodejs best to use for simple and real time applications..if you have very big business logic and complex functionality better should not use nodejs.
If you want to build an application along with chat and any collaborative functionality.. node can be used in specific parts and remain should go with your convenience technology.
Node is great for quick prototypes but I'd never use it again for anything complex.
I spent 20 years developing a relationship with a compiler and I sure miss it.
Node is especially painful for maintaining code that you haven't visited for awhile. Type info and compile time error detection are GOOD THINGS. Why throw all that out? For what? And dang, when something does go south the stack traces quite often completely useless.

Language recommendations for an efficient web crawler

I'm looking for a language for writing an efficient web crawler. Things I value:
expressive language (don't make me just through static typing hoops)
useful libraries (a css selector based html parser would be nice)
minimal memory footprint
dependable language runtime & libraries
I tried node.js. I like node in theory. Javascript is very expressive. You can use jQuery to parse html. Node's async nature lets me crawl many urls in parallel without dealing with threads. V8 is nice and fast for parsing.
In practice, node isn't working out for me. My process constantly crashes. Bus Errors, exceptions in the event manager ... etc.
I've done a fair bit of Ruby dev, so I wouldn't mind using Ruby 1.9's coroutines (fibers?) as long as I won't face similar issues with VM / library stability.
Additional suggestions?
Use Node.js, and fix whatever is crashing it. It's been running on my Ubuntu box without any problems for months.
For the library, I recommend to use YUI3 instead of jQuery, it easily lets you build a webcrawler/scraper in a couple of minutes, if you don't believe me watch this Talk from YUIConf2010, it's 40 minutes but it's all about code.
Dav Glass did a great job of showing how easy it is and how little code you need, yes there were some issues with different version of jsdom in the talk, but the talk was given at the beginning of November, so much of that should have been fixed already.
You can check out all the stuff from the talk at his GitHub page.
And here's his scraper that gets the current news headlines from Digg.
Seriously it's more than worth the effort making Node.js run on your system, since in the end you got all the awesomeness of YUI3 on the server side.
I'm pretty sure any language has something built that can handle it. Are you sure node.js isn't crashing because of a problem in your code? Why not use Ruby if you're comfortable with it?
There's also BeautifulSoup (Python), which you might consider if you main hurdle is the HTML parsing.
Go with the language most familiar to you or the language you want to learn most. You can write a web crawler in any language.
I've personally developed crawlers in Java, Ruby, and Perl. All of these languages met your requirements. (Yes, even the crawler in Java had a reasonable memory footprint.) Of these, Java was my favorite because it boasted the most mature HTTP and HTML libraries. If I find myself writing another, I want to try Python next.
The first algorithmic problem you'll face is the task of efficiently identifying the pages you've already visited. This index of URLs can grow very large and must support fast lookups and insertions. A common database index will work in early crawler prototypes but will quickly prove to be the bottleneck.
python and BeautifulSoup, easy to learn and very efficient.

Categories