I've been reading through example applications try to learning node. And I've noticed that several use the readdirSync method when loading models and controllers on boot().
For instance:
var models_path = __dirname + '/app/models'
var model_files = fs.readdirSync(models_path)
model_files.forEach(function(file){
if (file == 'user.js')
User = require(models_path+'/'+file)
else
require(models_path+'/'+file)
})
This seems anti-node to me. It's the opposite of the "try-to-make-everything-async" that node favors.
When and why might synchronous file reads like this be a good idea?
More than likely, to make initialisation more simple—when asynchronicity for speed's sake doesn't matter; we're not trying to service many concurrent requests.
Similarly, if you need access to some variable which you're initialising on startup, which will be used for the life-time of the application, you don't want to have to wrap your entire app in a callback!
Synchronous reads are needed when you have to be certain that all of the data is available before you continue AND you need to keep a sequence in order. In other words if you NEED to have the process blocked and unable to do anything else (for anyone) such as when you are starting up a server (e.g. reading the certificate files for HTTPS).
Synchronous reads may be desirable at other time to keep the coding simpler as Len has suggested. But then you are trading off simplicity with performance as you suggest. In fact, it is better to make use of one of the many sequencing helper libraries in this case. These greatly simplify your code by taking care of nested callbacks and sequence issues.
And of course, the code you provide as an example is rather dangerous - what happens if the read fails?
Here are 3 of the libraries:
Streamline.js Allows you to write async js/coffeescript as though it were sync. Simply replace the callbacks with '_'. BUT you either have to compile your scripts or run them via a loader.
async - seems to be about the best thought out and documented and is recommended by a couple of people who have built real-world apps.
async.js - Chainable, exposes fs as well (includes readdir, walkfiles, glob, abspath, copy, rm - focused on fs rather than generic
This link may also be of use: The Tale of Harry - An explanation of how a mythical programmer moves from traditional programming to callback-based & the patterns that he ends up using. Also a useful insight into the patterns presented in the async library.
Related
Straight to the point, I am running an http server in Node.js managing a hotel's check-in/out info where I write all the JSON data from memory to the same file using "fs.writeFile".
The data usually don't exceed 145kB max, however since I need to write them everytime that I get an update from my DataBase, I have data loss/bad JSON format when calls to fs.writeFile happen one after each other immediately.
Currently I have solved this problem using "fs.writeFileSync" however I would like to hear for a more sophisticated solution and not using the easy/bad solution of sync function.
Using fs.promises results in the same error since again I have to make multiple calls to fs.promises.
According to Node's documentation , calling fs.writefile or fs.promises multiple times is not safe and they suggest using a filestream, however this is not currently an option.
To summarize, I need to wait for fs.writeFile to end normally before attempting any repeated write action, and using the callback is not useful since I don't know a priori when a write action needs to be done.
Thank you very much in advance
I assume you mean you are overwriting or truncating the file while the last write request is still being written. If I were you, I would use the promises API and heed the warning from the documentation:
It is unsafe to use fsPromises.writeFile() multiple times on the same file without waiting for the promise to be settled.
You can await the result in a traditional loop, or very carefully use .then() to "synchronize" your callbacks, but if you're not doing anything else in your event loop except reading from your database and writing to this file, you might as well just use writeFileSync to keep things simple/safe. The asynchronous APIs (callback and Promises) are intended to allow your program to do other things in the meantime; if this is not necessary and the async APIs add troublesome complexity for your code, just use the synchronous APIs. That's true for any node API or library function, not just fs.writeFile.
There are also libraries that will perform atomic filesystem operations for you and abstract away the implementation details, but I think these are probably overkill for you unless you describe your use case in more detail. For example, why you're dumping a database to disk as JSON as fast/frequently as you can, rather than keeping things in memory or using event-based incremental updates (e.g. a real, local database with atomicity and consistency guarantees).
thank you for your response!
Since my app is mainly an http server,yes I do other things rather than simply input/output, although with not a great amount of requests. I will review again the promises solution but the first time I had no luck.
To explain more I have a:function updateRoom(data){ ...update things in memory... writetoDisk(); }
and the function writetoDisk(){
fsWriteFile(....)
}
Making the function writetoDisk an async function and implementing "await" inside it still does not solve the problem since the updateRoom function will call the writetoDisk without waiting for it to end.
The ".then" approach can not be implemented since my updateRoom is being called constantly and dynamically .
If you happen to know 1-2 thing about async-await you are more than welcome to explain me a bit more, thanks again nevertheless!
Can anyone help me understand the function of NodeJS and performance impact for the below scenario.
a. Making the request to Rest API end point "/api/XXX". In this request, i am returning the response triggering the asynchronous function like below.
function update(req, res) {
executeUpdate(req.body); //Asynchronous function
res.send(200);
}
b. In this, I send the response back without waiting for the function to complete and this function executing four mongodb updates of different collection.
Questions:
As I read, the NodeJS works on the single thread, how this
asynchronous function is executing?
If there are multiple requests for same end point, how will be the
performance impact of NodeJS?
How exactly the NodeJS handles the asynchronous function of each
request, because as the NodeJS is runs on the single thread, is there
any possibility of the memory issue?
In short, it depends on what you are doing in your function.
The synchronous functions in node are executed on main thread, thus,
they will not preempt and execute until end of the function or until
return statement is encountered.
The async functions, on the other hand, are removed from main thread,
and will only be executed when async tasks are completed on a
separate worker thread.
There are, I think, two different parts in the answer to your question.
Actual Performance - which includes CPU & memory performance. It also obviously includes speed.
Understanding as the previous poster said, Sync and Async.
In dealing with #1 - actual performance the real only way to test it is to create or use a testing environment on your code. In a rudimentary way based upon the system you are using you can view some of the information in top (linux) or Glances will give you a basic idea of performance, but in order to know exactly what is going on you will need to apply some of the various testing environments or writing your own tests.
Approaching #2 - It is not only sync and async processes you have to understand, but also the ramifications of both. This includes the use of callbacks and promises.
It really all depends on the current process you are attempting to code. For instance, many Node programmers seem to prefer using promises when they make calls to MongoDB, especially when one requires more than one call based upon the return of the cursor.
There is really no written-in-stone formula for when you use sync or async processes. Avoiding callback hell is something all Node programmers try to do. Catching errors etc. is something you always need to be careful about. As I said some programmers will always opt for Promises or Async when dealing with returns of data. The famous Async library coupled with Bluebird are the choice of many for certain scenarios.
All that being said, and remember your question is general and therefore so is my answer, in order to properly know the implications on your performance, in memory, cpu and speed as well as in return of information or passing to the browser, it is a good idea to understand as best as you can sync, async, callbacks, promises and error catching. You will discover certain situations are great for sync (and much faster), while others do require async and/or promises.
Hope this helps somewhat.
In the bluebird docs, they have this as an anti-pattern that stops optimization.. They call it argument leaking,
function leaksArguments2() {
var args = [].slice.call(arguments);
}
I do this all the time in Node.js. Is this really a problem. And, if so, why?
Assume only the latest version of Node.js.
Disclaimer: I am the author of the wiki page
It's a problem if the containing function is called a lot (being hot). Functions that leak arguments are not supported by the optimizing compiler (crankshaft).
Normally when a function is hot, it will be optimized. However if the function contains unsupported features like leaking arguments, being a hot function doesn't help and it will continue running slow generic code.
The performance of an optimized function compared to an unoptimized one is huge. For example consider a function that adds 3 doubles together: http://jsperf.com/213213213 21x difference.
What if it added 6 doubles together? 29x difference Generally the more code the function has, the more severe the punishment is for that function to run in unoptimized mode.
For node.js stuff like this in general is actually a huge problem due to the fact that any cpu time completely blocks the server. Just by optimizing the url parser that is included in node core (my module is 30x faster in node's own benchmarks), improves the requests per second of mysql-express from 70K rps to 100K rps in a benchmark that queries a database.
Good news is that node core is aware of this
Is this really a problem
For application code, no. For almost any module/library code, no. For a library such as bluebird that is intended to be used pervasively throughout an entire codebase, yes. If you did this in a very hot function in your application, then maybe yes.
I don't know the details but I trust the bluebird authors as credible that accessing arguments in the ways described in the docs causes v8 to refuse to optimize the function, and thus it's something that the bluebird authors consider worth using a build-time macro to get the optimized version.
Just keep in mind the latency numbers that gave rise to node in the first place. If your application does useful things like talking to a database or the filesystem, then I/O will be your bottleneck and optimizing/caching/parallelizing those will pay vastly higher dividends than v8-level in-memory micro-optimizations such as above.
https://github.com/caolan/async
https://github.com/maxtaco/tamejs
These are two modules. To me, it seems like the same thing, right?
Or...are they used in different situations?
async is a library that provides methods to let you control the flow of your program. For example: "I want to process each item in the array asynchronously and have this function executed after all processing is completed".
TameJS makes you write code that isn't JS, but will get converted to JS. It's aim is to make the way to do asynchronous programming more easy to follow.
I personally used TameJS, and there are a few problems with it:
When an error is reported, the line number is the line number of the JS file, not the TJS file that you wrote. Tracking errors is a pain.
There can be bugs that are hard to track down. I remember having a bug with return res.send(200) where the request was not being sent. It has been fixed by now, but it put a very bad taste in my mouth.
I am now using async and find it can make the code very easy to read and understand.
As a final suggestion, perhaps you should try writing your own code to manage the control flow. If you are new to JS, then that would be a very good learning experience to see what these libraries are doing on the inside. Even if you are in a time crunch, it would be best to understand what external libraries do, so you can make the best use of them.
They are completely different although they try to solve roughly the same problem. While async is a very cool flow control library that gives you some helper functions for managing your async code, tamejs is (similar to streamlinejs, which I prefer) a bunch of language additions for pseudo-synchronous code that gets compiled to asynchronous code.
I've got a collection of javascript files from a 3rd party, and I'd like to remove all the unused methods to get size down to a more reasonable level.
Does anyone know of a tool that does this for Javascript? At the very least give a list of unused/used methods, so I could do the manually trimming? This would be in addition to running something like the YUI Javascript compressor tool...
Otherwise my thought is to write a perl script to attempt to help me do this.
No. Because you can "use" methods in insanely dynamic ways like this.
obj[prompt("Gimme a method name.")]();
Check out JSCoverage . Generates code coverage statistics that show which lines of a program have been executed (and which have been missed).
I'd like to remove all the unused methods to get size down to a more reasonable level.
There are a couple of tools available:
npm install -g fixmyjs
fixmyjs <filename or folder>
A configurable module that uses JSHint (Github, docs) to flag functions that are unused and perform clean up as well.
I'm not sure that it removes undefined functions as opposed to flagging them. though it is a great tool for cleanup, it appears to lack compatibility with later versions of ECMAScript (more info below).
There is also the Google Closure Compiler which claims to remove dead JS but this is more of a build tool.
Updated
If you are using something like Babel, consider adding ESLint to your text editor, which can trigger a warning on unused methods and even variables and has a --fix CLI option for autofixing some errors and style issues.
I like ESLint because it contains multiple plugins for alternate libs (like React warnings if you're missing a prop), allowing you to catch bugs in advance. They have a solid ecosystem.
As an example: on my NodeJS projects, the config I use is based off of the Airbnb Style Guide.
You'll have to write a perl script. Take no notice of the nay-sayers above.
Such a tool could work with libraries that are designed to only make function calls explicitly. That means no delegates or pointers to functions would be allowed, the use of which in any case only results in unreadable "spaghetti code" and is not best practice. Even if it removes some of these hidden functions you'll discover most if not all of them in testing. The ones you dont discover will be so infrequently used that they will not be worth your time fixing them. Dont obsess with perfection. People go mad doing that.
So applying this one restriction to JavaScript (and libraries) will result in incredible reductions in page size and therefore load times, not to mention readability and maintainability. This is already the case for tools that remove unused CSS such as grunt_CSS and unCSS (see http://addyosmani.com/blog/removing-unused-css/) and which report typical reductions down to one tenth the original size.
Its a win/win situation.
Its noteworthy that all interpreters must address this issue of how to manage self modifying code. For the life of me I dont understand why people want to persist with unrestrained freedom. As noted by Triptych above Javascript functions can be called in ways that are literally "insane". This insane fexibility corrupts the fundamental doctrine of separation of code and data, enables real-time code injection, and invalidates any attempt to maintain code integrity. The result is always unreadable code that is impossible to debug and the side effect to JavaScript - removing the ability to run automatic code pre-optimisation and validation - is much much worse than any possible benefit.
AND - you'd have to feel pretty insecure about your work to want to deliberately obsficate it from both your collegues and yourself. Browser clients that do work extremely well take the "less is more" approach and the best example I've seeen to date is Microsoft Office combination of Access Web Forms paired with SharePoint Access Servcies. The productivity of having a ubiquitous heavy tightly managed runtime interpreter client and its server side clone is absolutely phenomenal.
The future of JavaScript self modifying code technologies therfore is bringing them back into line to respect the...
KISS principle of code and data: Keep It Seperate, Stupid.
Unless the library author kept track of dependencies and provided a way to download the minimal code [e.g. MooTools Core download], it will be hard to to identify 'unused' functions.
The problem is that JS is a dynamic language and there are several ways to call a function.
E.g. you may have a method like
function test()
{
//
}
You can call it like
test();
var i = 10;
var hello = i > 1 ? 'test' : 'xyz';
window[hello]();
I know this is an old question by UglifyJS2 supports removing unused code which may be what you are looking for.
Also worth noting that eslint supports an option called no-unused-vars which actually does some basic handling of detecting if functions are being used or not. It definitely detects it if you make the function anonymous and store it as a variable (but just be aware that as a variable the function declaration doesn't get hoisted immediately)
In the context of detecting unused functions, while extreme, you can consider breaking up a majority of your functions into separate modules because there are packages and tools to help detect unused modules. There is a little segment of sindreshorus's thoughts on tiny modules which might be relevant to that philosophy but that may be extreme for your use case.
Following would help:
If you have fully covered test cases, running Code Coverage tool like istanbul (https://github.com/gotwarlost/istanbul) or nyc (https://github.com/istanbuljs/nyc), would give a hint of untouched functions.
At least the above will help find the covered functions, that you may thought unused.