How to avoid node require loading same module twice

How to avoid node require loading same module twice - javascript

I'm developing a node module my-module which in turn depends on another module other-module. other-module is thus a dependency explicitly listed in my module's package.json.
As my module modifies the behaviour of other-module just by being required, it is important that other-module is loaded only once and that this, one-and-only 'instance' is the one referenced throughout any application that requires both my and other.
I expected this to hold true according to node's Module Caching Policy but what I've come across while writing a simple test app is this:
If my-module is npm installed before other-module then the latter is brought in as a dependency of the former. npm installing other-module afterwards brings it into the node_modules hierarchy a second time. Then, when my module requires other-module, node loads my module's 'local' copy and when the app requires it a second time node loads it again, (this time the version that was installed due to the second npm install). This obviously is not the intended result.
If my-module is npm installed after other-module then I end up with only one copy of other-module in node_modules and my test app works as expected.
This behaviour got me looking through node's relevant policies again and sure enough I came across the 'Module Caching Caveats':
Modules are cached based on their resolved filename. Since modules may resolve to a different filename based on the location of the calling module (loading from node_modules folders), it is not a guarantee that require('foo') will always return the exact same object, if it would resolve to different files.
At this point it looks like that my module may or may not behave as expected depending on the order of npm installs.
Are there any best practices I'm missing? Is there any way to avoid this mess without changing the way my module works?

The short answer: You can not.
As you pointed out node will load the required module from the most local place. This is as far as I know unique to a package manager and it enables you to don't care about the exact dependency tree of your modules. Node and npm will figure that out for you. In my opinion this is something really good.
Dependency hell is simply avoided by giving your modules the opportunity to require an exact version of what they need.
I think what you are trying to do, unless I do not understand your question fully, is not good node practice. Modules are loaded and assigned to a local variable. Global state should be avoided, as this can lead to rather awkward and untestable code. Additionally if you would succeed to inject your modified module into other's people code, there could be no guarantee that their code would still work. This would be as in the old Prototype.js_ days when it was ok to hack around with JavaScript's built-in globals like String or Array, which led to some disastrous code.
However keep in mind that this writing is just the opinion of one person. If you don't find more answers here, post your question otherwhere like the node's IRC channel.

I had the similar problem while developing tests with jest.
The following statement would allow you to load the same module again in different context:
jest.resetModules();

Related

npm installs many dependencies

I bought an HTML template recently, which contains many plugins placed inside a bower_components directory and a package.js file inside. I wanted to install another package I liked, but decided to use npm for this purpose.
When I typed:
npc install pnotify
the node_modules directory was created and contained about 900 directories with other packages.
What are those? Why did they get installed along with my package? I did some research and it turned out that those were needed, but do I really need to deliver my template in production with hundreds of unnecessary packages?

This is a very good question. There are a few things I want to point out.
The V8 engine, Node Modules (dependencies) and "requiring" them:
Node.js is built on V8 engine, which is written in C++. This means that Node.js' dependencies are fundamentally written in C++.
Now when you require a dependency, you really require code/functions from a C++ program or js library, because that's how new libraries/dependencies are made.
Libraries have so many functions that you will not use
For example, take a look at the express-validator module, which contains so many functions. When you require the module, do you use all the functions it provides? The answer is no. People most often require packages like this just to use one single benefit of it, although all of the functions end up getting downloaded, which takes up unnecessary space.
Think of the node dependencies that are made from other node dependencies as Interpreted Languages
For example, JavaScript is written in C/C++, whose functions and compilers are in turn originally written in assembly. Think of it like a tree. You create new branches each time for more convenient usage and, most importantly, to save time . It makes things faster. Similarly, when people create new dependencies, they use/require ones that already exist, instead of rewriting a whole C++ program or js script, because that makes everything easier.
Problem arises when requiring other NPMs for creating a new one
When the authors of the dependencies require other dependencies from here and there just to use a few (small amount) benefits from them, they end up downloading them all, (which they don't really care about because they mostly do not worry about the size or they'd rather do this than explicitly writing a new dependency or a C++ addon) and this takes extra space. For example you can see the dependencies that the express-validator module uses by accessing this link.
So, when you have big projects that use lots of dependencies you end up taking so much space for them.
Ways to solve this
Number 1
This requires some expert people on Node.js. To reduce the amount of the downloaded packages, a professional Node.js developer could go to the directories that modules are saved in, open the javascript files, take a look at their source code, and delete the functions that they will not use without changing the structure of the package.
Number 2 (Most likely not worth your time)
You could also create your own personal dependencies that are written in C++, or more preferably js, which would literally take up the least space possible, depending on the programmer, but would take/waste the most time, in order to reduce size instead of doing work. (Note: Most dependencies are written in js.)
Number 3 (Common)
Instead of Using option number 2, you could implement WebPack.
Conclusion & Note
So, basically, there is no running away from downloading all the node packages, but you could use solution number 1 if you believe you can do it, which also has the possibility of screwing up the whole intention of a dependency. (So make it personal and use it for specific purposes.) Or just make use of a module like WebPack.
Also, ask this question to yourself: Do those packages really cause you a problem?

No, there is no point to add about 900 packages dependencies in your project just because you want to add some template. But it is up to you!
The heavyness of a template is not challenging the node.js ecosystem nor his main package system npm.
It is a fact that javascript community tend to make smallest possible module to be responsible for one task, and just one.
It is not a bad thing I guess. But it could result of a situation where you have a lot of dependencies in your project.
Nowadays hard drive memory is cheap and nobody cares any more about making efficient/small apps.
As always, it's only a matter of choice.

What is the point of delivering hundreds of packages weighing hundreds of MB for a few kB project.
There isn't..
If you intend to provide it to other developers, just gitignore (or remove from shared package) node_modules or bower_components directories. Developers simply install dependencies again as required ;)
If it is something as simple as an HTML templates or similar stuff, node would most likely be there just for making your life as a developer easier providing live reload, compiling/transpiling typescript/babel/SCSS/SASS/LESS/Coffee... ( list goes on ;P ) etc.
And in that case dependencies would most likely only be dev_dependencies and won't be required at all in production environment ;)
Also many packages come with separate production and dev dependencies, So you just need to install production dependencies...
npm install --only=prod
If your project does need many projects in production, and you really really wanna avoid that stuff, just spend some time and include css/js files your your project needs(this can be a laborious task).
Update
Production vs default install
Most projects have different dev and production dependencies,
Dev dependencies may include stuff like SASS, typescript etc. compilers, uglifiers (minification), maybe stuff like live reload etc.
Where as production version will not have those things reducing the size node_modules directory.
** No node_modules**
In some html template kind of projects, you may not need any node_modules in production, so you skip doing an npm install.
No access to node_modules
Or in some cases, when server that serves exists in node_modules itself, access to it may be blocked (coz there is no need to access these from frontend).

What are those? Why did they get installed along with my package?
Dependencies exists to facilitate code reuse through modularity.
... do I need to deliver my template in production with hundreds of unnecessary packages?
One shouldn't be so quick to dismiss this modularity. If you inline your requires and eliminate dead code, you'll lose the benefit of maintenance patches for the dependencies automatically being applied to your code. You should see this as a form of compilation, because... well... it is compilation.
Nonetheless, if you're licensed to redistribute all of your dependencies in this compiled form, you'll be happy to learn those optimisations are performed by a compiler which compile Javascript to Javascript. The Closure Compiler, as the first example I stumbled across, appears to perform advanced compilation, which means you get dead code removal and function inlining... That seems promising!

This does however have another side effect when you are required to justify the licensing of all npm modules..so when you have hundreds of npm modules due to dependencies this effort also becomes a more cumbersome task

Very old question but I happened to come across very similar situation just as RA pointed out.
I tried to work with node.js framework using vscode and the moment when I tried to install start npm using npm init -y, it generated so many different dependencies. In my case, it was vscode extension ESlint that I added to prior to running npm init -y
Uninstalling ESlint
Restarted vscode to apply that uninstallation
removed previously generated package.json and node-modules folder
do npm init -y again
This solved my problem of starting out with so many dependencies.

Node package priority, global vs local

I've noticed that I have Angular 2 installed globally, and I don't know when I might have done that, or if that's the way it's supposed to be. It doesn't seem like that would be necessary if it's defined in every project.
It makes me wonder what side effects that have if I had different versions locally and globally. Which one takes priority? What's the best way to remove all of the Angular packages.

Globally installed NPM packages really only impact your command-line environment. Things like pm2 or sequelize insert bin/ stubs into the PATH to make your life easier.
In order to require something it needs to be present in package.json as well as properly installed.

Can't find modules required in React

I was going through the React codebase, and I noticed how React's require doesn't quite behave like in Nodejs. I don't get what's going on here.
Looking at line 19 on ReactClass.js for instance, there's a require('emptyObject'), but emptyObject isn't listed in package.json, nor does it say anywhere where that module's coming from.
https://github.com/facebook/react/blob/master/src/isomorphic/classic/class/ReactClass.js#L19
I did find "emptyObject" on npmjs, but the API there seems different from the one used in React; the .isEmpty grepped in React isn't related to emptyObject.
So where is emptyObject getting loaded from, and how is React's require doing what it's doing? This is not intuitive. At all.

The location of the emptyObject module which React refers to is https://github.com/facebook/fbjs/blob/master/packages/fbjs/src/core/emptyObject.js#L9 Note that it doesn't follow the CommonJS module system.
To make it easier for Facebook to share and consume our own JavaScript. Primarily this will allow us to ship code without worrying too much about where it lives, keeping with the spirit of #providesModule but working in the broader JavaScript ecosystem.
From https://github.com/facebook/fbjs#purpose
The way of defining a module by adding #providesModule in the license header and loading those modules with require in Node is called Haste, a customized module system built for Facebook's open source projects.
In fact, unless you would like to understand the inner workings of React or contribute to Facebook's open source projects, you don't need to know that. In other words, it's not recommended to use Haste to write your own project.
Along the same lines, the invariant module being loaded at line 10 of ReactClass.js is declared at https://github.com/facebook/fbjs/blob/master/packages/fbjs/src/__forks__/invariant.js#L9
As far as I know, both Eclipse and WebStorm don't support Haste so IDE can't help. But with Haste, the name of file and module should be the same, so you can find a module by searching for the filename, i.e. double shift in Webstorm and Ctrl+Shift+r in Eclipse. However, the emptyObject you asked about or invariant are not part of React so it's still cumbersome to find their origin.
Otherwise, there is a team that shares and organizes what they learn from hacking React that I contribute to occasionally and they have linked those requires by following Haste to the corresponding origin file e.g. https://annot.io/github.com/facebook/react/blob/cc3dc21/src/isomorphic/classic/class/ReactClass.js?l=19 You may want to see that.

I noticed how React's require doesn't quite behave like in Nodejs.
Right. Facebook has its own module loader. All modules have unique identifiers, provided by the #providesModule directive in each module. This allows you to use the identifier to load the module, instead of the file path.
Of course that doesn't work in a Node.js based environment. So when React or any other Facebook project is published to npm, all require calls are rewritten automatically to something that Node understands.
This functionality is provided by fbjs which contains shared dependencies and build helpers for all Facebook projects. This is where you find the emptyObject module.
If you look at React's gulp file, you can see how the module maps are constructed and that a custom Babel plugin is used to convert all require calls.

Using package.json for client side packages, that could be loaded dynamically in browser

I am thinking of extending the format of package.json to include dynamic package (plugin) loading on client side and I would like to understand whether this idea contradicts with npm vision or not. In other words I want to load a bunch of modules, that share common metadata, in browser runtime. Solutions like system.js and jspm are good for modules management, but what I seek is dynamic packages management on client side.
Speaking in details I would like to add a property like "myapp-clientRuntimeDependencies" that would allow to specify dependencies that would be loaded by browser instead of standard prepackaging (npm install->browserify-like solution).
package.json example:
{
name: "myapp-package",
version: "",
myapp-clientRuntimeDependencies: {
"myapp-plugin": "file:myapp-plugin",
"myapp-anotherplugin": "file:myapp-anotherplugin"
},
peerDependencies: {
"myapp-core": "1.0.0"
}
}
The question:
Does this idea contradict with "npm" and "package.json" vision? If yes then why?
Any feedback from npm community is very much appreciated.
References:
Extending package.json: http://blog.npmjs.org/post/101775448305/npm-and-front-end-packaging
EDIT:
The question was not formulated too well, the better way to ask this is:
What is the most standard way (e.g. handled by some existing tools, likely to be supported by npm) to specify run-time dependencies between 2 dynamically loaded front-end packages in package.json?
What is the most standard way to attach metadata in JSON format to front-end packages, that are loaded dynamically?

I wouldn't say that it conflicts with the vision of package.json, however it does seem to conflict a bit with how it's typically used. As you seem to be aware, package.json is normally used pre-runtime. In order to load something from package.json into your runtime, you'd have to load the package.json into your frontend code. If you're storing configurations that you don't want visible to frontend via a simple view source, this could definitely present a problem.
One thing that didn't quite click with me on this: you said that system.js and jspm are good for module management but that you were looking for package management. In the end, packages and modules tend to be synonymous, as a package becomes a module.
I may be misunderstanding what it is that you're looking for, but from what I can gather, I recommend you take a look at code-splitting...which is essentially creating separate js files that will be loaded dynamically based on what is needed instead of bundling all your javascript into a single file. Here's some docs on how to do this with webpack (I'm sure browserify does it as well).

If I understand correctly, your question is about using the package.json file to include your own app configuration. In the example you describe, you would use such configuration to let your app know which dependencies can be loaded at runtime.
There is really nothing preventing you from inserting your own fields in the package.json file, except for the risk of conflict with names that are used by npm for other meanings. But if you use a very specific name (like in your example), you should be safe enough. Actually, many lint and build tools have done so already. It is even explicitly written in the post you refer to:
If your tool needs metadata to make it work, put it in package.json. It would be rude to do this without asking, but we’re inviting you to do it, so go ahead. The registry is a schemaless store, so every field you add is on an equal footing to all the others, and we won’t strip out or complain about new fields (as long as they don’t conflict with existing ones).
But if you want to be even safer, you could resort to use a different file (like Bower did for example).

How can I use npm for front-end dependencies?

I want to ask if it is possible (and generally a good idea) to use npm to handle front-end dependencies (Backbone, jQuery).
I have found that Backbone, jQuery and so on are all available through npm but I would have to set another extraction point (the default is node_modules) or symlink or something else...
Has somebody done this before?
Is it possible?
What do I have to change in package.json?

+1 for using Browserify. We use it here at diy.org and love it. The best introduction and reasoning behind Browserify can be found in the Browserify Handbook. Topics like CommonJS & AMD solutions, build pipelines and testing are covered there.
The main reason Browserify works so well is it transparently works w/ NPM. As long as a module can be required it can be Browserified (though not all modules are made to work in the browser).
Basics:
npm install jquery-browserify
main.js
var $ = require('jquery-browserify');
$("img[attr$='png']").hide();
Then run:
browserify main.js > bundle.js
Then include bundle.js in your HTML doc and the code in main.js will execute.

Short answer: sort of.
It is largely up to the module author to support this, but it isn't common. Socket.io is an example of such a supporting module, as demonstrated on their landing page. There are other solutions however. These are the two I actually know anything about:
http://ender.no.de/ - Ender JS, self-described NPM analogue for client modules. A bit too involved for my tastes.
https://github.com/substack/node-browserify - Browserify, a utility that will walk your dependencies and allow you to output a single script by emulating the node.js module pattern. You can use a jake|cake|rake|make build script to spit out your application.js, and even automate it if you want to get fancy. I used this briefly, but decided it was a bit clunky, and became annoying to debug. Also, not all dual-environment npm modules like to be run through browserify.
Personally, I am currently opting for using RequireJS ( http://requirejs.org/ ) and manually managing my modules, similar to how Mozilla does with their BrowserQuest sample application ( https://github.com/mozilla/BrowserQuest ). Note that this comes with the challenge of having to potentially shim modules like backbone or underscore which removed support for AMD style module loaders. You can find an example of what is involved in shimming here: http://tbranyen.com/post/amdrequirejs-shim-plugin-for-loading-incompatible-javascript
Really it seems like it is going to hurt no matter what, which is why native module support is such a hot topic.

Our team maintains a tool called Lineman for building front-end projects. The tool is node-based, so a project relies on a lot of npm modules that operate server-side to build your assets, but out-of-the-box it expects to find your client-side dependencies in copied and committed to vendor/js.
However, a bunch of folks (myself included) have tried integrating with browserify, and we've run into a lot of complexity and problems, ranging from (a) npm modules being maintained by a third party which are either out of date or add unwanted changes, to (b) actual libraries that start failing when loaded traditionally whenever a top-level function named require is even defined, due to AMD/Require.js baggage.
My short-term recommendation is to hold off and stick with good ol' fashioned script concatenation until the dust settles. Until you have problems big enough or complex enough to warrant it, I suspect you'll spend more time debugging and remediating your build than you otherwise would. And I think most of us agree the best use of your time is focusing on your application code, not its build tools.

You might want to take a look at http://jspm.io/ which is a browser package manager. Has nice ES6 support too.

I personally use webmake for my small projects. It is an alternative to browserify in the way it brings npm dependencies into your browser, and it's apparently lighter.
I didn't have the opportunity to compare in details browserify and webmake, but I noticed webmake doesn't work well with modules internally using global variables such as socket.io (which is full of bloat anyway IMO).
I would be cautious about RequireJS, which has been recommended above. Because it is an AMD loader, your browser will load your JS files asynchronously. It will induces more exchanges between your client and server and may degrade the UX of people browsing from mobile networks / under bad WiFi. Moreover, if you succeed to keep your JS code simple and tiny, asynchronous loading is absolutely not needed !

We Keep Coding

JavaScript is the programming language of the Web.