npm installs many dependencies - javascript

I bought an HTML template recently, which contains many plugins placed inside a bower_components directory and a package.js file inside. I wanted to install another package I liked, but decided to use npm for this purpose.
When I typed:
npc install pnotify
the node_modules directory was created and contained about 900 directories with other packages.
What are those? Why did they get installed along with my package? I did some research and it turned out that those were needed, but do I really need to deliver my template in production with hundreds of unnecessary packages?

This is a very good question. There are a few things I want to point out.
The V8 engine, Node Modules (dependencies) and "requiring" them:
Node.js is built on V8 engine, which is written in C++. This means that Node.js' dependencies are fundamentally written in C++.
Now when you require a dependency, you really require code/functions from a C++ program or js library, because that's how new libraries/dependencies are made.
Libraries have so many functions that you will not use
For example, take a look at the express-validator module, which contains so many functions. When you require the module, do you use all the functions it provides? The answer is no. People most often require packages like this just to use one single benefit of it, although all of the functions end up getting downloaded, which takes up unnecessary space.
Think of the node dependencies that are made from other node dependencies as Interpreted Languages
For example, JavaScript is written in C/C++, whose functions and compilers are in turn originally written in assembly. Think of it like a tree. You create new branches each time for more convenient usage and, most importantly, to save time . It makes things faster. Similarly, when people create new dependencies, they use/require ones that already exist, instead of rewriting a whole C++ program or js script, because that makes everything easier.
Problem arises when requiring other NPMs for creating a new one
When the authors of the dependencies require other dependencies from here and there just to use a few (small amount) benefits from them, they end up downloading them all, (which they don't really care about because they mostly do not worry about the size or they'd rather do this than explicitly writing a new dependency or a C++ addon) and this takes extra space. For example you can see the dependencies that the express-validator module uses by accessing this link.
So, when you have big projects that use lots of dependencies you end up taking so much space for them.
Ways to solve this
Number 1
This requires some expert people on Node.js. To reduce the amount of the downloaded packages, a professional Node.js developer could go to the directories that modules are saved in, open the javascript files, take a look at their source code, and delete the functions that they will not use without changing the structure of the package.
Number 2 (Most likely not worth your time)
You could also create your own personal dependencies that are written in C++, or more preferably js, which would literally take up the least space possible, depending on the programmer, but would take/waste the most time, in order to reduce size instead of doing work. (Note: Most dependencies are written in js.)
Number 3 (Common)
Instead of Using option number 2, you could implement WebPack.
Conclusion & Note
So, basically, there is no running away from downloading all the node packages, but you could use solution number 1 if you believe you can do it, which also has the possibility of screwing up the whole intention of a dependency. (So make it personal and use it for specific purposes.) Or just make use of a module like WebPack.
Also, ask this question to yourself: Do those packages really cause you a problem?

No, there is no point to add about 900 packages dependencies in your project just because you want to add some template. But it is up to you!
The heavyness of a template is not challenging the node.js ecosystem nor his main package system npm.
It is a fact that javascript community tend to make smallest possible module to be responsible for one task, and just one.
It is not a bad thing I guess. But it could result of a situation where you have a lot of dependencies in your project.
Nowadays hard drive memory is cheap and nobody cares any more about making efficient/small apps.
As always, it's only a matter of choice.

What is the point of delivering hundreds of packages weighing hundreds of MB for a few kB project.
There isn't..
If you intend to provide it to other developers, just gitignore (or remove from shared package) node_modules or bower_components directories. Developers simply install dependencies again as required ;)
If it is something as simple as an HTML templates or similar stuff, node would most likely be there just for making your life as a developer easier providing live reload, compiling/transpiling typescript/babel/SCSS/SASS/LESS/Coffee... ( list goes on ;P ) etc.
And in that case dependencies would most likely only be dev_dependencies and won't be required at all in production environment ;)
Also many packages come with separate production and dev dependencies, So you just need to install production dependencies...
npm install --only=prod
If your project does need many projects in production, and you really really wanna avoid that stuff, just spend some time and include css/js files your your project needs(this can be a laborious task).
Update
Production vs default install
Most projects have different dev and production dependencies,
Dev dependencies may include stuff like SASS, typescript etc. compilers, uglifiers (minification), maybe stuff like live reload etc.
Where as production version will not have those things reducing the size node_modules directory.
** No node_modules**
In some html template kind of projects, you may not need any node_modules in production, so you skip doing an npm install.
No access to node_modules
Or in some cases, when server that serves exists in node_modules itself, access to it may be blocked (coz there is no need to access these from frontend).

What are those? Why did they get installed along with my package?
Dependencies exists to facilitate code reuse through modularity.
... do I need to deliver my template in production with hundreds of unnecessary packages?
One shouldn't be so quick to dismiss this modularity. If you inline your requires and eliminate dead code, you'll lose the benefit of maintenance patches for the dependencies automatically being applied to your code. You should see this as a form of compilation, because... well... it is compilation.
Nonetheless, if you're licensed to redistribute all of your dependencies in this compiled form, you'll be happy to learn those optimisations are performed by a compiler which compile Javascript to Javascript. The Closure Compiler, as the first example I stumbled across, appears to perform advanced compilation, which means you get dead code removal and function inlining... That seems promising!

This does however have another side effect when you are required to justify the licensing of all npm modules..so when you have hundreds of npm modules due to dependencies this effort also becomes a more cumbersome task

Very old question but I happened to come across very similar situation just as RA pointed out.
I tried to work with node.js framework using vscode and the moment when I tried to install start npm using npm init -y, it generated so many different dependencies. In my case, it was vscode extension ESlint that I added to prior to running npm init -y
Uninstalling ESlint
Restarted vscode to apply that uninstallation
removed previously generated package.json and node-modules folder
do npm init -y again
This solved my problem of starting out with so many dependencies.

Related

Should a svelte package be a dependency or a devDependency?

I know that there are already a lot of posts concerning the distinction between dependency and devDependency but I didn't find any that explain it for the case of svelte so lets open this one here.
In most of the svelte package like svelte-material-ui or svelte-routing, the installation guide tell to install the package as a dependency. However since svelte will compile this package during the build time, the new library that will use it doesn't need to install this svelte package. So I don't see why it has to be a dependency.
Maybe this question is opinion based but would be nice to have at least a small idea of what to use.
I believe this is personal opinion. If you're not distributing your code as an NPM package, the distinction should be minimal. See, for example, this related discussion.
In my experience with web projects, it's helpful to distinguish between dependencies that are used for building/testing (devDependencies) vs. those that are "used at runtime" (dependencies). You're right that, with Svelte, none of the literal code is used at runtime, but then everything would be a devDependency, so you don't get a useful separation.
The NPM documentation says that the distinction should be production vs. development/testing.
In SvelteKit (the next version of Sapper) there is one major difference between dependency and devDependency: any module used in a (server-side) endpoint must be a dependency. If not, the project may not work when deployed on a serverless platform, although it will work locally.
Otherwise, I prefer to keep everything as a devDependency. I think it makes sense because Svelte is a compiler, and the packages are only needed at compile-time. However, I don't think it would hurt to just put everything as a dependency.

Which options do exist for defining a Python package with node.js dependencies?

Currently, I have a few (unpublished) Python packages in local use, which I install (for development purposes) with a Bash script on Linux into an activated (otherwise "empty") virtual environment in the following manner:
cd /root/of/python/package
pip install -r requirements_python.txt # includes "nodeenv"
nodeenv -p # pulls node.js and integrates it into my virtual environment
npm i -g npm # update npm ...
cat requirements_node.txt | xargs npm install -g
pip install -e .
The background is that I have a number of node.js dependencies, JavaScript CLI scripts, which are called by my Python code.
Pros of current approach:
dead simple: relies on nodeenv for all required plumbing
can theoretically be implemented within setup.py with subprocess.Popen etc
Cons of current approach:
Unix-like platforms with Bash only
"hard" to distribute my packages, say on PyPI
requires a virtual environment
has potentially "interesting" side effects if a package is installed globally
potentially interferes with a pre-existing configuration / "deployment" of nodeenv in the current virtual environment
What is the canonical (if there is any) or just a sane, potentially cross-platform approach of defining node.js dependencies for a Python package, making it publishable?
Why is this question even relevant? JavaScript is not just for web development (any more). There are also interesting (relevant) data processing tools out there. If you do not want to miss / ignore them, well, welcome to this particular form of hell.
I recently came across calmjs, which appears to be what I am looking for. I have not experimented much with it yet and it also appears to be a relatively young project.
I started an issue there asking a similar question.
EDIT (1): Interesting resource: JavaScript versus Research Computing - A Brief Guide for Those Who Regret That This Has Become Necessary
EDIT (2): I started an issue against nodeenv, asking how I could make a project depend on it.
(Disclaimer: I am the author of calmjs)
After mulling over this particular issue for another few days, this question actually encapsulates multiple problems which may or may not be orthogonal to each other depending on one's given point of view, given some of the following (the list is not exhaustive)
How can a developer ensure that they have all the information
required to install the package when given one.
How does a project
ensure that the ground they are standing on is solid (i.e. has all
the dependencies required).
How easy is it for the user to install the given project.
How easy is it to reproduce a given build.
For a single language, single platform project, the first question posed is trivially answered - just use whatever package management solution implemented for that language (i.e. Python - PyPI, Node.js - npm). The other questions generally fall into place.
For a multi-language, multi-platform, this is where it completely falls apart. Long story short, this is why projects generally have multiple sets of instructions for whatever version of Windows, Mac or Linux (of various mainstream distros) for the installation of their software, especially in binary form, to address the third question so that it's easy for the end user (which usually end up being doable, but not necessarily easy).
For developers and system integrators, who are definitely more interested in questions 2 and 4, they likely want an automation script for whatever platform they are on. This is kind of what you already got, except it only works on Linux, or wherever Bash is available. Now this also begs the question: How does one ensure Bash is available on the system? Some system administrators may prefer some other form of shell, so we are again back to the same problem, but instead of asking if Node.js is there, we have to ask if Bash is there. So this problem is basically unsolvable unless a line is drawn.
The first question hasn't really been mentioned yet, and I am going to make this fun by asking it in this manner: given a package from npm that requires a Python package, how does one specify a dependency on PyPI? Turns out such a project exists: nopy. I have not use it before, but at a casual glance it provide a specific way to record dependency information in the package.json file, which is the standard method for Node.js packages convey information about itself. Do note that it has a non-standard way of managing Python packages, however given that it does use whatever Python available, it will probably do the right thing if a Python virtual environment was activated. Doing it this way also mean that Node.js package dependants may have a way to figure out the required Python dependencies that have been declared by their Node.js dependencies, but note that without something else on top of it (or some other ground/line), there is no way to assert from within the environment that it will guarantee to do what needs to be done.
Naturally, coming back to Python, this question has been asked before (but not necessarily in a useful way specifically to you as the contexts are all different):
javascript dependencies in python project
How to install npm package from python script?
Django, recommended way to declare and solve JavaScript dependencies in blocks
pip: dependency on javascript library
Anyway, calmjs only solves problem 1 - i.e. let developers have the ability to figure out the Node.js packages they need from a given Python package, and to a lesser extent assist with problem 4, but without the guarantees of 2 and 3 it is not exactly solved.
From within Python dependency management point of view, there is no way to guarantee that the required external tools are available until their usage are attempted (it will either work or not work, and likewise from Node.js as explained earlier, and thank you for your question on the issue tracker, by the way). If this particular guarantee is required, many system integrators would make use of their favorite operating system level package manager (i.e. dpkg/apt, rpm/yum, or whatever else on Linux, Homebrew on OS X, perhaps Chocolatey on Windows), but again this does require further dependencies to install. Hence if multiple platforms are to be supported, there is no general solutions unless one were to reduce the scope, or have some kind of standard continuous integration that would generate working installation images that one would then deploy onto whatever virtualisation services the organisation uses (just an example).
Without all the specific baselines, this question is very difficult to provide a satisfactory answer for all parties involved.
What you describe is certainly not the simplest problem. For Python alone, companies came up with all kinds of packaging methods (e.g. Twitter's pex, Spotify's dh-virtualenv, or even grocker, which shifts Python deployments into container space) - (plug: I did a presentation at PyCon Balkan '18 on Packaging Python applications).
That said, one very hacky way, I could think of would be:
Find a way to compile your Node apps into a single binary. There is pkg (a blogpost about it), which
[...] enables you to package your Node.js project into an executable that can be run even on devices without Node.js installed.
This way the Node tools would be take care of.
Next, take these binary blobs and add them (somehow) as scripts to your python package, so that they get distributed along with your package and find their place, where your actual python package can pick them up and execute them.
Upsides:
User do not need any nodejs on their machine (which is probably expected, when you just want to pip install something).
Your package gets more self-contained by including binaries.
Downsides:
Your python package will include binary, which is less common.
Containing binaries means that you will have to prepare versions for all platforms. Not impossible, but more work.
You will have to expand your package creation pipeline (Makefile, setup.py, or other) a bit to make this simple and repeatable.
Your package gets significantly larger (which is probably the least of the problems today).

Using package.json for client side packages, that could be loaded dynamically in browser

I am thinking of extending the format of package.json to include dynamic package (plugin) loading on client side and I would like to understand whether this idea contradicts with npm vision or not. In other words I want to load a bunch of modules, that share common metadata, in browser runtime. Solutions like system.js and jspm are good for modules management, but what I seek is dynamic packages management on client side.
Speaking in details I would like to add a property like "myapp-clientRuntimeDependencies" that would allow to specify dependencies that would be loaded by browser instead of standard prepackaging (npm install->browserify-like solution).
package.json example:
{
name: "myapp-package",
version: "",
myapp-clientRuntimeDependencies: {
"myapp-plugin": "file:myapp-plugin",
"myapp-anotherplugin": "file:myapp-anotherplugin"
},
peerDependencies: {
"myapp-core": "1.0.0"
}
}
The question:
Does this idea contradict with "npm" and "package.json" vision? If yes then why?
Any feedback from npm community is very much appreciated.
References:
Extending package.json: http://blog.npmjs.org/post/101775448305/npm-and-front-end-packaging
EDIT:
The question was not formulated too well, the better way to ask this is:
What is the most standard way (e.g. handled by some existing tools, likely to be supported by npm) to specify run-time dependencies between 2 dynamically loaded front-end packages in package.json?
What is the most standard way to attach metadata in JSON format to front-end packages, that are loaded dynamically?
I wouldn't say that it conflicts with the vision of package.json, however it does seem to conflict a bit with how it's typically used. As you seem to be aware, package.json is normally used pre-runtime. In order to load something from package.json into your runtime, you'd have to load the package.json into your frontend code. If you're storing configurations that you don't want visible to frontend via a simple view source, this could definitely present a problem.
One thing that didn't quite click with me on this: you said that system.js and jspm are good for module management but that you were looking for package management. In the end, packages and modules tend to be synonymous, as a package becomes a module.
I may be misunderstanding what it is that you're looking for, but from what I can gather, I recommend you take a look at code-splitting...which is essentially creating separate js files that will be loaded dynamically based on what is needed instead of bundling all your javascript into a single file. Here's some docs on how to do this with webpack (I'm sure browserify does it as well).
If I understand correctly, your question is about using the package.json file to include your own app configuration. In the example you describe, you would use such configuration to let your app know which dependencies can be loaded at runtime.
There is really nothing preventing you from inserting your own fields in the package.json file, except for the risk of conflict with names that are used by npm for other meanings. But if you use a very specific name (like in your example), you should be safe enough. Actually, many lint and build tools have done so already. It is even explicitly written in the post you refer to:
If your tool needs metadata to make it work, put it in package.json. It would be rude to do this without asking, but we’re inviting you to do it, so go ahead. The registry is a schemaless store, so every field you add is on an equal footing to all the others, and we won’t strip out or complain about new fields (as long as they don’t conflict with existing ones).
But if you want to be even safer, you could resort to use a different file (like Bower did for example).

Is there any way to reorganize node_modules?

There is big and deep node_modules directory. And there are many sub-folders with the same modules that are located in different subdirectories. Sometimes the same versions of the modules, there are sometimes differences in minor versions.
Is there a tool for reorganization of node_modules to remove duplicates, put them in the root directory and somehow still a bunch of modules to optimize this?
The NPM hierarchy is actually fairly complex and they've done a lot of work to optimize it. The most you're losing here is a little disk space. If you really need to prune the package structure for your app then you can take a look at npm dedupe which is built right into npm and does exactly what you're asking for (consolidates duplicates as much as possible).
I know a lot of people are against checking in node_modules directory but in production applications we've found that checking in node_modules makes it so much easier to pin down fatal changes in our application. Sure you usually know that your application broke after the last npm update but if your app is rather large like ours is then that simply isn't granular enough to solve problems quickly and efficiently. So in large production applications (i.e. not libraries being published to NPM) I use npm dedupe to simplify the package structure before checking them in.
If you're writing code that others will consume (via NPM or otherwise) then checking in node_modules isn't the best idea and you should avoid doing so by adding node_modules to your version control system's ignore list. You should also make sure you're dependencies in package.json are specific version numbers with as few ranges as possible (please don't put asterisks in place of the version numbers in your production apps :/).
If you follow those basic patterns then you can just forget about the node_modules directory and let npm take care of you.

How can I use npm for front-end dependencies?

I want to ask if it is possible (and generally a good idea) to use npm to handle front-end dependencies (Backbone, jQuery).
I have found that Backbone, jQuery and so on are all available through npm but I would have to set another extraction point (the default is node_modules) or symlink or something else...
Has somebody done this before?
Is it possible?
What do I have to change in package.json?
+1 for using Browserify. We use it here at diy.org and love it. The best introduction and reasoning behind Browserify can be found in the Browserify Handbook. Topics like CommonJS & AMD solutions, build pipelines and testing are covered there.
The main reason Browserify works so well is it transparently works w/ NPM. As long as a module can be required it can be Browserified (though not all modules are made to work in the browser).
Basics:
npm install jquery-browserify
main.js
var $ = require('jquery-browserify');
$("img[attr$='png']").hide();
Then run:
browserify main.js > bundle.js
Then include bundle.js in your HTML doc and the code in main.js will execute.
Short answer: sort of.
It is largely up to the module author to support this, but it isn't common. Socket.io is an example of such a supporting module, as demonstrated on their landing page. There are other solutions however. These are the two I actually know anything about:
http://ender.no.de/ - Ender JS, self-described NPM analogue for client modules. A bit too involved for my tastes.
https://github.com/substack/node-browserify - Browserify, a utility that will walk your dependencies and allow you to output a single script by emulating the node.js module pattern. You can use a jake|cake|rake|make build script to spit out your application.js, and even automate it if you want to get fancy. I used this briefly, but decided it was a bit clunky, and became annoying to debug. Also, not all dual-environment npm modules like to be run through browserify.
Personally, I am currently opting for using RequireJS ( http://requirejs.org/ ) and manually managing my modules, similar to how Mozilla does with their BrowserQuest sample application ( https://github.com/mozilla/BrowserQuest ). Note that this comes with the challenge of having to potentially shim modules like backbone or underscore which removed support for AMD style module loaders. You can find an example of what is involved in shimming here: http://tbranyen.com/post/amdrequirejs-shim-plugin-for-loading-incompatible-javascript
Really it seems like it is going to hurt no matter what, which is why native module support is such a hot topic.
Our team maintains a tool called Lineman for building front-end projects. The tool is node-based, so a project relies on a lot of npm modules that operate server-side to build your assets, but out-of-the-box it expects to find your client-side dependencies in copied and committed to vendor/js.
However, a bunch of folks (myself included) have tried integrating with browserify, and we've run into a lot of complexity and problems, ranging from (a) npm modules being maintained by a third party which are either out of date or add unwanted changes, to (b) actual libraries that start failing when loaded traditionally whenever a top-level function named require is even defined, due to AMD/Require.js baggage.
My short-term recommendation is to hold off and stick with good ol' fashioned script concatenation until the dust settles. Until you have problems big enough or complex enough to warrant it, I suspect you'll spend more time debugging and remediating your build than you otherwise would. And I think most of us agree the best use of your time is focusing on your application code, not its build tools.
You might want to take a look at http://jspm.io/ which is a browser package manager. Has nice ES6 support too.
I personally use webmake for my small projects. It is an alternative to browserify in the way it brings npm dependencies into your browser, and it's apparently lighter.
I didn't have the opportunity to compare in details browserify and webmake, but I noticed webmake doesn't work well with modules internally using global variables such as socket.io (which is full of bloat anyway IMO).
I would be cautious about RequireJS, which has been recommended above. Because it is an AMD loader, your browser will load your JS files asynchronously. It will induces more exchanges between your client and server and may degrade the UX of people browsing from mobile networks / under bad WiFi. Moreover, if you succeed to keep your JS code simple and tiny, asynchronous loading is absolutely not needed !

Categories