Is there any way to reorganize node_modules? - javascript

There is big and deep node_modules directory. And there are many sub-folders with the same modules that are located in different subdirectories. Sometimes the same versions of the modules, there are sometimes differences in minor versions.
Is there a tool for reorganization of node_modules to remove duplicates, put them in the root directory and somehow still a bunch of modules to optimize this?

The NPM hierarchy is actually fairly complex and they've done a lot of work to optimize it. The most you're losing here is a little disk space. If you really need to prune the package structure for your app then you can take a look at npm dedupe which is built right into npm and does exactly what you're asking for (consolidates duplicates as much as possible).
I know a lot of people are against checking in node_modules directory but in production applications we've found that checking in node_modules makes it so much easier to pin down fatal changes in our application. Sure you usually know that your application broke after the last npm update but if your app is rather large like ours is then that simply isn't granular enough to solve problems quickly and efficiently. So in large production applications (i.e. not libraries being published to NPM) I use npm dedupe to simplify the package structure before checking them in.
If you're writing code that others will consume (via NPM or otherwise) then checking in node_modules isn't the best idea and you should avoid doing so by adding node_modules to your version control system's ignore list. You should also make sure you're dependencies in package.json are specific version numbers with as few ranges as possible (please don't put asterisks in place of the version numbers in your production apps :/).
If you follow those basic patterns then you can just forget about the node_modules directory and let npm take care of you.

Related

Can I review which node_modules I need?

Is there a way to review which node_modules packages are used in my create-react-app project and delete modules which are not in use? Other than manually reviewing all used packages and their dependencies?
I'd like to reduce the amount of space used up by the project. It's committed to a resources repository, and the node_modules folder alone triples the space used.
For repo size (assuming you're using git):
git rm -r --cached ./node_modules
Add the line /node_modules to your .gitignore file
git add .
git commit -m "Remove node_modules from repo"
For build size:
create-react-app is pretty good at tree shaking, so you shouldn't need to worry too much about this. Certain packages may have more modular ways of importing, which can help - usually these will be documented on a package-by-package basis. For example, Material UI icons supports two import syntaxes:
// option 1
import AccessAlarmIcon from '#material-ui/icons/AccessAlarm'
import ThreeDRotation from '#material-ui/icons/ThreeDRotation'
// option 2
import { AccessAlarm, ThreeDRotation } from '#material-ui/icons'
Of these, option 1 is likely to yield the smallest bundle size.
For size on your development machine:
The biggest difference you can make here (assuming you have other projects) is probably switching to something like pnpm, which stores node_modules centrally on your local machine and then links out to them from your project, rather than having many instances of identical modules.
For general housekeeping and tidiness:
You could try using a tool such as depcheck to detect unused dependencies. This is probably overkill, though, unless it's a particular pain point for your project.
Probably the easiest way is not to include the node modules by adding it to the gitignore file (or creating one), but if that's not an option, probably just unsintall the unneeded packages

How can I cleanup least frequently used node packages?

My system has 1000s of node packages installed mostly via yarn and few via npm. Now I want to cleanup least frequently used node packages, how do I proceed? Is there any package which can track the usage of these over time and help me identify those to be cleaned up?
Maybe you could do two things.
First you could use the npm prune command to unbuild packages that are not listed in the packages.json dependencies. This way you can get rid of those that the package manager cant audit, search for vulnerabilities and fix.
Second, you could use webpack in order to list dependencies and set how they will be loaded, in order to optimize your app.
Hope this help you, or at least give you some ideas to do your research :)

npm installs many dependencies

I bought an HTML template recently, which contains many plugins placed inside a bower_components directory and a package.js file inside. I wanted to install another package I liked, but decided to use npm for this purpose.
When I typed:
npc install pnotify
the node_modules directory was created and contained about 900 directories with other packages.
What are those? Why did they get installed along with my package? I did some research and it turned out that those were needed, but do I really need to deliver my template in production with hundreds of unnecessary packages?
This is a very good question. There are a few things I want to point out.
The V8 engine, Node Modules (dependencies) and "requiring" them:
Node.js is built on V8 engine, which is written in C++. This means that Node.js' dependencies are fundamentally written in C++.
Now when you require a dependency, you really require code/functions from a C++ program or js library, because that's how new libraries/dependencies are made.
Libraries have so many functions that you will not use
For example, take a look at the express-validator module, which contains so many functions. When you require the module, do you use all the functions it provides? The answer is no. People most often require packages like this just to use one single benefit of it, although all of the functions end up getting downloaded, which takes up unnecessary space.
Think of the node dependencies that are made from other node dependencies as Interpreted Languages
For example, JavaScript is written in C/C++, whose functions and compilers are in turn originally written in assembly. Think of it like a tree. You create new branches each time for more convenient usage and, most importantly, to save time . It makes things faster. Similarly, when people create new dependencies, they use/require ones that already exist, instead of rewriting a whole C++ program or js script, because that makes everything easier.
Problem arises when requiring other NPMs for creating a new one
When the authors of the dependencies require other dependencies from here and there just to use a few (small amount) benefits from them, they end up downloading them all, (which they don't really care about because they mostly do not worry about the size or they'd rather do this than explicitly writing a new dependency or a C++ addon) and this takes extra space. For example you can see the dependencies that the express-validator module uses by accessing this link.
So, when you have big projects that use lots of dependencies you end up taking so much space for them.
Ways to solve this
Number 1
This requires some expert people on Node.js. To reduce the amount of the downloaded packages, a professional Node.js developer could go to the directories that modules are saved in, open the javascript files, take a look at their source code, and delete the functions that they will not use without changing the structure of the package.
Number 2 (Most likely not worth your time)
You could also create your own personal dependencies that are written in C++, or more preferably js, which would literally take up the least space possible, depending on the programmer, but would take/waste the most time, in order to reduce size instead of doing work. (Note: Most dependencies are written in js.)
Number 3 (Common)
Instead of Using option number 2, you could implement WebPack.
Conclusion & Note
So, basically, there is no running away from downloading all the node packages, but you could use solution number 1 if you believe you can do it, which also has the possibility of screwing up the whole intention of a dependency. (So make it personal and use it for specific purposes.) Or just make use of a module like WebPack.
Also, ask this question to yourself: Do those packages really cause you a problem?
No, there is no point to add about 900 packages dependencies in your project just because you want to add some template. But it is up to you!
The heavyness of a template is not challenging the node.js ecosystem nor his main package system npm.
It is a fact that javascript community tend to make smallest possible module to be responsible for one task, and just one.
It is not a bad thing I guess. But it could result of a situation where you have a lot of dependencies in your project.
Nowadays hard drive memory is cheap and nobody cares any more about making efficient/small apps.
As always, it's only a matter of choice.
What is the point of delivering hundreds of packages weighing hundreds of MB for a few kB project.
There isn't..
If you intend to provide it to other developers, just gitignore (or remove from shared package) node_modules or bower_components directories. Developers simply install dependencies again as required ;)
If it is something as simple as an HTML templates or similar stuff, node would most likely be there just for making your life as a developer easier providing live reload, compiling/transpiling typescript/babel/SCSS/SASS/LESS/Coffee... ( list goes on ;P ) etc.
And in that case dependencies would most likely only be dev_dependencies and won't be required at all in production environment ;)
Also many packages come with separate production and dev dependencies, So you just need to install production dependencies...
npm install --only=prod
If your project does need many projects in production, and you really really wanna avoid that stuff, just spend some time and include css/js files your your project needs(this can be a laborious task).
Update
Production vs default install
Most projects have different dev and production dependencies,
Dev dependencies may include stuff like SASS, typescript etc. compilers, uglifiers (minification), maybe stuff like live reload etc.
Where as production version will not have those things reducing the size node_modules directory.
** No node_modules**
In some html template kind of projects, you may not need any node_modules in production, so you skip doing an npm install.
No access to node_modules
Or in some cases, when server that serves exists in node_modules itself, access to it may be blocked (coz there is no need to access these from frontend).
What are those? Why did they get installed along with my package?
Dependencies exists to facilitate code reuse through modularity.
... do I need to deliver my template in production with hundreds of unnecessary packages?
One shouldn't be so quick to dismiss this modularity. If you inline your requires and eliminate dead code, you'll lose the benefit of maintenance patches for the dependencies automatically being applied to your code. You should see this as a form of compilation, because... well... it is compilation.
Nonetheless, if you're licensed to redistribute all of your dependencies in this compiled form, you'll be happy to learn those optimisations are performed by a compiler which compile Javascript to Javascript. The Closure Compiler, as the first example I stumbled across, appears to perform advanced compilation, which means you get dead code removal and function inlining... That seems promising!
This does however have another side effect when you are required to justify the licensing of all npm modules..so when you have hundreds of npm modules due to dependencies this effort also becomes a more cumbersome task
Very old question but I happened to come across very similar situation just as RA pointed out.
I tried to work with node.js framework using vscode and the moment when I tried to install start npm using npm init -y, it generated so many different dependencies. In my case, it was vscode extension ESlint that I added to prior to running npm init -y
Uninstalling ESlint
Restarted vscode to apply that uninstallation
removed previously generated package.json and node-modules folder
do npm init -y again
This solved my problem of starting out with so many dependencies.

is it ok to have more than one package.json in a project?

Is it ok to have more than one package.json in a project? I'm working in a .NET MVC solution and it has a package.json at the root level. I need to integrate Jasmine/Karma into the solution and this is my first time doing this type of integration. I found a sample project for Jasmine/Karma on the web and I was able to get this running locally. This project has it's own package.json.
It seems like it would be useful to maintain the package.json file for the sample Jasmine/Karma sample project separately from the package.json at the root level of the solution to provide more flexibility and to allow the same properties to be used differently based on context.
Would this be valid or generally considered an ok practice? Or do I need to figure out how to merge the contents of package.json from the sample project into the package.json at the root level of the .NET MVC solution?
In terms of keeping things simple, it is easier to only have to maintain a single package.json file, but there is nothing inherently wrong with multiple package.json files within a repo. Some companies do use mono-repos, for which it would make total sense to have multiple package.json files.
Multiple package.json files give you a lot of flexibility to run different/incompatible versions of dependencies. As a practical example, on one of the projects that I work on we have 2 package.json files, one for the main application code and then another one for our BDD tests. We use chimp for our BDD tests and had issues with running that on the latest node version, whereas we don't want to keep the rest of the app from upgrading just because of this. We also found that the single package.json file got very messy when it was combined at first.
I would say its typically bad practice to have more than one package.json. I would only expect to have to npm install once, and having to deal with two sets of dependency management could lead to issues down the line.

What is the perfect workflow to work on A, B & C in parallel where A depends on B and B on C?

I wonder what is the perfect workflow if one needs to work on project A, B and C in parallel where A depends on B and B depends on C.
Currently I have everything in one repository which speeded up the early development. So my working directory looks like this:
/A/
/A/B
/A/B/C
So A is the project that is driving the development but it also means B and C are in parallel evolvement.
However, I want to release the projects B and C individually as well as they are quite useful for others.
However, I'm torn on how to do this without ruining my development speed. npm is great for distribution and dependency management but during development you definitely don't want to move temporally versions across the internet just to get the files to update in a different folder on your machine :)
On the other hand you also don't want to manually copy them over. Heck, all this I have to switch directories to work on B and now copy it over to /A/B is scary and seems error prone.
So, git submodule seemed like the answer as it's essentially enabling exactly that: You would keep your directory layout just as it is. Then when you make changes to files in B you can directly test them out in A without having to copy something over. And when you think it's ready you can just commit and push from the three different folders. Everything goes into three different repositories automatically. Seems like heaven yet everybody hates git submodule for various reasons.
I have pretty much the same problem when working on grunt plugins. I have the grunt plugin in it's own repository and then when I'm working on it I have to copy it over in one of the projects where I use it to drive the development. And then at the end of the day I copy it back to the grunt plugin working directory, make commits and push them. Thank god, I'm not the author of thousands of grunt plugins so I can deal with that but for this project I'm currently working on, I would definitely like to find a better solution.
So I wonder, what is the answer?
Note that Git submodules point to a specific version of the external repository, i.e. to a specific commit.
To update all Git submodules to the latest version, you’d still have to run a command:
git submodule foreach git pull origin master
Depending on your situation, you could possibly use npm instead of Git submodules. In that case, you simply list your dependencies in package.json, then run npm install in the root of the repository to fetch them. If a dependency is updated and a new version is published, you just run npm update again and it will match the version requirements set in the package.json file. You could also use npm to point to a specific commit, much like how Git submodules work:
{
"devDependencies": {
"dependency-a": "git://github.com/the-user-name/the-project-name.git#b3c24169432a924d05020c85f852d4a1736e66d4"
}
}
Or, if you want to use a bleeding edge version of a dependency, like the master branch of a given Git repository, you could use:
{
"devDependencies": {
"dependency-a": "git://github.com/the-user-name/the-project-name.git#master"
}
}
I've tried all number of these workflows and have decided that git submodules are not the answer. The main reason for me being that they couple projects together at a scm level and it is not intuitive as to where exactly in the revision history this coupling happens. Especially in a detached head scenario.
I rely on npm for depency managment with a combination of npm link and git URLs. npm link is great when I want to test something locally. git URLs are perfect to resolve dependencies during pull requests. Eventually I always want a published module with a version number I can match to a git tag for future reference and issue tracking. I use npm version to do this in one step. Allows for projects to depend on each other at varying levels of coupling during my development cycle.
Would using git subtree be the answer here? This article Alternatives To Git Submodule: Git Subtree did a good job explaining the pros and cons of using git subtree versus git submodules.

Categories