browserify and babelify very slow due to large data js files - javascript

I have a nodejs project which uses large dictionary lists (millions of entries), stored in js files, that look like this:
module.exports = ["entry1", "entry2", "entry3", "entry4", "entry5", etc.];
and then I use them from the other files like this:
var values = require('./filePath');
This works great and it works in the browser too (using browserify), however bundling takes ages - about 10 minutes.
I use the following command to create the bundle:
browserify "./src/myModule.js" --standalone myModule -t [ babelify --presets [ es2015 stage-2 ] --plugins ["transform-es2015-classes", {"loose": true}]
I have tried to avoid parsing of my dictionary js files using --noparse ["path1", "path2", "path3", etc.] but it did not make any difference.
Ideally I would like to just speed up the browserify\babelify process, however if that's not possible I would be very happy to find another way (ie. avoid require) to store and use my lists, so that they don't slow the process down but that crucially work in node and in the browser too.

You can bundle the data files separately, so you'll only need to rebundle them when they change. This is possible using the --require -r and --external -x options.
To create the data bundle, do something like this:
browserify -r ./path/to/data.js -r ./path/to/other/data.js > data-bundle.js
The resulting data-bundle.js will define the require function globally which can be used to obtain any file you listed in the command above. Just make sure you include this bundle in a script tag before your main bundle.
It would be nice to be able to --require a glob pattern, but unfortunately browserify does not support this. If you try to use the shell to expand a pattern, the -r option will only apply to the first, which sucks. You can probably write a shell script that builds a command from an ls or something, to avoid having to list all of the data files explicilty, but that's beyond the scope of the question, I think.
To create your main bundle without rebuilding the data files, simply add an option like this to your command:
-x './path/to/data/*.js'
This tells browserify to basically ignore them and let them be pulled in through the global require function created by your other bundle. As you can see, this does support glob patterns, so it's a bit easier.
Update:
To make the two bundles into one, just put something like this at the end of a shell script that starts with the browserify command that builds your main bundle:
cat data-bundle.js main-bundle.js > bundle.js
rm main-bundle.js
Unfortunately this will always have to write a copy of data-bundle.js to disk, which may be the ultimate cause of the slowdown, as I mentioned in the comments below. Worth giving a shot, though.
If even that doesn't work, there are some other, much more hacky approaches you might take. I'll pass on going into those for now though, because I don't think they're worth it unless you absolutely must have it as one file and have no other way of doing it. :\

If you have files with data - just load them in separate way and don't include them into build process
Format your big data files as JSON
On the server use:
let fs = require('fs');
let yourContent = JSON.parse(fs.readFileSync('path/to/file'));
On client use:
let request = require("client-request"); // do npm install client-request
var options = {
uri: "http://.com/path/to/file",
json: true
}
var req = request(options, function callback(err, response, body) {
console.log(response.statusCode)
if (body) {
let yourContent = body
}
})
Or use any other library which makes HTTP request which you prefer

Related

Is there a way to cause the JS engine to load a .js file without explicitly importing something from it?

Maybe I'm trying to do something silly, but I've got a web application (Angular2+), and I'm trying to build it in an extensible/modular way. In particular, I've got various, well, modules for lack of a better term, that I'd like to be able to include or not, depending on what kind of deployment is desired. These modules include various functionality that is implemented via extending base classes.
To simplify things, imagine there is a GenericModuleDefinition class, and there are two modules - ModuleOne.js and ModuleTwo.js. The first defines a ModuleOneDefinitionClass and instantiate an exported instance ModuleOneDefinition, and then registers it with the ModuleRegistry. The second module does an analogous thing.
(To be clear - it registers the ModuleXXXDefinition object with the ModuleRegistry when the ModuleXXX.js file is run (e.g. because of some other .js file imports one of its exports). If it is not run, then clearly nothing gets registered - and this is the problem I'm having, as I describe below.)
The ModuleRegistry has some methods that will iterate over all the Modules and call their individual methods. In this example, there might be a method called ModuleRegistry.initAllModules(), which then calls the initModule() method on each of the registered Modules.
At startup, my application (say, in index.js) calls ModuleRegistry.initAllModules(). Obviously, because index.js imports the exported ModuleRegistry symbol, this will cause the ModuleRegistry.js code to get pulled in, but since none of the exports from either of the two Module .js files is explicitly referenced, these files will not have been pulled in, and so the ModuleOneDefinition and ModuleTwoDefinition objects will not have been instantiated and registered with the ModuleRegistry - so the call to initAllModules() will be for naught.
Obviously, I could just put meaningless references to each of these ModuleDefinition objects in my index.js, which would force them to be pulled in, so that they were registered by the time I call initAllModules(). But this requires changes to the index.js file depending on whether I want to deploy it with ModuleTwo or without. I was hoping to have the mere existence of the ModuleTwo.js be enough to cause the file to get pulled in and the resulting ModuleTwoDefinition to get registered with the ModuleRegistry.
Is there a standard way to handle this kind of situation? Am I stuck having to edit some global file (either index.js or some other file it references) so that it has information about all the included Modules so that it can then go and load them? Or is there a clever way to cause JavaScript to execute all the .js files in a directory so that merely copying the files it would be enough to get them to load at startup?
a clever way to cause xxJavaScriptxx Node.js to execute all the .js files in a directory:
var fs = require('fs') // node filesystem
var path = require('path') // node path
function hasJsExtension(item) {
return item != 'index.js' && path.extname(item) === '.js'
}
function pathHere(item) {
return path.join('.', item)
}
fs.readdir('./', function(err, list) {
if (err) return err
list.filter(hasJsExtension).map(pathHere).forEach(require) // require them all
})
Angular is pretty different, all the more if it is ng serve who checks if your app needs a module, and if so serves the corresponding js file, at any time needed, not at first load time.
In fact your situation reminds me of C++ with header files Declaration and cpp files with implementation, maybe you just need a defineAllModules function before initAllModules.
Another way could be considering finding out how to exclude those modules from ng-serve, and include them as scripts in your HTML before the others, they would so be defined (if present and so, served), and called by angular if necesary, the only cavehat is the error in the console if one script tag is not fetched, but your app will work anyway, if it supposed to do so.
But anyway, it would be declaring/defining those modules somewhere in ng-serve and also in the HTML.
In your own special case, and not willing to under-evalute ng-serve, but is the total js for your app too heavy to be served at once? (minified and all the ...), since the good-to-go solution may be one of the many tools to build and rebuild your production all.js from your dev js folder at will, or like you said, with a drag&drop in your folder.
Such tool is, again, server-side, but even if you only can push/FTP your javascript, you could use it in your prefered dev environment and just push your new version. To see a list of such tools google 'YourDevEnvironment bundle javascript'.
To do more with angular serve and append static js files under specific conditions, you should use webpack so the first option i see here is eject your webpack configuration and after that you can specify what angular should load or not.
With that said, i will give an example:
With angular cli and ng serve any external javascript files you wanna include, you have to put them inside the scripts array in the angular-cli.json file.However you can not control which file should be included and which one not.
By using webpack configuration you can specify all these thing by passing a flag from your terminal to the webpack config file and do all the process right there.
Example:
var env.commandLineParamater, plugins;
if(env.commandLineParamater == 'production'){
plugins = [
new ScriptsWebpackPlugin({
"name": "scripts",
"sourceMap": true,
"filename": "scripts.bundle.js",
"scripts": [
"D:\\Tutorial\\Angular\\demo-project\\node_moduels\\bootstrap\\dist\\bootstrap.min.js",
"D:\\Tutorial\\Angular\\demo-project\\node_moduels\\jquery\\dist\\jquery.min.js"
],
"basePath": "D:\\Tutorial\\Angular\\demo-project"
}),
]}else{
plugins = [
new ScriptsWebpackPlugin({
"name": "scripts",
"sourceMap": true,
"filename": "scripts.bundle.js",
"scripts": [
"D:\\Tutorial\\Angular\\demo-project\\node_moduels\\bootstrap\\dist\\bootstrap.min.js"
],
"basePath": "D:\\Tutorial\\Angular\\demo-project"
}),
]
}
then:
module.exports = (env) => {
"plugins": plugins,
// other webpack configuration
}
The script.js bundle will be loaded before your main app bundle and so you can control what you load when you run npm run start instead of ng-serve.
To Eject your webpack configuration, use ng eject.
Generally speaking, when you need to control some of angular ng-serve working, you should extract your own webpack config and customize it as you want.

Assemble every module into a single .js file

I want to minimize the number of HTTP requests from the client to load scripts in the browser. This is going to be a pretty general question but I still hope I can get some answers because module management in javascript has been a pain so far.
Current situation
Right now, in development, each module is requested individually from the main html template, like this:
<script src="/libraries/jquery.js"></script>
<script src="/controllers/controllername.js"></script>
...
The server runs on Node.js and sends the scripts as they are requested.
Obviously this is the least optimal way of doing so, since all the models, collections, etc. are also separated into their own files which translates into numerous different requests.
As far as research goes
The libraries I have come across (RequireJS using AMD and CommonJS) can request modules from within the main .js file sent to the client, but require a lot of additional work to make each module compliant with each library:
;(function(factory){
if (typeof define === 'function' && define.amd) define([], factory);
else factory();
}(function(){
// Module code
exports = moduleName;
}));
My goal
I'd like to create a single file on the server that 'concatenates' all the modules together. If I can do so without having to add more code to the already existing modules that would be perfect. Then I can simply serve that single file to the client when it is requested.
Is this possible?
Additionally, if I do manage to build a single file, should I include the open source libraries in it (jQuery, Angular.js, etc.) or request them from an external cdn on the client side?
What you are asking to do, from what I can tell, is concat your js files into one file and then in your main.html you would have this
<script src="/pathLocation/allMyJSFiles.js"></script>
If my assumption is correct, then the answer would be to use one of the two following items
GULP link or GRUNT link
I use GULP.
You can either use gulp on a case by case basis, which means calling gulp from the command line to execute gulp code, or use a watch to do it automatically on save.
Besides getting gulp to work and including the gulp files you need to do what you need, I will only provide a little of what I use to get your answer.
In my gulp file I would have something like this
var gulp = require('gulp');
var concat = require('gulp-concat');
...maybe more.
Then I have the file paths I need to be reduced into one file.
var onlyProductionJS = [
'public/application.js',
'public/directives/**/*.js',
'public/controllers/**/*.js',
'public/factories/**/*.js',
'public/filters/**/*.js',
'public/services/**/*.js',
'public/routes.js'
];
and I use this info in a gulp task like the one below
gulp.task('makeOneFileToRuleThemAll', function(){
return gulp.src(onlyProductionJS)
.pipe(concat('weHaveTheRing.js'))
.pipe(gulp.dest('public/'));
});
I then run the task in my command line by calling
gulp makeOneFileToRuleThemAll
This call runs the associated gulp task which uses 'gulp-concat' to get all the files together into one new file called 'weHaveTheRing.js' and creates that file in the destination 'public/'
Then just include that new file into your main.html
<script src="/pathLocation/weHaveTheRing.js"></script>
As for including all your files into one file, including your vendor files, just make sure that your vendor code runs first. It's probably best to keep those separate unless you have a sure fire way of getting your vendor code to load first without any issues.
UPDATE
Here is my gulp watch task.
gulp.task('startTheWatchingEye', function () {
gulp.watch(productionScripts, ['makeOneFileToRuleThemAll']);
});
Then I start up my server like this (yours may differ)
npm start
// in a different terminal window I then type
gulp startTheWatchfuleye
NOTE: you can use ANY movie or show reference you wish! :)
Now just code it up, every time you make a change in the specified files GULP will run your task(s).
If you want to say run Karma for your test runner...
add the following to your gulp file
var karma = require('karma').server;
gulp.task('karma', function(done){
karma.start({
configFile: __dirname + '/karma.conf.js'
}, done);
});
Then add this task karma to your watch I stated above like this...
gulp.task('startTheWatchingEye', function(){
gulp.watch(productionScripts, ['makeOneFileToRuleThemAll', 'karma']);
});
ALSO
Your specific settings may require a few more gulp modules. Usually, you install Gulp globally, as well as each module. Then use them in your various projects. Just make sure that your project's package.json has the gulp modules you need in dev or whatever.
There are different articles on whether to use Gulp or Grunt. Gulp was made after Grunt with a few additions that Grunt was lacking. I don't know if Grunt lacks them anymore. I like Gulp a lot though and find it very useful with a lot of documentation.
Good luck!
I'd like to create a single file on the server that 'concatenates' all the modules together. If I can do so without having to add more code to the already existing modules that would be perfect.
Sure you can. You can use Grunt or Gulp to do that, more specifically grunt-contrib-concat or gulp-concat
Here's an example of a Gruntfile.js configuration to concat every file under a js directory:
grunt.initConfig({
concat: {
dist: {
files: {
'dist/built.js': ['js/**/**.js'],
},
},
},
});
Also, you can minify everything after concatenating, using grunt-contrib-minify.
Both libraries support source maps so, in the case a bug gets to production, you can easily debug.
You can also minify your HTML files using grunt-contrib-htmlmin.
There's also an extremely useful library called grunt-usemin. Usemin let's you use HTML comments to "control" which files get minified (so you don't have to manually add them).
The drawback is that you have to explicitely include them in your HTML via script tags, so no async loading via javascript (with RequireJS for instance).
Additionally, if I do manage to build a single file, should I include the open source libraries in it (jQuery, Angular.js, etc.) or request them from an external cdn on the client side?
That's debatable. Both have pros and cons. Concatenating vendors assures that, if for some reason, the CDN isn't available, your page works as intended. However the file served is bigger so you consume more bandwidth.
In my personal experience, I tend to include vendor libraries that are absolutely essential for the page to run such as AngularJS for instance.
If I understand you correctly, you could use a task runner such as Grunt to concatenate the files for you.
Have a look at the Grunt Concat plugin.
Example configuration from the docs:
// Project configuration.
grunt.initConfig({
concat: {
dist: {
src: ['src/intro.js', 'src/project.js', 'src/outro.js'],
dest: 'dist/built.js',
}
}
});
Otherwise, as you have stated, a 'module loader' system such as Require JS or Browserify may be the way to go.

How do I package a node module with optional submodules?

I'm writing a javascript library that contains a core module and several
optional submodules which extend the core module. My target is the browser
environment (using Browserify), where I expect a user of my module will only
want to use some of my optional submodules and not have to download the rest to
the client--much like custom builds work in lodash.
The way I imagine this working:
// Require the core library
var Tasks = require('mymodule');
// We need yaks
require('mymodule/yaks');
// We need razors
require('mymodule/razors');
var tasks = new Tasks(); // Core mymodule functionality
var yak = tasks.find_yak(); // Provided by mymodule/yaks
tasks.shave(yak); // Provided by mymodule/razors
Now, imagine that the mymodule/* namespace has tens of these submodules. The
user of the mymodule library only needs to incur the bandwidth cost of the
submodules that she uses, but there's no need for an offline build process like
lodash uses: a tool like Browserify solves the dependency graph for us and
only includes the required code.
Is it possible to package something this way using Node/npm? Am I delusional?
Update: An answer over here seems to suggest that this is possible, but I can't figure out from the npm documentation how to actually structure the files and package.json.
Say that I have these files:
./lib/mymodule.js
./lib/yaks.js
./lib/razors.js
./lib/sharks.js
./lib/jets.js
In my package.json, I'll have:
"main": "./lib/mymodule.js"
But how will node know about the other files under ./lib/?
It's simpler than it seems -- when you require a package by it's name, it gets the "main" file. So require('mymodule') returns "./lib/mymodule.js" (per your package.json "main" prop). To require optional submodules directly, simply require them via their file path.
So to get the yaks submodule: require('mymodule/lib/yaks'). If you wanted to do require('mymodule/yaks') you would need to either change your file structure to match that (move yaks.js to the root folder) or do something tricky where there's a yaks.js at the root and it just does something like: module.exports = require('./lib/yaks');.
Good luck with this yak lib. Sounds hairy :)

Dynamically generating a required module with Browserify and Gulp

Is there a way to invoke Browserify (via Gulp) so that it includes a different file when requireing a module with a given name?
Briefly, the end result I would like is for my Browserify entry point, main.js:
var myPlatformSpecificImplmentation = require('./platform');
// go to town
to use the contents of ./path/to/platform-a.js when I run gulp js:platform-a and ./path/to/platform-b.js when I run gulp js:platform-b.
If I were using RequireJS, this would be as simple as modifying the paths option accordingly:
paths: {
platform: './path/to/platform-a'
}
It would be great if I could somehow generate these modules dynamically via gulp's built-in streaming mechanism. In that case, I could, say, pipe a file into gulp-template and on into Browserify.
Thanks
One solution would be to use my pathmodify plugin like so:
gulpfile.js
var
pathmod = require('pathmodify'),
paths = {a: '/path/to/platform-a.js', b: '/path/to/platform-b.js'};
function platform (version) {
return function () {
return browserify('./main')
.plugin(pathmod(), {mods: [
pathmod.mod.id('app/platform', paths[version])
]})
.bundle()
.pipe(...);
};
}
gulp.task('js:platform-a', platform('a'));
gulp.task('js:platform-b', platform('b'));
main.js
var myPlatformSpecificImplmentation = require('app/platform');
I've illustrated this with your require() string changed to app/platform because that allows the simplest implementation of pathmodify without collisions with other ./platform relative paths in other files. But this can be implemented with pathmodify without risking collision (by testing the parent module [main.js in this case] pathname). If it's important to you to keep the ./platform string I'll illustrate that.
Or you could use a transform. Take a look at makeRequireTransform() in benbria/browserify-transform-tools if you don't want to roll your own.
It would be great if I could somehow generate these modules dynamically via gulp's built-in streaming mechanism. In that case, I could, say, pipe a file into gulp-template and on into Browserify.
That's not out of the question, but it's not really easy to do. To do it without touching disk, you'd need to do something like create / gulp.src() a vinyl file and run it through whatever gulp plugins, then convert it to a stream to feed to browserify.

Node.js browserify slow: isn't there a way to cache big libraries?

I'm creating a file that requires huge libraries such as jquery and three.js using browserify. The compiling process takes several seconds, probably because it's recompiling all the libs for each minor change I make. Is there a way to speed it up?
Have you tried using the --insert-globals, --ig, or --fast flags? (they're all the same thing)
The reason it's slow may be that it's scanning all of jquery and d3 for __dirname, __filename, process, and global references.
EDIT:
I just remembered: Browserify will take any pre-existing require functions and fall back to using that. more info here
This means you could build a bundle for your static libs, and then only rebuild the bundle for your app code on change.
This coupled with my pre-edit answer should make it a lot faster.
There are a few options that can help:
--noparse=FILE is a must for things like jQuery and three.js that are huge but don't use require at all.
--detect-globals Set to false if your module doesn't use any node.js globals. Directs browserify not to parse a file looking for process, global, __filename, and __dirname.
--insert-globals Set to true if your module does use node.js globals. This will define those globals without parsing the module and checking to see if they're used.
I was able to speed up my build by externalizing ThreeJS, using noparse with it, and setting it not to create a source map for it.
Use https://github.com/substack/watchify while developing.
If you use grunt, you can use my grunt task : https://github.com/amiorin/grunt-watchify
It caches the dependencies and watches the filesystem. Because of this the build is very fast. You can use it with grunt-contrib-watch and grunt-contrib-connect or alone. You can find a Gruntfile.js example in the github repository.
If you don't use grunt, you can use the original watchify from #substack : https://github.com/substack/watchify
Using watchify is practically a must, as it actually caches your deps between reloads.
My builds dropped from 3-8s to under 1s. (The >3s builds were still using ignoreGlobals, detectGlobals=false, and even noParseing jQuery).
Here's how I use it with gulp and coffeescript:
gutil = require("gulp-util")
source = require("vinyl-source-stream")
watchify = require("watchify")
browserify = require("browserify")
coffeeify = require("coffeeify")
gulp.task "watchify", ->
args = watchify.args
args.extensions = ['.coffee']
bundler = watchify(browserify("./coffee/app.coffee", args), args)
bundler.transform(coffeeify)
rebundle = ->
gutil.log gutil.colors.green 'rebundling...'
bundler.bundle()
# log errors if they happen
.on "error", gutil.log.bind(gutil, "Browserify Error")
# I'm not really sure what this line is all about?
.pipe source("app.js")
.pipe gulp.dest("js")
.pipe livereload()
gutil.log gutil.colors.green 'rebundled.'
bundler.on "update", rebundle
rebundle()
gulp.task "default", ["watchify", "serve"]
EDIT: here's a JS translation:
var gutil = require("gulp-util")
var source = require("vinyl-source-stream")
var watchify = require("watchify")
var browserify = require("browserify")
var coffeeify = require("coffeeify")
gulp.task("watchify", function() {
var args = watchify.args
args.extensions = ['.coffee']
var bundler = watchify(browserify("./coffee/app.coffee", args), args)
bundler.transform(coffeeify)
function rebundle() {
gutil.log(gutil.colors.green('rebundling...'))
bundler.bundle()
// log errors if they happen
.on("error", gutil.log.bind(gutil, "Browserify Error"))
// I'm not really sure what this line is all about?
.pipe(source("app.js"))
.pipe(gulp.dest("js"))
.pipe(livereload())
gutil.log(gutil.colors.green('rebundled.'))
}
bundler.on("update", rebundle)
rebundle()
})
gulp.task("default", ["watchify", "serve"])
Update
You can also give it a try to persistify which can be used as a drop in replacement for watchify from the command line and from code.
Original answer below
=======
I'm currently using bundly: https://www.npmjs.com/package/bundly
FULL DISCLOUSURE: I wrote it
But the main difference of this wrapper is that it provides incremental building. It persists the browserify cache between runs and only parse the files that have changed without the need for the watch mode.
Currently the module does a bit more than only adding the cache, but I'm thinking that the logic that handles the incremental build part could be moved to a plugin, that way it can be used with browserify directly.
Check a demo here: https://github.com/royriojas/bundly-usage-demo
I wrote this to solve the problem of slow builds with browserify and commonjs-everywhere. If you run it in "watch" mode then it will automatically watch your input files and incrementally rebuild just any files that changed. Basically instantaneous and will never get slower as your project grows.
https://github.com/krisnye/browser-build

Categories