Minify Javascript programmatically in-memory - javascript

I am building a nifty little "asset-pipeline" for a express.js application, but i have a problem with the compression-step for javascript files
scripts = (fs.readFileSync(file) for file in filelist)
result = scripts.join("\n\n") # concat
upto now, things are working as expected (the logic itself is written in coffeescript). The next step after merging the JS-files would be to minify them. But here is my problem: i want to do this "hot" when i start my express-app in production mode, from within a piece of connect-middleware i wrote.
I need a solution that can minify a given blob of javascript stuff, without writing the result to disk (!), in other words: a function that does the minification and returns the result directly as a result value. (No, no webservices either.) It should be usable like this:
minified_result = awesomeMinifyFunction( result )
The raw processing performance isn't that important for me, neither is the level of compression, i need just something that works this way without hassle.
Does anyone know a suitable solution? Thanks in advance!

I'd suggest you look at one of the JavaScript based minifiers, like UglifyJS2.
npm install uglify-js
It can be used within a Node.JS application programatically:
var UglifyJS = require("uglify-js");
// you could pass multiple files (rather than reading them as strings)
var result = UglifyJS.minify([ "file1.js", "file2.js", "file3.js" ]);
console.log(result.code);
Or you could
var result = scripts.join("\n\n"); # concat
result = UglifyJS.minify(result, {fromString: true});
console.log(result.code);

You can write your own function that removes all comments/spaces/blank lines etc.
You can use a regular expression that makes use of rJSmin like:
function awesomeMinifyFunction(result)
{
pattern = (
r'([^\047"/\000-\040]+)|((?:(?:\047[^\047\\\r\n]*(?:\\(?:[^\r\n]|\r?'
r'\n|\r)[^\047\\\r\n]*)*\047)|(?:"[^"\\\r\n]*(?:\\(?:[^\r\n]|\r?\n|'
r'\r)[^"\\\r\n]*)*"))[^\047"/\000-\040]*)|(?<=[(,=:\[!&|?{};\r\n])(?'
r':[\000-\011\013\014\016-\040]|(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/))*'
r'(?:(?:(?://[^\r\n]*)?[\r\n])(?:[\000-\011\013\014\016-\040]|(?:/\*'
r'[^*]*\*+(?:[^/*][^*]*\*+)*/))*)*((?:/(?![\r\n/*])[^/\\\[\r\n]*(?:('
r'?:\\[^\r\n]|(?:\[[^\\\]\r\n]*(?:\\[^\r\n][^\\\]\r\n]*)*\]))[^/\\\['
r'\r\n]*)*/)[^\047"/\000-\040]*)|(?<=[\000-#%-,./:-#\[-^`{-~-]return'
r')(?:[\000-\011\013\014\016-\040]|(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/'
r'))*(?:(?:(?://[^\r\n]*)?[\r\n])(?:[\000-\011\013\014\016-\040]|(?:'
r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/)))*((?:/(?![\r\n/*])[^/\\\[\r\n]*(?'
r':(?:\\[^\r\n]|(?:\[[^\\\]\r\n]*(?:\\[^\r\n][^\\\]\r\n]*)*\]))[^/'
r'\\\[\r\n]*)*/)[^\047"/\000-\040]*)|(?<=[^\000-!#%&(*,./:-#\[\\^`{|'
r'~])(?:[\000-\011\013\014\016-\040]|(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)'
r'*/))*(?:((?:(?://[^\r\n]*)?[\r\n]))(?:[\000-\011\013\014\016-\040]'
r'|(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/))*)+(?=[^\000-\040"#%-\047)*,./'
r':-#\\-^`|-~])|(?<=[^\000-#%-,./:-#\[-^`{-~-])((?:[\000-\011\013\01'
r'4\016-\040]|(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/)))+(?=[^\000-#%-,./:'
r'-#\[-^`{-~-])|(?<=\+)((?:[\000-\011\013\014\016-\040]|(?:/\*[^*]*'
r'\*+(?:[^/*][^*]*\*+)*/)))+(?=\+)|(?<=-)((?:[\000-\011\013\014\016-'
r'\040]|(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/)))+(?=-)|(?:[\000-\011\013'
r'\014\016-\040]|(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/))+|(?:(?:(?://[^'
r'\r\n]*)?[\r\n])(?:[\000-\011\013\014\016-\040]|(?:/\*[^*]*\*+(?:[^'
r'/*][^*]*\*+)*/))*)+'
)
return result.match(pattern);
}

I'd recommend taking a look at Asset Rack, which already implements what you're building.

Related

How do you use an npm package with require & module export as a plain JS library

I'm not sure I'm even asking the right question here, sorry, but I think the two general ones are:
In what way do you need to modify a node.js package using require etc to be used as a plain embedded script/library in HTML?
How do you call a class constructor (?) in JS as a function to validate a form field?
I'm trying to use this small JS library NoSwearingPlease (which is an npm package) in an environment with no node or build system – so I'm just trying to call it like you would jQuery or something with a script & src in the HTML, and then utilise it with a small inline script.
I can see a couple of things are required to get this working:
the JSON file needs to be called in a different way (not using require etc)
the checker variable needs to be rewritten, again without require
I attempted using jQuery getJSON but I just don't understand the class & scope bits of the library enough to use it I think:
var noswearlist = $.getJSON( "./noswearing-swears.json" );
function() {
console.log( "got swear list from inline script" );
})
.fail(function() {
console.log( "failed to get swear list" );
})
noswearlist.done(function() {
console.log( "done callback as child of noswearlist variable" );
var checker = new NoSwearing(noswearlist);
console.log(checker);
});
Please halp. Thanks!
No need to modify, when outside of node the class is just appended to window (global):
fetch("https://cdn.jsdelivr.net/gh/ThreeLetters/NoSwearingPlease#master/swears.json").then(response => {
return response.json();
}).then(data => {
var noSwearing = new NoSwearing(data);
console.log(noSwearing.check("squarehead"));
});
<script src="https://cdn.jsdelivr.net/gh/ThreeLetters/NoSwearingPlease#master/index.js"></script>
In the future, you can answer this type of question on your own by looking through the source code and looking up things you don't understand. That being said, here's what I was able to gather doing that myself.
For your first question, if you have no build tools you can't use require, you have to hope your NPM package supports adding the class to the window or has a UMD export (which in this case, it does). If so, you can download the source code or use a CDN like JSDelivr and add a <script> tag to link it.
<script src="https://cdn.jsdelivr.net/gh/ThreeLetters/NoSwearingPlease#master/index.js"></script>
I'm having a hard time deciphering your script (it has a few syntax errors as far as I can tell), so here's what you do if you have a variable ns containing the JSON and the string str that you need to check:
var checker = new NoSwearing(ns);
checker.check(str);
As an aside, you should really use build tools to optimize your bundle size and make using packages a lot easier. And consider dropping jQuery for document.querySelector, fetch/XMLHttpRequest, and other modern JavaScript APIs.

Attempting to Import Module in Child Process (Javascript) and Failing

I'm currently running a heavy computation (i.e. generating a Monte Carlo tree), which is an expensive operation. I only have a few seconds to build as big of a tree as I can, so I am using subprocesses in Node.js in order to build multiple trees, and then aggregate their data together to make a more informed decision.
I understand that subprocesses do not share information/memory, and I need to use modules within these subprocesses that are located in a file, called "Epilog.js" on my machine.
When I run functions that are in epilog.js from the main file, it works just fine. But all of my functions that are in my worker threads return absolutely nothing.
I have tested to make sure that the parameters of the functions I am trying to use in "epilog.js" aren't empty, and they're not. The problem isn't in the parameter.
I have also tested to see what happens if I simply don't import, and instead of just outputting an undefined array, I get an error saying that there is no function called "findroles".
//My main thread.
var fs = require('fs');
eval(fs.readFileSync('epilog.js') + '');
var process = fork('./buildGraph.js');
process.send({library});
//My worker thread.
//buildGraph.js
var fs = require('fs');
eval(fs.readFileSync('epilog.js') + '');
// receive message from master process
process.on('message', async(message) => {
library = message["library"];
console.log(findroles(library));
// findroles(library) is a function that is defined in epilog.js,
//and this outputs an array of "roles" given a parameter,library.
// For some reason this function outputs [], rather than giving me
// all of the roles. If I run this exact line from my main thread,
// it doesn't give any errors and outputs the right array:
// e.g. ['red', 'white'].
});
I expect to get not the empty array, but [red, white], as I do if I were to run the same line in the main thread. Does anyone have an idea as to the inconsistency of the functions? I'm very new to node.js and this isn't a class focused too much on software engineering in JavaScript, so I'd appreciate if someone can dumb down what is going on, as this is all very new to me.
If your script does not find the function called findroles then there is a problem with the importing method. Using the eval function for importing is not the normal way of importing modules. Try something like this:
// buildGraph.js
const epilog = require("./epilog.js");
......
console.log(epilog.findroles(library));
then epilog.js
exports.findroles = function (library) {
// function content
}
You can find more info here:
https://www.w3schools.com/nodejs/nodejs_modules.asp
Base on the document and example here, everything seem correct but I think the problem come from this line:
var process = fork('./buildGraph.js');
you might override the original process.
try to change it to
const n = fork('./buildGraph.js');

Node.js changing exports on the fly

changing exports.X in a function seems to not work...
I want to be able to load settings from a file & access them in Node.js. I have this currently, however, the clients connecting to my node application can edit what's in the settings file. Unfortunately as it stands the Node application has to be restarted for the changes to take effect. Is there a way I can reload the module.exports on the fly?
EDIT:
Settings file is literally a JSON string.
My settings module is 'required' in almost every single file, and there's a lot of files... So reloading it per-file basis is out of the question. I do, however, know precisely when someone makes a change to the settings.
If you are using require to load the settings and only referencing the settings from one module, then doing something along the lines of:
delete require.cache[require.resolve(filename)];
will work for you.
If, on the other hand, multiple modules will be referencing these settings, that approach can become a bit unwieldy and open you up to unforeseen bugs. For example, if any of the modules are holding on to a reference to the required settings file, they would each need to somehow learn that the settings had changed and update their references.
To alleviate (not completely solve) the caching issue, you build your settings interface so that users of it must access either the settings object via a function and/or require that properties are accessed via functions. Even with this model, someone may still decide to cache a setting causing an obscure failure later down the road.
Using the simplest approach of a single getter for the settings object would look something like this:
var settings = require('./settings.json');
// ... watch for changes and reload by invalidating node's cache
module.exports = function() { return settings; }
Usage:
var settings = require('./path/to/settings');
settings().foo;
There are several libraries that do settings. Depending on your needs, I'm partial to nconf.
I'd set up a file watcher here that checks for changes of a JSON file dynamically. It is not recommended practice to change a JS script once the app is running.
Something like:
var _ = require("lodash");
var fs = require("fs");
var result = {};
fs.watch('my-settings.json',function(event,filename){
fs.readFile(filename,function(err,data){
if(err){
// your error catching
}
_.extend(result,JSON.parse(data));
});
});
module.exports = result;
Now, this comes with lots of caveats, first that fs.watch is not always supported by all platforms.
http://nodejs.org/api/fs.html#fs_fs_watch_filename_options_listener
Second, that it's really awkward to change a property like this. The expectation is generally that exports of module not mutate. I'd instead recommend exposing a method whose result can change based on the state of the file, a getter for the resulting data.
Third, a file watcher can be expensive, memory-wise.
This is better code, IMHO:
var _ = require("lodash");
var fs = require("fs");
var filename = 'my-settings.json';
var lastModified;
var mySetting;
module.exports = {
getSettingAsync : function (callback) {
fs.stat(filename,function(err,stat){
if(stat.mtime == lastModified) {
callback(mySetting);
} else {
fs.readFile(filename,function(err,data){
if(err){
// your error catching
}
// this assumes that your data is always correct
mySetting = JSON.parse(data).mySetting;
callback(mySetting);
});
}
});
}
};
In this case, we both check for a JSON file, and expose this as an async method. You could just as easily change the code to use the sync versions if need be and return the value instead of invoking the callback. This version checks when the file was changed, which is cheaper than reading the whole file every time, reads the file if newer and saves you the need to use a potentially buggy file watcher.
By the way, I've not tested this code and it may contain errors as is, but the concept is sound.
But, perhaps the more salient question, why not just store that value in the database?

How do I use uncompressed files in Dojo 1.7?

I've created a Dojo module which depends on dojox/data/JsonRestStore like this:
define("my/MyRestStore",
["dojo/_base/declare", "dojox/data/JsonRestStore"],
function(declare, JsonRestStore) {
var x = new JsonRestStore({
target: '/items',
identifier: 'id'
});
...
which is fine. But now I want to have the the uncompressed version of the JsonRestStore code loaded so that I can debug it. I can't find any documentation on how to do this, but since there is a file called 'JsonRestStore.js.uncompressed.js' I changed my code to:
define("my/MyRestStore",
["dojo/_base/declare", "dojox/data/JsonRestStore.js.uncompressed"],
function(declare, JsonRestStore) {
...
thinking that might work.
I can see the JsonRestStore.js.uncompressed.js file being loaded in FireBug, but I get an error when trying to do new JsonRestStore:
JsonRestStore is not a constructor
Should this work?
Is there a way of configuring Dojo to use uncompressed versions of all modules? That's what I really want, but will settle for doing it on a per dependency basis if that's the only way.
Update
I've found a way to achieve what I want to do: rename the JsonRestStore.js.uncompressed.js file to JsonRestStore.js.
However, this seems a bit like a hacky workaround so I'd still be keen to know if there is a better way (e.g. via configuration).
You have two options
1) Create a custom build. The custom build will output a single uncompressed file that you can use for debugging. Think the dojo.js.uncompressed.js but it includes all the extra modules that you use.
OR
2) For a development environment, use the dojo source code. This means downloading the Dojo Toolkit SDK and referencing dojo.js from that in the development environment.
For the projects I work on, I do both. I set up the Dojo configuration so that it can be dynamic and I can change which configuration that I want using a query string parameter.
When I am debugging a problem, I will use the first option just to let me step through code and see what is going on. I use the second option when I am writing some significant js and don't want the overhead of the custom build to see my changes.
I describe this a bit more at
http://swingingcode.blogspot.com/2012/03/dojo-configurations.html
I think the reason for this is due to the fact that the loader declares its class-loads (modules), by the file conventions used. The 1.7 loader is not too robust just yet, ive had similar problems until realizing how to separate the '.' and '/' chars.
Its only a qualified guess; but i believe it has to do with the interpretation of '.' character in the class-name which signifies as a sub-namespace and not module name.
The 'define(/ * BLANK * / [ / * DEPENDENCIES * / ], ...)' - where no first string parameter is given - gets loaded by the filename (basename). The returned declare also has a saying though. So, for your example with jsonrest, its split/parsed as such:
toplevel = dojox
mid = data
modulename = JsonRestStore.js.uncompressed
(Fail.. Module renders as dojox.data.JsonRestStore.js.uncompressed, not dojox.data.JsonRestStore as should).
So, three options;
Load uncomressed classes through <script src="{{dataUrl}}/dojox/data/JsonRestStore.js.uncompressed.js"></script> and work them on dojo.ready
I think modifying the define([], function(){}) in uncompressed.js to define("JsonRestStore", [], function() {}) would do the trick (uncomfirmed)
Use the dojo/text loader, see below
Text filler needed :)
define("my/MyRestStore",
["dojo/_base/declare", "dojo/text!dojox/data/JsonRestStore.js.uncompressed.js"],
function(declare, JsonRestStore) {
...
JsonRestStore = eval(JsonRestStore);
// not 100% sure 'define' returns reference to actual class,
// if above renders invalid, try access through global reference, such as
// dojox.dat...

Exclude debug JavaScript code during minification

I'm looking into different ways to minify my JavaScript code including the regular JSMin, Packer, and YUI solutions. I'm really interested in the new Google Closure Compiler, as it looks exceptionally powerful.
I noticed that Dean Edwards packer has a feature to exclude lines of code that start with three semicolons. This is handy to exclude debug code. For instance:
;;; console.log("Starting process");
I'm spending some time cleaning up my codebase and would like to add hints like this to easily exclude debug code. In preparation for this, I'd like to figure out if this is the best solution, or if there are other techniques.
Because I haven't chosen how to minify yet, I'd like to clean the code in a way that is compatible with whatever minifier I end up going with. So my questions are these:
Is using the semicolons a standard technique, or are there other ways to do it?
Is Packer the only solution that provides this feature?
Can the other solutions be adapted to work this way as well, or do they have alternative ways of accomplishing this?
I will probably start using Closure Compiler eventually. Is there anything I should do now that would prepare for it?
here's the (ultimate) answer for closure compiler :
/** #const */
var LOG = false;
...
LOG && log('hello world !'); // compiler will remove this line
...
this will even work with SIMPLE_OPTIMIZATIONS and no --define= is necessary !
Here's what I use with Closure Compiler. First, you need to define a DEBUG variable like this:
/** #define {boolean} */
var DEBUG = true;
It's using the JS annotation for closure, which you can read about in the documentation.
Now, whenever you want some debug-only code, just wrap it in an if statement, like so:
if (DEBUG) {
console.log("Running in DEBUG mode");
}
When compiling your code for release, add the following your compilation command: --define='DEBUG=false' -- any code within the debug statement will be completely left out of the compiled file.
A good solution in this case might be js-build-tools which supports 'conditional compilation'.
In short you can use comments such as
// #ifdef debug
var trace = debug.getTracer("easyXDM.Rpc");
trace("constructor");
// #endif
where you define a pragma such as debug.
Then when building it (it has an ant-task)
//this file will not have the debug code
<preprocess infile="work/easyXDM.combined.js" outfile="work/easyXDM.js"/>
//this file will
<preprocess infile="work/easyXDM.combined.js" outfile="work/easyXDM.debug.js" defines="debug"/>
Adding logic to every place in your code where you are logging to the console makes it harder to debug and maintain.
If you are already going to add a build step for your production code, you could always add another file at the top that turns your console methods into noop's.
Something like:
console.log = console.debug = console.info = function(){};
Ideally, you'd just strip out any console methods, but if you are keeping them in anyway but not using them, this is probably the easiest to work with.
If you use the Closure Compiler in Advanced mode, you can do something like:
if (DEBUG) console.log = function() {}
Then the compiler will remove all your console.log calls. Of course you need to --define the variable DEBUG in the command line.
However, this is only for Advanced mode. If you are using Simple mode, you'll need to run a preprocessor on your source file.
Why not consider the Dojo Toolkit? It has built-in comment-based pragma's to include/exclude sections of code based on a build. Plus, it is compatible with the Closure Compiler in Advanced mode (see link below)!
http://dojo-toolkit.33424.n3.nabble.com/file/n2636749/Using_the_Dojo_Toolkit_with_the_Closure_Compiler.pdf?by-user=t
Even though its an old question. I stumbled upon the same issue today and found that it can be achieved using CompilerOptions.
I followed this thread.
We run the compiler, from Java, on our server before sending the code to the client. This worked for us in Simple mode.
private String compressWithClosureCompiler(final String code) {
final Compiler compiler = new Compiler();
final CompilerOptions options = new CompilerOptions();
Logger.getLogger("com.google.javascript.jscomp").setLevel(Level.OFF);
if (compressRemovesLogging) {
options.stripNamePrefixes = ImmutableSet.of("logger");
options.stripNameSuffixes = ImmutableSet.of("debug", "dev", "info", "error",
"warn", "startClock", "stopClock", "dir");
}
CompilationLevel.SIMPLE_OPTIMIZATIONS.setOptionsForCompilationLevel(options);
final JSSourceFile extern = JSSourceFile.fromCode("externs.js", "");
final JSSourceFile input = JSSourceFile.fromCode("input.js", code);
compiler.compile(extern, input, options);
return compiler.toSource();
}
It will remove all the calls to logger.debug, logger.dev...etc.etc
If you're using UglifyJS2, you can use the drop_console argument to remove console.* functions.
I use this in my React apps:
if (process.env.REACT_APP_STAGE === 'PROD')
console.log = function no_console() {};
In other words, console.log will return nothing on prod enviroment.
I am with #marcel-korpel. Isn't perfect but works. Replace the debug instructions before minification. The regular expression works in many places. Watch out unenclosed lines.
/console\.[^;]*/gm
Works on:
;;; console.log("Starting process");
console.log("Starting process");
console.dir("Starting process");;;;;
console.log("Starting "+(1+2)+" processes"); iamok('good');
console.log('Message ' +
'with new line'
);
console.group("a");
console.groupEnd();
swtich(input){
case 1 : alert('ok'); break;
default: console.warn("Fatal error"); break;
}
Don't works:
console.log("instruction without semicolon")
console.log("semicolon in ; string");
I haven't looked into minification so far, but this behaviour could be accomplished using a simple regular expression:
s/;;;.*//g
This replaces everything in a line after (and including) three semicolons with nothing, so it's discarded before minifying. You can run sed (or a similar tool) before running your minification tool, like this:
sed 's/;;;.*//g' < infile.js > outfile.js
BTW, if you're wondering whether the packed version or the minified version will be 'better', read this comparison of JavaScript compression methods.
I've used following self-made stuf:
// Uncomment to enable debug messages
// var debug = true;
function ShowDebugMessage(message) {
if (debug) {
alert(message);
}
}
So when you've declared variable debug which is set to true - all ShowDebugMessage() calls would call alert() as well. So just use it in a code and forget about in place conditions like ifdef or manual commenting of the debug output lines.
I was searching for a built-in option to do this. I have not found that yet, but my favorite answer also does not require any changes to existing source code. Here's an example with basic usage.
Assume HTML file test.html with:
<html>
<script src="hallo.js"></script>
</html>
And hallo.js with:
sayhi();
function sayhi()
{
console.log("hallo, world!");
}
We'll use a separate file, say noconsole.js, having this from the linked answer:
console.log = console.debug = console.info = function(){};
Then we can compile it as follows, bearing in mind that order matters, noconsole.js must be placed first in the arguments:
google-closure-compiler --js noconsole.js hallo.js --js_output_file hallo.out.js
If you cat hallo.out.js you'd see:
console.log=console.debug=console.info=function(){};sayhi();function sayhi(){console.log("hallo, world!")};
And if I test with mv hallo.out.js hallo.js and reload the page, I can see that the console is now empty.
Hope this clarifies it. Note that I have not yet tested this in the ideal mode of compiling all the source code with ADVANCED optimizations, but I'd expect it to also work.

Categories