Safe way to let users register handelbars helpers in nodejs

Safe way to let users register handelbars helpers in nodejs - javascript

I have a node js web app that is using handlebars. Users are asking me to let them register their own handlebars helpers.
I'm quite hesitant about letting them do it... but I'll give it a go if there is a secure way of doing it so.
var Handlebars = require("handlebars");
var fs = require("fs");
var content = fs.readFileSync("template.html", "utf8");
//This helper will be posted by the user
var userHandlebarsHelpers = "Handlebars.registerHelper('foo', function(value) { return 'Foo' + value; });"
//eval(userHandlebarsHelpers); This I do not like! Eval is evil
//Compile handlebars with user submitted Helpers
var template = Handlebars.compile(content);
var handleBarContent = template({ foo: bar });
//Save compiled template and some extra code.
Thank you in advance!

Because helpers are just Javascript code, the only way you could safely run arbitrary Javascript from the outside world on your server is if you either ran it an isolated sandbox process or you somehow sanitized the code before you ran it.
The former can be done with isolated VMs and external control over the process, but that makes it quite a pain to have helper code in some external process as you now have to develop ways to even call it and pass data back and forth.
Sanitizing Javascript to be safe from running exploits on your server is a pretty much impossible task when your API set is as large as node.js. The browser has a very tightly controlled set of things that Javascript can do to keep the underlying system safe from what browser Javascript can do. node.js has none of those safeguards. You could put code in one of these helpers to erase the entire hard drive of the server or install multiple viruses or pretty much whatever evil exploit you wanted to code. So, running arbitrary Javascript will simply not be safe.
Depending upon the exact problems that need to be solved, one can something develop a data driven approach where, instead of code, the user provides some higher level set of instructions (map this to that, substitute this with that, replace this with that, display from this set of data, etc...) that is not actually Javascript, but rather some non-executable meta data. That is much more feasible to make safe because you control all the code that acts on this meta data so you just have to make sure that the code that processes the meta data isn't capable of being tricked into doing something evil.

Following #jfriend00 input and after some serious testing I found a way to do it using nodejs vm module.
Users will input their helpers with this format:
[[HBHELPER 'customHelper' value]]
value.replace(/[0-9]/g, "");
[[/HBHELPER]]
[[HBHELPER 'modulus' index mod result block]]
if(parseInt(index) % mod === parseInt(result))
block.fn(this);
[[/HBHELPER]]
//This will throw an error when executed "Script execution timed out."
[[HBHELPER 'infiniteLoop' value]]
while(1){}
[[/HBHELPER]]
I translate that block into this and execute it:
Handlebars.registerHelper('customHelper', function(value) {
//All the code is executed inside the VM
return vm.runInNewContext('value.replace(/[0-9]/g, "");', {
value: value
}, {
timeout: 1000
});
});
Handlebars.registerHelper('modulus', function(index, mod, result, block) {
return vm.runInNewContext('if(parseInt(index) % mod === parseInt(result)) block.fn(this);', {
index: index,
mod: mod,
result: result,
block: block
}, {
timeout: 1000
});
});
Handlebars.registerHelper('infiniteLoop', function(value) {
//Error
return vm.runInNewContext('while(1){}', {
value: value
}, {
timeout: 1000
});
});
I made multiple tests so far, trying to delete files, require modules, infinite loops. Everything is going perfectly, all those operations failed.
Running the handlebar helper callback function in a VM is what made this work for me, because my main problem using VM's and running the whole code inside was adding those helpers to my global Handlebars object.
I'll update if I found a way to exploit it.

Related

Rewrite browser JS code to transform global definitions into window properties

I support a very old PHP web framework that uses server-side rendering. I decided to implement Vue for the rendering of some modules, so I compiled a hello world app and realized deployment wouldn't be so simple.
The framework works as a giant SPA, with each module being rendered using the html output of a body() function. The output is replaced in the client's DOM without reloading the page itself.
<script> tags are banned for security reasons and will be sanitized from the resulting html. The only way to deliver JS to the client is by using an eval_js() function.
The problem is rather simple. I need to safely load JS code several times in the same DOM. I cannot load it as-is after app compilation, because from the second time onwards the code is executed (every time a user visits a module, or performs an action) the code will attempt to re-define global variables and kill the whole client.
The solution is also rather simple, just rewrite the JS code such that every global definition is transformed into a window property. This way, even if the same piece of code gets executed several times in the same DOM, it will simply replace window properties rather than attempting to re-define variables.
In example, the following input:
function Yr(t){
const b = t.prototype.hasOwnProperty;
this._init(b);
}
var hOe = sg(uOe, fOe, dOe, !1, null, "e687eb20", null, null);
const vOe = {
name: "AmmFilters",
components: {
AmmOptionSelect: pOe
}
};
new Yr({...}).$mount("#app");
Would be rewritten into:
window.Yr = function(t){
const b = t.prototype.hasOwnProperty;
this._init(b);
}
window.hOe = sg(window.uOe, window.fOe, window.dOe, !1, null, "e687eb20", null, null);
window.vOe = {
name: "AmmFilters",
components: {
AmmOptionSelect: window.pOe
}
}
new window.Yr({...}).$mount("#app");
I initially considered to write my own parser, but then realized that ES6+ syntax is no child's play. The code I will attempt to rewrite is optimized & obfuscated which means it will have all sort of complex syntax and I must be careful not to turn scoped definitions into window properties.
Any ideas on a tool that already performs this task? The resulting JS code should have no difference from the original, as global scoped variables end up in the window object anyway.
I believe it would be a fairly useful tool for various use cases, so thought about asking before attempting to reinvent the wheel.

Node.js design: multiple async functions writing to database using function passed as a closure

I am writing a standalone web scraper in Node, run from command line, which looks for specific data on a set of pages, fetches page views data from Google Analytics and saves it all in an MySQL database. Almost all is ready, but today I found a problem with the way I write data in the db.
To make thing easier let's assume I have an index.js file and two controllers - db and web. Db reads/writes data to db, web scraps the pages using configurable amount of PhantomJs instances.
Web exposes one function checkTargetUrls(urls, writer)
where urls is an array with urls to be checked and writer is an optional parameter, called only if it is a function and there is data to be written.
Now the way I pass the writer is obviously wrong, but looks as follows (in index.js):
some code here
....
let pageId = 0;
... some promises code,
which checks validy of urls,
creates new execution in the database, etc.
...
.then(ulrs => {
return web.checkTargetUrls(urls,
function(singleUrl, pageData) {
...
a chain of promisable functions from db controller,
which first lookup page id in the db, then its
puts in the pageId variable and continues with write to db
...
}).then(() => {
logger.info('All done captain!');
}).catch(err => {logger.error(err})
In the effect randomly pageId gets overwritten by id of preceeding/succeeding page and invalid data is saved. Inside web there are up to 10 concurrent instances of PhantomJs running, which call writer function after they analyzed a page. Excuse me my language, but for me an analogy for that situation would be if I had, say, 10 instances of some object, which then rely for writing on a singleton, which causes the pageId overwriting problem (don't know how to properly express in JS/Node.js terms).
So far I have found one fix to the problem, but it is ugly as it introduces tight coupling. If I put the writer code in a separate module and then load it directly from inside the web controller all works great. But for me it is a bad design pattern and would rather do it otherwise.
var writer = require('./writer');
function checkTargetUrls(urls, executionId) {
return new Promise(
function(resolve, reject) {
let poolSize = config.phantomJs.concurrentInstances;
let running = 0;
....
a bit of code goes here
....
if (slots != undefined && slots != null && slots.data.length > 0) {
return writer.write(executionId, singleUrl, slots);
}
...
more code follows
})
}
I have a hard time findng a nicer solution, where I could still pass writer as an argument for checkTargetUrls(urls, writer) function. Can anyone point me in the right direction or suggest where to look for the answer?

The exact problem around your global pageId is not entirely clear to me but you could reduce coupling by exposing a setWriter function from your 'web' controller.
var writer;
module.exports.setWriter = function(_writer) { writer = _writer };
Then near the top of your index.js, something like:
var web = require('./web');
web.setWriter(require('./writer'));

NodeJS, SocketIO and Express logic context build

I read a lot about Express / SocketIO and that's crazy how rarely you get some other example than a "Hello" transmitted directly from the app.js. The problem is it doesn't work like that in the real world ... I'm actually desperate on a logic problem which seems far away from what the web give me, that's why I wanted to point this out, I'm sure asking will be the solution ! :)
I'm refactoring my app (because there were many mistakes like using the global scope to put libs, etc.) ; Let's say I've got a huge system based on SocketIO and NodeJS. There's a loader in the app.js which starts the socket system.
When someone join the app it require() another module : it initializes many socket.on() which are loaded dynamically and go to some /*_socket.js files in a folder. Each function in those modules represent a socket listener, then it's way easier to call it from the front-end, might look like this :
// Will call `user_socket.js` and method `try_to_signin(some params)`
Queries.emit_socket('user.try_to_signin', {some params});
The system itself works really well. But there's a catch : the module that will load all those files which understand what the front-end has sent also transmit libraries linked with req/res (sessions, cookies, others...) and must do it, because the called methods are the core of the app and very often need those libraries.
In the previous example we obviously need to check if the user isn't already logged-in.
// The *_socket.js file looks like this :
var $h = require(__ROOT__ + '/api/helpers');
module.exports = function($s, $w) {
var user_process = require(__ROOT__ + '/api/processes/user_process')($s, $w);
return {
my_method_called: function(reference, params, callback) {
// Stuff using $s, $w, etc.
}
}
// And it's called this way :
// $s = services (a big object)
// $w = workers (a big object depending on $s)
// They are linked with the req/res from the page when they are instantiated
controller_instance = require('../sockets/'+ controller_name +'_socket')($s, $w);
// After some processes ...
socket_io.on(socket_listener, function (datas, callback) {
// Will call the correct function, etc.
$w.queries.handle_socket($w, controller_name, method_name, datas);
});
The good news : basically, it works.
The bad news : every time I refresh the page, the listeners double themselves because they are in a loop called on page load.
Below, this should have been one line :
So I should put all the socket.on('connection'...) stuff outside the page loading, which means when the server starts ... Yes, but I also need the req/res datas to be able to load the libraries, which I get only when the page is loaded !
It's a programing logic problem, I know I did something wrong but I don't know where to go now, I got this big system which "basically" works but there's like a paradox on the way I did it and I can't figure out how to resolve this ... It's been a couple of hours I'm stuck.
How can I refacto to let the possibility to get the current libraries depending on req/res within a socket.on() call ? Is there a trick ? Should I think about changing completely the way I did it ?
Also, is there another way to do what I want to do ?
Thank you everyone !
NOTE : If I didn't explain well or if you want more code, just tell me :)
EDIT - SOLUTION : As seen above we can use sockets.once(); instead of sockets.on(), or there's also the sockets.removeAllListeners() solution which is less clean.

Try As Below.
io.sockets.once('connection', function(socket) {
io.sockets.emit('new-data', {
channel: 'stdout',
value: data
});
});
Use once instead of on.
This problem is similar as given in the following link.
https://stackoverflow.com/questions/25601064/multiple-socket-io-connections-on-page-refresh/25601075#25601075

Node.js changing exports on the fly

changing exports.X in a function seems to not work...
I want to be able to load settings from a file & access them in Node.js. I have this currently, however, the clients connecting to my node application can edit what's in the settings file. Unfortunately as it stands the Node application has to be restarted for the changes to take effect. Is there a way I can reload the module.exports on the fly?
EDIT:
Settings file is literally a JSON string.
My settings module is 'required' in almost every single file, and there's a lot of files... So reloading it per-file basis is out of the question. I do, however, know precisely when someone makes a change to the settings.

If you are using require to load the settings and only referencing the settings from one module, then doing something along the lines of:
delete require.cache[require.resolve(filename)];
will work for you.
If, on the other hand, multiple modules will be referencing these settings, that approach can become a bit unwieldy and open you up to unforeseen bugs. For example, if any of the modules are holding on to a reference to the required settings file, they would each need to somehow learn that the settings had changed and update their references.
To alleviate (not completely solve) the caching issue, you build your settings interface so that users of it must access either the settings object via a function and/or require that properties are accessed via functions. Even with this model, someone may still decide to cache a setting causing an obscure failure later down the road.
Using the simplest approach of a single getter for the settings object would look something like this:
var settings = require('./settings.json');
// ... watch for changes and reload by invalidating node's cache
module.exports = function() { return settings; }
Usage:
var settings = require('./path/to/settings');
settings().foo;
There are several libraries that do settings. Depending on your needs, I'm partial to nconf.

I'd set up a file watcher here that checks for changes of a JSON file dynamically. It is not recommended practice to change a JS script once the app is running.
Something like:
var _ = require("lodash");
var fs = require("fs");
var result = {};
fs.watch('my-settings.json',function(event,filename){
fs.readFile(filename,function(err,data){
if(err){
// your error catching
}
_.extend(result,JSON.parse(data));
});
});
module.exports = result;
Now, this comes with lots of caveats, first that fs.watch is not always supported by all platforms.
http://nodejs.org/api/fs.html#fs_fs_watch_filename_options_listener
Second, that it's really awkward to change a property like this. The expectation is generally that exports of module not mutate. I'd instead recommend exposing a method whose result can change based on the state of the file, a getter for the resulting data.
Third, a file watcher can be expensive, memory-wise.
This is better code, IMHO:
var _ = require("lodash");
var fs = require("fs");
var filename = 'my-settings.json';
var lastModified;
var mySetting;
module.exports = {
getSettingAsync : function (callback) {
fs.stat(filename,function(err,stat){
if(stat.mtime == lastModified) {
callback(mySetting);
} else {
fs.readFile(filename,function(err,data){
if(err){
// your error catching
}
// this assumes that your data is always correct
mySetting = JSON.parse(data).mySetting;
callback(mySetting);
});
}
});
}
};
In this case, we both check for a JSON file, and expose this as an async method. You could just as easily change the code to use the sync versions if need be and return the value instead of invoking the callback. This version checks when the file was changed, which is cheaper than reading the whole file every time, reads the file if newer and saves you the need to use a potentially buggy file watcher.
By the way, I've not tested this code and it may contain errors as is, but the concept is sound.
But, perhaps the more salient question, why not just store that value in the database?

Better JavaScript Organisation and Execution The Unobtrusive Way - Self-Executing Anonymous Func

I'm slowly getting a better understanding of JavaScript but I'm stuck on how best to tackle this particular organization/execution scenario.
I come from a C# background and am used to working with namespaces so I've been reading up on how to achieve this with JavaScript. I've taken what was already starting to become a large JavaScript file and split it out into more logical parts.
I've decided on a single file per page for page specific JavaScript with anything common to two or more pages, like reusable utility functions, in another namespace and file.
This makes sense to me at the moment and seems to be a popular choice, at least during the development process. I'm going to use a bundling tool to combine these disparate files for deployment to production anyway so anything that makes development more logical and easier to find code the better.
As a result of my inexperience in dealing with lots of custom JavaScript I had a function defined in the common JavaScript file like this:
common.js
$(document).ready(function () {
var historyUrl = '/history/GetHistory/';
$.getJSON(historyUrl, null, function (data) {
$.each(data, function (index, d) {
$('#history-list').append('<li>' + d.Text + '</li>');
});
});
});
This is obviously far from ideal as it is specific to a single page in the application but was being executed on every page request which is utterly pointless and insanely inefficient if not outright stupid. So that led me to start reading up on namespaces first.
After a bit of a read I have now moved this to a page specific file and re-written it like this:
Moved from common.js to historyPage.js
(function(historyPage, $, undefined) {
historyPage.GetHistory = function () {
var historyUrl = '/history/GetHistory/';
$.getJSON(historyUrl, null, function (data) {
$.each(data, function (index, d) {
$('#history-list').append('<li>' + d.Text + '</li>');
});
});
};
}( window.historyPage = window.historyPage || {}, jQuery ));
I found this pattern on the jQuery Enterprise page. I'm not going to pretend to fully understand it yet but it seems to be a very popular and the most flexible way of organizing and executing JavaScript with various different scopes whist keeping things out of the global scope.
However what I'm now struggling with is how to properly make use of this pattern from an execution point of view. I'm also trying to keep any JavaScript out of my HTML Razor views and work in an unobtrusive way.
So how would I now call the historyPage.GetHistory function only when it should actually execute ie: only when a user navigates to the History page on the web site and the results of the function are required?

From looking at the code, it would seem that the easiest test would be to check if the page you are on contains an element with an id of history-list. Something like this:
var $histList = $('#history-list');
if($histList.length > 0){
// EXECUTE THE CODE
}
Though if it really only ever needs to run on one given page, maybe it's just not a good candidate for a shared javascript file.

Using the code I have detailed above in the question I have gotten it working by doing the following:
In _Layout.cshtml
#if (IsSectionDefined("History"))
{
<script type="text/javascript">
$(document).ready(function () {
#RenderSection("History", required: false)
});
</script>
}
In History.cshtml
#section History
{
historyPage.GetHistory();
}
The code is executing as required only when the user requests the History page on the web site. Although the comment from #Dagg Nabbit above has thrown me a curve ball in that I thought I was on the right track ... Hmm ...

We Keep Coding

JavaScript is the programming language of the Web.