NodeJS, SocketIO and Express logic context build - javascript

I read a lot about Express / SocketIO and that's crazy how rarely you get some other example than a "Hello" transmitted directly from the app.js. The problem is it doesn't work like that in the real world ... I'm actually desperate on a logic problem which seems far away from what the web give me, that's why I wanted to point this out, I'm sure asking will be the solution ! :)
I'm refactoring my app (because there were many mistakes like using the global scope to put libs, etc.) ; Let's say I've got a huge system based on SocketIO and NodeJS. There's a loader in the app.js which starts the socket system.
When someone join the app it require() another module : it initializes many socket.on() which are loaded dynamically and go to some /*_socket.js files in a folder. Each function in those modules represent a socket listener, then it's way easier to call it from the front-end, might look like this :
// Will call `user_socket.js` and method `try_to_signin(some params)`
Queries.emit_socket('user.try_to_signin', {some params});
The system itself works really well. But there's a catch : the module that will load all those files which understand what the front-end has sent also transmit libraries linked with req/res (sessions, cookies, others...) and must do it, because the called methods are the core of the app and very often need those libraries.
In the previous example we obviously need to check if the user isn't already logged-in.
// The *_socket.js file looks like this :
var $h = require(__ROOT__ + '/api/helpers');
module.exports = function($s, $w) {
var user_process = require(__ROOT__ + '/api/processes/user_process')($s, $w);
return {
my_method_called: function(reference, params, callback) {
// Stuff using $s, $w, etc.
}
}
// And it's called this way :
// $s = services (a big object)
// $w = workers (a big object depending on $s)
// They are linked with the req/res from the page when they are instantiated
controller_instance = require('../sockets/'+ controller_name +'_socket')($s, $w);
// After some processes ...
socket_io.on(socket_listener, function (datas, callback) {
// Will call the correct function, etc.
$w.queries.handle_socket($w, controller_name, method_name, datas);
});
The good news : basically, it works.
The bad news : every time I refresh the page, the listeners double themselves because they are in a loop called on page load.
Below, this should have been one line :
So I should put all the socket.on('connection'...) stuff outside the page loading, which means when the server starts ... Yes, but I also need the req/res datas to be able to load the libraries, which I get only when the page is loaded !
It's a programing logic problem, I know I did something wrong but I don't know where to go now, I got this big system which "basically" works but there's like a paradox on the way I did it and I can't figure out how to resolve this ... It's been a couple of hours I'm stuck.
How can I refacto to let the possibility to get the current libraries depending on req/res within a socket.on() call ? Is there a trick ? Should I think about changing completely the way I did it ?
Also, is there another way to do what I want to do ?
Thank you everyone !
NOTE : If I didn't explain well or if you want more code, just tell me :)
EDIT - SOLUTION : As seen above we can use sockets.once(); instead of sockets.on(), or there's also the sockets.removeAllListeners() solution which is less clean.

Try As Below.
io.sockets.once('connection', function(socket) {
io.sockets.emit('new-data', {
channel: 'stdout',
value: data
});
});
Use once instead of on.
This problem is similar as given in the following link.
https://stackoverflow.com/questions/25601064/multiple-socket-io-connections-on-page-refresh/25601075#25601075

Related

Attempting to Import Module in Child Process (Javascript) and Failing

I'm currently running a heavy computation (i.e. generating a Monte Carlo tree), which is an expensive operation. I only have a few seconds to build as big of a tree as I can, so I am using subprocesses in Node.js in order to build multiple trees, and then aggregate their data together to make a more informed decision.
I understand that subprocesses do not share information/memory, and I need to use modules within these subprocesses that are located in a file, called "Epilog.js" on my machine.
When I run functions that are in epilog.js from the main file, it works just fine. But all of my functions that are in my worker threads return absolutely nothing.
I have tested to make sure that the parameters of the functions I am trying to use in "epilog.js" aren't empty, and they're not. The problem isn't in the parameter.
I have also tested to see what happens if I simply don't import, and instead of just outputting an undefined array, I get an error saying that there is no function called "findroles".
//My main thread.
var fs = require('fs');
eval(fs.readFileSync('epilog.js') + '');
var process = fork('./buildGraph.js');
process.send({library});
//My worker thread.
//buildGraph.js
var fs = require('fs');
eval(fs.readFileSync('epilog.js') + '');
// receive message from master process
process.on('message', async(message) => {
library = message["library"];
console.log(findroles(library));
// findroles(library) is a function that is defined in epilog.js,
//and this outputs an array of "roles" given a parameter,library.
// For some reason this function outputs [], rather than giving me
// all of the roles. If I run this exact line from my main thread,
// it doesn't give any errors and outputs the right array:
// e.g. ['red', 'white'].
});
I expect to get not the empty array, but [red, white], as I do if I were to run the same line in the main thread. Does anyone have an idea as to the inconsistency of the functions? I'm very new to node.js and this isn't a class focused too much on software engineering in JavaScript, so I'd appreciate if someone can dumb down what is going on, as this is all very new to me.
If your script does not find the function called findroles then there is a problem with the importing method. Using the eval function for importing is not the normal way of importing modules. Try something like this:
// buildGraph.js
const epilog = require("./epilog.js");
......
console.log(epilog.findroles(library));
then epilog.js
exports.findroles = function (library) {
// function content
}
You can find more info here:
https://www.w3schools.com/nodejs/nodejs_modules.asp
Base on the document and example here, everything seem correct but I think the problem come from this line:
var process = fork('./buildGraph.js');
you might override the original process.
try to change it to
const n = fork('./buildGraph.js');

Node.js design: multiple async functions writing to database using function passed as a closure

I am writing a standalone web scraper in Node, run from command line, which looks for specific data on a set of pages, fetches page views data from Google Analytics and saves it all in an MySQL database. Almost all is ready, but today I found a problem with the way I write data in the db.
To make thing easier let's assume I have an index.js file and two controllers - db and web. Db reads/writes data to db, web scraps the pages using configurable amount of PhantomJs instances.
Web exposes one function checkTargetUrls(urls, writer)
where urls is an array with urls to be checked and writer is an optional parameter, called only if it is a function and there is data to be written.
Now the way I pass the writer is obviously wrong, but looks as follows (in index.js):
some code here
....
let pageId = 0;
... some promises code,
which checks validy of urls,
creates new execution in the database, etc.
...
.then(ulrs => {
return web.checkTargetUrls(urls,
function(singleUrl, pageData) {
...
a chain of promisable functions from db controller,
which first lookup page id in the db, then its
puts in the pageId variable and continues with write to db
...
}).then(() => {
logger.info('All done captain!');
}).catch(err => {logger.error(err})
In the effect randomly pageId gets overwritten by id of preceeding/succeeding page and invalid data is saved. Inside web there are up to 10 concurrent instances of PhantomJs running, which call writer function after they analyzed a page. Excuse me my language, but for me an analogy for that situation would be if I had, say, 10 instances of some object, which then rely for writing on a singleton, which causes the pageId overwriting problem (don't know how to properly express in JS/Node.js terms).
So far I have found one fix to the problem, but it is ugly as it introduces tight coupling. If I put the writer code in a separate module and then load it directly from inside the web controller all works great. But for me it is a bad design pattern and would rather do it otherwise.
var writer = require('./writer');
function checkTargetUrls(urls, executionId) {
return new Promise(
function(resolve, reject) {
let poolSize = config.phantomJs.concurrentInstances;
let running = 0;
....
a bit of code goes here
....
if (slots != undefined && slots != null && slots.data.length > 0) {
return writer.write(executionId, singleUrl, slots);
}
...
more code follows
})
}
I have a hard time findng a nicer solution, where I could still pass writer as an argument for checkTargetUrls(urls, writer) function. Can anyone point me in the right direction or suggest where to look for the answer?
The exact problem around your global pageId is not entirely clear to me but you could reduce coupling by exposing a setWriter function from your 'web' controller.
var writer;
module.exports.setWriter = function(_writer) { writer = _writer };
Then near the top of your index.js, something like:
var web = require('./web');
web.setWriter(require('./writer'));

Safe way to let users register handelbars helpers in nodejs

I have a node js web app that is using handlebars. Users are asking me to let them register their own handlebars helpers.
I'm quite hesitant about letting them do it... but I'll give it a go if there is a secure way of doing it so.
var Handlebars = require("handlebars");
var fs = require("fs");
var content = fs.readFileSync("template.html", "utf8");
//This helper will be posted by the user
var userHandlebarsHelpers = "Handlebars.registerHelper('foo', function(value) { return 'Foo' + value; });"
//eval(userHandlebarsHelpers); This I do not like! Eval is evil
//Compile handlebars with user submitted Helpers
var template = Handlebars.compile(content);
var handleBarContent = template({ foo: bar });
//Save compiled template and some extra code.
Thank you in advance!
Because helpers are just Javascript code, the only way you could safely run arbitrary Javascript from the outside world on your server is if you either ran it an isolated sandbox process or you somehow sanitized the code before you ran it.
The former can be done with isolated VMs and external control over the process, but that makes it quite a pain to have helper code in some external process as you now have to develop ways to even call it and pass data back and forth.
Sanitizing Javascript to be safe from running exploits on your server is a pretty much impossible task when your API set is as large as node.js. The browser has a very tightly controlled set of things that Javascript can do to keep the underlying system safe from what browser Javascript can do. node.js has none of those safeguards. You could put code in one of these helpers to erase the entire hard drive of the server or install multiple viruses or pretty much whatever evil exploit you wanted to code. So, running arbitrary Javascript will simply not be safe.
Depending upon the exact problems that need to be solved, one can something develop a data driven approach where, instead of code, the user provides some higher level set of instructions (map this to that, substitute this with that, replace this with that, display from this set of data, etc...) that is not actually Javascript, but rather some non-executable meta data. That is much more feasible to make safe because you control all the code that acts on this meta data so you just have to make sure that the code that processes the meta data isn't capable of being tricked into doing something evil.
Following #jfriend00 input and after some serious testing I found a way to do it using nodejs vm module.
Users will input their helpers with this format:
[[HBHELPER 'customHelper' value]]
value.replace(/[0-9]/g, "");
[[/HBHELPER]]
[[HBHELPER 'modulus' index mod result block]]
if(parseInt(index) % mod === parseInt(result))
block.fn(this);
[[/HBHELPER]]
//This will throw an error when executed "Script execution timed out."
[[HBHELPER 'infiniteLoop' value]]
while(1){}
[[/HBHELPER]]
I translate that block into this and execute it:
Handlebars.registerHelper('customHelper', function(value) {
//All the code is executed inside the VM
return vm.runInNewContext('value.replace(/[0-9]/g, "");', {
value: value
}, {
timeout: 1000
});
});
Handlebars.registerHelper('modulus', function(index, mod, result, block) {
return vm.runInNewContext('if(parseInt(index) % mod === parseInt(result)) block.fn(this);', {
index: index,
mod: mod,
result: result,
block: block
}, {
timeout: 1000
});
});
Handlebars.registerHelper('infiniteLoop', function(value) {
//Error
return vm.runInNewContext('while(1){}', {
value: value
}, {
timeout: 1000
});
});
I made multiple tests so far, trying to delete files, require modules, infinite loops. Everything is going perfectly, all those operations failed.
Running the handlebar helper callback function in a VM is what made this work for me, because my main problem using VM's and running the whole code inside was adding those helpers to my global Handlebars object.
I'll update if I found a way to exploit it.

Node.js changing exports on the fly

changing exports.X in a function seems to not work...
I want to be able to load settings from a file & access them in Node.js. I have this currently, however, the clients connecting to my node application can edit what's in the settings file. Unfortunately as it stands the Node application has to be restarted for the changes to take effect. Is there a way I can reload the module.exports on the fly?
EDIT:
Settings file is literally a JSON string.
My settings module is 'required' in almost every single file, and there's a lot of files... So reloading it per-file basis is out of the question. I do, however, know precisely when someone makes a change to the settings.
If you are using require to load the settings and only referencing the settings from one module, then doing something along the lines of:
delete require.cache[require.resolve(filename)];
will work for you.
If, on the other hand, multiple modules will be referencing these settings, that approach can become a bit unwieldy and open you up to unforeseen bugs. For example, if any of the modules are holding on to a reference to the required settings file, they would each need to somehow learn that the settings had changed and update their references.
To alleviate (not completely solve) the caching issue, you build your settings interface so that users of it must access either the settings object via a function and/or require that properties are accessed via functions. Even with this model, someone may still decide to cache a setting causing an obscure failure later down the road.
Using the simplest approach of a single getter for the settings object would look something like this:
var settings = require('./settings.json');
// ... watch for changes and reload by invalidating node's cache
module.exports = function() { return settings; }
Usage:
var settings = require('./path/to/settings');
settings().foo;
There are several libraries that do settings. Depending on your needs, I'm partial to nconf.
I'd set up a file watcher here that checks for changes of a JSON file dynamically. It is not recommended practice to change a JS script once the app is running.
Something like:
var _ = require("lodash");
var fs = require("fs");
var result = {};
fs.watch('my-settings.json',function(event,filename){
fs.readFile(filename,function(err,data){
if(err){
// your error catching
}
_.extend(result,JSON.parse(data));
});
});
module.exports = result;
Now, this comes with lots of caveats, first that fs.watch is not always supported by all platforms.
http://nodejs.org/api/fs.html#fs_fs_watch_filename_options_listener
Second, that it's really awkward to change a property like this. The expectation is generally that exports of module not mutate. I'd instead recommend exposing a method whose result can change based on the state of the file, a getter for the resulting data.
Third, a file watcher can be expensive, memory-wise.
This is better code, IMHO:
var _ = require("lodash");
var fs = require("fs");
var filename = 'my-settings.json';
var lastModified;
var mySetting;
module.exports = {
getSettingAsync : function (callback) {
fs.stat(filename,function(err,stat){
if(stat.mtime == lastModified) {
callback(mySetting);
} else {
fs.readFile(filename,function(err,data){
if(err){
// your error catching
}
// this assumes that your data is always correct
mySetting = JSON.parse(data).mySetting;
callback(mySetting);
});
}
});
}
};
In this case, we both check for a JSON file, and expose this as an async method. You could just as easily change the code to use the sync versions if need be and return the value instead of invoking the callback. This version checks when the file was changed, which is cheaper than reading the whole file every time, reads the file if newer and saves you the need to use a potentially buggy file watcher.
By the way, I've not tested this code and it may contain errors as is, but the concept is sound.
But, perhaps the more salient question, why not just store that value in the database?

Gloal Abatement - Using a Single Object Literal

So I just need a sanity check on the way in which I layout my code for an application. I'm always keen to learn better approaches.
I basically use an Object Literal to organise my code meaning that I have one single global variable. Then for each section of the application I create a separate object - something like:
var MYAPP = {
init : function() {
//site wide common js
},
sections : {
homepage : function() {
//homepage js
},
anotherpage : function() {
//another page js
tools.usefultool();
}
},
tools : {
usefultool : function() {
//useful reuseable method
}
}
};
My question is while this helps code organisation, I'm wondering about objects being initialised but never used. For example - if I'm on a site's homepage I'll just call MYAPP.sections.homepage() . I don't actually need any of the other objects so I'm wondering - does this structure have a performance implication? Is there a better way? The structure closely follows the the great Rebecca Murphy article "Using Object to Organise Your Code" (http://blog.rebeccamurphey.com/2009/10/15/using-objects-to-organize-your-code).
Thanks!
Yes, there's always a performance hit in unused code as the parser has to actually interpret the code even if it's not executed. But any performance hit here is so minute that you're never going to notice it. The only real hit in unused code like this is in the bandwidth required to download it. If you have a 100kb file downloaded that you never use then you're wasting the time to download that file.

Categories