Simplest approach to Node.js request serialisation - javascript

I've got the classic asynchronous/concurrency problem that folks writing a service in Node.js at some point stumble into. I have an object that fetches some data from an RDBM in response to a request, and emits a fin event (using an EventEmitter) when the row fetching is complete.
As you might expect, when the caller of the service makes several near-simultaneous calls to it, the rows are returned in an unpredictable order. The fin event is fired for rows that do not correspond to the calling function's understanding of the request that produced them.
Here's what I've got going on (simplified for relevance):
var mdl = require('model.js');
dispatchGet: function(req, res, sec, params) {
var guid = umc.genGUID(36);
mdl.init(this.modelMap[sec], guid);
// mdl.load() creates returns a 'new events.EventEmitter()'
mdl.load(...).once('fin',
function() {
res.write(...);
res.end();
});
}
A simple test shows that the mdl.guid often does not correspond to the guid.
I would have thought that creating a new events.EventEmitter() inside the mdl.load() function would fix this problem by creating a discrete EventEmitter for every request, but evidently that is not the case; I suppose the same rules of object persistence apply to it as to any other object, irrespective of new.
I'm a C programmer by background: I can certainly come up with my own scheme for associating these replies with their requests, using some circular queue or hashing scheme. However, I am guessing this problem has already been solved many times over. My research has revealed many opinions on how to best handle this--various kinds of queuing implementations, Futures, etc.
What I'm wondering is, what's the simplest possible approach to good asynchronous flow control here? I don't want to get knee-deep in some dependency's massive paradigm shift if I don't have to. Is there a relatively simple, canonical, definitive solution, and/or widespread consensus on which third-party module is best?

Could it be that your model.js looks something like this?
module.exports = {
init : function(model, guid) {
this.guid = guid;
...
}
};
You have to be aware that the object you're passing to module.exports there is a shared object, in the sense that every other module that runs require("model.js") it will receive a reference to the same object.
So every time you run mdl.init(), the guid property of that object is changed, which would explain your comment that "...a simple test shows that the mdl.guid often does not correspond to the guid".
It really depends on your exact implementation, but I think you'd want to use a class instead:
// model.js
var Mdl = function(model, guid) {
this.guid = guid;
};
Mdl.prototype.load = function() {
// instantiate and return a new EventEmitter.
};
module.exports = Mdl;
// app.js
var Mdl = require('model.js');
...
var mdl = new Mdl(this.modelMap[sec], guid);
mdl.load(...)

Related

Attempting to Import Module in Child Process (Javascript) and Failing

I'm currently running a heavy computation (i.e. generating a Monte Carlo tree), which is an expensive operation. I only have a few seconds to build as big of a tree as I can, so I am using subprocesses in Node.js in order to build multiple trees, and then aggregate their data together to make a more informed decision.
I understand that subprocesses do not share information/memory, and I need to use modules within these subprocesses that are located in a file, called "Epilog.js" on my machine.
When I run functions that are in epilog.js from the main file, it works just fine. But all of my functions that are in my worker threads return absolutely nothing.
I have tested to make sure that the parameters of the functions I am trying to use in "epilog.js" aren't empty, and they're not. The problem isn't in the parameter.
I have also tested to see what happens if I simply don't import, and instead of just outputting an undefined array, I get an error saying that there is no function called "findroles".
//My main thread.
var fs = require('fs');
eval(fs.readFileSync('epilog.js') + '');
var process = fork('./buildGraph.js');
process.send({library});
//My worker thread.
//buildGraph.js
var fs = require('fs');
eval(fs.readFileSync('epilog.js') + '');
// receive message from master process
process.on('message', async(message) => {
library = message["library"];
console.log(findroles(library));
// findroles(library) is a function that is defined in epilog.js,
//and this outputs an array of "roles" given a parameter,library.
// For some reason this function outputs [], rather than giving me
// all of the roles. If I run this exact line from my main thread,
// it doesn't give any errors and outputs the right array:
// e.g. ['red', 'white'].
});
I expect to get not the empty array, but [red, white], as I do if I were to run the same line in the main thread. Does anyone have an idea as to the inconsistency of the functions? I'm very new to node.js and this isn't a class focused too much on software engineering in JavaScript, so I'd appreciate if someone can dumb down what is going on, as this is all very new to me.
If your script does not find the function called findroles then there is a problem with the importing method. Using the eval function for importing is not the normal way of importing modules. Try something like this:
// buildGraph.js
const epilog = require("./epilog.js");
......
console.log(epilog.findroles(library));
then epilog.js
exports.findroles = function (library) {
// function content
}
You can find more info here:
https://www.w3schools.com/nodejs/nodejs_modules.asp
Base on the document and example here, everything seem correct but I think the problem come from this line:
var process = fork('./buildGraph.js');
you might override the original process.
try to change it to
const n = fork('./buildGraph.js');

How does live object creation and partial teardown management work in javascript?

What I would like to do is load javascript to create a library of methods in an object and wait until the object is used for the first time before it is actually defined or compiled. I would like to build references to this object before it is actually fully defined. When I call a method on this object for the first time before the methods on the object are ever defined (meaning the object doesn't actually have methods) I would like to define the object and then call the method. Is there a way to do this using standard syntax such as "MyLibrary.sayHello()" if "sayHello()" is not yet defined on the object.
I imagine it would look like this:
var independentVar = "noCommitments";
var MyLibrary = function(user_ini){
//MyLibrary.init looks like
// (function(ini){
// var a = ini;
// return function(){
// //Notice the method sayHello defines when called,
// // and does not return a reference
// return {
// b:a,c:"c",sayHello:function(z){return "Hello"+a+z}
// }
// }
// })(user_ini);
var d1 = myRequire("MyLibrary.init");
return {
**handleAll : function(){ this = d1(); this.("**calledMethod")}
}
};
var greeting = MyLibrary.sayHello();
alert(greeting);
This is only pseudo-code. If I add a method to cleanup I can then return that object to the uninitialized state of "{**handleAll:function(){/noContext/}}". My application/library has a stub and a link this way and can be used immediately from an undefined state, when building modules this can be useful in order to lower the number of references to a utility, say a post has a menu of functions and those functions are shared by by all posts, -- with a mechanism such as is described here only the "active post"/"post in focus" will reference the utility. It moreless give the ability to activate and de-activate modules. The special part is the modules are already warmed up, they are ready to call functions even though they do not reference them yet, it is similar to live binding but allows the whole user interface to already be defined with functions already stubbed out with the exact name they will have when they are usable. A control mechanism for defaults and debounce is easily found in this model for me.
My question is: Is this type of scripting possible natively or will I have to use some form of compilation like for TypeScript, CoffeeScript or others. I understand it is possible if I pass the method I would like to call as a parameter to a singleton factory. I ultimately would like whole applications that are able to gracefully degrade unused functionality without polluting the code.
What I mean by pollution:
var LibDef = (
function(){
return {
callUndefined:function(methodName){
var returnVal = {}
}
}
})()
var MySingltonLibrary = moduleSinglton.getLibrary("MyLibrary", Lib);
var greeting = MySingltonLibrary.callUndefined("sayHello");
//
// Please use your imagination to consider the complexity in the singlton
The best way that will allow you to tear down an object releasing any space its functions and members consume on the heap and maintain a single reference, that will allow the object to rebuild itself or just rebuild the function that is called is like this - (A very simple model, you may like to use arrays and gradually tear down nested objects internally):
var twentySecondObj(function(window,document){
var base_obj = undefined;
var externalAPI = undefined;
setTimeout(function(){
base_obj = undefined;
},20000);
return function(){
base_obj = (function(base_obj){
if(base_obj === undefined){
return {
property1:"This is property1",
property2:"This is property2"
}
}
})();
externalAPI = (function(){
if(externalAPI === undefined){
return {
property1:base_obj.property1,
property2:base_obj.property2
}
}
})();
return externalAPI;
}
})(window,document);
console.log(twentySecondObj().property1);
On an additional note, you can use getters and setters to observe access to properties and can internally present a facade of both functions and properties which reference a build method like the one above, this way it looks like you are accessing a legit member of the object. There are no options I can think of that will allow you to intercept when attempt to set a new property on an object like: myObj.fooProperty = "foo", and buildup that property into a custom object with a getter and setter, if you have a custom type that needs to be set, then you will have to know it's implimentation details to set it, or call a function passing in the property name and value, or use a method similar to what is shown above.
Here is a link to the proposal for adding weak references to javascript: https://ponyfoo.com/articles/weakref weak-references would alter how this looks, however would not address everything mentioned in this question. Remapping an object when a property is added via some type of deep observer will allow new property members to be enhanced at the time they are set, this would require that the observer ran synchrounously when the property was set, or once the set is complete, the very next statement must be a call to update the object. I will keep posted here for any advances I see that will make the "default handler function" available within javascript in the future.
WeakRef can absolutely be used for recording and handling object usage. I would really like to move object management into webworkers and service workers so they can be maintained through all web endpoints on the domain and do not require to reload across requests. Web frameworks would need to have modified handle to offload all dom changes and updates to worker, essentially a single hook that handles message passing for all hooks. Modload, now must include a message handle name and have task priority meta data so it is properly placed in the least busy or least active worker (slow worker and fast worker) this helps to create an api that can offload to cloud functions, this shpuld give us ability to do more AI, lookups and work offline that is currently handled for most apps in the cloud where more processing power is, and in this way we can gracefully augment local processing with cloud functions only when local resources, or completion times are degraded below acceptable speeds, or above acceptable power policy.
https://v8.dev/features/weak-references

Node.js design: multiple async functions writing to database using function passed as a closure

I am writing a standalone web scraper in Node, run from command line, which looks for specific data on a set of pages, fetches page views data from Google Analytics and saves it all in an MySQL database. Almost all is ready, but today I found a problem with the way I write data in the db.
To make thing easier let's assume I have an index.js file and two controllers - db and web. Db reads/writes data to db, web scraps the pages using configurable amount of PhantomJs instances.
Web exposes one function checkTargetUrls(urls, writer)
where urls is an array with urls to be checked and writer is an optional parameter, called only if it is a function and there is data to be written.
Now the way I pass the writer is obviously wrong, but looks as follows (in index.js):
some code here
....
let pageId = 0;
... some promises code,
which checks validy of urls,
creates new execution in the database, etc.
...
.then(ulrs => {
return web.checkTargetUrls(urls,
function(singleUrl, pageData) {
...
a chain of promisable functions from db controller,
which first lookup page id in the db, then its
puts in the pageId variable and continues with write to db
...
}).then(() => {
logger.info('All done captain!');
}).catch(err => {logger.error(err})
In the effect randomly pageId gets overwritten by id of preceeding/succeeding page and invalid data is saved. Inside web there are up to 10 concurrent instances of PhantomJs running, which call writer function after they analyzed a page. Excuse me my language, but for me an analogy for that situation would be if I had, say, 10 instances of some object, which then rely for writing on a singleton, which causes the pageId overwriting problem (don't know how to properly express in JS/Node.js terms).
So far I have found one fix to the problem, but it is ugly as it introduces tight coupling. If I put the writer code in a separate module and then load it directly from inside the web controller all works great. But for me it is a bad design pattern and would rather do it otherwise.
var writer = require('./writer');
function checkTargetUrls(urls, executionId) {
return new Promise(
function(resolve, reject) {
let poolSize = config.phantomJs.concurrentInstances;
let running = 0;
....
a bit of code goes here
....
if (slots != undefined && slots != null && slots.data.length > 0) {
return writer.write(executionId, singleUrl, slots);
}
...
more code follows
})
}
I have a hard time findng a nicer solution, where I could still pass writer as an argument for checkTargetUrls(urls, writer) function. Can anyone point me in the right direction or suggest where to look for the answer?
The exact problem around your global pageId is not entirely clear to me but you could reduce coupling by exposing a setWriter function from your 'web' controller.
var writer;
module.exports.setWriter = function(_writer) { writer = _writer };
Then near the top of your index.js, something like:
var web = require('./web');
web.setWriter(require('./writer'));

Long running load operations in durandal

I'm trying to work out where the best place to run a long-running load operation is using Durandal.
From what I can tell, the general recommendation for loading data is in the ViewModel's activate method, which is what I usually do - something like:
viewModel.activate = function () {
var loadPromise = myService.loadData();
return $.when(loadPromise).then(function (loadedData) {
viewModel.data(data);
});
};
I know that if I don't return the promise here, then there's usually problems with the bindings - as this question and answer indicates.
However, executing a long running load operation in the activate method makes the app "freeze" while the load operation completes. For example, what if my load was now something like this?
viewModel.activate = function () {
// All loads return a promise
var firstLoad = myService.loadFirstData();
var secondLoad = myService.loadSecondData();
var thirdLoad = myService.loadThirdDataWhichTakesAges();
return $.when(firstLoad, secondLoad, thirdLoad).then(function (one, two, three) {
viewModel.one(one);
viewModel.two(two);
viewModel.three(three);
});
};
In this scenario, the URL is updated to reflect the page which is being loaded, but the page content still shows the previous page (which is what I mean by "freezes").
Ideally, it would be good if the URL should change to the new page, and the page content should show the new page too (even though the data for that page has not yet been returned). Then, as each load operation returns, the relevant part of the page should be updated when the data is bound into the view model.
Is there a recommended way for achieving this inside Durandal?
My current solution is to kick-off the load in the activate method, and then populate the data in the viewAttached method:
var loadPromise;
viewModel.activate = function () {
// All loads return a promise
var firstLoad = myService.loadFirstData();
var secondLoad = myService.loadSecondData();
var thirdLoad = myService.loadThirdDataWhichTakesAges();
loadPromise = $.when(firstLoad, secondLoad, thirdLoad);
// Don't return the promise - let activation proceed.
};
viewModel.viewAttached = function () {
$.when(loadPromise).then(function (one, two, three) {
viewModel.one(one);
viewModel.two(two);
viewModel.three(three);
});
};
It seems to work, but I remember reading somewhere that relying on viewAttached wasn't a good solution. I'm also not sure if there is potential for a race condition since I'm allowing the activate to proceed.
Any other recommendations?
You don't have to return a promise but in that case you must handle this in you knockout bindings so you woun't bind to elements that are undefined. You can try to get rid of that 'return' in activate but add a property indicating if model is still loading. Something like this:
viewModel.isLoading = ko.observable(false);
viewModel.activate = function () {
isLoading(true);
var loadPromise = myService.loadData();
$.when(loadPromise).then(function (loadedData) {
viewModel.data(data);
isLoading(false);
});
};
And then, in your view, you can have a section that shows up when view is still loading and one that shows up when loading is done. Sometinhg like:
<div data-bind:"visible: isLoading()">Loading Data....</div>
<div data-bind:"visible: !isLoading()">Put your regular view with bindings here. Loading is done so bindings will work.</div>
Which version of Durandal are you using? In Durandal 2.0.0pre you would be allowed NOT returning a promise in activate so that the composition of the view (without data) could happen immediately.
You might consider refactoring viewModel.one etc. into a module that returns a constructor function, so that each one, two, three would be responsible for retrieving their own data. That way you first two calls wouldn't have to wait on loadThirdDataWhichTakesAges. That would make sense in scenarios where one, two, three are not heavily depend on each other.
For reference; I posted a similar question on the Durandal Google Group (effectively asking if using activate and viewAttached in this manner is an OK idea) and got this reply from Rob Eisenberg:
That will probably work. The problem is that Knockout will destroy
databindings on elements if the properties are updated and the element
isn't currently in the document. This can happen depending on the
timing of the async code. Because of the way composition worked in
1.x, this would cause problems if you didn't return the promise from your activate function. It should work better in viewAttached, but
depending on the nature of your composition, the view may be attached
to its parent, but still not in the document. It depends on the depth
of the composition. So, you could encounter issues with this too if
you have this in a deeply composed module. Unfortunately, there isn't
a clean way about it in Durandal 1.x due to the knockout behavior. In
Durandal 2.x we have reworked composition so that this problem is
non-existent and returning the promise is no longer necessary (though
you can still do it). Durandal 2.0 will be releasing in about two
weeks.

Mocking Postgres for unit tests with Sinon.js in Node.js

I am having trouble getting my head round how i can use sinon to mock a call to postgres which is required by the module i am testing, or if it is even possible.
I am not trying to test the postgres module itself, just my object to ensure it is working as expected, and that it is calling what it should be calling in this instance.
I guess the issue is the require setup of node, in that my module requires the postgres module to hit the database, but in here I don't want to run an integration test I just want to make sure my code is working in isolation, and not really care what the database is doing, i will leave that to my integration tests.
I have seen some people setting up their functions to have an optional parameter to send the mock/stub/fake to the function, test for its existence and if it is there use it over the required module, but that seems like a smell to me (i am new at node so maybe this isn't).
I would prefer to mock this out, rather then try and hijack the require if that is possible.
some code (please note this is not the real code as i am running with TDD and the
function doesn't do anything really, the function names are real)
TEST SETUP
describe('#execute', function () {
it('should return data rows when executing a select', function(){
//Not sure what to do here
});
});
SAMPLE FUNCTION
PostgresqlProvider.prototype.execute = function (query, cb) {
var self = this;
if (self.connection === "")
cb(new Error('Connection can not be empty, set Connection using Init function'));
if (query === null)
cb(new Error('Invalid Query Object - Query Object is Null'))
if (!query.buildCommand)
cb(new Error("Invalid Query Object"));
//Valid connection and query
};
It might look a bit funny to wrap around the postgres module like this but there are some design as this app will have several "providers" and i want to expose the same API for them all so i can use them interchangeably.
UPDATE
I decided that my test was too complicated, as i was looking to see if the connect call had been made AND then returning data, which smelt to me, so i stripped it back and put it into two tests:
The Mock Test
it('should call pg.connect when a valid Query object is parsed', function(){
var mockPg = sinon.mock(pg);
mockPg.expects('connect').once;
Provider.init('ConnectionString');
Provider.execute(stubQueryWithBuildFunc, null, mockPg);
mockPg.verify();
});
This works (i think) as without the postgres connector code it fails, with it passes (Boom :))
Issue now is with the second method, which i am going to use a stub (maybe a spy) which is passing 100% when it should fail, so i will pick that up in the morning.
Update 2
I am not 100% happy with the test, mainly because I am not hijacking the client.query method which is the one that hits the database, but simply my execute method and forcing it down a path, but it allows me to see the result and assert against it to test behaviour, but would be open to any suggested improvements.
I am using a spy to catch the method and return null and a faux object with contains rows, like the method would pass back, this test will change as I add more Query behaviour but it gets me over my hurdle.
it('should return data rows when a valid Query object is parsed', function(){
var fauxRows = [
{'id': 1000, 'name':'Some Company A'},
{'id': 1001, 'name':'Some Company B'}
];
var stubPg = sinon.stub(Provider, 'execute').callsArgWith(1, null, fauxRows);
Provider.init('ConnectionString');
Provider.execute(stubQueryWithBuildFunc, function(err, rows){
rows.should.have.length(2);
}, stubPg);
stubPg.called.should.equal.true;
stubPg.restore();
});
Use pg-pool: https://www.npmjs.com/package/pg-pool
It's about to be added to pg anyway and purportedly makes (mocking) unit-testing easier... from BrianC ( https://github.com/brianc/node-postgres/issues/1056#issuecomment-227325045 ):
Checkout https://github.com/brianc/node-pg-pool - it's going to be the pool implementation in node-postgres very soon and doesn't rely on singletons which makes mocking much easier. Hopefully that helps!
I very explicitly replace my dependencies. It's probably not the best solution but all the other solutions I saw weren't that great either.
inject: function (_mock) {
if (_mock) { real = _mock; }
}
You add this code to the module under test. In my tests I call the inject method and replace the real object. The reason why I don't 100% like it is because you have to add extra code only for testing.
The other solution is to read the module file as a string and use vm to manually load the file. When I investigated this I found it a little to complex so I went with just using the inject function. It's probably worth investigating this approach though. You can find more information here.

Categories