I'm new to node.js, so before releasing my node.js app, I need to be sure it will work as it should.
Let's say I have an array variable and I intialize it on beginning of my script
myArray = [];
then I pull some data from an external API, store it inside myArray, and use setInterval() method to pull this data again each 30 minutes:
pullData();
setInterval(pullData, 30*60*1000);
pullData() function takes about 2-3 seconds to finish.
Clients will be able to get myArray using this function:
http.createServer(function(request, response){
var path = url.parse(request.url).pathname;
if(path=="/getdata"){
var string = JSON.stringify(myArray);
response.writeHead(200, {'Content-Type': 'text/plain'});
response.end(string);
}
}).listen(8001);
So what I'm asking is, can next situation happen?:
An client tries to get data from this node.js server, and in that same moment, data is being written into myArray by pullData() function, resulting in invalid data being sent to client?
I read some documentation, and what I realized is that when pullData() is running, createServer() will not respond to clients until pullData() finishes its job?
I'm really not good at understanding concurrent programming, so I need your confirmation on this, or if you have some better solution?
EDIT: here is the code of my pullData() function:
var now = new Date();
Date.prototype.addDays = function(days){
var dat = new Date(this.valueOf());
dat.setDate(dat.getDate() + days);
return dat;
}
var endDateTime = now.addDays(noOfDays);
var formattedEnd = endDateTime.toISOString();
var url = "https://api.mindbodyonline.com/0_5/ClassService.asmx?wsdl";
soap.createClient(url, function (err, client) {
if (err) {
throw err;
}
client.setEndpoint('https://api.mindbodyonline.com/0_5/ClassService.asmx');
var params = {
"Request": {
"SourceCredentials": {
"SourceName": sourceName,
"Password": password,
"SiteIDs": {
"int": [siteIDs]
}
},
"EndDateTime" : formattedEnd
}
};
client.Class_x0020_Service.Class_x0020_ServiceSoap.GetClasses(params, function (errs, result) {
if (errs) {
console.log(errs);
} else {
var classes = result.GetClassesResult.Classes.Class;
myArray = [];
for (var i = 0; i < classes.length; i++) {
var name = classes[i].ClassDescription.Name;
var staff = classes[i].Staff.Name;
var locationName = classes[i].Location.Name;
var start = classes[i].StartDateTime.toISOString();
var end = classes[i].EndDateTime.toISOString();
var klasa = new Klasa(name,staff,locationName,start,end);
myArray.push(klasa);
}
myArray.sort(function(a,b){
var c = new Date(a.start);
var d = new Date(b.start);
return c-d;
});
string = JSON.stringify(myArray);
}
})
});
No, NodeJs is not multi-threaded and everything run on a single thread, this means except non-blocking calls (ie. IO) everything else will engage CPU until it returns, and NodeJS absolutely doesn't return half-way populated array to the end user, as long as you only do one HTTP call to populate your array.
Update:
As pointed out by #RyanWilcox any asynchronous (non-blocking syscall) call may hint NodeJS interpreter to leave your function execution half way and return to it later.
In general: No.
JavaScript is single threaded. While one function is running, no other function can be.
The exception is if you have delays between functions that access the value of an array.
e.g.
var index = i;
function getNext() {
async.get(myArray[i], function () {
i++;
if (i < myArray.length) {
getNext()
}
});
}
… in which case the array could be updated between the calls to the asynchronous function.
You can mitigate that by creating a deep copy of the array when you start the first async operation.
Javascript is single threaded language so you don't have to be worried about this kind of concurrency. That means no two parts of code are executed at the same time. Unlike many other programming languages, javascript has different concurrency model based on event loop. To achieve best performance, you should use non-blocking operations handled by callback functions, promises or events. I suppose that your external API provides some asynchronous i/o functions what is well suited for node.js.
If your pullData call doesn't take too long, another solution is to cache the data.
Fetch the data only when needed (so when the client accesses /getdata). If it is fetched you can cache the data with a timestamp. If the /getdata is called again, check if the cached data is older than 30 minutes, if so fetch again.
Also parsing the array to json..
var string = JSON.stringify(myArray);
..might be done outside the /getdata call, so this does not have to be done for each client visiting /getdata. Might make it slightly quicker.
Related
I have a web socket that receives data from a web socket server every 100 to 200ms, ( I have tried both with a shared web worker as well as all in the main.js file),
When new JSON data arrives my main.js runs filter_json_run_all(json_data) which updates Tabulator.js & Dygraph.js Tables & Graphs with some custom color coding based on if values are increasing or decreasing
1) web socket json data ( every 100ms or less) -> 2) run function filter_json_run_all(json_data) (takes 150 to 200ms) -> 3) repeat 1 & 2 forever
Quickly the timestamp of the incoming json data gets delayed versus the actual time (json_time 15:30:12 vs actual time: 15:31:30) since the filter_json_run_all is causing a backlog in operations.
So it causes users on different PC's to have websocket sync issues, based on when they opened or refreshed the website.
This is only caused by the long filter_json_run_all() function, otherwise if all I did was console.log(json_data) they would be perfectly in sync.
Please I would be very very grateful if anyone has any ideas how I can prevent this sort of blocking / backlog of incoming JSON websocket data caused by a slow running javascript
function :)
I tried using a shared web worker which works but it doesn't get around the delay in main.js blocked by filter_json_run_all(), I dont thing I can put filter_json_run_all() since all the graph & table objects are defined in main & also I have callbacks for when I click on a table to update a value manually (Bi directional web socket)
If you have any ideas or tips at all I will be very grateful :)
worker.js:
const connectedPorts = [];
// Create socket instance.
var socket = new WebSocket(
'ws://'
+ 'ip:port'
+ '/ws/'
);
// Send initial package on open.
socket.addEventListener('open', () => {
const package = JSON.stringify({
"time": 123456,
"channel": "futures.tickers",
"event": "subscribe",
"payload": ["BTC_USD", "ETH_USD"]
});
socket.send(package);
});
// Send data from socket to all open tabs.
socket.addEventListener('message', ({ data }) => {
const package = JSON.parse(data);
connectedPorts.forEach(port => port.postMessage(package));
});
/**
* When a new thread is connected to the shared worker,
* start listening for messages from the new thread.
*/
self.addEventListener('connect', ({ ports }) => {
const port = ports[0];
// Add this new port to the list of connected ports.
connectedPorts.push(port);
/**
* Receive data from main thread and determine which
* actions it should take based on the received data.
*/
port.addEventListener('message', ({ data }) => {
const { action, value } = data;
// Send message to socket.
if (action === 'send') {
socket.send(JSON.stringify(value));
// Remove port from connected ports list.
} else if (action === 'unload') {
const index = connectedPorts.indexOf(port);
connectedPorts.splice(index, 1);
}
});
Main.js This is only part of filter_json_run_all which continues on for about 6 or 7 Tabulator & Dygraph objects. I wante to give an idea of some of the operations called with SetTimeout() etc
function filter_json_run_all(json_str){
const startTime = performance.now();
const data_in_array = json_str //JSON.parse(json_str.data);
// if ('DATETIME' in data_in_array){
// var milliseconds = (new Date()).getTime() - Date.parse(data_in_array['DATETIME']);
// console.log("milliseconds: " + milliseconds);
// }
if (summary in data_in_array){
if("DATETIME" in data_in_array){
var time_str = data_in_array["DATETIME"];
element_time.innerHTML = time_str;
}
// summary Data
const summary_array = data_in_array[summary];
var old_sum_arr_krw = [];
var old_sum_arr_irn = [];
var old_sum_arr_ntn = [];
var old_sum_arr_ccn = [];
var old_sum_arr_ihn = [];
var old_sum_arr_ppn = [];
var filtered_array_krw_summary = filterByProperty_summary(summary_array, "KWN")
old_sum_arr_krw.unshift(Table_summary_krw.getData());
Table_summary_krw.replaceData(filtered_array_krw_summary);
//Colour table
color_table(filtered_array_krw_summary, old_sum_arr_krw, Table_summary_krw);
var filtered_array_irn_summary = filterByProperty_summary(summary_array, "IRN")
old_sum_arr_irn.unshift(Table_summary_inr.getData());
Table_summary_inr.replaceData(filtered_array_irn_summary);
//Colour table
color_table(filtered_array_irn_summary, old_sum_arr_irn, Table_summary_inr);
var filtered_array_ntn_summary = filterByProperty_summary(summary_array, "NTN")
old_sum_arr_ntn.unshift(Table_summary_twd.getData());
Table_summary_twd.replaceData(filtered_array_ntn_summary);
//Colour table
color_table(filtered_array_ntn_summary, old_sum_arr_ntn, Table_summary_twd);
// remove formatting on fwds curves
setTimeout(() => {g_fwd_curve_krw.updateOptions({
'file': dataFwdKRW,
'labels': ['Time', 'Bid', 'Ask'],
strokeWidth: 1,
}); }, 200);
setTimeout(() => {g_fwd_curve_inr.updateOptions({
'file': dataFwdINR,
'labels': ['Time', 'Bid', 'Ask'],
strokeWidth: 1,
}); }, 200);
// remove_colors //([askTable_krw, askTable_inr, askTable_twd, askTable_cny, askTable_idr, askTable_php])
setTimeout(() => { askTable_krw.getRows().forEach(function (item, index) {
row = item.getCells();
row.forEach(function (value_tmp){value_tmp.getElement().style.backgroundColor = '';}
)}); }, 200);
setTimeout(() => { askTable_inr.getRows().forEach(function (item, index) {
row = item.getCells();
row.forEach(function (value_tmp){value_tmp.getElement().style.backgroundColor = '';}
)}); }, 200);
color_table Function
function color_table(new_arr, old_array, table_obj){
// If length is not equal
if(new_arr.length!=old_array[0].length)
console.log("Diff length");
else
{
// Comparing each element of array
for(var i=0;i<new_arr.length;i++)
//iterate old dict dict
for (const [key, value] of Object.entries(old_array[0][i])) {
if(value == new_arr[i][key])
{}
else{
// console.log("Different element");
if(key!="TENOR")
// console.log(table_obj)
table_obj.getRows()[i].getCell(key).getElement().style.backgroundColor = 'yellow';
if(key!="TIME")
if(value < new_arr[i][key])
//green going up
//text_to_speech(new_arr[i]['CCY'] + ' ' +new_arr[i]['TENOR']+ ' getting bid')
table_obj.getRows()[i].getCell(key).getElement().style.backgroundColor = 'Chartreuse';
if(key!="TIME")
if(value > new_arr[i][key])
//red going down
table_obj.getRows()[i].getCell(key).getElement().style.backgroundColor = 'Crimson';
}
}
}
}
Potential fudge / solution, thanks Aaron :):
function limiter(fn, wait){
let isCalled = false,
calls = [];
let caller = function(){
if (calls.length && !isCalled){
isCalled = true;
if (calls.length >2){
calls.splice(0,calls.length-1)
//remove zero' upto n-1 function calls from array/ queue
}
calls.shift().call();
setTimeout(function(){
isCalled = false;
caller();
}, wait);
}
};
return function(){
calls.push(fn.bind(this, ...arguments));
// let args = Array.prototype.slice.call(arguments);
// calls.push(fn.bind.apply(fn, [this].concat(args)));
caller();
};
}
This is then defined as a constant for a web worker to call:
const filter_json_run_allLimited = limiter(data => { filter_json_run_all(data); }, 300); // 300ms for examples
Web worker calls the limited function when new web socket data arrives:
// Event to listen for incoming data from the worker and update the DOM.
webSocketWorker.port.addEventListener('message', ({ data }) => {
// Limited function
filter_json_run_allLimited(data);
});
Please if anyone knows how websites like tradingview or real time high performance data streaming sites allow for low latency visualisation updates, please may you comment, reply below :)
I'm reticent to take a stab at answering this for real without knowing what's going on in color_table. My hunch, based on the behavior you're describing is that filter_json_run_all is being forced to wait on a congested DOM manipulation/render pipeline as HTML is being updated to achieve the color-coding for your updated table elements.
I see you're already taking some measures to prevent some of these DOM manipulations from blocking this function's execution (via setTimeout). If color_table isn't already employing a similar strategy, that'd be the first thing I'd focus on refactoring to unclog things here.
It might also be worth throwing these DOM updates for processed events into a simple queue, so that if slow browser behavior creates a rendering backlog, the function actually responsible for invoking pending DOM manipulations can elect to skip outdated render operations to keep the UI acceptably snappy.
Edit: a basic queueing system might involve the following components:
The queue, itself (this can be a simple array, it just needs to be accessible to both of the components below).
A queue appender, which runs during filter_json_run_all, simply adding objects to the end of the queue representing each DOM manipulation job you plan to complete using color_table or one of your setTimeout` callbacks. These objects should contain the operation to performed (i.e: the function definition, uninvoked), and the parameters for that operation (i.e: the arguments you're passing into each function).
A queue runner, which runs on its own interval, and invokes pending DOM manipulation tasks from the front of the queue, removing them as it goes. Since this operation has access to all of the objects in the queue, it can also take steps to optimize/combine similar operations to minimize the amount of repainting it's asking the browser to do before subsequent code can be executed. For example, if you've got several color_table operations that coloring the same cell multiple times, you can simply perform this operation once with the data from the last color_table item in the queue involving that cell. Additionally, you can further optimize your interaction with the DOM by invoking the aggregated DOM manipulation operations, themselves, inside a requestAnimationFrame callback, which will ensure that scheduled reflows/repaints happen only when the browser is ready, and is preferable from a performance perspective to DOM manipulation queueing via setTimeout/setInterval.
I am using Meteor, which uses Mongodb as its database. I have code that inserts several documents into a collection when users fill out a form. When these documents are inserted, I would like to fire some JavaScript code within the server side directories that sorts through the collection in question for documents with matching fields as the documents just inserted.
My problem is that I do not know how to fire code on the server when the new documents arrive. Would it make sense to Meteor.call a Meteor.method at the end of the code involved with inserting, with the Meteor.method called preforming the sorting code I need?
Edit:
As you can see, in the below code I'm not calling any Meteor methods as none exist yet. The vast majority of this code is simply lead up for the insert({}) at the end of the page, so I think it can be safely ignored. The only server side code I have is to declare the possibleGames mongo collection.
I am not sure what you mean by call a plain JavaScript function, my problem is getting any code firing at all.
possibleGames = new Mongo.Collection("possibleGames");
Template.meet_form.events({
"submit .meet_form": function(event, template){
event.preventDefault();
var user = Meteor.userId();
var where = event.target.where.value;
var checkedGames = [];
function gameCheck (game) {
if (game.checked === true){
checkedGames.push(game.value);
};
};
var checkedDays = [];
function dayCheck (day) {
if (day.checked === true){
checkedDays.push(day.value);
};
};
console.log(event.target.where.value)
gameCheck(event.target.dnd);
gameCheck(event.target.savageWorlds);
gameCheck(event.target.shadowRun);
console.log(checkedGames);
dayCheck(event.target.monday);
dayCheck(event.target.tuesday);
dayCheck(event.target.wednesday);
dayCheck(event.target.thursday);
dayCheck(event.target.friday);
dayCheck(event.target.saturday);
dayCheck(event.target.sunday);
console.log(checkedDays);
var whereWhat = [];
for (i = 0; i < checkedGames.length; i++) {
var prepareWhereWhat = where.concat(checkedGames[i]);
whereWhat.push(prepareWhereWhat);
};
console.log(whereWhat);
var whereWhatWhen = [];
for (a = 0; a < whereWhat.length; a++) {
var prepareWWW1 = whereWhat[a];
for (b = 0; b < checkedDays.length; b++) {
var prepareWWW2 = prepareWWW1.concat(checkedDays[b]);
whereWhatWhen.push(prepareWWW2);
};
};
console.log(whereWhatWhen);
for (i = 0; i < whereWhatWhen.length; i++) {
possibleGames.insert({
game: whereWhatWhen[i],
user: user,
created_on: new Date().getTime()
})
}
}
});
You don't need to do a meteor.call on the server because you're already on the server.
Just call a plain javascript function.
If what you want to call from your first Meteor.method is already in another Meteor.method, then refactor that function to extract out the common bit.
Some code would also help if this is still confusing.
I am tryng to create posts using a for loop, but when i look at Parse database only the last object of my array get's stored. this is the code i wrote.
var Reggione = Parse.Object.extend("Reggione");
var creaReggione = new Reggione();
var selectobject = $('#searcharea')[0];
for (var i = 2; i < selectobject.length; i++) {
creaReggione.set("name", selectobject.options[i].text);
creaReggione.save();
Thanks, Bye.
Do this by creating an array of new objects, then save them together...
var newObjects = [];
for (var i = 2; i < selectobject.length; i++) {
creaReggione.set("name", selectobject.options[i].text);
newObjects.push(creaReggione);
// ...
}
Parse.Object.saveAll(newObjects);
Remember, if you want something to happen after saveAll completes (like call response.success() if you're in cloud code), then you should use that promise as follows...
Parse.Object.saveAll(newObjects).then(function(result) {
response.success(result);
}, function(error) {
response.error(error);
});
In extension to danhs answer, the reason this does not work is because only one transaction can happen at a time from the JS client to Parse.
Therefore in your loop the first call to .save() is made and the object is saved to Parse at some rate (asynchrounously), in that time the loop continues to run and skips over your other save calls, these objects are NOT queued to be saved. As Danh pointed out, you must use Parse's batch operations to save multiple objects to the server in one go, to do this you can:
var newObjects = [];
for (var i = 2; i < selectobject.length; i++) {
creaReggione.set("name", selectobject.options[i].text);
newObjects.push(creaReggione);
// ...
}
Parse.Object.saveAll(newObjects);
Hope this helps, I'd also recommend taking a look at Parse's callback functions on the save method to get more details on what happened (you can check for errors and success callbacks here to make debugging a little easier)
An example of this would be to extend the previous call with:
Parse.Object.saveAll(newObjects, {
success: function(messages) {
console.log("The objects were successfully saved...")
},
error: function(error) {
console.log("An error occurred when saving the messages array: %s", error.message)
}
})
I hope this is of some help to you
I don't even know how to properly ask this question but I have concerns about the performance (mostly memory consumption) of the following code. I am anticipating that this code will consume a lot of memory because of map on a large set and a lot of 'hanging' functions that wait for external service. Are my concerns justified here? What would be a better approach?
var list = fs.readFileSync('./mailinglist.txt') // say 1.000.000 records
.split("\n")
.map( processEntry );
var processEntry = function _processEntry(i){
i = i.split('\t');
getEmailBody( function(emailBody, name){
var msg = {
"message" : emailBody,
"name" : i[0]
}
request(msg, function reqCb(err, result){
...
});
}); // getEmailBody
}
var getEmailBody = function _getEmailBody(obj, cb){
// read email template from file;
// v() returns the correct form for person's name with web-based service
v(obj.name, function(v){
cb(obj, v)
});
}
If you're worried about submitting a million http requests in a very short time span (which you probably should be), you'll have to set up a buffer of some kind.
one simple way to do it:
var lines = fs.readFileSync('./mailinglist.txt').split("\n");
var entryIdx = 0;
var done = false;
var processNextEntry = function () {
if (entryIdx < lines.length) {
processEntry(lines[entryIdx++]);
} else {
done = true;
}
};
var processEntry = function _processEntry(i){
i = i.split('\t');
getEmailBody( function(emailBody, name){
var msg = {
"message" : emailBody,
"name" : name
}
request(msg, function reqCb(err, result){
// ...
!done && processNextEntry();
});
}); // getEmailBody
}
// getEmailBody didn't change
// you set the ball rolling by calling processNextEntry n times,
// where n is a sensible number of http requests to have pending at once.
for (var i=0; i<10; i++) processNextEntry();
Edit: according to this blog post node has an internal queue system, it will only allow 5 simultaneous requests. But you can still use this method to avoid filling up that internal queue with a million items if you're worried about memory consumption.
Firstly I would advise against using readFileSync, and instead favour the async equivalent. Blocking on IO operations should be avoided as reading from a disk is very expensive, and whilst that's the sole purpose of your code now, I would consider how that might change in the future - and arbitrarily wasting clock cycles is never a good idea.
For large data files I would read them in in defined chunks and process them. If you can come up with some schema, either sentinels to distinguish data blocks within the file, or padding to boundaries, then process the file piece by piece.
This is just rough, untested off the top of my head, but something like:
var fs = require("fs");
function doMyCoolWork(startByteIndex, endByteIndex){
fs.open("path to your text file", 'r', function(status, fd) {
var chunkSize = endByteIndex - startByteIndex;
var buffer = new Buffer(chunkSize);
fs.read(fd, buffer, 0, chunkSize, 0, function(err, byteCount) {
var data = buffer.toString('utf-8', 0, byteCount);
// process your data here
if(stillWorkToDo){
//recurse
doMyCoolWork(endByteIndex, endByteIndex + 100);
}
});
});
}
Or look into one of the stream library functions for similar functionality.
H2H
ps. Javascript and Node works extremely well with async and eventing.. using sync is an antipattern in my opinion, and likely to cause code to be a headache in future
I have only recently started developing for node.js, so forgive me if this is a stupid question - I come from Javaland, where objects still live happily sequentially and synchronous. ;)
I have a key generator object that issues keys for database inserts using a variant of the high-low algorithm. Here's my code:
function KeyGenerator() {
var nextKey;
var upperBound;
this.generateKey = function(table, done) {
if (nextKey > upperBound) {
require("../sync/key-series-request").requestKeys(function(err,nextKey,upperBound) {
if (err) { return done(err); }
this.nextKey = nextKey;
this.upperBound = upperBound;
done(nextKey++);
});
} else {
done(nextKey++);
}
}
}
Obviously, when I ask it for a key, I must ensure that it never, ever issues the same key twice. In Java, if I wanted to enable concurrent access, I would make make this synchronized.
In node.js, is there any similar concept, or is it unnecessary? I intend to ask the generator for a bunch of keys for a bulk insert using async.parallel. My expectation is that since node is single-threaded, I need not worry about the same key ever being issued more than once, can someone please confirm this is correct?
Obtaining a new series involves an asynchronous database operation, so if I do 20 simultaneous key requests, but the series has only two keys left, won't I end up with 18 requests for a new series? What can I do to avoid that?
UPDATE
This is the code for requestKeys:
exports.requestKeys = function (done) {
var db = require("../storage/db");
db.query("select next_key, upper_bound from key_generation where type='issue'", function(err,results) {
if (err) { done(err); } else {
if (results.length === 0) {
// Somehow we lost the "issue" row - this should never have happened
done (new Error("Could not find 'issue' row in key generation table"));
} else {
var nextKey = results[0].next_key;
var upperBound = results[0].upper_bound;
db.query("update key_generation set next_key=?, upper_bound=? where type='issue'",
[ nextKey + KEY_SERIES_WIDTH, upperBound + KEY_SERIES_WIDTH],
function (err,results) {
if (err) { done(err); } else {
done(null, nextKey, upperBound);
}
});
}
}
});
}
UPDATE 2
I should probably mention that consuming a key requires db access even if a new series doesn't have to be requested, because the consumed key will have to be marked as used in the database. The code doesn't reflect this because I ran into trouble before I got around to implementing that part.
UPDATE 3
I think I got it using event emitting:
function KeyGenerator() {
var nextKey;
var upperBound;
var emitter = new events.EventEmitter();
var requesting = true;
// Initialize the generator with the stored values
db.query("select * from key_generation where type='use'", function(err, results)
if (err) { throw err; }
if (results.length === 0) {
throw new Error("Could not get key generation parameters: Row is missing");
}
nextKey = results[0].next_key;
upperBound = results[0].upper_bound;
console.log("Setting requesting = false, emitting event");
requesting = false;
emitter.emit("KeysAvailable");
});
this.generateKey = function(table, done) {
console.log("generateKey, state is:\n nextKey: " + nextKey + "\n upperBound:" + upperBound + "\n requesting:" + requesting + " ");
if (nextKey > upperBound) {
if (!requesting) {
requesting = true;
console.log("Requesting new series");
require("../sync/key-series-request").requestSeries(function(err,newNextKey,newUpperBound) {
if (err) { return done(err); }
console.log("New series available:\n nextKey: " + newNextKey + "\n upperBound: " + newUpperBound);
nextKey = newNextKey;
upperBound = newUpperBound;
requesting = false;
emitter.emit("KeysAvailable");
done(null,nextKey++);
});
} else {
console.log("Key request is already underway, deferring");
var that = this;
emitter.once("KeysAvailable", function() { console.log("Executing deferred call"); that.generateKey(table,done); });
}
} else {
done(null,nextKey++);
}
}
}
I've peppered it with logging outputs, and it does do what I want it to.
As another answer mentions, you will potentially end up with results different from what you want. Taking things in order:
function KeyGenerator() {
// at first I was thinking you wanted these as 'class' properties
// and thus would want to proceed them with this. rather than as vars
// but I think you want them as 'private' members variables of the
// class instance. That's dandy, you'll just want to do things differently
// down below
var nextKey;
var upperBound;
this.generateKey = function (table, done) {
if (nextKey > upperBound) {
// truncated the require path below for readability.
// more importantly, renamed parameters to function
require("key-series-request").requestKeys(function(err,nKey,uBound) {
if (err) { return done(err); }
// note that thanks to the miracle of closures, you have access to
// the nextKey and upperBound variables from the enclosing scope
// but I needed to rename the parameters or else they would shadow/
// obscure the variables with the same name.
nextKey = nKey;
upperBound = uBound;
done(nextKey++);
});
} else {
done(nextKey++);
}
}
}
Regarding the .requestKeys function, you will need to somehow introduce some kind of synchronization. This isn't actually terrible in one way because with only one thread of execution, you don't need to sweat the challenge of setting your semaphore in a single operation, but it is challenging to deal with the multiple callers because you will want other callers to effectively (but not really) block waiting for the first call to requestKeys() which is going to the DB to return.
I need to think about this part a bit more. I had a basic solution in mind which involved setting a simple semaphore and queuing the callbacks, but when I was typing it up I realized I was actually introducing a more subtle potential synchronization bug when processing the queued callbacks.
UPDATE:
I was just finishing up one approach as you were writing about your EventEmitter approach, which seems reasonable. See this gist which illustrates the approach. I took. Just run it and you'll see the behavior. It has some console logging to see which calls are getting deferred for a new key block or which can be handled immediately. The primary moving part of the solution is (note that the keyManager provides the stubbed out implementation of your require('key-series-request'):
function KeyGenerator(km) {
this.nextKey = undefined;
this.upperBound = undefined;
this.imWorkingOnIt = false;
this.queuedCallbacks = [];
this.keyManager = km;
this.generateKey = function(table, done) {
if (this.imWorkingOnIt){
this.queuedCallbacks.push(done);
console.log('KG deferred call. Pending CBs: '+this.queuedCallbacks.length);
return;
};
var self=this;
if ((typeof(this.nextKey) ==='undefined') || (this.nextKey > this.upperBound) ){
// set a semaphore & add the callback to the queued callback list
this.imWorkingOnIt = true;
this.queuedCallbacks.push(done);
this.keyManager.requestKeys(function(err,nKey,uBound) {
if (err) { return done(err); }
self.nextKey = nKey;
self.upperBound = uBound;
var theCallbackList = self.queuedCallbacks;
self.queuedCallbacks = [];
self.imWorkingOnIt = false;
theCallbackList.forEach(function(f){
// rather than making the final callback directly,
// call KeyGenerator.generateKey() with the original
// callback
setImmediate(function(){self.generateKey(table,f);});
});
});
} else {
console.log('KG immediate call',self.nextKey);
var z= self.nextKey++;
setImmediate(function(){done(z);});
}
}
};
If your Node.js code to calculate the next key didn't need to execute an async operation then you wouldn't run into synchronization issues because there is only one JavaScript thread executing code. Access to the nextKey/upperBound variables will be done in sequence by only one thread (i.e. request 1 will access first, then request 2, then request 3 et cetera.) In the Java-world you will always need synchronization because multiple threads will be executing even if you didn't make a DB call.
However, in your Node.js code since you are making an async call to get the nextKey you could get strange results. There is still only one JavaScript thread executing your code, but it would be possible for request 1 to make the call to the DB, then Node.js might accept request 2 (while request 1 is getting data from the DB) and this second request will also make a request to the DB to get keys. Let's say that request 2 gets data from the DB quicker than request 1 and update nextKey/upperBound variables with values 100/150. Once request 1 gets its data (say values 50/100) then it will update nextKey/upperBound. This scenario wouldn't result in duplicate keys, but you might see gaps in your keys (for example, not all keys 100 to 150 will be used because request 1 eventually reset the values to 50/100)
This makes me think that you will need a way to sync access, but I am not exactly sure what will be the best way to achieve this.