I'm trying out the framework node.js on one of my projects.
I'm really seeing some good advantages on what they called "event-driven, non-blocking I/O model" however if my project there are some moments where I don't necessarily want to have some asynchronous calls and to be able to several operation before launching some asynchronous call.
Especially when I want to do some factorization and create some functions.
Typically I have the following case:
I know that in several part of my program I have to check if a media is existing in my database for a given string or id.
So as a guy who tried to stay organize I want to create a function that I will call each time I need to check this.
However, I did not find the way to do that with node.js and pg (the npm PostgreSQL library (https://github.com/brianc/node-postgres/) . Indeed, there is always a callback in the function and the return is null because of the callback. Here is an example below
/*
Function which is supposed to check if a media existing
*/
function is_media_existing (url_or_id){
log.debug("is_media_existing : begin of the function", {"Parameter" : url_or_id});
pg.connect(connectionstring, function (err, client, done) {
if (err) {
log.warning("is_media_existing : Problem with Database connection", {
"Parameter": url_or_id,
"Error": err
});
}
if (isNaN(url_or_id)) {
// Case is parameter is not a number (string)
var query = client.query('SELECT COUNT(*) as count FROM media WHERE url = $1::string ', url_or_id);
query.on('error', function (error) {
log.warning("is_media_existing : Problem with Database query (connection to db passed but not query " +
"", {"Parameter": url_or_id, "Error": error});
});
return query;
} else {
// Case is parameter is a int
log.debug("is_media_existing : Type of Parameter is a string");
// Case is parameter is not a number (string)
var query = client.query('SELECT COUNT(*) as count FROM media WHERE id = $1::id ', url_or_id);
query.on('error', function (error) {
log.warning("is_media_existing : Problem with Database query (connection to db passed but not query " +
"", {"Parameter": url_or_id, "Error": error});
});
return query;
}
});
}
// Executing the function
var test = is_media_existing("http://random_url_existing_in_db");
// test is always null as the return is in a callback and the callback is asynchronous
i have the feeling my question is touching the core concepts of node.js, and perhaps my approach is wrong and I apologize in advance.
I know it's not good to wait for a response before doing something.
But what's the alternative? How can I factorize my code into functions when I need some functionalities in several part of my code?
So if there would be anyone who could explain how to do that with a best practice of programming it would be great.
Thanks
Anselme
As Cody says, you probably dont want to do synchronous function.
The way you should handle the situation in your example is to pass in your own callback like this
function is_media_existing (url_or_id, callback){
and then instead of return query; use your callback like this-
callback(query);
or probably better to follow the node convention for callback functions to have two parameters (err, result) so your callback would look like this
callback(null, query);
Here is a rework of your sample
function is_media_existing (url_or_id, callback){ /* callback(err, result) */
log.debug("is_media_existing : begin of the function", {"Parameter" : url_or_id});
pg.connect(connectionstring, function (err, client, done) {
if (err) {
done(err);
log.warning("is_media_existing : Problem with Database connection", {
"Parameter": url_or_id,
"Error": err
});
return callback(err, null);
/* note that this return is simply used to exit the connect's callback and the return value is typically
* not used it is the call to callback() that returns the error value */
}
var qrystr;
if (isNaN(url_or_id)) {
log.debug("is_media_existing : Type of Parameter is a string");
qrystr = 'SELECT COUNT(*) as count FROM media WHERE url = $1::string;';
} else {
qrystr = 'SELECT COUNT(*) as count FROM media WHERE id = $1::id;';
}
client.query(qrystr, [url_or_id], function(err, result){
done();
if(err){
/* .. */
}
callback(err, result);
});
});
}
// Executing the function
var test = is_media_existing("http://random_url_existing_in_db", function(err, result){
if(err){
}else {
}
});
If you end up with a hard nest of callbacks, promises are really worth looking into.
I don't think you really do want a synchronous call. The problem with synchronous calls in node is that it stops the entire process from doing anything while a synchronous function is running as it will stop the event loop. As an example lets say your sync function takes 2 seconds to complete. Your server will then do nothing for 2 full seconds. That 2 seconds includes everything (accepting new connections, everything else, etc). The reason blocking functions don't exist is because they are (very) bad. Here is an example how your function will react in an async manor.
is_media_existing("http://random_url_existing_in_db", function(exists){
if (exists){
//do stuff
} else {
//do this other stuff
}
});
Then within is_media_existing you will need to call that callback function when your query completes.
//psuedo
function is_media_existing(url, callback){
query('select COUNT(*) as count FROM media WHERE id = $1::id '. [url], function(err, result){
if (err)
callback(false)
else
callback(result.count > 0)
})
}
With the new ES6 plus async stuff and babel its simpler. You can npm i -g babel npm i babel-runtime then compile and run the following with babel test.js --optional runtime --stage 2 | node. Please read the following example carefully to see how to adapt it to your use case:
let testData = [
{ id: 0, childIds: [1,2]},
{ id: 1, childIds:[] }
];
function dbGet(ids) {
return new Promise( r=> {
// this an example; you could do any db
// query here and call r with the results
r(ids.map((id) => { return testData[id];}));
});
}
async function dbExists(ids) {
let found = await dbGet(ids);
return (found && found.length>0);
}
async function test() {
var exists = await dbExists([0]);
console.log(exists);
}
test().then(f=>{}).catch( e=> {console.log('e',e)});
Related
I ran into an issue whilst attempting to create the logic to add rows to a new table I made on my MySql database. When adding a row I need to query the database 4 times to check other rows and to then add the correct value to the new row. I am using node.js and the mysql module to accomplish this. While coding I ran into a snag, the code does not wait for the 4 queries to finish before inserting the new row, this then gives the values being found a value of 0 every time. After some research I realize a callback function would be in order, looking something like this:
var n = 0;
connection.query("select...", function(err, rows){
if(err) throw err;
else{
if(rows.length === 1) ++n;
}
callback();
});
function callback(){
connection.query("insert...", function(err){
if(err) throw err;
});
}
Note: The select queries can only return one item so the if condition should not effect this issue.
A callback function with only one query to wait on is clear to me, but I become a bit lost for multiple queries to wait on. The only idea that I had would be to create another variable that increments before the callback is called, and is then passed in the callback function's arguments. Then inside the callback the query could be encapsulated by an if statement with a condition of this being the variable equaling the number of queries that need to be called, 4 for my purposes here. I could see this working but wasn't sure if this sort of situation already has a built in solution or if there are other, better, solutions already developed.
You need async (https://github.com/caolan/async). You can do a very complex logic with this module.
var data = {} //You can do this in many ways but one way is defining a global object so you can add things to this object and every function can see it
firstQueryFunction(callback){
//do your stuff with mysql
data.stuff = rows[0].stuff; //you can store stuff inside your data object
callback(null);
}
secondQueryFunction(callback){
//do your stuff with mysql
callback(null);
}
thirdQueryFunction(callback){
//do your stuff with mysql
callback(null);
}
fourthQueryFunction(callback){
//do your stuff with mysql
callback(null);
}
//This functions will be executed at the same time
async.parallel([
firstQueryFunction,
secondQueryFunction,
thirdQueryFunction,
fourthQueryFunction
], function (err, result) {
//This code will be executed after all previous queries are done (the order doesn't matter).
//For example you can do another query that depends of the result of all the previous queries.
});
As per Gesper's answer I'd recommend the async library, however, I would probably recommend running in parallel (unless the result of the 1st query is used as input to the 2nd query).
var async = require('async');
function runQueries(param1, param2, callback) {
async.parallel([query1, query2, query3(param1, param2), query4],
function(err, results) {
if(err) {
callback(err);
return;
}
var combinedResult = {};
for(var i = 0; i < results.length; i++) {
combinedResult.query1 = combinedResult.query1 || result[i].query1;
combinedResult.query2 = combinedResult.query2 || result[i].query2;
combinedResult.query3 = combinedResult.query3 || result[i].query3;
combinedResult.query4 = combinedResult.query4 || result[i].query4;
}
callback(null, combinedResult);
});
}
function query1(callback) {
dataResource.Query(function(err, result) {
var interimResult = {};
interimResult.query1 = result;
callback(null, interimResult);
});
}
function query2(callback) {
dataResource.Query(function(err, result) {
var interimResult = {};
interimResult.query2 = result;
callback(null, interimResult);
});
}
function query3(param1, param2) {
return function(callback) {
dataResource.Query(param1, param2, function(err, result) {
var interimResult = {};
interimResult.query3 = result;
callback(null, interimResult);
});
}
}
function query4(callback) {
dataResource.Query(function(err, result) {
var interimResult = {};
interimResult.query4 = result;
callback(null, interimResult);
});
}
Query3 shows the use of parameters being 'passed through' to the query function.
I'm sure someone can show me a much better way of combining the result, but that is the best I have come up with so far. The reason for the use of the interim object, is that the "results" parameter passed to your callback is an array of results, and it can be difficult to determine which result is for which query.
Good luck.
I have a function that will poll a database ever x seconds. I am using the Q library so the function will return a promise. The function will ultimetly be used in a long chain of .then()s.
The function does give me the results that I expect but the function continues to run for 30-40 seconds after the results are returned. I cannot figure out why it would not exit right after I return.
var _ = require('lodash');
var pg = require('pg');
var Q = require('q');
connString = 'postgres://somedb_info';
var query = "SELECT * FROM job where jobid='somejobid123123'";
exports.run_poller = function () {
var deferred = Q.defer();
function exec_query(callback) {
pg.connect(connString, function(err, client, done) {
if(err) {
deferred.reject(err);
}
client.query(query, function(err, result) {
done();
if(err) {
return deferred.reject(err);
}
callback(result.rows[0]);
});
});
}
function wait_for(res){
if(res.status == 'COMPLETE') {
return deferred.resolve(res);
} else {
setTimeout(function(){
exec_query(wait_for);
}, 1000);
}
}
exec_query(wait_for);
return deferred.promise;
};
Just to test this I call the function from a main.js file like so:
var poller = require('./utils/poller').run_poller;
poller().then(console.log).catch(function(err) {console.log(err,'*');});
Why doesn't main.js exit right after the data is returned? Is there a better way to achieve this?
I see that pg maintains a connection pool. I would assume that it's either taking awhile to time out some shared resources or it just takes a little while to shutdown.
You may be able to just call process.exit(0) in your node app to force it to exit sooner if you know you're done with all your work.
Or, you may be able to find configuration settings that affect how the connection pool works.
On this doc page, there's an example like this that might help:
//disconnect client when all queries are finished
client.on('drain', client.end.bind(client));
You should read the doc for .end() to make sure you're using it correctly as there are some cases where it says it should not be called (though they may not apply if your done with all activity).
So right now I'm trying to use Nodejs to access files in order to write them to a server and process them.
I've split it into the following steps:
Traverse directories to generate an array of all of the file paths
Put the raw text data from each of file paths in another array
Process the raw data
The first two steps are working fine, using these functions:
var walk = function(dir, done) {
var results = [];
fs.readdir(dir, function(err, list) {
if (err) return done(err);
var pending = list.length;
if (!pending) return done(null, results);
list.forEach(function(file) {
file = path.resolve(dir, file);
fs.stat(file, function(err, stat) {
if (stat && stat.isDirectory()) {
walk(file, function(err, res) {
results = results.concat(res);
if (!--pending) done(null, results);
});
} else {
results.push(file);
if (!--pending) done(null, results);
}
});
});
});
};
function processfilepaths(callback) {
// reading each file
for (var k in filepaths) { if (arrayHasOwnIndex(filepaths, k)) {
fs.readFile(filepaths[k], function (err, data) {
if (err) throw err;
rawdata[k] = data.toString().split(/ *[\t\r\n\v\f]+/g);
for (var j in rawdata[k]) { if (arrayHasOwnIndex(rawdata[k], j)) {
rawdata[k][j] = rawdata[k][j].split(/: *|: +/);
}}
});
}}
if (callback) callback();
}
Obviously, I want to call the function processrawdata() after all of the data has been loaded. However, using callbacks doesn't seem to work.
walk(rootdirectory, function(err, results) {
if (err) throw err;
filepaths = results.slice();
processfilepaths(processrawdata);
});
This never causes an error. Everything seems to run perfectly except that processrawdata() is always finished before processfilepaths(). What am I doing wrong?
You are having a problem with callback invocation and asynchronously calling functions. IMO I'll recommend that you use a library such as after-all to execute a callback once all your functions get executed.
Here's a example, here the function done will be called once all the functions wrapped with next are called.
var afterAll = require('after-all');
// Call `done` once all the functions
// wrapped with next() get called
next = afterAll(done);
// first execute this
setTimeout(next(function() {
console.log('Step two.');
}), 500);
// then this
setTimeout(next(function() {
console.log('Step one.');
}), 100);
function done() {
console.log("Yay we're done!");
}
I think for your problem, you can use async module for Node.js:
async.series([
function(){ ... },
function(){ ... }
]);
To answer you actual question, I need to explain how Node.js works:
Say, when you call an async operation (say mysql db query), Node.js sends "execute this query" to MySQL. Since this query will take some time (may be some milliseconds), Node.js performs the query using the MySQL async library - getting back to the event loop and doing something else there while waiting for MySQL to get back to us. Like handling that HTTP request.
So, In your case both functions are independent and executes almost in parallel.
For more information:
Async.js for use with Node.js
function processfilepaths(callback) {
// reading each file
for (var k in filepaths) { if (arrayHasOwnIndex(filepaths, k)) {
fs.readFile(filepaths[k], function (err, data) {
if (err) throw err;
rawdata[k] = data.toString().split(/ *[\t\r\n\v\f]+/g);
for (var j in rawdata[k]) { if (arrayHasOwnIndex(rawdata[k], j)) {
rawdata[k][j] = rawdata[k][j].split(/: *|: +/);
}}
});
}}
if (callback) callback();
}
Realize that you have:
for
readfile (err, callback) {... }
if ...
Node will call each readfile asynchronously, which only sets up the event and callback, then when it is done calling each readfile, it will do the if, before the callback probably even has a chance to get invoked.
You need to use either Promises, or a promise module like async to serialize it. What you would then do looks like:
async.XXXX(filepaths, processRawData,
function (err, ...) {
// function for when all are done
if (callback) callback();
}
);
Where XXXX is one of the functions from the library like series, parallel, each, etc... The only thing you also need to know is in your process raw data, async gives you a callback to call when done. Unless you really need sequential access (I don't think you do) use parallel so that you can queue up as many i/o events as possible, it should execute faster, maybe only marginally, but it'll better leverage the hardware.
So basically I am making a database query, to get all posts with a certain id, then add them to a list, so I can return. But the list is returned, before the callback has finished.
How do I prevent it from being returned before callback has finished?
exports.getBlogEntries = function(opid) {
var list12 =[];
Entry.find({'opid' : opid}, function(err, entries) {
if(!err) {
console.log("adding");
entries.forEach( function(currentEntry){
list12.push(currentEntry);
});
}
else {
console.log("EEEERROOR");
}
//else {console.log("err");}
});
console.log(list12);
return list12;
};
ALL callback is asynchronous, so we don't have any guarantee if they will run exactly in the order we have leave them.
To fix it and make the process "synchronous" and guarantee an order executation you have two solutions:
First: make all process in nested list:
instead of this:
MyModel1.find({}, function(err, docsModel1) {
callback(err, docsModel1);
});
MyModel2.find({}, function(err, docsModel2) {
callback(err, docsModel2);
});
use this:
MyModel1.find({}, function(err, docsModel1) {
MyModel2.find({}, function(err, docsModel2) {
callback(err, docsModel1, docsModel2);
});
});
The last snippet above guarantee us that MyModel2 will be executed AFTER MyModel1 is executed.
Second: Use some framework as Async. This framework is awesome and have several helper functions to execute code in series, parallels, whatever way we want.
Example:
async.series(
{
function1 : function(callback) {
//your first code here
//...
callback(null, 'some result here');
},
function2 : function(callback) {
//your second code here (called only after the first one)
callback(null, 'another result here');
}
},
function(err, results) {
//capture the results from function1 and function2
//if function1 raise some error, function2 will not be called.
results.function1; // 'some result here'
results.function2; // 'another result here'
//do something else...
}
);
You could use sync database calls but that would work around the concept of node.js.
The proper way is to pass a callback to the function that queries the database and then call the provided callback inside the database query callback.
How do I prevent it from being returned before callback has finished?
The callback is asynchronous, and you cannot avoid that. Hence, you must not return a list.
Instead, offer a callback for when it's filled. Or return a Promise for the list. Example:
exports.getBlogEntries = function(opid, callback) {
Entry.find({'opid': opid}, callback); // yes, that's it.
// Everything else was boilerplate code
};
There is an alternate way to handle this scenario. You can use the async module and when the forEach has finished then make the return call. Please find the code snippet below for the same:
var async = requires('async');
exports.getBlogEntries = function(opid) {
var list12 =[];
Entry.find({'opid' : opid}, function(err, entries) {
if(!err) {
console.log("adding");
async.forEachSeries(entries,function(entry,returnFunction){
list12.push(entry);
},function(){
console.log(list12);
return list12;
});
}
else{
console.log("EEEERROOR");
}
});
};
I'm using the Node.JS driver for MongoDB, and I'd like to perform a synchronous query, like such:
function getAThing()
{
var db = new mongo.Db("mydatabase", server, {});
db.open(function(err, db)
{
db.authenticate("myuser", "mypassword", function(err, success)
{
if (success)
{
db.collection("Things", function(err, collection)
{
collection.findOne({ name : "bob"}, function(err, thing)
{
return thing;
});
});
}
});
});
}
The problem is, db.open is an asychronous call (it doesn't block), so the getAThing returns "undefined" and I want it to return the results of the query. I'm sure I could some sort of blocking mechanism, but I'd like to know the right way to do something like this.
ES 6 (Node 8+)
You can utilize async/await
await operator pauses the execution of asynchronous function until the Promise is resolved and returns the value.
This way your code will work in synchronous way:
const query = MySchema.findOne({ name: /tester/gi });
const userData = await query.exec();
console.log(userData)
Older Solution - June 2013 ;)
Now the Mongo Sync is available, this is the right way to make a synchronous MongoDB query in Node.js.
I am using this for the same. You can just write sync method like below:
var Server = require("mongo-sync").Server;
var server = new Server('127.0.0.1');
var result = server.db("testdb").getCollection("testCollection").find().toArray();
console.log(result);
Note: Its dependent on the node-fiber and some issues are there with it on windows 8.
Happy coding :)
There's no way to make this synchronous w/o some sort of terrible hack. The right way is to have getAThing accept a callback function as a parameter and then call that function once thing is available.
function getAThing(callback)
{
var db = new mongo.Db("mydatabase", server, {});
db.open(function(err, db)
{
db.authenticate("myuser", "mypassword", function(err, success)
{
if (success)
{
db.collection("Things", function(err, collection)
{
collection.findOne({ name : "bob"}, function(err, thing)
{
db.close();
callback(err, thing);
});
});
}
});
});
}
Node 7.6+ Update
async/await now provides a way of coding in a synchronous style when using asynchronous APIs that return promises (like the native MongoDB driver does).
Using this approach, the above method can be written as:
async function getAThing() {
let db = await mongodb.MongoClient.connect('mongodb://server/mydatabase');
if (await db.authenticate("myuser", "mypassword")) {
let thing = await db.collection("Things").findOne({ name: "bob" });
await db.close();
return thing;
}
}
Which you can then call from another async function as let thing = await getAThing();.
However, it's worth noting that MongoClient provides a connection pool, so you shouldn't be opening and closing it within this method. Instead, call MongoClient.connect during your app startup and then simplify your method to:
async function getAThing() {
return db.collection("Things").findOne({ name: "bob" });
}
Note that we don't call await within the method, instead directly returning the promise that's returned by findOne.
While it's not strictly synchronous, a pattern I've repeatedly adopted and found very useful is to use co and promisify yield on asynchronous functions. For mongo, you could rewrite the above:
var query = co( function* () {
var db = new mongo.Db("mydatabase", server, {});
db = promisify.object( db );
db = yield db.open();
yield db.authenticate("myuser", "mypassword");
var collection = yield db.collection("Things");
return yield collection.findOne( { name : "bob"} );
});
query.then( result => {
} ).catch( err => {
} );
This means:
You can write "synchronous"-like code with any asynchronous library
Errors are thrown from the callbacks, meaning you don't need the success check
You can pass the result as a promise to any other piece of code