web3 websocket connection prevents node process from exiting - javascript

I have a node js process that creates a web3 websocket connection, like so:
web3 = new Web3('ws://localhost:7545')
When the process completes (I send it a SIGTERM), it does not exit, but rather hangs forever with no console output.
I registered a listener on SIGINT and SIGTERM to observe at what handles the process has outstanding with process._getActiveRequests() and process._getActiveHandles(), I see this:
Socket {
connecting: false,
_hadError: false,
_handle:
TCP {
reading: true,
owner: [Circular],
onread: [Function: onread],
onconnection: null,
writeQueueSize: 0 },
<snip>
_peername: { address: '127.0.0.1', family: 'IPv4', port: 7545 },
<snip>
}
For completeness, here is the code that's listening for the signals:
async function stop() {
console.log('Shutting down...')
if (process.env.DEBUG) console.log(process._getActiveHandles())
process.exit(0)
}
process.on('SIGTERM', async () => {
console.log('Received SIGTERM')
await stop()
})
process.on('SIGINT', async () => {
console.log('Received SIGINT')
await stop()
})
Looks like web3 is holding a socket open, which makes sense since I never told it to close the connection. Looking through the documentation and googling, it doesn't look like there's a close or end method for the web3 object.
Manually closing the socket in stop above allows the process to successfully exit:
web3.currentProvider.connection.close()
Anyone have a more elegant or officially sanctioned solution? It feels funny to me that you have to manually do this rather than have the object destroy itself on process end. Other clients seem to do this automatically without explicitly telling them to close their connections. Perhaps it is cleaner to tell all the clients created by your node process to close their handles/connections on shutdown anyway, but to me, this was unexpected.

It feels funny to me that you have to manually do this rather than have the object destroy itself on process end
It feels funny because you have probably been exposed to more synchronous programming compared to asynchronous. Consider the below code
fs = require('fs')
data = fs.readFileSync('file.txt', 'utf-8');
console.log("Read data", data)
When you run above you get the output
$ node sync.js
Read data Hello World
This is a synchronous code. Now consider the asynchronous version of the same
fs = require('fs')
data = fs.readFile('file.txt', 'utf-8', function(err, data) {
console.log("Got data back from file", data)
});
console.log("Read data", data);
When you run you get the below output
$ node async.js
Read data undefined
Got data back from file Hello World
Now if you think as a synchronous programmer, the program should have ended at the last console.log("Read data", data);, but what you get is another statement printed afterwards. Now this feels funny? Let's add a exit statement to the process
fs = require('fs')
data = fs.readFile('file.txt', 'utf-8', function(err, data) {
console.log("Got data back from file", data)
});
console.log("Read data", data);
process.exit(0)
Now when you run the program, it ends at the last statement.
$ node async.js
Read data undefined
But the file is not actually read. Why? because you never gave time for JavaScript engine to execute the pending callbacks. Ideally a process automatically finishes when there is no work left for it to do (no pending callbacks, function calls etc...). This is the way asynchronous world works. There are some good SO threads and articles you should look into
https://medium.freecodecamp.org/walking-inside-nodejs-event-loop-85caeca391a9
https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/
How to exit in Node.js
Why doesn't my Node.js process terminate once all listeners have been removed?
How does a node.js process know when to stop?
So in the async world you need to either tell the process to exit or it will automatically exit when there are no pending tasks (which you know how to check - process._getActiveRequests() and process._getActiveHandles())

The provider API for the JavaScript web3 module has gone through some substantial change recently due to the implementation of EIP-1193 and the impending release of Web3 1.0.0.
Per the code, it looks like web3.currentProvider.disconnect() should work. This method also accepts optional code and reason arguments, as described in the MDN reference docs for WebSocket.close(...).
Important: you'll notice that I referenced the source code above and not the documentation. That's because at present the disconnect method is not considered part of the public API. If you use it in your code, you should be sure to add a test case for it, as it could break at any time! From what I can see, WebSocketProvider.disconnect was introduced in web3#1.0.0-beta.38 and is still present in the latest release as of today, which is web3#1.0.0-beta.55. Given that the stable 1.0.0 release is due to drop very soon, I don't think it's likely that this will change much between now and web3#1.0.0, but there's no holds barred when it comes to the structure of internal APIs.
I've discussed making the internal providers public at length with the current maintainer, Samuel Furter, aka nividia on GitHub. I don't fully agree with his decision to keep it internal here, but in his defense he's the only maintainer at present and he's had his hands very full with stabilizing the long-standing work in progress on the 1.0 branch.
As a result of these discussions, my opinion at the moment is that those who need a stable API for their WebSocket provider should write an EIP-1193 compatible provider of their own, and publish it on NPM for others to use. Please follow semver for this, and include a similar disconnect method in your own public API. Bonus points if you write it in TypeScript, as this gives you the ability to explicitly declare class members as public, protected, or private.
If you do this, be aware that EIP-1193 is still in draft status, so you'll need to keep an eye on the EIP-1193 discussions on EthereumMagicians and in the Provider Ring Discord to stay on top of any changes that might occur.

At the end of your node js process, simply call:
web3.currentProvider.connection.close()

Related

Async task without waiting for the result

I've got a Node.js / Express application where sometimes I need to perform a non critical async task that doesn't require waiting for the result (for example, a call to save some data in an analytics platform):
router.post("/", function (req, res, next) {
criticalTask()
.then(result => {
res.json({success: true});
nonCriticalTask();
})
.catch(next)
}
Is there a guarantee that the nonCriticalTask() gets executed completely without terminating it in the middle? Are there any restrictions on this?
In the end I couldn't find any documentation on this. After lots of experiments and logging, seems that nonCriticalTask() doesn't terminate and is executed by node and node doesn't exit if there are tasks still executing or handles are in use, e.g. DB connection is open.
So it seems to work for my nonCriticalTask() that does analytics. That being said, it's probably a bad design practice to rely on the node engine for anything critical running in the background like this, and other approaches should be considered, e.g. persistent queues.

Synchronous TCP Read in Node.js

Is there a way to do a synchronous read of a TCP socket in node.js?
I'm well aware of how to do it asynchronously by adding a callback to the socket's 'data' event:
socket.on('data', function(data) {
// now we have the string data to do whatever with
});
I'm also aware that trying to block with a function call instead of registering callbacks goes against node's design, but we are trying to update an old node module that acts as a client for my university while maintaining backwards compatibility. So we currently have:
var someData = ourModule.getData();
Where getData() previously had a bunch of logic behind it, but now we just want to send to the server "run getData()" and wait for the result. That way all logic is server side, and not duplicated client and server side. This module already maintains a TCP connection to the server so we are just piggybacking on that.
Here are the solutions I've tried:
Find a blocking read function for the socket hidden somewhere similar to python's socket library within node's net module.
string_from_tcp = socket.recv(1024)
The problem here is that it doesn't seem to exist (unsurprisingly because it goes against node's ideology).
This syncnet module adds what I need, but has no Windows support; so I'd have to add that.
Find a function that allow's node to unblock the event loop, then return back, such that this works:
var theData = null;
clientSocket.on('data', function(data) {
theData = data;
});
clientSocket.write("we want some data");
while(theData === null) {
someNodeFunctionThatUnblocksEventLoopThenReturnsHere(); // in this function node can check the tcp socket and call the above 'data' callback, thus changing the value of theData
}
// now theData should be something!
Obvious problem here is that I don't think such a thing exists.
Use ECMAScript 6 generator functions:
var stringFromTcp = yield socketRead(1024);
The problem here is that we'd be forcing students to update their JavaScript clients to this new syntax and understanding ES6 is outside the scopes of the courses that use this.
Use node-gyp and add to our node module an interface to a C++ TCP library that does support synchronous reads such as boost's asio. This would probably work but getting the node module to compile with boost cross platform has been a huge pain. So I've come to Stack Overflow to make sure I'm not over-complicating this problem.
In the simplest terms I'm just trying to create a command line JavaScript program that supports synchronous tcp reads.
So any other ideas? And sorry in advance if this seems blasphemous in context of a node project, and thanks for any input.
I ended up going with option 5. I found a small, fast, and easy to build TCP library in C++ (netLink) and wrote a node module wrapper for it, aptly titled netlinkwrapper.
The module builds on Windows and Linux, but as it is a C++ addon you'll need node-gyp configured to build it.
I hope no one else has to screw with Node.js as I did using this module, but if you must block the event loop with TCP calls this is probably your only bet.

Meteor.call (Meteor.methods) seem to run on both client and server--causing issues

I execute Meteor.call from the client side to a Meteor.methods method and as I console.log things, they are logged in both the command prompt and the browser console.
The issue with this is that it actually seems to be executing on the client side--which does not have access to the proper entities. While I get no errors on the command prompt, here's what is shown on the client side:
Exception while simulating the effect of invoking 'createInvite'
Meteor.makeErrorType.errorClass {error: "not-found", reason:
"evaluator_invite__entity_not_found", details: undefined, message:
"evaluator_invite__entity_not_found [not-found]", errorType:
"Meteor.Error"…} Error: evaluator_invite__entity_not_found [not-found]
at Meteor.methods.createInvite (http://localhost:3000/lib/collections/invites.js?505cdea882e0f829d521280c2057403ec134b075:38:15)
Is it actually running on the client side? Should this error be there?
Meteor methods are expected to run on both environments if defined globally, this allows for a nice feature which is called latency compensation.
The whole latency compensation concept is off topic but basically it means simulating client-side what the server would actually do like inserting documents in the database right-away to design fluid UX. You're essentially predicting server behavior before it's even happening to make your client experience ultra-responsive by eliminating network latency.
An example of this might be inserting a comment in the database immediately after the user submitted it. In this case the Meteor method calls getting executed on both the server and the client will share the same code.
Sometimes it makes absolutely no sense to provide a client-only simulation - or "stub" - when designing a method responsible for sending emails for example.
Some other times it makes sense to share some part of the code but you need to access environment specific objects : for example a method to insert comments might use Email.send on the server to notify an author a comment has been added on his post.
Meteor.methods({
insertComment: function(comment){
check(comment, [...]);
if(! this.isSimulation){
Email.send([...]);
}
return Comments.insert(comment);
}
});
Ultimately, you have to structure your code differently depending on how your method is supposed to behave.
For server-only methods, define them under the server directory, with no client-side counterpart.
For shared methods that can share exactly the same code on both environments, just define them globally.
For shared methods that just slightly differ, you can use Meteor.isClient / Meteor.isServer or method only property isSimulation to check against the current environment and execute specific code accordingly.
For shared methods that share little to no code, I personally use a pattern where I define the Meteor method on both client and server environment and take care of shared behavior like checking arguments validity, then I call a function responsible for actually implementing the correct behavior.
lib/method.js
Meteor.method({
method: function(args){
check(args, [...]);
//
return implementation(args);
}
});
The trick is to define the implementation separately on client and server, depending on the context, the correct function will get called.
client/method.js
implementation = function(args){
// define your client simulation / stub
};
server/method.js
implementation = function(args){
// define your server implementation
};
Read how to structure your app: http://docs.meteor.com/#/full/structuringyourapp
If you want your methods to be run only on the server, put files with methods into server folder.
If you want latency compensation you can use conditionals in your methods:
Meteor.methods({
method1: function() {
if (Meteor.isClient) {
//Do client side stuff
}
if (Meteor.isServer) {
//Do server side stuff
}
}
});

Unique field constraint not enforced after dropDatabase

I have a Node / Mongoose / MongoDB project which I use Selenium WebDriver for integration tests. Part of my testcase setup is to wipe the database before each test. I accomplish this via command line using the method that's heavily accepted here Delete everything in a MongoDB database
mongo [database] --eval "db.dropDatabase();"
The problem is however that after running this command, Mongo no longer enforces any unique fields, such as the following as defined by my Mongoose Schema:
new Mongoose.Schema
name :
type : String
required : true
unique : true
If I restart mongod after the dropDatabase then the unique constraints are enforced again.
However since my mongod is run by a separate process, restarting it as part of my test case set up would be cumbersome, and I'm hoping unnecessary. Is this a bug or am I missing something?
As you stated, your mongod (or multiple ones) is a separate process to your application, and as such your application has no knowledge of what has happened when you dropped the database. While the collections and indeed the database will be created when accessed by default, this is not true of all the indexes you defined.
Index creation actually happens on application start-up, as is noted in this passage from the documentation:
When your application starts up, Mongoose automatically calls ensureIndex for each defined index in your schema. ...
It does not automatically happen each time a model is accessed, mostly because that would be excessive overhead, even if the index is in place and the server does nothing, it's still additional calls that you don't want preceeding every "read", "update", "insert" or "delete".
So in order to have this "automatic" behavior "fire again", what you need to do is "restart" your application and this process will be run again on connection. In the same way, when your "application" cannot connect to the mongod process, it re-tries the connection and when reconnected the automatic code is run.
Alternately, you can code a function you could call to do this and expose the method from an API so you could link this function with any other external operation that does something like "drop database":
function reIndexAllModels(callback) {
console.log("Re-Indexing all models");
async.eachLimit(
mongoose.connection.modelNames(),
10,
function(name,callback) {
console.log("Indexing %s", name);
mongoose.model(name).ensureIndexes(callback);
},
function(err) {
if(err) throw err;
console.log("Re-Indexing done");
callback();
}
);
}
That will cycle through all registered models and re-create all indexes that are assigned to them. Depending on the size of registered models, you probably don't want to run all of these in parallel connections. So to help a little here, I use async.eachLimit in order to keep the concurrent operations to a manageable number.
Thanks to Neil Lunn for pointing out the pitfalls of dropDatabase and re-indexing. Ultimately, I went for the following command line solution, and removing the dropDatabase command.
mongo [database] --eval "db.getCollectionNames().forEach(function(n){db[n].remove()});"

Node.js Server Crash Handling

Is there any way i can do some database updation things whenever my node.js server crashes or stopped. Like try{}catch(){}finally(){} in JAVA. I am a bit newbie here.
Is there any events will node emit before it going shutdown. If so i can write my function there.
I have scenario,if i stop the server manually,i need to update some fields in the database.
The same is for Unhandled crashes also.
i here about domain in Node.js. But i have no idea how to monitor a whole server using domain.
An event is emitted when the node process is about to exit:
process.on('exit', function(code) {
console.log('About to exit with code:', code);
});
http://nodejs.org/api/process.html#process_event_exit
You can't query the database here though, since this handler can only perform synchronous operations. Some possible alternatives:
use database transactions so you never need to do "database updation things" when your app crashes
use a tool like Upstart to automatically restart your process, and then do database fixup stuff whenever your process starts
When you are using node JS it's bad practice to use try / catch, because of big number of asynchronous calls. The best practice here to use "promises" review next link, there you can find a good explanation: https://www.promisejs.org/

Categories