Node JS, createServer, and the Event Loop - javascript

Behind the scenes in node, how does the http module's createServer method (and its callback) interact with the event loop? Is it possible to build functionality similar to createServer on my own in userland, or would this require a change to node's underlying system code?
That is, my general understanding of node's event loop is
Event loop ticks
Node looks for any callbacks to run
Node runs those callbacks
Event loops ticks again, process repeats ad-infinitum
What I'm still a little fuzzy on is how createServer fits into the event loop. If I do something like this
var http = require('http');
// create an http server and handle with a simple hello world message
var server = http.createServer(function (request, response) {
//...
});
I'm telling node to run my callback whenever an HTTP request comes in. That doesn't seem compatible with the event loop model I understand. It seems like there's some non-userland and non-event loop that's listening for HTTP requests, and then running my callback if one comes in.
Put another way — if I think about implementing my own version version of createServer, I can't think of a way to do it since any callback I schedule will run once. Does createServer just use setTimeout or setInterval to constantly recheck for an incoming HTTP request? Or is there something lower level, more efficient going on. I understand I don't need to fully understand this to write efficient node code, but I'm curious how the underlying system was implemented.
(I tried following along in the node source, but the going is slow since I'm not familiar with the node module system, or the legacy assumptions w/r/t to coding patterns deep in the system code)

http.createServer is a convenience method for creating a new http.Server() and attaching the callback as an event listener to the request event. Of course the node http library implements the protocol parsing, as well.
There is no constant polling of the event loop, node is waiting for the C++ tcp bindings to receive data on the socket, which then marshall that data as a buffer to your callback.
If you were to implement your own http parser, you would start with a net.Server object as your base. See node's implementation here: https://github.com/joyent/node/blob/master/lib/_http_server.js#L253

The events library does the generation and handling of events as mentioned by CrazyTrain in comments. It has EventEmitter class which is used for servers, sockets and streams etc.
Event-loop like you said an infinite loop executing the callbacks after every tick. The callback provided with the http server is an eventhandler, specifically for event request.
var server = http.createServer(function (request, response) //request handler
Eventhandlers can be executed multiple times. http.server is an instance of EventEmitter. The way it works incoming requests is that it first parses an incoming request. When parsed, it emits the request event. The eventemitter then executes the callback for request with the parameters supplied.
You are right that EventEmitter is not a part of event loop. And it needs to be implemented by the developer of the module or library, only using the handlers provided by user of the module. But most importantly, it provides the necessary mechanism to implement events.

Related

NodeJS TCP server not working because of while loop

I'm going to illustrate my issue with this simple example.
What boggles my mind is why the server never gets created
and the socket never gets printed.
If I were to remove the while loop everything works.
What do I have to change to make the example below function?
const net = require('net');
net.createServer(socket => {
socket.setEncoding('utf-8');
console.log(socket);
}).listen(4242, '127.0.0.1');
console.log('do some while logic here')
while(true) { }
This happens because your socket creation is not an instant process. It needs to make system calls and so on. In other words, it is asynchronous. The way javascript works are that it has main loop and callback queue. Basically the main loop is what is executed and callback queue is the things that await to be executed (See MDN docs on this https://developer.mozilla.org/en-US/docs/Web/JavaScript/EventLoop).
What happens in your case is that your callback goes to the callback queue and waits to be executed, but it never gets to do so, because your main loop is blocked by while (true) {} loop. If you want nonblocking behavior you need to send things that are inside you while loop to callback queue instead. One of the ways to do it in javascript is to use setTimeout. E.g.
const net = require('net');
net.createServer(socket => {
socket.setEncoding('utf-8');
console.log(socket);
}).listen(4242, '127.0.0.1');
console.log('do some while logic here')
function main() {
// do something here
setTimeout(main);
}
main()
This way you're not going to have a stack overflow issue and you get nonblocking behavior in you while loop.
Nodejs is an event driven system that runs your Javascript single threaded. That means that in order for things to work properly, you cannot hog the entire CPU in a while() loop (or any other kind of loop) unless the loop directly contains an await statement that is awaiting an actual promise tied to an asynchronous operation.
This is a basic principle of programming in nodejs and you have to learn how to structure your program logic into the event driven world. You don't show what you're really trying to do, but "polling" anything in a tight loop is generally not the correct way to program an event driven system.
So, in the code you show here:
const net = require('net');
net.createServer(socket => {
socket.setEncoding('utf-8');
console.log(socket);
}).listen(4242, '127.0.0.1');
console.log('do some while logic here')
while(true) { }
Your while loop just spins forever and never allows any events to get processed and therefore your server can never get events about incoming connections. The events will just pile up in the event queue, but you never give nodejs a chance to go back to the event queue to process those events. To do so, you must finish what you're doing and return control back to the system (thus why you can't use the while(true) { } loop).
So, you really need to be thinking event-driven programming in nodejs. You set up event listeners and you execute code some time in the future when those events occur. You can artificially create events with setTimeout() or setInterval(), but doing that constantly or with really, really short time durations is just polling and is not an efficient way to program a nodejs server either.
If you show or describe for us what you're really trying to do in the rest of your code, we can advise the most important part of this question which is how to actually write that code in an event-driven fashion.
I repeat, learning how to program in an event-driven fashion is required for an efficient, scalable nodejs server process.

Does event loops polls for event completion or kernel/os notifies back?

When Node.js starts, it initializes the event loop, processes the provided input script which may make async API calls, schedule timers, or call process.nextTick(), then begins processing the event loop.
There are seven phases and each phase has its own event queue which is based on FIFO.
So application makes a request event, event demultiplexer gathers those requests and pushes to respective event queues.
For example, If my code makes two reqeusts one is setTimeOut() and another is some API Call, demultiplexer will push the first one in timer queue and other in poll queue.
But events are there, and loop watches over those queues and events, on completion in pushes the registered callback to the callstack where it is processed.
My question is,
1). Who handles events in event queue to OS?
2). Does event loop polls for event completion in each event queue or does OS notifies back?
3). Where and who decides whether to call native asyncrhonous API or handle over to a thread pool?
I am very verge of understanding this, I have been strugling a lot to grasp the concepts. There are a lot of false information about node.js event loop and how it handles asynchronous calls using one thread.
Please answer this questions if possible. Below are the references where I could get some better insight from.
https://github.com/nodejs/nodejs.org/blob/master/locale/en/docs/guides/event-loop-timers-and-nexttick.md
https://dev.to/lunaticmonk/understanding-the-node-js-event-loop-phases-and-how-it-executes-the-javascript-code-1j9
how does reactor pattern work in Node.js?
https://www.youtube.com/watch?v=PNa9OMajw9w&t=3s
Who handles events in event queue to OS?
How OS events work depends upon the specific type of event. Disk I/O works one way and Networking works a different way. So, you can't ask about OS events generically - you need to ask about a specific type of event.
Does event loop polls for event completion in each event queue or does OS notifies back?
It depends. Timers for example are built into the event loop and the head of the timer list is checked to see if it's time has come in each timer through the event loop. File I/O is handled by a thread pool and when a disk operation completes, the thread inserts a completion event into the appropriate queue directly so the event loop will just find it there the next time through the event loop.
Where and who decides whether to call native asynchronous API or handle over to a thread pool?
This was up to the designers of nodejs and libuv and varies for each type of operation. The design is baked into nodejs and you can't yourself change it. Nodejs generally uses libuv for cross platform OS access so, in most cases, it's up to the libuv design for how it handles different types of OS calls. In general, if all the OSes that nodejs runs on offer a non-blocking, asynchronous mechanism, then libuv and nodejs will use it (like for networking). If they don't (or it's problematic to make them all work similarly), then libuv will build their own abstraction (as with file I/O and a thread pool).
You do not need to know the details of how this works to program asynchronously in nodejs. You make a call and get a callback (or resolved promise) when its done, regardless of how it works internally. For example, nodejs offers some asynchronous crypto APIs. They happen to be implemented using a thread pool, but you don't need to know that in order to use them.

Synchronous TCP Read in Node.js

Is there a way to do a synchronous read of a TCP socket in node.js?
I'm well aware of how to do it asynchronously by adding a callback to the socket's 'data' event:
socket.on('data', function(data) {
// now we have the string data to do whatever with
});
I'm also aware that trying to block with a function call instead of registering callbacks goes against node's design, but we are trying to update an old node module that acts as a client for my university while maintaining backwards compatibility. So we currently have:
var someData = ourModule.getData();
Where getData() previously had a bunch of logic behind it, but now we just want to send to the server "run getData()" and wait for the result. That way all logic is server side, and not duplicated client and server side. This module already maintains a TCP connection to the server so we are just piggybacking on that.
Here are the solutions I've tried:
Find a blocking read function for the socket hidden somewhere similar to python's socket library within node's net module.
string_from_tcp = socket.recv(1024)
The problem here is that it doesn't seem to exist (unsurprisingly because it goes against node's ideology).
This syncnet module adds what I need, but has no Windows support; so I'd have to add that.
Find a function that allow's node to unblock the event loop, then return back, such that this works:
var theData = null;
clientSocket.on('data', function(data) {
theData = data;
});
clientSocket.write("we want some data");
while(theData === null) {
someNodeFunctionThatUnblocksEventLoopThenReturnsHere(); // in this function node can check the tcp socket and call the above 'data' callback, thus changing the value of theData
}
// now theData should be something!
Obvious problem here is that I don't think such a thing exists.
Use ECMAScript 6 generator functions:
var stringFromTcp = yield socketRead(1024);
The problem here is that we'd be forcing students to update their JavaScript clients to this new syntax and understanding ES6 is outside the scopes of the courses that use this.
Use node-gyp and add to our node module an interface to a C++ TCP library that does support synchronous reads such as boost's asio. This would probably work but getting the node module to compile with boost cross platform has been a huge pain. So I've come to Stack Overflow to make sure I'm not over-complicating this problem.
In the simplest terms I'm just trying to create a command line JavaScript program that supports synchronous tcp reads.
So any other ideas? And sorry in advance if this seems blasphemous in context of a node project, and thanks for any input.
I ended up going with option 5. I found a small, fast, and easy to build TCP library in C++ (netLink) and wrote a node module wrapper for it, aptly titled netlinkwrapper.
The module builds on Windows and Linux, but as it is a C++ addon you'll need node-gyp configured to build it.
I hope no one else has to screw with Node.js as I did using this module, but if you must block the event loop with TCP calls this is probably your only bet.

Event Queuing in NodeJS

NodeJS uses a event driven model in which only one thread executes the events. I understand the first event executed will be the user JS code. Simple example from nodeJS website of a webserver is below
var http = require('http');
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end('Hello World\n');
}).listen(1337, '127.0.0.1');
console.log('Server running at http://127.0.0.1:1337/');
First event executed will perform the above steps. Later on the event loop will wait for events to be enqueued. My question is which thread enqueues event? Is there are a separate thread that does that? If yes can multiple threads enqueue events also?
Thank you.
My question is which thread enqueues event? Is there are a separate thread that does that? If yes can multiple threads enqueue events also?
At it's core, the javascript interpreter is a loop around the select() sytem call. The select() system call informs the OS about the filehandles that your program is interested in and the OS will block execution of your program until there are events on any of those filehandles. Basically, select() will return if there's data on some I/O channel your program is interested in.
In pseudocode, it looks something like this:
while(1) {
select(files_to_write,files_to_read,NULL,NULL,timeout)
process_writable(files_to_write)
process_readable(files_to_read)
timeout = process_timers()
}
This single function allows the interpreter to implement both asynchronous I/O and setTimeout/setInterval. (Technically, this is only partly true. Node.js uses either select or poll or epoll etc. based on what functions are available and what functions are more efficient on your OS)
So, who enqueues events? The OS.
There is one exception. Disk I/O on node.js is handled by a separate thread. So for this I/O it is this disk I/O thread that enqueues events to the event loop.
It could be implemented without threads. For example, the Tcl programming language (which predates javascript but also has built-in event loop) implements disk I/O in the main thread using features such as kqueue (on BSD based OS like MacOS) or aio (on Linux and several other OSes) or overlapped-i/o (Windows). But the node.js developers simply chose threading to handle disk i/o.
For more how this works at the C level see this answer: I know that callback function runs asynchronously, but why?

The mechanics of the JavaScript WebSockets API

I've been trying to understand some code used to open a websocket:
var ws = new WebSocket('ws://my.domain.com');
ws.onopen = function(event) {
...
}
My question is how does the handshaking get started? If it is started in the WebSocket constructor, then how does onopen get called if it isn't set by then? If the WebSocket constructor creates a thread that does the handshaking, then does onopen have to be defined quickly enough before the handshaking is over? If so, that sounds a little dangerous because if the JS virtual machine is slowed the handshaking could be finished before onopen is defined, which means that the event is not handled. Or does setting the onopen function trigger the handshaking?
Could someone explain to me the mechanics of the API please?
It does not look for onopen function until end of execution of current (synchronous) code. That is because the connection (and thus calling onopen callback) is asynchronous.
Consider:
let x = false;
setTimeout(function () {
x = true
}, 1000);
while(!x){
console.log('waiting!');
}
The while loop there will never end but you would probably suspect it'd end after one second.
If you delay the initialisation of onopen function by executing time-consuming (but synchronous) code then it is not dangerous. On the other if you setTimeout initialisation of onopen then there's no guarantee whether it's defined or not at the time the WebSockets connection is ready as you can't be sure which callback will be executed first.
If you were doing the same thing in C++ you'd use threads for that. In JavaScript callbacks mechanism is not thread-based; it just behaves thread-like (see the endless while loop above).
Single thread executes one code-unit at a time and other code units
are queued until the current code unit is finished executing
source: http://www.slideshare.net/clutchski/writing-asynchronous-javascript-101
It's important to understand that even if you setTimeout something for 1s it might not execute after one second - If the thread is busy it might never get executed.
Thus if you initiate WebSocket connection and run a loop similar to the one above but waiting for the connection to be ready it might never end.
This behaviour might look strange for programmers not familiar with JS. Therefore for readability I define callbacks at the same time or immediately after the functions which need them whenever it's possible.
If you want to explicitly use threads and concurrent execution, read more about Web Workers
Reference:
How JavaScript Timers Work
Understanding JavaScript timers
You don't need any setTimeout function. I'm using a library for this and my code looks something like this:
var pushstream = new PushStream({
host: window.location.hostname,
port: window.location.port,
modes: "websocket"
});
pushstream.onmessage = _manageEvent;
function _manageEvent(eventMessage) {
console.log(eventMessage);
}
This gave me a hell of an insight on websockets and how to implement a client in Javascript: https://github.com/wandenberg/nginx-push-stream-module/blob/master/misc/js/pushstream.js
And also the server: https://github.com/wandenberg/nginx-push-stream-module/
It's very well documented I hope it helps :)

Categories