Make HTTP request inside Web Worker - javascript

I am trying to use web-workers or threads in my node application for the first time. I am using the webworker-threads npm module.
Basically I would like each worker to make requests to a server, measure the response time and send it back to the main thread.
I tried it many different ways, but I just can't seem to get it working. The basic examples from the docs work. But when I try to require a module ("request" in my case), the workers just seem to stop working, without any error messages. I saw in the docs that require doesn't work inside a worker, so I tried "importScripts()", which doesn't work either. When using threadpools I tried to use .all.eval() but it didn't work either.
Since this is the first time working with web-workers / threads in node, I might misunderstand how to use those things in general. Here is one example I tried:
server.js
var Worker = require('webworker-threads').Worker;
var worker = new Worker('worker.js');
worker.js
console.log("before import");
importScripts('./node_modules/request/request.js');
console.log("after import");
This basic example only prints before import and then stops.

Web workers are native javascript only so you can't achieve what you want with them. Worker threads don't support node.js api or npm packages(like http or request.js). For concurrency you don't need any multithread magic just use async.js or promises. If you want to play with threads then child_processes is the way to go. You could also use an API to manage child_processes like https://github.com/rvagg/node-worker-farm
Considering your example you could write something like this:
main.js
var workerFarm = require('worker-farm')
, workers = workerFarm(require.resolve('./child'))
, ret = 0;
var urls = ['https://www.google.com', 'http://stackoverflow.com/', 'https://github.com/'];
urls.forEach(function (url) {
workers(url, function (err, res, body, responseTime) {
console.log('Url ' + url + 'finished in ' + responseTime + 'ms');
//Ugly code here use async/promise instead
if (++ret == urls.length)
workerFarm.end(workers);
});
});
child.js
var request = require('request');
module.exports = function(url, cb) {
var start = new Date();
request(url, function(err, res, body) {
var responseTime = new Date() - start;
cb(err, res, body, responseTime);
});
};

Related

NodeJS cluster, Is it really needed?

I decided that i want to investigate what is the best possible way to handle big amount of traffic with NodeJS server, i did a small test on 2 digital ocean servers which has 1GB RAM / 2 CPUs
No-Cluster server code:
// Include Express
var express = require('express');
// Create a new Express application
var app = express();
// Add a basic route – index page
app.get('/', function (req, res) {
res.redirect('http://www.google.co.il');
});
// Bind to a port
app.listen(3000);
console.log('Application running');
Cluster server code:
// Include the cluster module
var cluster = require('cluster');
// Code to run if we're in the master process
if (cluster.isMaster) {
// Count the machine's CPUs
var cpuCount = require('os').cpus().length;
// Create a worker for each CPU
for (var i = 0; i < cpuCount; i += 1) {
cluster.fork();
}
// Code to run if we're in a worker process
} else {
// Include Express
var express = require('express');
// Create a new Express application
var app = express();
// Add a basic route – index page
app.get('/', function (req, res) {
res.redirect('http://www.walla.co.il');
});
// Bind to a port
app.listen(3001);
console.log('Application running #' + cluster.worker.id);
}
And i sent stress test requests to those servers, i excepted that the cluster server will handle more requests but it didn't happen, both servers crashed on the same load, although 2 node services were running on the cluster and 1 service on the non-cluster.
Now i wonder why ? Did i do anything wrong?
Maybe something else is making the servers reach its breakpoint? both servers crashed at ~800 rps
Now i wonder why ? did i do anything wrong?
Your test server doesn't do anything other than a res.redirect(). If your request handlers use essentially no CPU, then you aren't going to be CPU bound at all and you won't benefit from involving more CPUs. Your cluster will be bottlenecked at the handling of incoming connections which is going to be roughly the same with or without clustering.
Now, add some significant CPU usage to your request handler and you should get a different result.
For example, change to this:
// Add a basic route – index page
app.get('/', function (req, res) {
// spin CPU for 200ms to simulate using some CPU in the request handler
let start = Date.now();
while (Date.now() - start < 200) {}
res.redirect('http://www.walla.co.il');
});
Running tests is a great thing, but you have to be careful what exactly you're testing.
What #jfriend00 says is correct; you aren't actually doing enough heavy lifting to justify this, however, you're not actually sharing the load. See here:
app.listen(3001);
You can't bind two services onto the same port and have the OS magically load-balance them[1]; try adding an error handler on app.listen() and see if you get an error, e.g.
app.listen(3001, (err) => err ? console.error(err));
If you want to do this, you'll have to accept everything in your master, then instruct the workers to do the task, then pass the results back to the master again.
It's generally easier not to do this in your Node program though; your frontend will still be the limiting factor. An easier (and faster) way may be to put a special purpose load-balancer in front of multiple running instances of your application (i.e. HAProxy or Nginx).
[1]: That's actually a lie; sorry. You can do this by specifying SO_REUSEPORT when doing the initial bind call, but you can't explicitly specify that in Node, and Node doesn't specify it for you...so you can't in Node.

How to inject module from different app in Node.js

I've two node apps/services that are running together,
1. main app
2. second app
The main app is responsible to show all the data from diffrent apps at the end. Now I put some code of the second app in the main app and now its working, but I want it to be decoupled. I mean that the code of the secnod app will not be in the main app (by somehow to inject it on runtime )
like the second service is registered to the main app in inject the code of it.
the code of it is just two modules ,is it possible to do it in nodejs ?
const Socket = require('socket.io-client');
const client = require("./config.json");
module.exports = (serviceRegistry, wsSocket) =>{
var ws = null;
var consumer = () => {
var registration = serviceRegistry.get("tweets");
console.log("Service: " + registration);
//Check if service is online
if (registration === null) {
if (ws != null) {
ws.close();
ws = null;
console.log("Closed websocket");
}
return
}
var clientName = `ws://localhost:${registration.port}/`
if (client.hosted) {
clientName = `ws://${client.client}/`;
}
//Create a websocket to communicate with the client
if (ws == null) {
console.log("Created");
ws = Socket(clientName, {
reconnect: false
});
ws.on('connect', () => {
console.log("second service is connected");
});
ws.on('tweet', function (data) {
wsSocket.emit('tweet', data);
});
ws.on('disconnect', () => {
console.log("Disconnected from blog-twitter")
});
ws.on('error', (err) => {
console.log("Error connecting socket: " + err);
});
}
}
//Check service availability
setInterval(consumer, 20 * 1000);
}
In the main module I put this code and I want to decouple it by inject it somehow on runtime ? example will be very helpful ...
You will have to use vm module to achieve this. More technical info here https://nodejs.org/api/vm.html. Let me explain how you can use this:
You can use the API vm.script to create compiled js code from the code which you want run later. See the description from official documentation
Creating a new vm.Script object compiles code but does not run it. The
compiled vm.Script can be run later multiple times. It is important to
note that the code is not bound to any global object; rather, it is
bound before each run, just for that run.
Now when you want to insert or run this code, you can use script.runInContext API.
Another good example from their official documentation:
'use strict';
const vm = require('vm');
let code =
`(function(require) {
const http = require('http');
http.createServer( (request, response) => {
response.writeHead(200, {'Content-Type': 'text/plain'});
response.end('Hello World\\n');
}).listen(8124);
console.log('Server running at http://127.0.0.1:8124/');
})`;
vm.runInThisContext(code)(require);
Another example of using js file directly:
var app = fs.readFileSync(__dirname + '/' + 'app.js');
vm.runInThisContext(app);
You can use this approach for the conditional code which you want to insert.
You can create a package from one of your apps and then reference the package in the other app.
https://docs.npmjs.com/getting-started/creating-node-modules
There are several ways to decouple two applications. One easy way is with pub/sub pattern (in case you don't need a response).
(Now if you have an application that is very couple, it will be very difficult to decouple it unless you do some refactoring.)
zeromq offers a very good implementation of pub/sub and is very fast.
e.g.
import zmq from "zmq";
socket.connect('tcp://127.0.0.1:5545');
socket.subscribe('sendConfirmation');
socket.on('message', function (topic, message) {
// you can get the data from message.
// something like:
const msg = message.toString('ascii');
const data = JSON.parse(msg);
// do some actions.
// .....
});
//don't forget to close the socket.
process.on('SIGINT', () => {
debug("... closing the socket ....");
socket.close();
process.exit();
});
//-----------------------------------------
import zmq from "zmq";
socket.bind('tcp://127.0.0.1:5545');
socket.send(['sendConfirmation', someData]);
process.on('SIGINT', function() {
socket.close();
});
This way you could have two different containers (docker) for your modules, just be sure to open the corresponding port.
What i don't understand, is why you inject wsSocket and also you create a new Socket. Probably what I would do is just to send the
socket id, and then just use it like:
const _socketId = "/#" + data.socketId;
io.sockets.connected[socketId].send("some message");
You could also use another solution like kafka instead of zmq, just consider that is slower but it will keep the logs.
Hope this can get you an idea of how to solve your problem.
You can use npm link feature.
The linking process consists of two steps:
Declaring a module as a global link by running npm link in the module’s root folder
Installing the linked modules in your target module(app) by running npm link in the target folder
This works pretty well unless one of your local modules depends on another local module. In this case, linking fails because it cannot find the dependent module. In order to solve this issue, one needs to link the dependent module to the parent module and then install the parent into the app.
https://docs.npmjs.com/cli/link

node.js express response.write() not async in Safari

I have a very simple node.js server that I use to ping some servers I need to keep online.
Using Express I have a very simple endpoint I can access that will perform a loop of requests and report the results.
Using res.write() on each loop, the webpage I load can show me the progress as it's happening.
The problem is, this progress doesn't happen in Safari on either OS X or iOS. It waits until the process is complete and then dumps the whole output in 1 go.
Here's an example of my code:
router.route('/test').get(function(req, res)
{
res.write('<html><head></head><body>');
res.write('Starting tests...<br />');
performServerTests(req, res, function(results)
{ // Each loop within performServerTests also uses res.write()
res.write('<br />Complete');
res.end('</body></html>');
});
});
Is there a known reason why Safari would wait for the res.end() call before displaying what it already has, while Chrome shows each res.write() message as it receives it?
Thanks
When using chunked transfers (what you're trying to do), browsers are generally waiting for a minimum amount of data to be received before starting rendering. The exact size is browser-specific - see Using "transfer-encoding: chunked", how much data must be sent before browsers start rendering it? for some quite recent data points on this.
You example could for example be written like this (adding some headers to be explicit too):
router.route('/test').get(function(req, res)
{
res.setHeader('Content-Type', 'text/html; charset=UTF-8');
res.setHeader('Transfer-Encoding', 'chunked');
res.write('<html><head></head><body>');
res.write('Starting tests...<br />');
var buf = ""
for (var i = 0; i < 500; i++) {
buf += " "
}
res.write(buf);
performServerTests(req, res, function(results)
{ // Each loop within performServerTests also uses res.write()
res.write('<br />Complete');
res.end('</body></html>');
});
});

How to determine function parameters in Javascript?

I am a Java developer learning Javascript (Node.js).
This is the first piece of code I tried running :
var sys = require("sys"),
my_http = require("http");
my_http.createServer(function(request,response){
response.writeHeader(200, {"Content-Type": "text/plain"});
response.write("Hello World");
response.end();
}).listen(8080);
IF there was no documentation, how would have I known that createServer takes a function which takes request and response as parameter ? I am asking this because I want to prepare myself for all the undocumented code I will start facing soon. Here is the source for createServer function :
function createServer(options) {
var bunyan = require('./bunyan_helper');
var InternalError = require('./errors').InternalError;
var Router = require('./router');
var Server = require('./server');
var opts = shallowCopy(options || {});
var server;
opts.name = opts.name || 'restify';
opts.log = opts.log || bunyan.createLogger(opts.name);
opts.router = opts.router || new Router(opts);
server = new Server(opts);
server.on('uncaughtException', function (req, res, route, e) {
if (this.listeners('uncaughtException').length > 1 ||
res._headerSent) {
return (false);
}
res.send(new InternalError(e, e.message || 'unexpected error'));
return (true);
});
return (server);
}
I understand Javascript is a dynamically typed language, but wondering how do people debug or understand each other's code without knowing types.
Well the nice thing about javascript is it's interpreted meaning you always have access to the actual source code itself. For node, you can look in node_modules/blah to read the source, but the vast majority of what is on npm is also open source on github and you can read the source there, too.
In the browser the developer tools has an auto-format button if you encounter minified code, but in node usually you don't need that as code is published unminified.
That said, some things are documented well, sometimes documentation is wrong or out of date, and sometimes reading the source code is neither quick nor straightforward. But if something is really problematic for you and is both undocumented and hard to read, you can and should switch to something else on npm because "ain't nobody got time for that".
you must be very familiar with the api when using JavaScript.for example, document.getElementById(id). There is no hint in what the id is in the code,but it is well understood.

Co - generator based lib for node.js. How i can run parallel tasks?

I have tried generator based async lib co (GitHub) for node.js
Here is the code. I use co-express and co-wait.
As you can see client wait 10 seconds before he get response.
My problem is - if i try to run multiple requests to this url and all functions will block next calls.
How i can run multiple calls to this url in parallel?
localhost:8000/test
var fs = require('fs');
var co = require('co');
var express = require('express');
var wrapper = require('co-express');
var app = wrapper(express());
var wait = require('co-wait');
app.get('/test', function* (req, res, next) {
yield waitAndAnswer(res)
});
function* waitAndAnswer (res) {
yield wait(10000);
res.send('Done: ' + Date.now());
}
app.listen(8000);
I made a mistake, when tried to call this url by single browser. Looks like it standard behavior of browsers, so this code work as expected. Jonathan Ong have answered my question.
if you're making calls via your browser, then its a browser thing. if you call it via curl or something, it should be fine. – Jonathan Ong

Categories