Multithreading in node js with bidirectional communication - javascript

I have an api which fetches the databases and then on each row of the response i have to do cpu extensive work and then return the response back to the client. So in a nutshell this is what i want to achieve.
exports.updateUser = async(req, res) => {
const users = await fetchUserFromDB(); // 10^4 User
const updatedUser = []
for(const user of users) {
// Do some CPU extensive work here, this takes 100ms
const userDataAfterHeavpOps = await computeUser(user);
updatedUser.push(userDataAfterHeavpOps);
}
return res.send(updatedUser)
}
I have 8 core system, so i am thinking if i can use worker-node/clustering in node this will improve the performance. So this is what i find on the internet
index.js
// run with node --experimental-worker index.js on Node.js 10.x
const { Worker } = require('worker_threads')
function runService(workerData) {
return new Promise((resolve, reject) => {
const worker = new Worker('./service.js', { workerData });
worker.on('message', resolve);
worker.on('error', reject);
worker.on('exit', (code) => {
if (code !== 0)
reject(new Error(`Worker stopped with exit code ${code}`));
})
})
}
async function run() {
const result = await runService(100)
console.log(result);
}
run().catch(err => console.error(err))
worker.js
const { workerData, parentPort } = require('worker_threads')
// You can do any heavy stuff here, in a synchronous way
// without blocking the "main thread"
parentPort.postMessage({ hello: workerData })
Is there any way to send an array of users and process them on different thread to improve the performance and after process them i want to get the computed value so that parent can send it back to client.

Related

Promise is opening too many HTTP Connections (Example)

I have a very simple code that calls an external API
const promises = [];
for (const elem of data) {
const carId = Number(elem.carId);
promises.push(getCarData(carId));
}
console.log('Waiting for all promises to finish');
const result = await Promise.allSettled(promises);
export const getCarData = async (carId: number) => {
return new Promise(async (resolve, reject) => {
const car = await API.getCar(carId);
if (!car) {
return reject(new Error('Could not find car'));
}
return resolve(car);
});
}
-- API --
import axios from 'axios';
import https from 'https';
const axiosInstance = axios.create({
httpsAgent: new https.Agent({
rejectUnauthorized: false
})
});
static getCar = async(carId: number) => {
try {
const response = await axiosInstance.get(`${baseUrl}/v1/car/${carId}.json`);
const car = response.data;
return car;
} catch (err) {
// something else
}
}
There are like 10.000 requests being made in the promise array for each element in data. It starts working well but after some time I start getting errors in some of the promises
"connect EMFILE IP:443 - Local (undefined:undefined)"
I checked and it seems that it's related to the OS not wanting to open more sockets. This is hosted in AWS obviously, so what can I do to slow down the requests and not open so many connections at the same time?
Promise.allSettled is part of the concurrent methos.
You could write your own promise handle or find some package.
simple example assuming each request takes the same time:
//run multiple promises sequentialy.
async function runSeq(ps:Promise[]){
for(const p of ps){
await p;
}
}
//fake 20 requests.
const ps =new Array(20).fill(Promise.resolve(0)).map((_v,i)=>i);
//split into an array with 3 elements.
d3=a.length/3;
const pss = [a.splice(0,Math.ceil(d3)),a.splice(0,Math.ceil(d3)),a]
//execute 3 concurrent promises.
await Promise.all(pss.map(ps=>await runSeq(ps)))

Not able to get results when running HTTPS GET requests on an array of data

I am using the code given below to run multiple https GET request for Wikipedia API.
app.get("/data_results", (req, res) => {
const articlesData = names.map(nameObj => {
let name = nameObj.name;
let articleExtract = "";
let contentURL =
`https://en.wikipedia.org/w/api.php?
action=query&titles=${name}&prop=extracts&format=json&exintro=1&explaintext=false&origin=*`;
// Getting article content
https.get(contentURL, response => {
response.on("data", async data => {
const wikiArticle = JSON.parse(data);
// Mapping the keys of the JSON data of query to its values.
articleExtract = await Object.keys(wikiArticle.query.pages).map(key => wikiArticle.query.pages[key])[0].extract;
nameObj.article = articleExtract.substring(0,350);
})
});
return nameObj;
});
res.send(articlesData);});
This is my names array
[
{ name: 'Indus%20Valley%20Civilisation' },
{ name: 'Ramayana' },
{ name: 'Mahabharata' },
{ name: 'Gautama%20Buddha' }
]
My aim :-
Run the HTTPS GET request for every value in the names array sequentially.
Add the article extract with its name and save in it in an objects array.
You could also suggest me better ways of doing this.
Try this solution. it creates an object like
{
Mahabharata: response
}
The code
const getValues = async () => {
const name_array = [
{name: 'Indus%20Valley%20Civilisation'},
{name: 'Ramayana'},
{name: 'Mahabharata'},
{name: 'Gautama%20Buddha'},
];
const namePromises = name_array.map(value =>
fetch('baseurl' + value.name).then(res => res.json()),
);
const response = await Promise.all(namePromises);
return response.reduce((obj, responseResults, index) => {
obj[name_array[index].name] = responseResults;
}, {});
};
I would suggest you use fetch and promise. all. Map over your array of names and create promises as input. and then return the array of resolved promises.
something like this
const namePromises = [
{ name: 'Indus%20Valley%20Civilisation' },
{ name: 'Ramayana' },
{ name: 'Mahabharata' },
{ name: 'Gautama%20Buddha' }
].map((value) => fetch(baseurl + value.name).then(res=>res.json()));
Promise.all(namePromises).then((response) => {
console.log(response);
});
Problems:
So, you have several things going on here.
http.get() is non-blocking and asynchronous. That means that the rest of your code continues to run while your multiple http.get() operations are running. That means you end up calling res.send(articlesData); before any of the http requests are complete.
.map() is not promise-aware or asynchronous-aware so it will not wait its loop for any asynchronous operation. As such, you're actually running ALL your http requests in parallel (they are all in-flight at the same time). The .map() loop starts them all and then they all finish some time later.
The data event for http.get() is not guaranteed to contain all your data or even a whole piece of data. It may just be a partial chunk of data. So, while your code may seem to get a complete response now, different network or server conditions could change all that and reading the result of your http requests may stop functioning correctly.
You don't show where names comes from in your request handler. If it's a statically declared array of objects declared in a higher-scoped variable, then this code has a problem that you're trying to modify the objects in that array in a request handler and multiple requests to this request handler can be conflicting with one another.
You don't show any error handling for errors in your http requests.
You're using await in a place it doesn't belong. await only does something useful when you are awaiting a promise.
Solutions:
Managing asynchronous operations in Javascript, particularly when you have more than one to coordinate) is a whole lot easier when using promises instead of the plain callback that http.get() uses. And, it's a whole lot easier to use promises when you use an http request interface that already supports promises. My goto library for built-in promise support for http requests in nodejs is got(). There are many good choices shown here.
You can use got() and promises to control the asynchronous flow and error handling like this:
const got = require('got');
app.get("/data_results", (req, res) => {
Promise.all(names.map(nameObj => {
let name = nameObj.name;
// create separate result object
let result = { name };
let contentURL =
`https://en.wikipedia.org/w/api.php?
action=query&titles=${name}&prop=extracts&format=json&exintro=1&explaintext=false&origin=*`;
return got(contentURL).json().then(wikiArticle => {
// Mapping the keys of the JSON data of query to its values.
let articleExtract = Object.keys(wikiArticle.query.pages).map(key => wikiArticle.query.pages[key])[0].extract;
result.article = articleExtract.substring(0, 350);
return result;
});
})).then(articlesData => {
res.send(articlesData);
}).catch(err => {
console.log(err);
res.sendStatus(500);
});
});
This code attempts to fix all six of the above mentioned problems while still running all your http requests in parallel.
If your names array was large, running all these requests in parallel may consume too many resources, either locally or on the target server. Or, it may run into rate limiting on the target server. If that was an issue, you can run sequence the http requests to run them one at a time like this:
const got = require('got');
app.get("/data_results", async (req, res) => {
try {
let results = [];
for (let nameObj of names) {
let name = nameObj.name;
let result = { name };
let contentURL =
`https://en.wikipedia.org/w/api.php?
action=query&titles=${name}&prop=extracts&format=json&exintro=1&explaintext=false&origin=*`;
const wikiArticle = await got(contentURL).json();
let articleExtract = Object.keys(wikiArticle.query.pages).map(key => wikiArticle.query.pages[key])[0].extract;
result.article = articleExtract.substring(0, 350);
results.push(result);
}
res.send(results);
} catch (e) {
console.log(e);
res.sendStatus(500);
}
});
If you really want to stick with https.get(), then you can "promisify" it and use it in place of got():
function myGet(url, options = {}) {
return new Promise((resolve, reject) => {
https.get(url, options, (res) => {
res.setEncoding('utf8');
let rawData = '';
res.on('data', (chunk) => { rawData += chunk; });
res.on('end', () => {
try {
const parsedData = JSON.parse(rawData);
resolve(parsedData);
} catch (e) {
reject(e);
}
});
}).on('error', reject);
});
}
app.get("/data_results", async (req, res) => {
try {
let results = [];
for (let nameObj of names) {
let name = nameObj.name;
let result = { name };
let contentURL =
`https://en.wikipedia.org/w/api.php?
action=query&titles=${name}&prop=extracts&format=json&exintro=1&explaintext=false&origin=*`;
const wikiArticle = await myGet(contentURL);
let articleExtract = Object.keys(wikiArticle.query.pages).map(key => wikiArticle.query.pages[key])[0].extract;
result.article = articleExtract.substring(0, 350);
results.push(result);
}
res.send(results);
} catch (e) {
console.log(e);
res.sendStatus(500);
}
});

Run 3 bash scripts simultaneously then send their return values as JSON in REST API NodeJS

I'm trying to run 3 bash scripts depending on what the JSON list sent from the client specifies, then return their outputs to a JSON dict which will be sent to the client again.
This is the code of the three scripts:
marc#linux:~ $ cat 1.sh
sleep 1
echo -n "a"
marc#linux:~ $ cat 2.sh
sleep 1
echo -n "b"
marc#linux:~ $ cat 3.sh
sleep 1
echo -n "c"
If I executed them synchronously, they might stop the event loop for 3 seconds (undesirable):
const express = require("express");
const app = express();
app.use(express.json())
const cp = require("child_process");
app.get("/", (request, response) => {
console.log(request.body);
var response_data = {};
if (request.body.includes("script1")) {
response_data.value1 = cp.execFileSync("./1.sh").toString();
}
if (request.body.includes("script2")) {
response_data.value2 = cp.execFileSync("./2.sh").toString();
}
if (request.body.includes("script3")) {
response_data.value3 = cp.execFileSync("./3.sh").toString();
}
response.json(response_data);
response.status(200)
response.send()
})
app.listen(8080, () => {
console.log("ready");
})
And if I executed them asynchronously, they would return after the response is sent, where the response would be just {}
My intended flow chart is that if I send ["script1", "script3"], it should return {"value1": "a", "value3": "c"} when the 1.sh and 3.sh are done executing, and without blocking the event loop.
Example
How do I implement callbacks/promises in such scenario?
Use Promise.all() for that:
const run = (script) => new Promise((resolve, reject) => {
// spawns cp for script
// without any args and options
cp.execFile(`./${script.slice(-1)}.sh`, null, null, (res, err) => {
if (err) reject(err);
// returns res through callback
resolve(res);
});
});
const runScripts = (req) => new Promise((resolve, reject) => {
const
result = {},
promises = [];
// creates promise for each script in request
for (script of req) {
// pushes it to the required array of promises for Promise.all
promises.push(new Promise((resolve, reject) => {
// runs script
run(script).then((res) => {
// gets scripts result
// and writes it
result[`value${script.slice(-1)}`] = res;
resolve();
});
}));
};
// uses Promise.all to wait until all scripts are done
Promise.all(promises).then(() => {
// when each script is done
// finally returns final result
resolve(result);
});
});
// put req as your request.body
runScripts(req).then((res) => {
// sends final result as response
response.status(200).json(result);
}).catch((err) => {
// if something goes wrong
response.status(500).json('Something broke!');
});
I couldn't check this code because because I have Windows.
If I try to execute scripts with child_process, it's tells me that I'm trying to execute UNKNOWN, even if I used sync function in test without anything, just console.log(cp.execFileSync("./1.sh").toString());. But, it worked for you.
So try it and tell me if it works or not.
P.S. Edited for error handling.

How to inject a promise into a promise chain with an executor?

What I'm trying to accomplish
I am currently trying to create a wrapper for a db connection (to Neo4j) that works similar to the following:
Instantiate driver
Expose the executor for the driver so a session can be created
Pass my logic
Close the connection
Since there's no destructor in JavasScript, it's making this unnecessarily difficult with properly closing the session. The logic for creating and closing a connection is extremely repetitive and I'm trying to simplify repetitive scripts so that it's easier to call.
What I've tried.
Inject promise in chain
I thought something like the following could work, but I just cannot seem to create the logic properly. Passing session back to my inserted promise is challenging.
const connect = () => {
var driver;
var session;
return Promise.resolve(() => {
driver = my.driver(uri, creds);
}).then(() => {
// insert my promise here, exposing driver.session() function for executor
// if possible, bind it to the session var so we can properly close it after
// this would also determine the return value
}).catch((e) => console.error(e))
.finally(() => {
session.close();
driver.close();
})
});
Create class wrapper with procedural logic
I then also tried another approach, similar to:
var driver = my.driver(uri, creds);
var session;
function exitHandler(options) {
// ...
session.close();
driver.close();
}
// I have process.on for exit, SIGINT, SIGUSR1, SIGUSR2, and uncaughtException
process.on('exit', exitHandler.bind(null, options));
// ...
class Connection extends Promise<any> {
constructor(executor: Function) {
super((resolve, reject) => executor(resolve, reject));
executor(driver.session.bind(null, this));
}
}
export default Connection;
And calling it like
// ...
const handler = async () => await new Connection((session) => {
const s = session();
// do stuff here
});
The problem with this approach is that the driver is not instantiated before session is used (and so it's undefined). It also feels a little hacky with the process.on calls.
Question
Neither method works (or any of my other attempts). How can I properly wrap db connections to ensure they're consistent and deduplicate my existing code?
A sample of the Neo4j connection script can be found here. This is, essentially, what I'm trying to deduplicate across my scripts (pass everything from line 11 to 42 - inclusive) but have the init of driver, catch, finally, session.close(), driver.close() logic in my wrapper.
Ideally, I would like to expose the session function call so that I can pass parameters to it if needed: See the Session API for more info. If possible, I also want to bind the rxSession reactive session.
A sample of the Neo4j connection script can be found here. This is, essentially, what I'm trying to deduplicate across my scripts (pass everything from line 11 to 42 - inclusive) but have the init of driver, catch, finally, session.close(), driver.close() logic in my wrapper.
OK, the above part of what you are asking is what I was able to best parse and work with.
Taking the code you reference and factoring out lines 11 to 42 such that everything outside of those is shared and everything inside of those is customizable by the caller, this is what I get for the reusable part, designed to be in a module by itself:
// dbwrapper.js
const neo4j = require('neo4j-driver')
const uri = 'neo4j+s://<Bolt url for Neo4j Aura database>';
const user = '<Username for Neo4j Aura database>';
const password = '<Password for Neo4j Aura database>';
const driver = neo4j.driver(uri, neo4j.auth.basic(user, password));
let driverOpen = true;
async function runDBOperation(opCallback, sessOpts = {}) {
const session = driver.session(sessOpts);
try {
await opCallback(session);
} catch (e) {
console.log(e);
throw e;
} finally {
await session.close();
}
}
async function shutdownDb() {
if (driverOpen) {
driverOpen = false;
await driver.close();
}
}
process.on('exit', shutdownDb);
module.exports = { runDBOperation, shutdownDb };
Then, you could use this from some other module like this:
const { runDBOperation, shutdownDB } = require('./dbwrapper.js');
runDBOperation(async (session) => {
const person1Name = 'Alice'
const person2Name = 'David'
// To learn more about the Cypher syntax, see https://neo4j.com/docs/cypher-manual/current/
// The Reference Card is also a good resource for keywords https://neo4j.com/docs/cypher-refcard/current/
const writeQuery = `MERGE (p1:Person { name: $person1Name })
MERGE (p2:Person { name: $person2Name })
MERGE (p1)-[:KNOWS]->(p2)
RETURN p1, p2`
// Write transactions allow the driver to handle retries and transient errors
const writeResult = await session.writeTransaction(tx =>
tx.run(writeQuery, { person1Name, person2Name })
)
writeResult.records.forEach(record => {
const person1Node = record.get('p1')
const person2Node = record.get('p2')
console.log(
`Created friendship between: ${person1Node.properties.name}, ${person2Node.properties.name}`
)
})
const readQuery = `MATCH (p:Person)
WHERE p.name = $personName
RETURN p.name AS name`
const readResult = await session.readTransaction(tx =>
tx.run(readQuery, { personName: person1Name })
)
readResult.records.forEach(record => {
console.log(`Found person: ${record.get('name')}`)
})
}).then(result => {
console.log("all done");
}).catch(err => {
console.log(err);
});
This can be made more flexible or more extensible according to requirements, but obviously the general idea is to keep it simple so that simple uses of the common code don't require a lot of code.

How to synchronously serialize api calls in react?

I have a task that requires fetching api data, with the constraint of only one outstanding api request at a time. Must receive a response, or time out, before issuing the next one. Since fetch (or axios) returns a promise, I can’t figure out how to wait for each promise to fulfill before issuing the next fetch.
I'm handed a large array of api url's that must all be resolved in this one-at-a-time manner before continuing.
I’m using create-react-app’s bundled dev server, and Chrome browser.
Curiously, accomplishing this via a node script is easy, because ‘await fetch’ actually waits. Not so in my browser environment, where all the fetch requests blast out at once, returning promises along the way.
Here’s a simple loop that results in the desired behavior as a node script. My question is how to achieve this one-outstanding-request-at-a-time synchronous serialization in the browser environment?
const fetchOne = async (fetchUrl) => {
try {
const response = await fetch(fetchUrl, { // Or axios instead
"headers": {
'accept': 'application/json',
'X-API-Key': 'topSecret'
},
'method': 'GET'
})
const data = await response.json();
if (response.status == 200) {
return (data);
} else {
// error handling
}
} catch(error) {
// different error handling
}
}
const fetchAllData = async (fetchUrlArray) => {
let fetchResponseDataArray = new Array();
let fetchResponseDataObject = new Object(/*object details*/);
for (var j=0; j<fetchUrlArray.length; j++) { // or forEach or map instead
// Node actually synchronously waits between fetchOne calls,
// but react browser environment doesn't wait, instead blasts them all out at once.
// Question is how to achieve the one-outstanding-request-at-a-time synchronous
// serialization in the browser environment?
fetchResponseDataObject = await fetchOne(fetchUrlArray[j]);
fetchResponseDataArray.push(fetchResponseDataObject);
}
return(fetchResponseDataArray);
}
If there's a problem, it's with code you haven't shown (perhaps in one of your components, or maybe in your project configuration).
Here's an runnable example derived from the problem you described, which mocks fetch and an API, showing you how to iterate each network request synchronously (and handle potential errors along the way):
Note, handling potential errors at the boundaries where they might occur is a better practice than only having a top level try/catch: by doing so, you can make finer-grained decisions about what to do in response to each kind of problem. Here, each failed request is stored as [url, error] in a separate array so that you can programmatically make decisions if one or more requests failed. (Maybe you want to retry them in a subsequent step, or maybe you want to show something different in the UI, etc.). Note, there's also Promise.allSettled(), which might be useful to you now or in the future.
<div id="root"></div><script src="https://unpkg.com/react#17.0.2/umd/react.development.js"></script><script src="https://unpkg.com/react-dom#17.0.2/umd/react-dom.development.js"></script><script src="https://unpkg.com/#babel/standalone#7.16.4/babel.min.js"></script>
<script type="text/babel" data-type="module" data-presets="env,react">
const {useEffect, useState} = React;
const successChance = {
fetch: 0.95,
server: 0.95,
};
function mockApi (url, chance = successChance.server) {
// Simulate random internal server issue
const responseArgs = Math.random() < chance
? [JSON.stringify({time: performance.now()}), {status: 200}]
: ['Oops', {status: 500}];
return new Response(...responseArgs);
}
function mockFetch (requestInfo, _, chance = successChance.fetch) {
return new Promise((resolve, reject) => {
// Simulate random network issue
if (Math.random() > chance) {
reject(new Error('Network error'));
return;
}
const url = typeof requestInfo === 'string' ? requestInfo : requestInfo.url;
setTimeout(() => resolve(mockApi(url)), 100);
});
}
// Return an object containing the response if successful (else an Error instance)
async function fetchOne (url) {
try {
const response = await mockFetch(url);
if (!response.ok) throw new Error('Response not OK');
const data = await response.json();
return {data, error: undefined};
}
catch (ex) {
const error = ex instanceof Error ? ex : new Error(String(ex));
return {data: undefined, error};
}
}
async function fetchAll (urls) {
const data = [];
const errors = [];
for (const url of urls) {
const result = await fetchOne(url);
if (result.data) data.push([url, result.data]);
else if (result.error) {
// Handle this however you want
errors.push([url, result.error]);
}
}
return {data, errors};
}
function Example () {
const [data, setData] = useState([]);
const [loading, setLoading] = useState(false);
useEffect(() => {
const fetchData = async () => {
setLoading(true);
try {
const {data, errors} = await fetchAll([
'https://my.url/api/0',
'https://my.url/api/1',
'https://my.url/api/2',
'https://my.url/api/3',
'https://my.url/api/4',
'https://my.url/api/5',
'https://my.url/api/6',
'https://my.url/api/7',
'https://my.url/api/8',
'https://my.url/api/9',
]);
setData(data);
}
catch (ex) {
console.error(ex);
}
setLoading(false);
};
fetchData();
}, []);
return (
<div>
<div>Loading: {loading ? '...' : 'done'}</div>
<ul>
{
data.map(([url, {time}]) => (<li
key={url}
style={{fontFamily: 'monospace'}}
>{url} - {time}</li>))
}
</ul>
</div>
);
}
ReactDOM.render(<Example />, document.getElementById('root'));
</script>

Categories