Question: Concurrent queue with max concurrency (per cacheKey) - javascript

this is more of a opinionated question. I do have a working solution, but I'm not 100% comfortable with it, as it has it's flaws.
Maybe someone can help me to improve this.
Goal:
I have an external api that only allows 4 calls to be made concurrently against it (for each user). Our app can impersonate multiple users at once.
So the issue comes, if more than 4 calls are made against the api simultaneously. (sometimes more than 20 calls are made)
Using a batched approach with Promise.all and chunking would be very inefficient, as the calls have a different runtime each.
Ideally, the queue would work FIFO and as soon as one call finishes, the next call is started. At the current standing, I created 4 own FIFO queues with somewhat of a Promise chain and I fill these evenly (if all are running).
The problem that I have is, that I do not know how long a request is running.
So choosing one of the queues can lead to an overall longer runtime as necessary.
The calls are automatically rejected after 30s from the external api, so no dead lock here.
Another problem that I have is, that I have to return the data provided from the api to a dependency. The queue is called/filled from withan a callback function...
This is wrapped in a Promise itself, so we can wait as long as we want.
But storing the values in a cache for later retrieval is no option either.
So long story short, here is the code
class QueuingService {
/** ~FIFO queue */
private static queue: Record<string, { promise: Promise<Function>, uuid: string }[]> = {};
private static MAX_CONCURRENCY = 4
/** cacheKey is used as cachekey for grouping in the queue */
public static async addToQueueAndExecute(func: Function, cacheKey: string) {
let resolver: Function | null = null;
let promise = new Promise<Function>(resolve => {
resolver = resolve;
});
//in the real world, this is a real key created by nanoId
let uuid = `${Math.random()}`;
Array.isArray(this.queue[cacheKey])
? this.queue[cacheKey].push({ promise: promise, uuid: uuid })
: this.queue[cacheKey] = [{ promise: promise, uuid: uuid }];
//queue in all calls, until MAX_CONCURRENCY is reached. After that, slice the first entry and await the promise
if (this.queue[cacheKey].length > this.MAX_CONCURRENCY) {
let queuePromise = this.queue[cacheKey].shift();
if (queuePromise){
await queuePromise.promise;
}
}
//console.log("elements in queue:", this.queue[cacheKey].length, cacheKey)
//technically this wrapping is not necessary, but it makes to code more readable imho
let result = async () => {
let res = await func();
if (resolver) {
//resolve and clean up
resolver();
this.queue[cacheKey] = this.queue[cacheKey].filter(elem => elem.uuid !== uuid);
}
//console.log(res, cacheKey, "finshed after", new Date().getTime() - enteredAt.getTime(),"ms", "entered at:", enteredAt)
return res;
}
return await result();
}
}
async function sleep(ms:number){
return await new Promise(resolve=>window.setTimeout(resolve,ms))
}
async function caller(){
/* //just for testing with fewer entries
for(let i=0;i<3;i++){
let groupKey = Math.random() <0.5? "foo":"foo"
QueuingService.addToQueueAndExecute(async ()=>{
await sleep(Math.floor(4000-Math.random()*2000));
console.log(i, new Date().getSeconds(), new Date().getMilliseconds(),groupKey)
return Math.random()
},groupKey)
}
*/
for(let i=0;i<20;i++){
let groupKey = Math.random() <0.5? "foo":"foo";
let startedAt = new Date();
QueuingService.addToQueueAndExecute(async ()=>{
await sleep(Math.floor(4000-Math.random()*2000));
console.log(i, new Date().getTime()-startedAt.getTime(),groupKey)
return Math.random()
},groupKey)
}
}
caller()
Also, here is a playground with the code to play around with:
https://www.typescriptlang.org/play?#code/MYGwhgzhAECKCuBTeBLAdgcwMqIE4DcVhFoBvAWACgBIAegCp7oA-AMQElWB5aARySTR6tKtQAOuFPjAAXEhBmyifAYgBc0AEqJgAe1wATADwLJmADRloE3QFsUEddAAKuOw8RHW8NMBkpdNAA+S3hUAw1TdAxoAF8AbQBdIOgAXjJYgG5RCSlZeUV-YGgAWQBBAA0AfQBhLgA5GoBVTU0AUUaATTToABYqUQYmYDBgAAtEAGlEAE9oB2h4RwNoSGgR8cQAa1noADN9aAw3eDFo+bRoGQmVZBJhHPgAIxBlBSViyBnfVYMDABVdAg7mU0AY2gAPHTwOQACj2PmAGm8vn8gUsGwm0xmkRkZgwAEoyKJqCBEDJoLhEBBdCB8HhkYi0ZcAD7QNDwEAgHocrnZGik8nWNz2Rw8xAAdxcIo8XiZAWCsKpNLpJFSKQoAuoytp9NwPR1qv51GosQJ-OglqtltotHQVxuVLA3Il+hABks1wWCzAlMQzugOzmwCdchWTzmaDAaF07AMJLJFLCKBW6QABgASUglWRjAB0uGjBjssIJsTT-IGArKuELMzzDhrddhXogef4d3imKms0SBJJ1AA-A6HO3VF3Rlje3mxEsxrDSML3I4NDZRYhQuENMmVmaBxpW2PO93sYkevFF2uPKuZY5Nynt+E4olKwLbR3BPbndyRlyIKE0H8blymqOpGhadounmGAnU2Aw82gMo9jkfVrlkSwIFeYgHRIPYUFwBRoEQQDcDmItVglMAUApa4SCvRwSRQPZoBbMZRw-RAJ02U88zJTBrmgFJDxA2oGmaVoOhqToiU1E1BQpDjXGXNURzbDiuKnGZEjzCA2OQ0tjRNJiWMU29EAJWS5LASjqNuJAlPXGczIta1XMtWISQ8yg3JtWg9DQFVEF43QMFhAAiRAyVsYiZBge0OLUMLPTYtTxxPac+Iwa4MUnHsZn7SgSVtORxjQIhvzmVtoAlQsxDOTBoPZXQKTQHRqQgMBSMsJ4YXmClbDAHYYBkXR1l0AwSFsfQSCdAwwBeEgUFsMZdATIVlU5Cl0i+H5SzSDUB0TP0YG2myKQRXwDIHYylWpXU8Bkgc6FoQ16VWMF1jJaNFjEJ7XrwK6tWoQ91PSrSehBtLcp4vCQBQ2FIsQWx9qIqK8x3aAAEJUnSHdzQHLyfNc21-MC4LQuVHLuNmSwwrwgKJhWMBkLwJL2UlaAABF8lLPMMHJf4lsQPaAFoiMAvBEAMMoZD5gWhdLcwwtsCA2YiiWqSZmREssGLJelmQCoHKkZHgXBLmVQyvJJE2zcuayqIpDa4cB00qGtygduKC6-AVaBMMQRAxFhFW1A5Wwnge2TbfNijHfZqUHI8W6VXpdUJXQYsJR0+Xot0GEU-u8wVYJAqPa9-Z5UCdZvwBiyqGtBhoFtAArJZzsOOQFHODOBL2SU8HFvEUGpBurQOXBYSOlBUgABkyFAjAAZgXgBqVf6+8nyjuOfOxGxHoc2uAsixLIkjFnvMAFZhzp3RdDCxKDgfse3OBVBMBwAgiCCsA-kBd+iBQTgihMAGEwsK6lnVJqIm1oHa2QDkHWER98x7BAPfSevRZ7YJFigk+YIz70AAEzYNnqXFysCxoBVpEFdBoUUCWFalKbmcICRyxkDgfyBgICKwTlzHmbD+YyBKCgLkHguE8IJOYXepxsQFUoZaGOlw8GFgIbYUsr9XKxGkScfesx5FWkJlaB4W9LSaInlPIUM956LxIWvDeMDt5ChkXouY6QVGn3UefS+N9oB3wfk-e+YUKGuSOu8XAYYZbimYQIkJ1p37RC-oQYgeY-4AiBKoYBkJoRwkgQSaBmiibwIpIg4OeC0EYNhFgnBHi1GlmIaQ8hhSfKkxoeTWEDC+EsOFoI3OPSRbhMibLIRgtoqKxcXI5pbklGlFzPg4sXiplxB0XvSZpi4iaPdlQX8ZJJ4EiAA

Related

Implementing a queue of function calls that runs in parallel to the rest of the code

I'm currently building a program in JavaScript that is making requests of the google sheets API based on activity occurring in a Discord Server (messaging app). However, I've been running into the API RateLimits in cases where multiple users do the same action at the same time, causing too many API Requests in too short of a time.
My idea to get around this is to implement a parallel queue of async function calls, so that whenever I want to make a request of the google API, I queue that function call, and another function or thread or something will keep checking this queue and if there is a function available, it will run that function, wait a little bit, and then check the queue again, and so on.
I'm struggling to figure out how to do this in regular asynchronous (async/await) programming. I've been referring to the following posts/pages, but they all seem focused on a predefined queue that is then dequeued in order - I want to be able to keep adding to the queue even after the functions have started being run.
How do I store javascript functions in a queue for them to be executed eventually
Semaphore-like queue in javascript?
https://www.codementor.io/#edafeadjekeemunotor/building-a-concurrent-promise-queue-with-javascript-1ano2eof0v
Any help or guidance would be very appreciated, thank you!
The simplest option would be to have a queue of promise-returning functions and poll it periodically.
Example:
let queue = []
async function poll() {
console.log('POLL, queue.length=', queue.length)
if (queue.length) {
let result = await queue.shift()()
console.log('RESULT', result.id)
}
setTimeout(poll, 1000)
}
let n = 0
function test() {
let getter = () => fetch(
'https://jsonplaceholder.typicode.com/todos/' + (++n)
).then(r => r.json())
queue.push(getter)
}
poll()
<button onclick="test()">click many times</button>
Besides what T.J. Crowder already did mention about true parallelism in JavaScript there is this special requirement of wanting to continuously being able to add to the queue (enqueue) after the (de)queue based processing had started. Therefore I doubt there will be any solution based entirely on promises.
Thus, in case one does not want to go for permanently running "background" tasks based on setInterval/setTimeout, one has to implement an approach capable of handling callbacks.
One way was to e.g. implement a request class which is capable of dispatching its own (custom) events. It should be possible for both node and Web Api environments (browsers) since the latter provides/supports EventTarget and the former features packages for it.
Possible implementation details are as follows.
Any request-queue can be instantiated with the batchSize-parameter's integer-value where this parameter indicates the desired amount of fetch requests which are going to participate in a single all-settled promise handling.
Once such a promise is settled - regardless of any fetch promise's status - at least one of both custom queue events will be thrown, either the 'new-batch' or the 'rejected' event type or both event types. Each custom-event's detail payload will feature its type specific data, e.g. a resolved array for the former and a rejected array for the latter case.
Regarding the handling of rejected api calls (the list of rejected fetch URLs),
one could implement the handling callback in a way that it
collects/accumulates such data until a certain threshold where one then would pass this data to the request-queue's fetch method again.
one too could implement functionality which prevents fetching the same url(s) again and again up to a maximum retry count.
But the latter proposed features should not be part of the queue implementation.
// helper functions for logging and creating a list
// of api requests.
function logRequestQueueEvent({ type, detail }) {
console.log({ [ type ]: { detail } });
}
function createFetchArrayFromBoundApiCallCount() {
let { callCount = 0, fetchSize = 12 } = this;
const result = Array
.from({ length: fetchSize })
.map((_, idx) =>
`https://jsonplaceholder.typicode.com/photos/${ idx + callCount + 1 }`
);
this.callCount = callCount + fetchSize;
return result;
}
// initializing of the main example which uses an
// instance of a custom implemented request-queue
// class which is capable of both
// - fetching continuously
// - and dispatching events.
function main() {
const requestQueue = new ContinuouslyFetchingRequestQueue(5);
// a queue instance's three available event types one can subsribe to.
requestQueue.addEventListener('new-fetch', logRequestQueueEvent);
requestQueue.addEventListener('new-batch', logRequestQueueEvent);
requestQueue.addEventListener('rejected', logRequestQueueEvent);
// as for handling rejected api calls (the list of rejected URLs),
// - one could implement the handling callback in a way that it
// - collects/accumulates such data until a certain threshold
// where one then would pass this data to the request-queue's
// `fetch` method again.
// - one too could implement functionality which prevents fetching
// the same url(s) again and again up to a maximum retry count.
// but such features should not be part of the queue implementation.
const createFetchArray = createFetchArrayFromBoundApiCallCount
.bind({ callCount: 0, fetchSize: 12 });
document
.querySelector('[data-request]')
.addEventListener('click', () =>
// a queue instance's sole public accessible method.
requestQueue.fetch(createFetchArray())
);
}
main();
body { zoom: .9; margin: 0; }
button { display: block; width: 5em; margin: 10px 0; }
.as-console-wrapper { min-height: 100%!important; width: 89%; top: 0; left: auto!important; }
<script>
// helper function for creating chunks from an array.
function chunkArray(arr = [], chunkLength = arr.length) {
chunkLength = Math.abs(chunkLength);
const result = [];
while (arr.length >= 1) {
result.push(
arr.splice(0, chunkLength)
);
}
return result;
}
// `queue` instance related requests and responses handler.
function handleRequestsAndResponses(queue, fetching, addresses) {
// for each `addresses` array create an all-settled promise ...
Promise
.allSettled(
addresses.map(url => fetch(url)
.then(response => response.json())
.catch(error => ({ error, url }))
)
)
.then(results => {
// ... where each settled promise item either features
// the JSON-parsed `value` or a failing `reason`.
const resolved = results
.filter(({ status }) => status === 'fulfilled')
.map(({ value }) => value);
const rejected = results
.filter(({ status }) => status === 'rejected')
.map(({ reason }) => reason.url);
// since a `queue` instance features inherited
// `EventTarget` behavior, one can dispatch the
// above filtered and mapped response arrays as
// `detail`-payload to a custom-event like 'new-batch'.
queue
.dispatchEvent(
new CustomEvent('new-batch', {
detail: { resolved, fetching: [...fetching] },
}),
);
// one also could think about dispatching the
// list of rejected addresses per bundled fetch
// separately, in case there are any.
// guard.
if (rejected.length >= 1) {
queue
.dispatchEvent(
new CustomEvent('rejected', {
detail: { rejected },
}),
);
}
})
}
// `queue` instance related fetch/request functionality.
function createBundledFetch(queue, fetching, batchSize) {
queue
.dispatchEvent(
new CustomEvent('new-fetch', {
detail: { fetching: [...fetching] },
}),
);
// decouple the `queue` related `fetching`
// reference from the to be started request
// process by creating a shallow copy.
const allAddresses = [...fetching];
// reset/mutate the `queue` related `fetching`
// reference to an empty array.
fetching.length = 0;
// create an array of chunked `addresses` arrays ...
chunkArray(allAddresses, batchSize)
.forEach(addresses => setTimeout(
// ... and invoke each bundled request and
// response-batch handling as non blocking.
handleRequestsAndResponses, 0, queue, fetching, addresses,
));
}
// queue instance method implemented
// as `this` context aware function.
function addAddressListToBoundQueueData(...addressList) {
// assure a flat arguments array (to a certain degree).
addressList = addressList.flat();
// guard.
if (addressList.length >= 1) {
const { queue, fetching, batchSize } = this;
fetching.push(...addressList);
// invoke the bundled fetch creation as non blocking.
setTimeout(
createBundledFetch, 0, queue, fetching, batchSize,
);
}
}
// custom request-queue class which is capable of both
// - fetching continuously
// - and dispatching events.
class ContinuouslyFetchingRequestQueue extends EventTarget {
constructor(batchSize) {
super();
batchSize = Math
.max(1, Math.min(20, parseInt(batchSize, 10)));
const fetching = [];
const queue = this;
// single/sole public accessible instance method.
queue.fetch = addAddressListToBoundQueueData
.bind({ queue, fetching, batchSize });
}
}
</script>
<button data-request>add 12 requests</button>
<button onclick="console.clear();">clear console</button>

Aggregate multiple calls then separate result with Promise

Currently I have many concurrent identical calls to my backend, differing only on an ID field:
getData(1).then(...) // Each from a React component in a UI framework, so difficult to aggregate here
getData(2).then(...)
getData(3).then(...)
// creates n HTTP requests... inefficient
function getData(id: number): Promise<Data> {
return backend.getData(id);
}
This is wasteful as I make more calls. I'd like to keep my getData() calls, but then aggregate them into a single getDatas() call to my backend, then return all the results to the callers. I have more control over my backend than the UI framework, so I can easily add a getDatas() call on it. The question is how to "mux" the JS calls into one backend call, the "demux" the result into the caller's promises.
const cache = Map<number, Promise<Data>>()
let requestedIds = []
let timeout = null;
// creates just 1 http request (per 100ms)... efficient!
function getData(id: number): Promise<Data> {
if (cache.has(id)) {
return cache;
}
requestedIds.push(id)
if (timeout == null) {
timeout = setTimeout(() => {
backend.getDatas(requestedIds).then((datas: Data[]) => {
// TODO: somehow populate many different promises in cache??? but how?
requestedIds = []
timeout = null
}
}, 100)
}
return ???
}
In Java I would create a Map<int, CompletableFuture> and upon finishing my backend request, I would look up the CompletableFuture and call complete(data) on it. But I think in JS Promises can't be created without an explicit result being passed in.
Can I do this in JS with Promises?
A little unclear on what your end goal looks like. I imagine you could loop through your calls as needed; Perhaps something like:
for (let x in cache){
if (x.has(id))
return x;
}
//OR
for (let x=0; x<id.length;x++){
getData(id[x])
}
Might work. You may be able to add a timing method into the mix if needed.
Not sure what your backend consists of, but I do know GraphQL is a good system for making multiple calls.
It may be ultimately better to handle them all in one request, rather than multiple calls.
The cache can be a regular object mapping ids to promise resolution functions and the promise to which they belong.
// cache maps ids to { resolve, reject, promise, requested }
// resolve and reject belong to the promise, requested is a bool for bookkeeping
const cache = {};
You might need to fire only once, but here I suggest setInterval to regularly check the cache for unresolved requests:
// keep the return value, and stop polling with clearInterval()
// if you really only need one batch, change setInterval to setTimeout
function startGetBatch() {
return setInterval(getBatch, 100);
}
The business logic calls only getData() which just hands out (and caches) promises, like this:
function getData(id) {
if (cache[id]) return cache[id].promise;
cache[id] = {};
const promise = new Promise((resolve, reject) => {
Object.assign(cache[id], { resolve, reject });
});
cache[id].promise = promise;
cache[id].requested = false;
return cache[id].promise;
}
By saving the promise along with the resolver and rejecter, we're also implementing the cache, since the resolved promise will provide the thing it resolved to via its then() method.
getBatch() asks the server in a batch for the not-yet-requested getData() ids, and invokes the corresponding resolve/reject functions:
function getBatch() {
// for any
const ids = [];
Object.keys(cache).forEach(id => {
if (!cache[id].requested) {
cache[id].requested = true;
ids.push(id);
}
});
return backend.getDatas(ids).then(datas => {
Object.keys(datas).forEach(id => {
cache[id].resolve(datas[id]);
})
}).catch(error => {
Object.keys(datas).forEach(id => {
cache[id].reject(error);
delete cache[id]; // so we can retry
})
})
}
The caller side looks like this:
// start polling
const interval = startGetBatch();
// in the business logic
getData(5).then(result => console.log('the result of 5 is:', result));
getData(6).then(result => console.log('the result of 6 is:', result));
// sometime later...
getData(5).then(result => {
// if the promise for an id has resolved, then-ing it still works, resolving again to the -- now cached -- result
console.log('the result of 5 is:', result)
});
// later, whenever we're done
// (no need for this if you change setInterval to setTimeout)
clearInterval(interval);
I think I've found a solution:
interface PromiseContainer {
resolve;
reject;
}
const requests: Map<number, PromiseContainer<Data>> = new Map();
let timeout: number | null = null;
function getData(id: number) {
const promise = new Promise<Data>((resolve, reject) => requests.set(id, { resolve, reject }))
if (timeout == null) {
timeout = setTimeout(() => {
backend.getDatas([...requests.keys()]).then(datas => {
for (let [id, data] of Object.entries(datas)) {
requests.get(Number(id)).resolve(data)
requests.delete(Number(id))
}
}).catch(e => {
Object.values(requests).map(promise => promise.reject(e))
})
timeout = null
}, 100)
}
return promise;
}
The key was figuring out I could extract the (resolve, reject) from a promise, store them, then retrieve and call them later.

JS: what's a use-case of Promise.resolve()

I am looking at https://www.promisejs.org/patterns/ and it mentions it can be used if you need a value in the form of a promise like:
var value = 10;
var promiseForValue = Promise.resolve(value);
What would be the use of a value in promise form though since it would run synchronously anyway?
If I had:
var value = 10;
var promiseForValue = Promise.resolve(value);
promiseForValue.then(resp => {
myFunction(resp)
})
wouldn't just using value without it being a Promise achieve the same thing:
var value = 10;
myFunction(10);
Say if you write a function that sometimes fetches something from a server, but other times immediately returns, you will probably want that function to always return a promise:
function myThingy() {
if (someCondition) {
return fetch('https://foo');
} else {
return Promise.resolve(true);
}
}
It's also useful if you receive some value that may or may not be a promise. You can wrap it in other promise, and now you are sure it's a promise:
const myValue = someStrangeFunction();
// Guarantee that myValue is a promise
Promise.resolve(myValue).then( ... );
In your examples, yes, there's no point in calling Promise.resolve(value). The use case is when you do want to wrap your already existing value in a Promise, for example to maintain the same API from a function. Let's say I have a function that conditionally does something that would return a promise — the caller of that function shouldn't be the one figuring out what the function returned, the function itself should just make that uniform. For example:
const conditionallyDoAsyncWork = (something) => {
if (something == somethingElse) {
return Promise.resolve(false)
}
return fetch(`/foo/${something}`)
.then((res) => res.json())
}
Then users of this function don't need to check if what they got back was a Promise or not:
const doSomethingWithData = () => {
conditionallyDoAsyncWork(someValue)
.then((result) => result && processData(result))
}
As a side node, using async/await syntax both hides that and makes it a bit easier to read, because any value you return from an async function is automatically wrapped in a Promise:
const conditionallyDoAsyncWork = async (something) => {
if (something == somethingElse) {
return false
}
const res = await fetch(`/foo/${something}`)
return res.json()
}
const doSomethingWithData = async () => {
const result = await conditionallyDoAsyncWork(someValue)
if (result) processData(result)
}
Another use case: dead simple async queue using Promise.resolve() as starting point.
let current = Promise.resolve();
function enqueue(fn) {
current = current.then(fn);
}
enqueue(async () => { console.log("async task") });
Edit, in response to OP's question.
Explanation
Let me break it down for you step by step.
enqueue(task) add the task function as a callback to promise.then, and replace the original current promise reference with the newly returned thenPromise.
current = Promise.resolve()
thenPromise = current.then(task)
current = thenPromise
As per promise spec, if task function in turn returns yet another promise, let's call it task() -> taskPromise, well then the thenPromise will only resolve when taskPromise resolves. thenPromise is practically equivalent to taskPromise, it's just a wrapper. Let's rewrite above code into:
current = Promise.resolve()
taskPromise = current.then(task)
current = taskPromise
So if you go like:
enqueue(task_1)
enqueue(task_2)
enqueue(task_3)
it expands into
current = Promise.resolve()
task_1_promise = current.then(task_1)
task_2_promise = task_1_promise.then(task_2)
task_3_promise = task_2_promise.then(task_3)
current = task_3_promise
effectively forms a linked-list-like struct of promises that'll execute task callbacks in sequential order.
Usage
Let's study a concrete scenario. Imaging you need to handle websocket messages in sequential order.
Let's say you need to do some heavy computation upon receiving messages, so you decide to send it off to a worker thread pool. Then you write the processed result to another message queue (MQ).
But here's the requirement, that MQ is expecting the writing order of messages to match with the order they come in from the websocket stream. What do you do?
Suppose you cannot pause the websocket stream, you can only handle them locally ASAP.
Take One:
websocket.on('message', (msg) => {
sendToWorkerThreadPool(msg).then(result => {
writeToMessageQueue(result)
})
})
This may violate the requirement, cus sendToWorkerThreadPool may not return the result in the original order since it's a pool, some threads may return faster if the workload is light.
Take Two:
websocket.on('message', (msg) => {
const task = () => sendToWorkerThreadPool(msg).then(result => {
writeToMessageQueue(result)
})
enqueue(task)
})
This time we enqueue (defer) the whole process, thus we can ensure the task execution order stays sequential. But there's a drawback, we lost the benefit of using a thread pool, cus each sendToWorkerThreadPool will only fire after last one complete. This model is equivalent to using a single worker thread.
Take Three:
websocket.on('message', (msg) => {
const promise = sendToWorkerThreadPool(msg)
const task = () => promise.then(result => {
writeToMessageQueue(result)
})
enqueue(task)
})
Improvement over take two is, we call sendToWorkerThreadPool ASAP, without deferring, but we still enqueue/defer the writeToMessageQueue part. This way we can make full use of thread pool for computation, but still ensure the sequential writing order to MQ.
I rest my case.

Why is my code executing although it should pause?

I have an API which is limited regarding how many requests per minute (50/minute) I can send to any endpoint provided by that API.
In the following code-section, I filter the objects orders with an URL as property, every object with an URL that provides data should be stored in successfullResponses in my app.component.ts.
Promise.all(
orders.map(order => this.api.getURL(order.resource_url).catch(() => null))
).then(responses => {
const successfulResponses = responses.filter(response => response != null)
for(let data of successfulResponses) {
// some other requests should be sent with data here
}
});
There are more than 50 orders to check, but I just can check maximum 50 orders at once, so I try to handle it in my service. I set the first date when the first request is sent. After that I compare the dates of the new request with the first one. If the difference is over 60, I set the current date to the new one and set maxReq again to 50. If it is under 60, I check if there are requests left, if yes I send the request and if not I just wait one minute :
sleep(ms){
return new Promise(resolve => setTimeout(resolve, ms));
}
async getURL(){
if(!this.date){
let date = new Date();
this.date = date;
}
if((new Date().getSeconds() - this.date.getSeconds() > 60 )){
this.maxReq = 50;
this.date = new Date();
return this.http.get(url, this.httpOptions).toPromise();
} else {
if(this.maxReq > 0){
this.maxReq -= 1;
return this.http.get(url, this.httpOptions).toPromise();
} else{
console.log("wait");
await this.sleep(60*1000);
this.maxReq = 50;
this.date = new Date();
return this.http.get(url, this.httpOptions).toPromise();
}
}
}
However the code in app.component.tsis not waiting for the function getURL() and executes further code with requests which leads to the problem that I send ´too many requests too quickly´.
What can I do about that?
I had a similar problem while trying to use promises with multiple async functions. It's an easy thing to forget, but in order to make them all wait, you have to use await on the root line that calls the function in question.
I'm not entirely certain, but my presumption is that your await this.sleep(60*1000); line is indeed waiting for a timeout to occur, but whilst it is doing this, the code that called getURL() is executing the rest of its lines, because it did not have an await (or equivalent, like .then) before getURL().
The way I discovered this in my case was by using a good debugging tool (I used Chrome DevTools's own debugging features). I advise you do the same, adding breakpoints everywhere, and see where your code is going with each line.
Here is a short, rough example to show what I mean:
// This code increments a number from 1 to 2 to 3 and returns it each time after a delay of 1 second.
async function loop() {
for (i = 1; i <= 3; i++) {
console.log('Input start');
/* The following waits for result of aSync before continuing.
Without 'await', it would execute the last line
of this function whilst aSync's own 'await'
waited for its result.
--- This is where I think your code goes wrong. --- */
await aSync(i);
console.log('Input end');
}
}
async function aSync(num) {
console.log('Return start');
/* The following waits for the 1-second delay before continuing.
Without 'await', it would return a pending promise immediately
each time. */
let result = await new Promise(
// I'm not using arrow functions to show what it's doing more clearly.
function(rs, rj) {
setTimeout(
function() {
/* For those who didn't know, the following passes the number
into the 'resolved' ('rs') parameter of the promise's executor
function. Without doing this, the promise would never be fulfilled. */
rs(num);
}, 1000
)
}
);
console.log(result);
console.log('Return end');
}
loop();

Implementing Lock in Node.js

Preface:
In order to solve my problem, you have to have knowledge in the following areas: thread-safety, Promise, async-await.
For people who are not familiar with TypeScript, it's just normal JavaScript (ES6) with type annotations.
I have a function named excludeItems that accepts an item list (each item is a string), and calls an API (that excludes the item) for each item. It's important not to call the API twice for the same item, not even in different executions of the function, so I save in a local DB the items that are already excluded.
async function excludeItems(items: string[]) {
var excludedItems = await db.getExcludedItems();
for (var i in items) {
var item = items[i];
var isAlreadyExcluded = excludedItems.find(excludedItem => excludedItem == item);
if (isAlreadyExcluded) {
continue;
}
await someApi.excludeItem(item);
await db.addExcludedItem(item);
}
}
This function is called asynchronously by its client several times instantaneously, meaning the client calls the function say 5 times before the first execution is completed.
A concrete scenario:
excludeItems([]);
excludeItems(['A','B','C']);
excludeItems([]);
excludeItems(['A','C']);
excludeItems([]);
In this case, although Node.js is single-threaded, the Critical Section problem is existing here, and I get the wrong results. This is my "excludedItems" collection in my local DB after the execution of that scenario:
[{item: 'A'},
{item: 'B'},
{item: 'C'},
{item: 'A'},
{item: 'C'}]
As you can see, the last 'A' and 'C' are redundant (meaning that the API was also called twice for these items).
It occurs due to the await statements in the code. Every time an await statement is reached, a new Promise is created under the hood, therefore although Node.js is single-threaded, the next async function that was waiting to be executed is getting executed, and that way this critical section is executed parallelly.
To solve that problem, I've implemented a locking mechanism:
var excludedItemsLocker = false;
async function safeExcludeItems(items: string[]) {
while (excludedItemsLocker) {
await sleep(100);
}
try {
excludedItemsLocker = true;
var excludedItems: string[] = await db.getExcludedItems();
for (var i in items) {
var item = items[i];
var isAlreadyExcluded = excludedItems.find(excludedItem => excludedItem == item);
if (isAlreadyExcluded) {
continue;
}
await someApi.excludeItem(item);
await db.addExcludedItem(item);
}
}
finally {
excludedItemsLocker = false;
}
}
async function sleep(duration: number): Promise<Object> {
return new Promise(function (resolve) {
setTimeout(resolve, duration);
});
}
However, this implementation does not work for some reason. I still get more than one (alleged) "thread" in the critical section, meaning it's still getting executed parallelly and my local DB is filled with the same wrong results. BTW the sleep method works as expected, its purpose is just to give CPU time to the next function call that's waiting to be executed.
Does anybody see what's broken in my implementation?
BTW I know that I can achieve the same goal without implementing a Lock, for example by calling to db.getExcludedItems inside the loop, but I want to know why my Lock implementation is broken.
If the parameters are:
['A','B','C']
and db.getExcludedItems() returns:
[{item: 'A'},
{item: 'B'},
{item: 'C'}]
Then you are trying to find a string in an array of objects, which will always return undefined:
var isAlreadyExcluded = excludedItems.find(excludedItem => excludedItem == item);
Just a thought, because I can't see any problem with the locking itself, it should work as expected.
I encountered a similar problem a while back and I ended up implementing a ticket system where each "thread" would request a ticket and wait in a queue (I know it's not a thread, but it's easier to say than 'next set of functions in the event loop'). It's an NPM package found at promise-ticket, but the crux of the solution was to have a generator function returning promises that would resolve when an EventEmitter would emit it's ticket number
let emitter = new EventEmitter();
let nextTicket = 0;
let currentTicket = 0;
let skips = [];
const promFn = (resolve) => {
let num = currentTicket++;
emitter.once(num, () => resolve(num));
};
const generator = (function* () {
while(true) yield new Promise(promFn);
})();
// Someone takes a ticket from the machine
this.queue = (resolveValue) => {
let ticketNumber = currentTicket;
let p = generator.next().value;
if(resolveValue !== undefined) p = p.then(() => resolveValue);
if(skips.includes(ticketNumber)) emitter.emit(ticketNumber);
return p;
};
Using promise-ticket is pretty easy:
const TicketMachine = require("promise-ticket");
const tm = new TicketMachine();
// Queue some tickets
tm.queue("resolve1"); // id=0
tm.queue("resolve2"); // id=1
tm.queue("resolve4"); // id=2
tm.queue("resolve3"); // id=3
// Call those tickets
tm.next(); // resolve1
tm.next(); // resolve2
tm.next(3); // resolve3
tm.next(2); // resolve4
I guess it's not a perfect solution, but IMO it's better than wrapping entire code blocks in callbacks.
I guess for your problem (if you still have it 5 years later), you should query your localstorage and abort if those items have been sent (by calling ticketMachine.next()).

Categories