How to handle concurrency issue gracefully - javascript

I am working on app where I need to implement offline architecture. My implementation is I am saving the requests whatever user is performing like update order status, update delivery status.
It has to be sequentially like Req1(Shipped)->Req2(Out For Delivery) -> Req3(Delivered) I have a method (uploadRequests ) where I am doing sync with the server. Whenever internet is back, If user refresh the order list by swipe up then I am uploading the offline data then calling order list api. It is like this.
OrderList.js
<List onRefresh=(() => { uploadRequests()) />
offline-request.js
export async function uploadRequests() {
for (let index = 0; index < sortedRequests.length; index++) {
const keyName = "REQUEST_" + sortedRequests[index];
const { consigmentID, userID, type, request, sync } = await load(keyName);
// Update sync request
const payload = {
type: TYPE.UPDATE_LINE_ITEM,
request: request,
sync: SYNC.IN_PROCESS,
userID: userID,
consigmentID: consigmentID,
};
await save(keyName, payload);
// update sync status to SYNC.IN_PROCESS in the storage
if (sync === SYNC.TO_BE_UPLOAD) {
// make API call to upload this request to server
}
// delete the request from storage once api hit is success
await remove(keyName);
}
}
Problem-
Here initially all offline payload sync status is SYNC.TO_BE_UPLOAD, once sync start for this request, then I am updating the status to sync to SYNC.IN_PROCESS one by one after request has been hit to server so that if user swipe multiple times then this request wont be picked up.
So if I saved 30 requests in the storage to sync with server. 1st time if user swipe up and internet is back then suppose it has processed 1st to 5th requests and set the sync status to IN_PROCESS so other thread wont process it and if meanwhile user swipe up list again then this uploadRequests method will call again. Here it will pick up 6th to 30th requests and first thread also it will process 6th to 30th requests so both threads will process 6th to 30th requests twice times.
How I can handle this problem gracefully in javascript without variable ? Main thing all other areas where I am calling API has to wait first to clear this storage.
Problem with variable is Suppose I am on order list page, i swipe up and it start uploading requests, if swipe up then i can ignore this api hit using variable but If i tap on one order and mark it delivered then it will also skip offline requests and make delivered api call but here i want offline requests should clear from storage then make new call when internet is available. This is the reason I don't want using variable like isOfflineRunning=true/false.
Any suggestions to solve this problem ?

You can wrap your uploadRequests function with this singleCallOnly function like
const singleCallOnly = fn => {
let lastPromise = null;
return async (...args) => {
if(lastPromise) await lastPromise;
lastPromise = fn(...args);
return lastPromise;
}
}
export const uploadRequests = singleCallOnly(_uploadRequests)
You can run the attached snippet for a sample response.
However, on a sidenote, I'm doubtful about the way you're consuming sortedRequests. This way, you should ensure that no element is added/removed from sortedRequests while that function is running.
If sortedRequests is like a queue of pending requests, you should consume it like
while(sortedRequests.length) {
const key = sortedRequests.shift();
const keyName = "REQUEST_" + key;
// ... rest of code
}
// Helper functions start
const sleep = async (ms) => new Promise((res) => setTimeout(() => res(ms), ms));
const simulateLatency = () => sleep(50 + Math.floor(Math.random() * 500));
const SYNC = {
TO_BE_UPLOAD: "TO_BE_UPLOAD",
IN_PROCESS: "IN_PROCESS",
};
const TYPE = {
UPDATE_LINE_ITEM: "UPDATE_LINE_ITEM",
};
const requests = Array(10)
.fill()
.map((_, i) => ({
consigmentID: `c${i}`,
userID: `u${i}`,
type: TYPE.UPDATE_LINE_ITEM,
request: "some-request",
sync: SYNC.TO_BE_UPLOAD,
key: `REQUEST_K${i}`,
}));
const load = async (key) => {
// simulate latency
await simulateLatency();
return requests.find((r) => r.key === key);
};
const save = async (keyname, payload) => {
await simulateLatency();
requests.find((r) => r.key === keyname).sync = SYNC.IN_PROCESS;
};
const remove = async (keyname) => {
await simulateLatency();
const idx = requests.findIndex((r) => r.key === keyname);
if (idx < 0) return;
requests.splice(idx, 1);
};
const sortedRequests = Array(5)
.fill()
.map((_, i) => `K${i}`);
// Helper functions end
// YOUR CODE STARTS
async function _uploadRequests() {
while(sortedRequests.length) {
const key = sortedRequests.shift()
const keyName = "REQUEST_" + key;
console.log('start', keyName);
const { consigmentID, userID, type, request, sync } = await load(keyName);
// Update sync request
const payload = {
type: TYPE.UPDATE_LINE_ITEM,
request: request,
sync: SYNC.IN_PROCESS,
userID: userID,
consigmentID: consigmentID,
};
await save(keyName, payload);
// update sync status to SYNC.IN_PROCESS in the storage
if (sync === SYNC.TO_BE_UPLOAD) {
// make API call to upload this request to server
}
// delete the request from storage once api hit is success
await remove(keyName);
console.log('done', keyName);
}
}
const singleCallOnly = fn => {
let lastPromise = null;
return async (...args) => {
if(lastPromise) await lastPromise;
lastPromise = fn(...args);
return lastPromise;
}
}
(async () => {
const uploadRequests = singleCallOnly(_uploadRequests);
uploadRequests();
await sleep(1000);
uploadRequests();
await sleep(1000);
uploadRequests();
})();

Related

Limit the maximum number of concurrent requests

There is a function:
export function getImage(requestParameters: TRequestParameters): TRequest<TResponse<ImageBitmap | HTMLImageElement>> {
const request = helper.getArrayBuffer(requestParameters);
return {
response: (async () => {
const response = await request.response;
const image = await arrayBufferToCanvasImageSource(response.data);
return {
data: image,
cacheControl: response.cacheControl,
expires: response.expires
};
})(),
cancel: request.cancel
};
}
It is synchronous, but returns an object consisting of two fields: response - a Promise, which is resolved with an object (3 fields: data, cacheControl, expires, but that's not interesing for us) and cancel - a method that cancels the request.
The function works as expected and everything about it is just fine. However, I need to implement an additional constraint. It is necessary to make sure that the number of parallel (simultaneous) requests to the network at any given point in time does not exceed n.
Thus, if n === 0, no request should be made. If n === 1, then only one image can be loaded at a time (that is, all images are loaded sequentially). For n > 1 < m, no more than m images can be loaded simultaneously.
My solution
Based on the fact that the getImage function is synchronous, the line
const request = helper.getArrayBuffer(requestParameters);
is executed immediately when getImage is called. That's not what we want though, we need to postpone the execution of the request itself. Therefore, we will replace the request variable with the requestMaker function, which we will call only when we need it:
export function getImage(requestParameters: TRequestParameters): TRequest<TResponse<ImageBitmap | HTMLImageElement>> {
if (webpSupported.supported) {
if (!requestParameters.headers) requestParameters.headers = {};
requestParameters.headers['Accept'] = 'image/webp,*/*';
}
function requestMaker() {
const request = helper.getArrayBuffer(requestParameters);
return request;
}
return {
response: (async () => {
const response = await requestMaker().response;
const image = await arrayBufferToCanvasImageSource(response.data);
return {
data: image,
cacheControl: response.cacheControl,
expires: response.expires
};
})(),
cancel() {
//
}
};
}
(Let's omit the cancel for now for the sakes of simplicity).
Now the execution of this requestMaker function, which makes the request itself, needs to be postponed until some point.
Suppose now we are trying to solve the problem only for n === 1.
Let's create an array in which we will store all requests that are currently running:
const ongoingImageRequests = [];
Now, inside requestMaker, we will save requests to this variable as soon as they occur, and delete them as soon as we receive a response:
const ongoingImageRequests = [];
export function getImage(requestParameters: TRequestParameters): TRequest<TResponse<ImageBitmap | HTMLImageElement>> {
if (webpSupported.supported) {
if (!requestParameters.headers) requestParameters.headers = {};
requestParameters.headers['Accept'] = 'image/webp,*/*';
}
function requestMaker() {
const request = helper.getArrayBuffer(requestParameters);
ongoingImageRequests.push(request);
request.response.finally(() => ongoingImageRequests.splice(ongoingImageRequests.indexOf(request), 1));
return request;
}
return {
response: (async () => {
const response = await requestMaker().response;
const image = await arrayBufferToCanvasImageSource(response.data);
return {
data: image,
cacheControl: response.cacheControl,
expires: response.expires
};
})(),
cancel() {
//
}
};
}
It's only left now to add a restriction regarding the launch of requestMaker: before starting it, we need to wait until all the requests from the array are finished:
const ongoingImageRequests = [];
export function getImage(requestParameters: TRequestParameters): TRequest<TResponse<ImageBitmap | HTMLImageElement>> {
if (webpSupported.supported) {
if (!requestParameters.headers) requestParameters.headers = {};
requestParameters.headers['Accept'] = 'image/webp,*/*';
}
function requestMaker() {
const request = helper.getArrayBuffer(requestParameters);
ongoingImageRequests.push(request);
request.response.finally(() => ongoingImageRequests.splice(ongoingImageRequests.indexOf(request), 1));
return request;
}
return {
response: (async () => {
await Promise.allSettled(ongoingImageRequests.map(ongoingImageRequest => ongoingImageRequest.response));
const response = await requestMaker().response;
const image = await arrayBufferToCanvasImageSource(response.data);
return {
data: image,
cacheControl: response.cacheControl,
expires: response.expires
};
})(),
cancel() {
//
}
};
}
I understand it this way: when getImage starts executing (it is called from somewhere outside), it immediately returns an object in which response is a Promise, which will resolve at least not before the moment when all the other requests from the queue are completed.
But, as it turns out, this solution for some reason does not work. The question is why? And how to make it work? At least for n === 1.

How to improve Javascript performance

I know the title is quite generic but I am inserting 1 Million records into a AWS DynamoDB and currently it takes ~30 minutes to load.
I have the 1 Million records in memory, I just need to improve the speed to insert the items. AWS only allows to send batches of 25 records but I all my code in syncronous.
Usually my data has a very small amount of data in the object (e.g. like 3-5 properties with number ids)
I read the 1 million entries from a CSV and basically store it in data array
Then I do this:
await DatabaseHandler.batchWriteItems('myTable', data); // data length is 1 Million
Which calls my insert function
const documentClient = new DynamoDB.DocumentClient();
export class DatabaseHandler {
static batchWriteItems = async (tableName: string, data: {}[]) => {
// AWS only allows batches of max 25 items
while (data.length) {
const batch = data.splice(0, 25);
const putRequests = batch.map((elem => {
return {
PutRequest: {
Item: elem
}
};
});
const params = {
RequestItems: {
[tableName]: putRequests,
},
};
await documentClient.batchWrite(params).promise();
}
}
}
I believe I am doing 40,000 HTTP requests to create 25 records in the database
Is there any way to improve this? Even some ideas would be great
Your code is "blocking", in the sense that you're waiting for the previous batch to execute before executing the next one. This is not the nature of JavaScript, and you're not taking advantage of promises. Instead, you can send all your requests at once, and JS' asynchronism will kick in and do all the work for you, which will be significantly faster:
// in your class method:
const proms = []; // <-- create a promise array
while (data.length) {
const batch = data.splice(0, 25);
const putRequests = batch.map((elem => {
return {
PutRequest: {
Item: elem
}
};
});
const params = {
RequestItems: {
[tableName]: putRequests,
},
};
proms.push(documentClient.batchWrite(params).promise()); // <-- add the promise to our array
}
}
await Promise.all(proms); // <-- wait for everything to be resolved asynchronously, then be done
This will speed up your request monumentally, as long as AWS lets you send that many concurrent requests.
I'm not sure how exactly you implemented the code, but to prove that it works, here's a dummy implementation (expect to wait about a minute):
const request = (_, t = 5) => new Promise(res => setTimeout(res, t)); // implement a dummy request API
// with your approach
async function a(data) {
while(data.length) {
const batch = data.splice(0, 25);
await request(batch);
}
}
// non-blocking
async function b(data) {
const proms = [];
while(data.length) {
const batch = data.splice(0, 25);
proms.push(request(batch));
}
await Promise.all(proms);
}
(async function time(a, b) {
const data = Array(10000).fill(); // create some dummy data (10,000 instead of a million or you'll be staring at this demo for a while)
console.time("original");
await a(data);
console.timeEnd("original");
console.time("optimized");
await b(data);
console.timeEnd("optimized");
})(a, b);

Parallel HTTP requests in batches with async for loop for each request

I am trying to run parallel requests in batches to an API using a bunch of keywords in an array. Article by Denis Fatkhudinov.
The problem I am having is that for each keyword, I need to run the request again with a different page argument for as many times as the number in the pages variable.
I keep getting Cannot read property 'then' of undefined for the return of the chainNext function.
The parallel request in batches on its own, without the for loop, works great, I am struggling to incorporate the for loop on the process.
// Parallel requests in batches
async function runBatches() {
// The keywords to request with
const keywords = ['many keyword strings here...'];
// Set max concurrent requests
const concurrent = 5;
// Clone keywords array
const keywordsClone = keywords.slice()
// Array for future resolved promises for each batch
const promises = new Array(concurrent).fill(Promise.resolve());
// Async for loop
const asyncForEach = async (pages, callback) => {
for (let page = 1; page <= pages; page++) {
await callback(page);
}
};
// Number of pages to loop for
const pages = 2;
// Recursively run batches
const chainNext = (pro) => {
// Runs itself as long as there are entries left on the array
if (keywordsClone.length) {
// Store the first entry and conviently also remove it from the array
const keyword = keywordsClone.shift();
// Run 'the promise to be' request
return pro.then(async () => {
// ---> Here was my problem, I am declaring the constant before running the for loop
const promiseOperation = await asyncForEach(pages, async (page) => {
await request(keyword, page)
});
// ---> The recursive invocation should also be inside the for loop
return chainNext(promiseOperation);
});
}
return pro;
}
return await Promise.all(promises.map(chainNext));
}
// HTTP request
async function request(keyword, page) {
try {
// request API
const res = await apiservice(keyword, page);
// Send data to an outer async function to process the data
await append(res.data);
} catch (error) {
throw new Error(error)
}
}
runBatches()
The problem is simply that pro is undefined, because you haven't initialized it.
You basically execute this code:
Promise.all(new Array(concurrent).fill(Promise.resolve().map(pro => {
// pro is undefined here because the Promise.resolve had no parameter
return pro.then(async () => {})
}));
I'm not completely sure about your idea behind that, but this is your problem in a more condensed version.
I got it working by moving actual request promiseOperation inside the for loop and returning the recursive function there too
// Recursively run batches
const chainNext = async (pro) => {
if (keywordsClone.length) {
const keyword = keywordsClone.shift()
return pro.then(async () => {
await asyncForEach(pages, (page) => {
const promiseOperation = request(keyword, page)
return chainNext(promiseOperation)
})
})
}
return pro
}
Credit for the parallel requests in batches goes to https://itnext.io/node-js-handling-asynchronous-operations-in-parallel-69679dfae3fc

Run HTTP requests in chunks

I want to run 1 thundered http requests in configurable chunks, and set configurable timeout between chunk requests. The request is based on the data provided with some.csv file.
It doesn't work because I am getting a TypeError, but when I remove () after f, it doesn't work either.
I would be very grateful for a little help. Probably the biggest problem is that I don't really understand how exactly promises work, but I tried multiple solutions and I wasn't able to achieve what I want.
The timeout feature will probably give me even more headache so I would appreciate any tips for this too.
Can you please help me to understand why it doesn't work?
Here is the snippet:
const rp = require('request-promise');
const fs = require('fs');
const { chunk } = require('lodash');
const BATCH_SIZE = 2;
const QUERY_PARAMS = ['clientId', 'time', 'changeTime', 'newValue'];
async function update(id, time, query) {
const options = {
method: 'POST',
uri: `https://requesturl/${id}?query=${query}`,
body: {
"prop": {
"time": time
}
},
headers: {
"Content-Type": "application/json"
},
json: true
}
return async () => { return await rp(options) };
}
async function batchRequestRunner(data) {
const promises = [];
for (row of data) {
row = row.split(',');
promises.push(update(row[0], row[1], QUERY_PARAMS.join(',')));
}
const batches = chunk(promises, BATCH_SIZE);
for (let batch of batches) {
try {
Promise.all(
batch.map(async f => { return await f();})
).then((resp) => console.log(resp));
} catch (e) {
console.log(e);
}
}
}
async function main() {
const input = fs.readFileSync('./input.test.csv').toString().split("\n");
const requestData = input.slice(1);
await batchRequestRunner(requestData);
}
main();
Clarification for the first comment:
I have a csv file which looks like below:
clientId,startTime
123,13:40:00
321,13:50:00
the file size is ~100k rows
the file contains information how to update time for a particular clientId in the database. I don't have an access to the database but I have access to an API which allows to update entries in the database.
I cannot make 100k calls at once, because: my network is limited (I work remotely because of coronavirus), it comsumpts a lot of memory, and API can also be limited and can crash if I will make all the requests at once.
What I want to achieve:
Load csv into memory, convert it to an Array
Handle api requests in chunks, for example take first two rows from the array, make API call based on the first two rows, wait 1000ms, take another two rows, and continue processing until the end of array (csv file)
Well, it seems like this is a somewhat classic case of where you want to process an array of values with some asynchronous operation and to avoid consuming too many resources or overwhelming the target server, you want to have no more than N requests in-flight at the same time. This is a common problem for which there are pre-built solutions for. My goto solution is a small piece of code called mapConcurrent(). It's analagous to array.map(), but it assumes a promise-returning asynchronous callback and you pass it the max number of items that should ever be in-flight at the same time. It then returns to you a promise that resolves to an array of results.
Here's mapConcurrent():
// takes an array of items and a function that returns a promise
// returns a promise that resolves to an array of results
function mapConcurrent(items, maxConcurrent, fn) {
let index = 0;
let inFlightCntr = 0;
let doneCntr = 0;
let results = new Array(items.length);
let stop = false;
return new Promise(function(resolve, reject) {
function runNext() {
let i = index;
++inFlightCntr;
fn(items[index], index++).then(function(val) {
++doneCntr;
--inFlightCntr;
results[i] = val;
run();
}, function(err) {
// set flag so we don't launch any more requests
stop = true;
reject(err);
});
}
function run() {
// launch as many as we're allowed to
while (!stop && inflightCntr < maxConcurrent && index < items.length) {
runNext();
}
// if all are done, then resolve parent promise with results
if (doneCntr === items.length) {
resolve(results);
}
}
run();
});
}
Your code can then be structured to use it like this:
function update(id, time, query) {
const options = {
method: 'POST',
uri: `https://requesturl/${id}?query=${query}`,
body: {
"prop": {
"time": time
}
},
headers: {
"Content-Type": "application/json"
},
json: true
}
return rp(options);
}
function processRow(row) {
let rowData = row.split(",");
return update(rowData[0], rowData[1], rowData[2]);
}
function main() {
const input = fs.readFileSync('./input.test.csv').toString().split("\n");
const requestData = input.slice(1);
// process this entire array with up to 5 requests "in-flight" at the same time
mapConcurrent(requestData, 5, processRow).then(results => {
console.log(results);
}).catch(err => {
console.log(err);
});
}
You can obviously adjust the number of concurrent requests to whatever number you want. I set it to 5 here in this example.

How to ensure that all operations in a cloud function have been successfully completed?

I am using Firebase Cloud Functions, which get triggered by the creation of a document in Firestore. On creation of the object, I need to undertake two different operations in parallel:
update the value of a field in a specific document (not the one which was created and triggered the cloud function)
run a transaction on another document.
So my questions are:
How do I ensure that both of my operations have been successfully completed before ending the cloud function itself?
How do I implement separate retry mechanism for each of the two operations (as I do not want a common retry mechanism for the whole function as it can redo the transaction operation even if it was the other operation that had failed)?
Here is my current code:
exports.onCityCreated = functions.firestore
.document('Cities/{cityId}')
.onCreate((snap, context) => {
const db = admin.firestore();
const newCity = snap.data();
const mayorId = newEvent.mayorID;
const mayorRef = db.doc('Users/'+ mayorId);
const timestamp = admin.firestore.FieldValue.serverTimestamp();
db.doc('Utils/lastPost').update({timestamp: timestamp}); //First Operation - Timestamp Update
return db.runTransaction(t => { //Second Operation - Transaction
return t.get(mayorRef).then(snapshot => {
var new_budget = snapshot.data().b - 100;
return t.update(mayorRef, {b: new_budget});
})
.then(result => {
return console.log('Transaction success!');
})
.catch(err => {
console.log('Transaction failure:', err);
});
});
});
Whenever you have multiple operations like this, the solution is to use Promise.all(). This takes an array of promises, and in turn returns a promise that resolves when all promises you passed in are resolved.
exports.onCityCreated = functions.firestore
.document('Cities/{cityId}')
.onCreate((snap, context) => {
const db = admin.firestore();
const newCity = snap.data();
const mayorId = newEvent.mayorID;
const mayorRef = db.doc('Users/'+ mayorId);
const timestamp = admin.firestore.FieldValue.serverTimestamp();
var p1 = db.doc('Utils/lastPost').update({timestamp: timestamp});
var p1 = db.runTransaction(t => {
return t.get(mayorRef).then(snapshot => {
var new_budget = snapshot.data().b - 100;
return t.update(mayorRef, {b: new_budget});
})
});
return Promise.all([p1, p2]);
});

Categories