I am relatively new to NodeJS. I was wondering if the await keyword will slow down the entire javascript/nodeJS program?
For example,
If I have many express routers written on a single server file, and one router function calls the 'await' for a promise to resolve, will all others routers and asynchronous functions stay on halt/paused until the promise is resolved? Or just that thread will be paused?
In such a case the await call will cause performance issues to the Javascript program?
No. While await sounds like it is blocking, it is fully asynchronous(non blocking) as it is also implied in the required function signature by async keyword. It is a (lot) nicer way to use promises. So go ahead and use it in your code.
You also mentioned threads, I suggest you ignore thread concept while developing node.js apps and trust the node.js event loop. Just never use blocking IO calls(which are explicitly named so by having 'Sync' in the name).
await waits for a promise to get resolved, but as we know that node is asynchronous in nature, therefore, other requests will be made to the application will have no effect on them they will not wait for the previous request promise to be resolved.
example
route 1 -> it will await and iterate through million rows and return sum in response
route 2 -> it will only return '1' in response
now when we'll call route 1 first then route 2 then you'll see that still you'll get the response from route 2 immediately and when route 1 will get completed you'll get that response.
Related
I've strong background in a JS frameworks. Javascript is single threaded language and during the execution of code when it encounters any async task, the event loop plays the important role there.
Now I've been into Python/Django and things are going good but I'm very curious about how python handle any asynchronous tasks whenever I need to make a database query in django, I simple use the queryset method like
def func(id):
do_somthing()
user = User.objects.get(id=id)
do_somthing_with_user()
func()
another_func()
I've used the queryset method that fetch something from database and is a asynchronous function without await keyword. what will be the behaviour of python there, is it like block the execution below until the queryset response, is it continue the execution for the rest of code on another thread.
Also whenever we need to make an external request from python, we uses requests library method without await keyword,
r = request.get("https://api.jsonplaceholder.com/users")
# will the below code executes when the response of the request arrive?
do_somthing()
.
.
.
do_another_thing()
does python block the execution of code until the response arrives?
In JS the async task is handled by event loop while the rest of the code executes normally
console.log("first")
fetch("https://api.jsonplaceholder.com/users").then(() => console.log("third"))
console.log("second")
the logs in console will be
first
second
third
How Python Handles these type things under the hood?
The await command needs to be used specifically when you are running an asynchronous function that calls synchronous functions. This synchronous function is ran as a coroutine and has an object name coroutine when debugging. To run a coroutine, it must be scheduled on the event loop. Using the
await keyword releases control back to the event loop. Instead of awaiting a coroutine, you can also setup a task to schedule a coroutine to run at a certain time using the asyncio package. See documentation
asyncio.create_task(coro, *, name=None, context=None)
Python does not block execution but rather ensures that it happens at the correct time using CPU parallelism. Using a method like request.get() should not create any forthcoming errors because you are accessing values over the internet and not locally. Network requests move, essentially at the speed of light, where as database requests on the server are limited by the speed of the SSD or HDD. This was more of an issue on old drives where you are literally waiting for a disk to spin, but the idea of properly managing these local resources is still a issue that has to be solved in order for the asynchronous application to work. When making calls to the sql database, make sure to use the decorator #database_sync_to_async above the regular 'synchronous' functions that your asynchronous function calls.
from channels.db import database_sync_to_async
For more information check out the django documentation on asynchronous support.
I've studied sync and async in JavaScript. I'm going to make a crawling program using Puppeteer.
There are many code examples of crawling in Puppeteer.
But, I have one question: Why do they use async in basic Puppeteer example scripts?
Can't I use sync programming in Puppeteer? Is there an issue that I don't know about that makes async necessary?
It doesn't seem useful if I don't use multiple threads (multi-crawling).
For starters, I recommend reading How the single threaded non blocking IO model works in Node.js. This thread motivates the callback and promise-based models Node provides for achieving concurrency.
Whenever the Node process needs to access an out-of-process resource such as the file system or a network socket (as Puppeteer does to communicate with the browser it's connected to), there are two options:
Block the whole process and wait for the response, as fs.readFileSync does.
Use a promise or a callback to be notified of the response and go about other things, as fs.readFile (either via callback or fs.promises) and Puppeteer do.
The first option is a poor choice, with the only advantage being easier syntax to write. Blocking the thread to wait for a resource is like ordering a pizza, then doing nothing until the pizza arrives. You might as well read a book or water your plants while you wait.
Historically, callbacks were originally the only way to write concurrent code in Node. Eventually, promises and then arrived, which were better, but still posed readability burdens. With the advent of async/await, it's no longer difficult to write asynchronous code that reads like synchronous code. Synchronous APIs like fs's __Sync functions that alias an asynchronous API are historical artifacts. It's normal that Puppeteer doesn't offer page.waitForSelectorSync, page.$evalSync, etc.
Now, it's understandable to think that Puppeteer's asynchronous API is pointless in a simple, straight-line script since your Node process doesn't have anything else to do while awaiting responses, but having to type await for each call is the least evil of the available design options for the API.
Simply not awaiting promises isn't an option even when a script is a single sequence of straight-line code. Without await, ordering of operations/results becomes nondeterministic as each promise runs concurrently, independent of the others. This interleaving would be unintended in sequential code, but is a useful tool in cases when concurrency is desired.
For the authors of an asynchronous API where almost all calls are accessesing an external resource, as is the case with Puppeteer, the options are:
Write and maintain two versions of the API, a synchronous and an asynchronous version. No libraries that I know of do this -- it's a major pain with little benefit and plenty of room for misuse.
Write and maintain a synchronous API only to cater to the simple use case at the expense of making the library virtually unusable for anyone that cares about concurrency. Clearly, this is horrible design, like forcing everyone who orders a pizza (in the above real-world example) to do nothing until it arrives.
Write and maintain one asynchronous API, and make clients who don't care about concurrency in a particular program have to write await in front of all the calls. That's what Puppeteer does.
Incidentally, the fact that the browser is in a separate process tends to cause all manner of confusion in Puppeteer beginners. For example, the fact that data is serialized and deserialized (converted to a string) on every call to page.evaluate (and family) means that you can't pass complex structures like DOM nodes across the inter-process gap. You can't access variables you've defined in Node from the body of an evaluate callback without passing them as arguments to the evaluate call, and these variables need to be able to respond correctly to JSON.stringify() (that is, be serializable).
Just 13 hours before this post, someone asked node.js puppeteer "document is not defined" -- they were trying to access the browser process' document object inside of Node.
If you're on Windows, try running a simple Puppeteer Node script that doesn't close the browser, then look at your task manager. On Linux, you can run ps -a. You'll see that there's a Chromium browser and a Node process. The two processes communicate over a socket, which has much higher latency than intra-process communication and involves the operating system's network stack. Every Puppeteer call provides an opportunity for concurrency that'd be lost if Puppeteer's API was synchronous.
Understanding the inter-process gap is critical to success in Puppeteer because it motivates why the API calls are asynchronous, and helps clarify which code is executing in which process.
async is very important for data fetching/crawling. You can imagine this case, you have 1 element is book-container, but inside book-container, it will have book data coming later on UI with API fetch.
const scraperObject = {
url: 'http://book-store.com',
scraper(browser){
let page = browser.newPage();
page.goto(this.url);
page.waitForSelector('.book-container');
page.waitForSelector('.book');
//TODO: save book data after this
});
}
}
With this code snippet, it will run like this
page.goto(this.url) Go to the page with certain URL
page.waitForSelector('.book-container') No async here, so it will try to get .book-container element immediately (of course, it won't be there because the page is possibly still loading due to some network problem)
page.waitForSelector('.book') Similarly, it try to get book data immediately (even though book-container has not been in HTML yet)
To solve this problem, we should have async to WAIT for elements ready in HTML.
const scraperObject = {
url: 'http://book-store.com',
async scraper(browser){
let page = await browser.newPage();
await page.goto(this.url);
await page.waitForSelector('.book-container');
await page.waitForSelector('.book');
//TODO: save book data after this
});
}
}
Explain it again with async/await.
page.goto(this.url) Go to the page with certain URL and wait till the page loaded
page.waitForSelector('.book-container') Wait till .book-container element appears in HTML
page.waitForSelector('.book') Wait till .book element appears in HTML (we can understand that API's data responded)
ive found out that its a best practice for nodeJS / ExpressJS as an API-Endpoint for a ReactJS Software to only use asynchornus functions for mysql-querys etc.
But i really dont get how this should work and why it decrease performance if i didnt use it.
Please imagine the following code:
API Endpoint "/user/get/1"
Fetches every datas from user with id one and responds with a json content. If i use async there is no possibility to respond with the information gathered by the query, because its not fulfilled when the function runs to its end.
If i wrap it in a Promise, and wait until its finished its the same like a synchronus function - isnt it?
Please describe for me whats the difference between waiting for a async function or use sync function directly.
Thanks for your help!
If i wrap it in a Promise, and wait until its finished its the same
like a synchronus function - isnt it?
No, it isn't. The difference between synchronous and async functions in JavaScript is precisely that async code is scheduled to run whenever it can instead of immediately, right now. In other words, if you use sync code, your entire Node application will stop everything to grab your data from the database and won't do anything else until it's done. If you use a Promise and async code, instead, while your response won't come until it's done, the rest of the Node app will still continue running while the async code is getting the data, allowing other connections and requests to be made in the meantime.
Using async here isn't about making the one response come faster; it's about allowing other requests to be handled while waiting for that one response.
I have code that will make a number of concurrent calls to a piece of software that provides a rest interface to query our system. We recently discovered that a sufficiently high number of concurrent calls (somewhere between 30 to 100) can cause the software to fail and actually break the backed functionality.
As a work around I'd like to fix this by having my code block if more then X concurrent calls are outstanding, sending new calls only after the old calls resolve. As I already have a number of outstanding UIs I'd also prefer to change this behavior by updating a shared lower level Request method.
My 'request' method returns a promise, which is ultimately called within a redux Saga in each UI via the 'call' redux-effect. This means I need to be able to return a promise from my Request method, which means figuring some way to return a promise that won't try to resolve a Request until the earlier Requests resolve.
I started to write my own, but there is no easy way for me to 'alert' a promise, to tell it that previous promises have completed. The best way to roll my own I was able to figure out involved using Promise.race and then having the exact same callNextQueued method called for both the resolve and reject method, and generally it just feels kind of ugly conceptually.
Is there a better way to implement this? either an API that already does it, or a clean way to work with promises to role my own logic for starting a new request whenever any old request completes?
I'm using Promises in Angular (4) project and I have a question about them that I couldn't find a response to it in the docs.
When I create a Promise, I basically wait for an async answer from a service/party. But how long should I expect this Promise to stay in pending state?
Is there any mechanism that will terminate it after a while?
How reliable is this concept of waiting/pending?
Let's suppose that I need to get some data from a busy service that can answer even after few minutes of waiting, maybe more, no matter if the computing of the response is a resource intensive process or that service is linked with another one that is responding very slow.
Is there anything on the client side that will somehow terminate my Promise and determine/force to create another one to ask again for my data?
Someone suggested to upgrade to Observables, and I will do that, but for now I want to keep using Promises, at least for some areas of the code.
Tks a lot
A Promise can be in pending state as long as the page is loaded.
You can wrap the call in another Promise where you introduce a timeout like shown in
let wrappingPromise = new Promise((resolve, reject) => {
var error = false;
setTimeout(function(){
reject("some error");
}, 3000);
this.http.get(...).toPromise().then(res => {
if(!error) {
resolve(res.json);
}
});
});
This will cause an error when the timeout is reached.
It will still wait to receive the full response.
An Observable might be able to forward a cancellation and close the connection, so that the result isn't even received anymore when the timeout is reached. This might depend on whether the concrete implementation and the browser used browser API supports that.
new Promise(() => {}) will never settle, like a callback never called.
A promise is a return object you attach callbacks to, instead of passing callbacks into the function. That's all. It is not a control surface of the asynchronous operation that was just started.
Instead, look to the asynchronous API you called for such controls, if it has them.
Creating promises
Most people are consumers of promises returned from asynchronous APIs. There's no reason to create a Promise other than to wrap a legacy callback API. In an ideal world there'd be no need.