I am trying to fetch the HTML script of two webpages using their URLs. This is my code:
const links = ["url1" : "https://.......", "url2" : "https://......"];
var responses = [];
for(const [key,value] of Object.entries(links)){
let resp = fetch('https://api.codetabs.com/v1/proxy?quest='+value)
responses.push(resp);
}
Promise.all(responses)
.then( htmlfiles =>{
htmlfiles.forEach(file=>{
file.text().then(function(data){
gethtmldata(data);
})
})
})
In my function gethtmldata, I am parsing this data in HTML format:
function gethtmldata(html_data){
var parser = new DOMParser();
var htmldoc = parser.parseFromString(html_data, "text/html");
console.log(htmldoc); //shows data of url2, then url1
}
To my utter surprise, the data of url2 gets printed first, then url1. Why?
It should show the html data of url1 then url2. How do I fix this?
The iterations of your for loop aren't paused when you do file.text().then(function(data){...}. Instead, your loop fires off multiple .text() calls which will complete sometime in the future, with no guaranteed order on which ones will complete first.
You should .push() a Promise that resolves to your .text() data instead when you create resp:
const links = {"url1" : "https://.......", "url2" : "https://......"};
const urls = Object.values(links);
const responses = [];
for(const value of urls){
const txtPromise = fetch('https://api.codetabs.com/v1/proxy?quest='+value).then(resp => resp.text());
responses.push(txtPromise);
}
Promise.all(responses)
.then(htmlData => {
htmlData.forEach(data=>{
gethtmldata(data);
});
});
You can refactor the above by using .map() and async/await like so:
async function fetchHTMLData(urls) {
const promises = urls.map(async url => {
const resp = await fetch('https://api.codetabs.com/v1/proxy?quest='+url);
return resp.text();
});
return Primise.all(promises);
};
async function processHTMLData() {
const links = {"url1" : "https://.......", "url2" : "https://......"};
const urls = Object.values(links);
const htmlArr = await fetchHTMLData(urls);
htmlArr.forEach(htmlStr => {
gethtmldata(htmlStr);
});
}
Related
Hi I have exported using data (hawkers collection) using getDocs() from Firebase.
After that I put each hawker data as an object in an array called allStall as shown in the screenshot of the console log below.
Question 1 - How do I access each individual object in my allStall array. I try to use .map() to access each of it, but i am getting nothing.
Do note that I already have data inside my allStall array, see screenshot above.
[Update] map doesn't work in code below because field is stallname not stallName. However, it needs to be async + await if using/call in/from other function.
Question 2 - Why is there [[Prototype]]: Array(0) in my allStall array
export /*Soln add async*/function getAllStall(){
var allStall = [];
try
{
/*Soln add await */getDocs(collection(db, "hawkers")).then((querySnapshot) =>
{
querySnapshot.forEach((doc) =>
{
var stall = doc.data();
var name = stall.stallname;
var category = stall.category;
var description = stall.description;
var stallData = {
stallName:name,
stallCategory:category,
stallDescription:description
};
allStall.push(stallData);
});});
console.log(allStall);
//Unable to access individual object in Array of objects
allStall.map(stall =>{console.log(stall.stallName);});}
catch (e) {console.error("Error get all document: ", e);}
return allStall;
}
In my main js file, i did the following:
useEffect(/*Soln add await*/() =>
{
getAllStall();
/*Soln:replace the statement above with the code below
const allStall = await getAllStall();
allStall.map((stall)=>console.log(stall.stallname));
*/
}
);
You are getting nothing because allStall is empty since you are not waiting for the promise to be fullfilled
try this
export const getAllStall = () => getDocs(collection(db, "hawkers"))
.then((querySnapshot) =>
querySnapshot.map((doc) =>
{
const {stallName, category, description} = doc.data();
return {
stallName:name,
stallCategory:category,
stallDescription:description
};
});
)
try to change use effect like this
useEffect(async () =>
{
const allStats = await getAllStall();
console.log(allStats)
allStats.forEach(console.log)
}
);
A very big thanks to R4ncid, you have been an inspiration!
And thank you all who commented below!
I managed to get it done with async and await. Latest update, I figure out what's wrong with my previous code too. I commented the solution in my question, which is adding the async to the function and await to getDocs.
Also map doesn't work in code above because field is stallname not stallName. However, it needs to be async + await if using in/calling from other function.
Helper function
export async function getAllStall(){
const querySnapshot = await getDocs(collection(db, "hawkers"));
var allStall = [];
querySnapshot.forEach(doc =>
{
var stall = doc.data();
var name = stall.stallname;
var category = stall.category;
var description = stall.description;
var stallData = {
stallName:name,
stallCategory:category,
stallDescription:description
};
allStall.push(stall);
}
);
return allStall;
}
Main JS file
useEffect(async () =>
{
const allStall = await getAllStall();
allStall.map((stall)=>console.log(stall.stallname));
}
);
Hurray
This question already has answers here:
Using async/await with a forEach loop
(33 answers)
Closed 1 year ago.
I'm making a program that consists of three different functions:
downloadPDF: download a PDF from the web
getPDF: read and parse the pdf
getDaata: loop through getPDF
Problem I'm having is that the third function(getData) that has a for of loop that runs getPDF, it seems as if it doesn't let getPDF finish before trying to console.log the result that getPDF returns.
Here are the three functions:
async function downloadPDF(pdfURL, outputFilename) {
let pdfBuffer = await request.get({uri: pdfURL, encoding: null});
console.log("Writing downloaded PDF file to " + outputFilename + "...");
fs.writeFileSync(outputFilename, pdfBuffer);
}
async function getPDF(query, siteName, templateUrl, charToReplace) {
const currentWeek = currentWeekNumber().toString();
await downloadPDF(templateUrl.replace(charToReplace, currentWeek), "temp/pdf.pdf");
var resultsArray = []
let dataBuffer = fs.readFileSync("temp/pdf.pdf");
pdf(dataBuffer).then(function(data) {
pdfContent = data.text;
const splittedArray = pdfContent.split("\n");
const parsedArray = splittedArray.map((item, index) => {
if(item.includes(query)) {
resultsArray.push({result: item, caseId: splittedArray[index-1].split(',', 1)[0], site: siteName});
}
}).filter(value => value);
return(resultsArray);
});
fs.unlinkSync("temp/pdf.pdf"); //deletes the downloaded file
}
async function getData(query, desiredSites) {
var resultsArray = []
for (const value of desiredSites) {
let result = await getPDF(query, sitesList.sites[value].name, sitesList.sites[value].templateUrl, sitesList.sites[value].charToReplace);
console.log(result)
}
}
getData("test", ['a', 'b']);
In the bottom function(getData), the console.log results in undefined
I'm guessing this has something to do with the promises. Any ideas? Thanks a lot!
In getPDF, you should chain all your async functions with await instead of .then or vice versa.
You can mix await with .then but this would be not easy to chain them with linear codes. The reason people use await because they want to make the codes look linear and easy to maintain.
async function downloadPDF(pdfURL, outputFilename) {
let pdfBuffer = await request.get({ uri: pdfURL, encoding: null });
console.log("Writing downloaded PDF file to " + outputFilename + "...");
fs.writeFileSync(outputFilename, pdfBuffer);
}
async function getPDF(query, siteName, templateUrl, charToReplace) {
const currentWeek = currentWeekNumber().toString();
await downloadPDF(
templateUrl.replace(charToReplace, currentWeek),
"temp/pdf.pdf"
);
var resultsArray = [];
let dataBuffer = fs.readFileSync("temp/pdf.pdf");
const data = await pdf(dataBuffer);
pdfContent = data.text;
const splittedArray = pdfContent.split("\n");
const resultsArray = splittedArray
.filter(item => item.includes(query))
.map(item => ({
result: item,
caseId: splittedArray[index - 1].split(",", 1)[0],
site: siteName,
}));
fs.unlinkSync("temp/pdf.pdf"); //deletes the downloaded file
return resultsArray;
}
async function getData(query, desiredSites) {
for (const value of desiredSites) {
let result = await getPDF(
query,
sitesList.sites[value].name,
sitesList.sites[value].templateUrl,
sitesList.sites[value].charToReplace
);
console.log(result);
}
}
getData("test", ["a", "b"])
.then(() => console.log("done"))
.catch(console.log);
The good thing about async is, it speeds up the fetch, and we get whatever is available first instead of server to respond.
But this messes up the ORDER of the requests which in this case is important for me.
I wanted to scrape different parts of a continuous story, so the order is a must.
Here's what I did:
async function getSingle(url) {
await fetch(url).then(function (response) {
return response.text();
}).then(function (html) {
var output = document.querySelector('.output');
output.innerHTML=html;
text += "\n" + document.querySelector('.post-body').innerText;
});
}
It gets everything perfectly, but when I call it for multiple URLs, it returns them unordered.
[Beginner here, so please pardon if it's something really trivial.]
You function the same with (I recommended use async/await completely)
async function getSingle(url) {
const response = await fetch(url);
const html = await response.text();
const output = document.querySelector('.output');
output.innerHTML = html;
text += "\n" + document.querySelector('.post-body').innerText; // "text" ???
}
To keep the order, you can call the function in order by a for-loop, the request will be called one by one:
const urls = ['url1', 'url2'];
for (const url of urls) {
await getSingle(url);
}
If you want to call the requests in parallel, let's use Promise.all. But you have to refactor your function - it will returns html string:
async function getSingle(url) {
const response = await fetch(url);
const html = await response.text();
return html
}
const htmls = await Promise.all(urls.map(url => getSingle(url));
for (const html of htmls) {
const output = document.querySelector('.output');
output.innerHTML = html;
text += "\n" + document.querySelector('.post-body').innerText; // "text" ???
}
You could try to keep track of the order they were sent, then when you receive the response, try and add your html to them dom in the same order. something like this (I didn't test it):
let order = [];
let index = 0;
let completed_index=0;
async function getSingle(url) {
let i=index++;
order[i] = {index:i}
await fetch(url).then(function (response) {
order[i] = response.text();
}).then(function (html) {
while(order[completed_index]!==undefined{
completed_index++;
var output = document.querySelector('.output');
output.innerHTML=order[completed_index];
text += "\n" + document.querySelector('.post-body').innerText;
}
});
}
you might want to clear the array afterwards
I'm fetching article list data from API and use/fetch Unsplash API to get relative images according to each fetched article title.
This is my code:
let url = 'http://127.0.0.1:8000/api';
async function getData(url) {
const res = await fetch(url);
const objects = await res.json();
await Promise.all(objects.map(async (object) => {
const res = await fetch('https://api.unsplash.com/search/photos?client_id=XXX&content_filter=high&per_page=1&query=' + object.title);
const image = await res.json();
object.image_url = image.results[0].urls.small
object.image_alt = image.results[0].alt_description
}));
}
let articles_1 = getData(url + '/articles/index/1/');
let articles_2 = getData(url + '/articles/index/2/');
let articles_3 = getData(url + '/articles/index/3/');
I am showing three different categories at once on the same page. That's why I call that function three times.
Question:
When this function kicks, results are shown after both article data and images are fetched. But I want to show article data first when it's been fetched and then images when they get fetched in order to shorten the user waiting time. How can I achieve it wether with Svelte reactive declaration or plain Javascript?
You would want to seperate the two functions, so they can be called in sequence.
const endpoint = 'http://127.0.0.1:8000/api';
const getArticles = async (url) => {
return fetch(url).res.json();
};
const renderArticles = async (articles) => {
// render article set and return it as a DOM object
return articlesDOM;
};
const getImageForArticle = async (articleNode) => {
const res = await = fetch('https://api.unsplash.com/search/photos?client_id=XXX&content_filter=high&per_page=1&query=' + object.title);
const img = new Image();
img.src = res.results[0].urls.small;
img.alt = res.results[0].alt_description;
return {img, articleNode};
};
const renderImage = async (stuff) => {
const {img, articleNode} = stuff;
// inject your img into your article
};
// now call in sequence
getArticles(endpoint+'/articles/index/1/').then(renderArticles).then(articleNodes => {
const promises = articleNodes.map(articleNode => {
return getImageForArticle(articleNode).then(renderImage);
});
return Promise.all(promises);
});
While I'm not completely sure what you're trying to do (I don't have a minimal working example), here's my best attempt at it:
var url = 'http://127.0.0.1:8000/api';
async function getData(url) {
var data = fetch(url)
.then(data => data.json())
await data.then(data => ArticleFunc(data))
await data.then(function(data) {
data.map(function(object) {
fetch('https://api.unsplash.com/search/photos?client_id=XXX&content_filter=high&per_page=1&query=' + object.title)
.then(data => data.json())
object.image_url = image.results[0].urls.small
object.image_alt = image.results[0].alt_description
ImageFunc(object)
})
})
}
function ArticleFunc(data){
//display article
}
function ImageFunc(data){
//display image
}
getData(url + '/articles/index/1/');
getData(url + '/articles/index/2/');
getData(url + '/articles/index/3/');
Note that this is to be treated as pseudocode, as again, it is untested due to the absense of a minimal working example.
I'm trying to improve a firestore get function, I have something like:
return admin.firestore().collection("submissions").get().then(
async (x) => {
var toRet: any = [];
for (var i = 0; i < 10; i++) {
try {
var hasMedia = x.docs[i].data()['mediaRef'];
if (hasMedia != null) {
var docData = (await x.docs[i].data()) as MediaSubmission;
let submission: MediaSubmission = new MediaSubmission();
submission.author = x.docs[i].data()['author'];
submission.description = x.docs[i].data()['description'];
var mediaRef = await admin.firestore().doc(docData.mediaRef).get();
submission.media = mediaRef.data() as MediaData;
toRet.push(submission);
}
}
catch (e) {
console.log("ERROR GETTIGN MEDIA: " + e);
}
}
return res.status(200).send(toRet);
});
The first get is fine but the performance is worst on the line:
var mediaRef = await admin.firestore().doc(docData.mediaRef).get();
I think this is because the call is not batched.
Would it be possible to do a batch get on an array of mediaRefs to improve performance?
Essentially I have a collection of documents which have foreign references stored by a string pointing to the path in a separate collection and getting those references has been proven to be slow.
What about this? I did some refactoring to use more await/async code, hopefully my comments are helpful.
The main idea is to use Promise.all and await all the mediaRefs retrieval
async function test(req, res) {
// get all docs
const { docs } = await admin
.firestore()
.collection('submissions')
.get();
// get data property only of docs with mediaRef
const datas = await Promise.all(
docs.map(doc => doc.data()).filter(data => data.mediaRef),
);
// get all media in one batch - this is the important change
const mediaRefs = await Promise.all(
datas.map(({ mediaRef }) =>
admin
.firestore()
.doc(mediaRef)
.get(),
),
);
// create return object
const toRet = datas.map((data: MediaSubmission, i) => {
const submission = new MediaSubmission();
submission.author = data.author;
submission.description = data.description;
submission.media = mediaRefs[i].data() as MediaData;
return submission;
});
return res.status(200).send(toRet);
}