How can I update my dictionary with nested HTTP request? - javascript

I'm gonna try to explain this as clearly as I can, but it's very confusing to me so bear with me.
For this project, I'm using Node.js with the modules Axios and Cheerio.
I am trying to fetch HTML data from a webshop (similar to Amazon/eBay), and store the product information in a dictionary. I managed to store most things (title, price, image), but the product description is on a different page. To do a request to this page, I'm using the URL I got from the first request, so they are nested.
This first part is done with the following request:
let request = axios.get(url)
.then(res => {
// This gets the HTML for every product
getProducts(res.data);
console.log("Got products in HTML");
})
.then(res => {
// This parses the product HTML into a dictionary of product items
parseProducts(productsHTML);
console.log("Generated dictionary with all the products");
})
.then(res => {
// This loops through the products to fetch and add the description
updateProducts(products);
})
.catch(e => {
console.log(e);
})
I'll also provide the way I'm creating product objects, as it might clarify the function where I think the problem occurs.
function parseProducts(html) {
for (item in productsHTML) {
// Store the data from the first request
const $ = cheerio.load(productsHTML[item]);
let product = {};
let mpUrl = $("a").attr("href");
product["title"] = $("a").attr("title");
product["mpUrl"] = mpUrl;
product["imgUrl"] = $("img").attr("src");
let priceText = $("span.subtext").text().split("\xa0")[1].replace(",", ".");
product["price"] = parseFloat(priceText);
products.push(product);
}
}
The problem resides in the updateProducts function. If I console.log the dictionary afterwards, the description is not added. I think this is because the console will log before the description gets added. This is the update function:
function updateProducts(prodDict) {
for (i in prodDict) {
let request2 = axios.get(prodDict[i]["mpUrl"])
.then(res => {
const $ = cheerio.load(res.data);
description = $("div.description p").text();
prodDict[i]["descr"] = description;
// If I console.log the product here, the description is included
})
}
// If I console.log the product here, the description is NOT included
}
I don't know what to try anymore, I guess it can be solved with something like async/await or putting timeouts on the code. Can someone please help me with updating the products properly, and adding the product descriptions? Thank you SO much in advance.

To refactor this with async/await one would do:
async function fetchAndUpdateProducts() => {
try {
const response = await axios.get(url);
getProducts(response.data);
console.log("Got products in HTML");
parseProducts(productsHTML);
console.log("Generated dictionary with all the products");
await updateProducts(products);
} catch(e) {
console.log(e);
}
}
fetchAndUpdateProducts().then(() => console.log('Done'));
and
async function updateProducts(prodDict) {
for (i in prodDict) {
const response = await axios.get(prodDict[i]["mpUrl"]);
const $ = cheerio.load(response.data);
description = $("div.description p").text();
prodDict[i]["descr"] = description;
}
}
This will not proceed to conclude the call to fetchAndUpdateProducts unless the promise returned by updateProducts has been resolved.

Related

Is it possible to loop through an API, print separate results, and then combine them into a single variable?

I’m trying to read the sentiment of multiple Reddit posts. I’ve got the idea to work using 6 API calls but I think we can refactor it to 2 calls.
The wall I’m hitting - is it possible to loop through multiple APIs (one loop for each Reddit post we’re scrapping), print the results, and then add them into a single variable?
The last part is where I’m stuck. After looping through the API, I get separate outputs for each loop and I don’t know how to add them into a single variable…
Here’s a simple version of what the code looks like:
import React, { useState, useEffect } from 'react';
function App() {
const [testRedditComments, setTestRedditComments] = useState([]);
const URLs = [
'https://www.reddit.com/r/SEO/comments/tepprk/is_ahrefs_worth_it/',
'https://www.reddit.com/r/juststart/comments/jvs0d1/is_ahrefs_worth_it_with_these_features/',
];
useEffect(() => {
URLs.forEach((URL) => {
fetch(URL + '.json').then((res) => {
res.json().then((data) => {
if (data != null) setTestRedditComments(data[1].data.children);
});
});
});
}, []);
//This below finds the reddit comments and puts them into an array
const testCommentsArr = testRedditComments.map(
(comments) => comments.data.body
);
//This below takes the reddit comments and terns them into a string.
const testCommentsArrToString = testCommentsArr.join(' ');
console.log(testCommentsArrToString);
I've tried multiple approaches to adding the strings together but I've sunk a bunch of time. Anyone know how this works. Or is there a simpler way to accomplish this?
Thanks for your time and if you need any clarification let me know.
-Josh
const URLs = [
"https://www.reddit.com/r/SEO/comments/tepprk/is_ahrefs_worth_it/",
"https://www.reddit.com/r/juststart/comments/jvs0d1/is_ahrefs_worth_it_with_these_features/",
];
Promise.all(
URLs.map(async (url) => {
const resp = await fetch(url + ".json");
return resp.json();
})
).then((res) => console.log(res));
I have used Promise.all and got the response and attached a react sandbox with the same.
Based on your requirements, you can use state value or you can prepare your api response before setting it to state.

Firebase Firestore - Async/Await Not Waiting To Get Data Before Moving On?

I'm new to the "async/await" aspect of JS and I'm trying to learn how it works.
The error I'm getting is Line 10 of the following code. I have created a firestore database and am trying to listen for and get a certain document from the Collection 'rooms'. I am trying to get the data from the doc 'joiner' and use that data to update the innerHTML of other elements.
// References and Variables
const db = firebase.firestore();
const roomRef = await db.collection('rooms');
const remoteNameDOM = document.getElementById('remoteName');
const chatNameDOM = document.getElementById('title');
let remoteUser;
// Snapshot Listener
roomRef.onSnapshot(snapshot => {
snapshot.docChanges().forEach(async change => {
if (roomId != null){
if (role == "creator"){
const usersInfo = await roomRef.doc(roomId).collection('userInfo');
usersInfo.doc('joiner').get().then(async (doc) => {
remoteUser = await doc.data().joinerName;
remoteNameDOM.innerHTML = `${remoteUser} (Other)`;
chatNameDOM.innerHTML = `Chatting with ${remoteUser}`;
})
}
}
})
})
})
However, I am getting the error:
Uncaught (in promise) TypeError: Cannot read property 'joinerName' of undefined
Similarly if I change the lines 10-12 to:
remoteUser = await doc.data();
remoteNameDOM.innerHTML = `${remoteUser.joinerName} (Other)`;
chatNameDOM.innerHTML = `Chatting with ${remoteUser.joinerName}`;
I get the same error.
My current understanding is that await will wait for the line/function to finish before moving forward, and so remoteUser shouldn't be null before trying to call it. I will mention that sometimes the code works fine, and the DOM elements are updated and there are no console errors.
My questions: Am I thinking about async/await calls incorrectly? Is this not how I should be getting documents from Firestore? And most importantly, why does it seem to work only sometimes?
Edit: Here are screenshots of the Firestore database as requested by #Dharmaraj. I appreciate the advice.
You are mixing the use of async/await and then(), which is not recommended. I propose below a solution based on Promise.all() which helps understanding the different arrays that are involved in the code. You can adapt it with async/await and a for-of loop as #Dharmaraj proposed.
roomRef.onSnapshot((snapshot) => {
// snapshot.docChanges() Returns an array of the documents changes since the last snapshot.
// you may check the type of the change. I guess you maybe don’t want to treat deletions
const promises = [];
snapshot.docChanges().forEach(docChange => {
// No need to use a roomId, you get the doc via docChange.doc
// see https://firebase.google.com/docs/reference/js/firebase.firestore.DocumentChange
if (role == "creator") { // It is not clear from where you get the value of role...
const joinerRef = docChange.doc.collection('userInfo').doc('joiner');
promises.push(joinerRef.get());
}
});
Promise.all(promises)
.then(docSnapshotArray => {
// docSnapshotArray is an Array of all the docSnapshots
// corresponding to all the joiner docs corresponding to all
// the rooms that changed when the listener was triggered
docSnapshotArray.forEach(docSnapshot => {
remoteUser = docSnapshot.data().joinerName;
remoteNameDOM.innerHTML = `${remoteUser} (Other)`;
chatNameDOM.innerHTML = `Chatting with ${remoteUser}`;
})
});
});
However, what is not clear to me is how you differentiate the different elements of the "first" snapshot (i.e. roomRef.onSnapshot((snapshot) => {...}))). If several rooms change, the snapshot.docChanges() Array will contain several changes and, at the end, you will overwrite the remoteNameDOM and chatNameDOM elements in the last loop.
Or you know upfront that this "first" snapshot will ALWAYS contain a single doc (because of the architecture of your app) and then you could simplify the code by just treating the first and unique element as follows:
roomRef.onSnapshot((snapshot) => {
const roomDoc = snapshot.docChanges()[0];
// ...
});
There are few mistakes in this:
db.collection() does not return a promise and hence await is not necessary there
forEach ignores promises so you can't actually use await inside of forEach. for-of is preferred in that case.
Please try the following code:
const db = firebase.firestore();
const roomRef = db.collection('rooms');
const remoteNameDOM = document.getElementById('remoteName');
const chatNameDOM = document.getElementById('title');
let remoteUser;
// Snapshot Listener
roomRef.onSnapshot(async (snapshot) => {
for (const change of snapshot.docChanges()) {
if (roomId != null){
if (role == "creator"){
const usersInfo = roomRef.doc(roomId).collection('userInfo').doc("joiner");
usersInfo.doc('joiner').get().then(async (doc) => {
remoteUser = doc.data().joinerName;
remoteNameDOM.innerHTML = `${remoteUser} (Other)`;
chatNameDOM.innerHTML = `Chatting with ${remoteUser}`;
})
}
}
}
})

Pulling Articles from Large JSON Response

I'm trying to code something which tracks the Ontario Immigrant Nominee Program Updates page for updates and then sends an email alert if there's a new article. I've done this in PHP but I wanted to try and recreate it in JS because I've been learning JS for the last few weeks.
The OINP has a public API, but the entire body of the webpage is stored in the JSON response (you can see this here: https://api.ontario.ca/api/drupal/page%2F2020-ontario-immigrant-nominee-program-updates?fields=body)
Looking through the safe_value - the common trend is that the Date / Title is always between <h3> tags. What I did with PHP was create a function that stored the text between <h3> into a variable called Date / Title. Then - to store the article body text I just grabbed all the text between </h3> and </p><h3> (basically everything after the title, until the beginning of the next title), stored it in a 'bodytext' variable and then iterated through all occurrences.
I'm stumped figuring out how to do this in JS.
So far - trying to keep it simple, I literally have:
const fetch = require("node-fetch");
fetch(
"https://api.ontario.ca/api/drupal/page%2F2020-ontario-immigrant-nominee-program-updates?fields=body"
)
.then((result) => {
return result.json();
})
.then((data) => {
let websiteData = data.body.und[0].safe_value;
console.log(websiteData);
});
This outputs all of the body. Can anyone point me in the direction of a library / some tips that can help me :
Read through the entire safe_value response and break down each article (Date / Title + Article body) into an array.
I'm probably then just going to upload each article into a MongoDB and then I'll have it checked twice daily -> if there's a new article I'll send an email notif.
Any advice is appreciated!!
Thanks,
You can use regex to get the content of Tags e.g.
/<h3>(.*?)<\/h3>/g.exec(data.body.und[0].safe_value)[1]
returns August 26, 2020
With the use of some regex you can get this done pretty easily.
I wasn't sure about what the "date / title / content" parts were but it shows how to parse some html.
I also changed the code to "async / await". This is more of a personal preference. The code should work the same with "then / catch".
(async () => {
try {
// Make request
const response = await fetch("https://api.ontario.ca/api/drupal/page%2F2020-ontario-immigrant-nominee-program-updates?fields=body");
// Parse response into json
const data = await response.json();
// Get the parsed data we need
const websiteData = data.body.und[0].safe_value;
// Split the html into seperate articles (every <h2> is the start of an new article)
const articles = websiteData.split(/(?=<h2)/g);
// Get the data for each article
const articleInfo = articles.map((article) => {
// Everything between the first h3 is the date
const date = /<h3>(.*)<\/h3>/m.exec(article)[0];
// Everything between the first h4 is the title
const title = /<h4>(.*)<\/h4>/m.exec(article)[0];
// Everything between the first <p> and last </p> is the content of the article
const content = /<p>(.*)<\/p>/m.exec(article)[0];
return {date, title, content};
});
// Show results
console.log(articleInfo);
} catch(error) {
// Show error if there are any
console.log(error);
}
})();
Without comments
(async () => {
try {
const response = await fetch("https://api.ontario.ca/api/drupal/page%2F2020-ontario-immigrant-nominee-program-updates?fields=body");
const data = await response.json();
const websiteData = data.body.und[0].safe_value;
const articles = websiteData.split(/(?=<h2)/g);
const articleInfo = articles.map((article) => {
const date = /<h3>(.*)<\/h3>/m.exec(article)[0];
const title = /<h4>(.*)<\/h4>/m.exec(article)[0];
const content = /<p>(.*)<\/p>/m.exec(article)[0];
return {date, title, content};
});
console.log(articleInfo);
} catch(error) {
console.log(error);
}
})();
I just completed creating .Net Core worker service for this.
The value you are looking for is "metatags.description.og:updated_time.#attached.drupal_add_html_head..#value"
The idea is if the last updated changes you send an email notification!
Try this in you javascript
fetch(`https://api.ontario.ca/api/drupal/page%2F2021-ontario-immigrant-nominee-program-updates`)
.then((result) => {
return result.json();
})
.then((data) => {
let lastUpdated = data.metatags["og:updated_time"]["#attached"].drupal_add_html_head[0][0]["#value"];
console.log(lastUpdated);
});
I will be happy to add you to the email list for the app I just created!

Changing object property through $set gives:" TypeError: Cannot read property 'call' of undefined "

First of all, hello.
I'm relatively new to web development and Vue.js or Javascript. I'm trying to implement a system that enables users to upload and vote for pictures and videos. In general the whole system worked. But because i got all of my information from the server, the objects used to show the files + their stats wasn't reactive. I tried to change the way i change the properties of an object from "file['votes'] ) await data.data().votes" to "file.$set('votes', await data.data().votes)". However now i'm getting the TypeError: Cannot read property 'call' of undefined Error. I have no idea why this happens or what this error even means. After searching a lot on the internet i couldn't find anybody with the same problem. Something must be inheritly wrong with my approach.
If anybody can give me an explanation for what is happening or can give me a better way to handle my problem, I'd be very grateful.
Thanks in advance for anybody willing to try. Here is the Code section i changed:
async viewVideo() {
this.videoURLS = []
this.videoFiles = []
this.videoTitels = []
var storageRef = firebase.storage().ref();
var videourl = []
console.log("try")
var listRef = storageRef.child('User-Videos/');
var firstPage = await listRef.list({
maxResults: 100
});
videourl = firstPage
console.log(videourl)
if (firstPage.nextPageToken) {
var secondPage = await listRef.list({
maxResults: 100,
pageToken: firstPage.nextPageToken,
});
videourl = firstPage + secondPage
}
console.log(this.videoURLS)
if (this.videoURLS.length == 0) {
await videourl.items.map(async refImage => {
var ii = refImage.getDownloadURL()
this.videoURLS.push(ii)
})
try {
await this.videoURLS.forEach(async file => {
var fale2 = undefined
await file.then(url => {
fale2 = url.substring(url.indexOf("%") + 3)
fale2 = fale2.substring(0, fale2.indexOf("?"))
})
await db.collection("Files").doc(fale2).get().then(async data => {
file.$set('titel', await data.data().titel)
file.$set('date', await data.data().date)
if (file.$set('voted', await data.data().voted)) {
file.$set('voted', [])
}
file.$set('votes', await data.data().votes)
if (file.$set('votes', await data.data().votes)) {
file.$set('votes', 0)
}
await this.videoFiles.push(file)
this.uploadDate = data.data().date
console.log(this.videoFiles)
this.videoFiles.sort(function(a, b) {
return a.date - b.date;
})
})
})
} catch (error) {
console.log(error)
}
}
},
<script src="https://cdnjs.cloudflare.com/ajax/libs/vue/2.5.17/vue.js"></script>
firstly, file.$set('votes', await data.data().votes) is the wrong syntax to use. It should be this.$set(file, 'votes', data.data().votes). I am guessing the second data with data() returns an object with votes as a property.
Your use of await is not necessary here. await db.collection("Files").doc(fale2).get().then(async data => {....
You are already using a promise in the form of the .then block here. Async-await and the then/catch blocks are basically doing the same thing. It's one or the other.
Please check this fantastic post that covers how to deal with asynchronous code in javascript. Learning about the asynchronous nature of javascript is highly essential right now.
There's a fair bit to pick on, and for now my focus is on removing things from your code that are either redundant or may not make it work. I am not focusing on the logic. With more information, I may make necessary edits for the logic.
I will leave comments in the code, where I feel they are necessary
async viewVideo() {
this.videoURLS = []
this.videoFiles = []
this.videoTitels = []
var storageRef = firebase.storage().ref();
var videourl = '' // videourl should be initialised as a string
console.log("try")
var listRef = storageRef.child('User-Videos/');
var firstPage = listRef.list({ // the await here isn't necessary as this function isn't expected to return a promise(isn't asynchronous) to the best of my knowledge.
maxResults: 100
});
videourl = firstPage
console.log(videourl)
if (firstPage.nextPageToken) {
var secondPage = listRef.list({ // same as above
maxResults: 100,
pageToken: firstPage.nextPageToken,
});
videourl = firstPage + secondPage // videourl is a string here
}
console.log(this.videoURLS)
if (this.videoURLS.length == 0) {
videourl.items.map(async refImage => { //videourl is acting as an object here (something seems off here) - please explain what is happening here
// again await is not needed here as the map function does not return a promise
var ii = refImage.getDownloadURL()
this.videoURLS.push(ii)
})
try {
this.videoURLS.forEach(file => { // await here is not necessary as the forEach method does not return a promise
// The 'async' keyword is not necessary here. It is required to use the await keyword and due to the database call here, ordinarily it wouldn't be out of place, but you deal with that bit of asynchronous code using a `.then` block. It's `async-await` or `.then` and never both.
var fale2 = undefined
file.then(url => { // await is not necessary here as you use `.then`
// Also, does `file` return a promise? That's the only thing I can infer from `file.then`. It looks odd.
fale2 = url.substring(url.indexOf("%") + 3)
fale2 = fale2.substring(0, fale2.indexOf("?"))
})
db.collection("Files").doc(fale2).get().then(data => { // await and async not necessary due to the same reasons outlined above
this.$set(file, 'titel', data.data().titel) // correct syntax according to vue's documentation - https://vuejs.org/v2/guide/reactivity.html#Change-Detection-Caveats
this.$set(file, 'date', data.data().date)
if (this.$set(file, 'voted', data.data().voted)) { // I don't know what's going on here, I will just correct the syntax. I am not focused on the logic at this point
this.$set(file, 'voted', [])
}
this.$set(file, 'votes', data.data().votes)
if (this.$set(file, 'votes', data.data().votes)) {
this.$set(file, 'votes', 0)
}
this.videoFiles.push(file) // await not necessary here as the push method does not return a promise and also is not asynchronous
this.uploadDate = data.data().date
console.log(this.videoFiles)
this.videoFiles.sort(function(a, b) {
return a.date - b.date;
})
})
})
} catch (error) {
console.log(error)
}
}
},
Like I said at the beginning, this first attempt isn't designed to make the logic work. There's a lot going on there that I don't understand. I have focused on removing redundant code and correcting syntax errors. I may be able to look at the logic if more detail is provided.

How to stay within 2 GET requests/second per seconds with Axios (Shopify API)

I have about 650 products and each product has a lot of additional information relating to it being stored in metafields. I need all the metafield info to be stored in an array so I can filter through certain bits of info and display it to the user.
In order to get all the metafiled data, you need to make one API call per product using the product id like so: /admin/products/#productid#/metafields.json
So what I have done is got all the product ids then ran a 'for in loop' and made one call at a time. The problem is I run into a '429 error' because I end up making more than 2 requests per second. Is there any way to get around this like with some sort of queuing system?
let products = []
let requestOne = `/admin/products.json?page=1&limit=250`
let requestTwo = `/admin/products.json?page=2&limit=250`
let requestThree = `/admin/products.json?page=3&limit=250`
// let allProducts will return an array with all products
let allProducts
let allMetaFields = []
let merge
$(document).ready(function () {
axios
.all([
axios.get(`${requestOne}`),
axios.get(`${requestTwo}`),
axios.get(`${requestThree}`),
])
.then(
axios.spread((firstResponse, secondResponse, thirdResponse) => {
products.push(
firstResponse.data.products,
secondResponse.data.products,
thirdResponse.data.products
)
})
)
.then(() => {
// all 3 responses into one array
allProducts = [].concat.apply([], products)
})
.then(function () {
for (const element in allProducts) {
axios
.get(
`/admin/products/${allProducts[element].id}/metafields.json`
)
.then(function (response) {
let metafieldsResponse = response.data.metafields
allMetaFields.push(metafieldsResponse)
})
}
})
.then(function () {
console.log("allProducts: " + allProducts)
console.log("allProducts: " + allMetaFields)
})
.catch((error) => console.log(error))
})
When you hit 429 error, check for Retry-After header and wait for the number of seconds specified there.
You can also use X-Shopify-Shop-Api-Call-Limit header in each response to understand how many requests left until you exceed the bucket size limit.
See more details here: REST Admin API rate limits
By the way, you're using page-based pagination which is deprecated and will become unavailable soon.
Use cursor-based pagination instead.

Categories