I am trying to get innerText of https://www.example.com/ in a nodejs appplication. I tried using request npm module to fetch body of URL as shown below:
function getBodyText() {
request({
url:'https://www.example.com/'
}, (error, response, body) => {
console.log(body.innerText);
});
}
The above code displays body of the current page I am in (https:www.google.com). Am I missing anything?
In your above code, the body value is a just a string. innerText on the other hand assumes body is a DOM Node.
In Node, the DOM is not present like it would be in the browser, so in order to access the DOM Nodes that were returned you'll need to load body using the package Cheerio. You can assign the transform property of the request options to load the body into a DOM using cheerio.load(). Then you can use traditional DOM selectors to traverse body.
In order to use the transform option on your request options object, you'll need to switch from request to request-promise. (npm i --save request request-promise) They function nearly identically except that request-promise will return an A+ promise using Bluebird where request uses a more traditional error first callback.
Since Cheerio uses its own implementation of jQuery you can refer to their docs for the differences when interacting with the DOM returned.
const cheerio = require('cheerio')
const request = require('request-promise')
request({
method: 'GET',
uri: 'https://google.com'
transform: body => cheerio.load(body)
})
.then($ => {
console.log($('p').text)
})
If you didn't want to switch over to request-promise you can still do this and make it use Promises
const cheerio = require('cheerio')
const request = require('request')
const getDOMFromURI = uri => {
return new Promise((resolve, reject) => {
request(uri, (err, res, body) => {
if (err) {
return reject(err)
}
return resolve(cheerio.load(body))
})
})
}
getDOMFromURI('https://google.com').then($ => {
console.log($('p').text)
})
You have to use some other technology combination. It seems that you want to scrap website for data. Please use phantomjs or nightmare or puppeteer or any other headless browser.
A small example for you how to get first result title with puppeteer
const puppeteer = require('puppeteer');
let scrape = async () => {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.goto('https://www.google.com.pk/search?q=puppeteer');
await page.waitFor(2000);
const result = await page.evaluate(() => {
let title = document.querySelector('h3').innerText;
return {
title
}
});
browser.close();
return result;
};
scrape().then((value) => {
console.log(value); // Success!
});
from the docs, you can use a string as first argument if using GET method:
request('https://www.example.com', function (error, response, body) {
console.log('error:', error); // Print the error if one occurred
console.log('statusCode:', response && response.statusCode); // Print the
response status code if a response was received
console.log('body:', body); // Print the HTML for the Google homepage.
const dom = new JSDOM(body);
console.log(dom.window.document.querySelector("p").textContent);
});
see https://www.npmjs.com/package/request
You also might want to try the request-promise module or axios (which is the number 1 lib for making HTTP requests)
Once you've got the body back you may need to use JSDOM or some other lib to convert the body into a document object which you can then traverse using normal JS methods or even jQuery / another DOM traversal lib.
Related
I am trying to use the google-sheets api with express and don't have much experience with javascript. I'm attempting to use pass a json object from express to react, but it seems that whenever I finally send the object, it just renders as empty on the frontend?
I've tried using res.body/res.data, but the object doesn't seem to have either. I've also tried to put as many awaits as I can everywhere to make sure the object is loaded in before sending, but nothing seems to do the trick. If I use res.json or res.send with just the response object, I get a circular structure converting to JSON error. Here is the code I'm working with.
async function docShit() {
// Initialize the sheet - doc ID is the long id in the sheets URL
const doc = new GoogleSpreadsheet(
"--SPREADSHEET ID--"
);
// Initialize Auth - see https://theoephraim.github.io/node-google-spreadsheet/#/getting-started/authentication
await doc.useServiceAccountAuth({
// env var values are copied from service account credentials generated by google
// see "Authentication" section in docs for more info
client_email: process.env.GOOGLE_SERVICE_ACCOUNT_EMAIL,
private_key: process.env.GOOGLE_PRIVATE_KEY,
});
await doc.loadInfo(); // loads document properties and worksheets
const sheet = doc.sheetsByTitle[--WORKSHEET TITLE--];
const rows = await sheet.getRows(); // can pass in { limit, offset }
return rows;
}
app.get("/home", async (req, res) => {
try {
await docShit()
.then((response) => {
res.send(Promise.resolve(response)); //console log shows the object, but res.send just sends nothing??
})
.catch((err) => console.log(err));
} catch (err) {
console.error(err.message);
}
});
There is no res.send at all in your code. Also, you use await and .then together, but I consider them alternatives. Try the following:
app.get("/home", async (req, res, next) => {
try {
var response = await docShit();
console.log(response);
/* If response is circular, decide which parts of it you want to send.
The following is just an example. */
res.json(response.map(function(row) {
return {id: row.id, cells: row.cells.map(function(cell) {
return {id: cell.id, value: cell.value};
};
})};
} catch (err) {
console.error(err.message);
next(err);
}
});
my fetch is stuck in pending when I query a fastapi endpoint in local dev.
followed this blog and a few others - https://damaris-goebel.medium.com/promise-pending-60482132574d
Using this fetch code (having simplified it drastically just to get a simple solution working)
function fastapiRequest(path) {
return fetch(`${path}`)
.then((response) => {
return response;
}
);
into a constant variable i.e.,
const xxxx = fastapiRequest(
`http://0.0.0.0:8008/xcascasc/Dexaa/Emo.json?Magic=Dexxaa&emotion=expressions`
);
Ideally I want to use UseSWR to do this as I'm using next.js, but first of all, just need it to work :)
A postman query like this works fine to return a value
curl --location --request GET 'http://0.0.0.0:8008/xcaxc/dexxa/emo.json?analysis=statistical&substance=dexxa&emo=powa' \
--header 'x_token: 13wdxaxacxasdc1'
the value is left like this in console.log
data show here? Promise {<pending>}
With the initial response being
Response {type: 'cors', url: 'url', redirected: false, status: 200, ok: true, …}
Update based on answers.
Using each of the proposed answers, I am still not getting the data returned appropriately. i.e.,
function fastApiRequest(path) {
console.log("really begins here");
return fetch(`${path}`, { mode: 'cors' })
.then((response) => {
console.log('response', response);
return response.json();
})
.catch((err) => {
throw err;
});
}
async function test() {
console.log('begins');
return await fastApiRequest(
`http://0.0.0.0:8008/xxxx/dex/adea.json?deaed=adedea&adee=deaed&adeada=adeeda`
);
}
const ansa = test();
Is giving a response of pending at the moment.
The backend is built with fastapi, with these CORS, I'm wondering if I need to give it more time to get the data? (postman works fine :/)
def get_db():
try:
db = SessionLocal()
yield db
finally:
db.close()
origins = [
"http://moodmap.app",
"http://localhost:3000/dashboard/MoodMap",
"http://localhost:3000",
"http://localhost",
"http://localhost:8080",
]
app.add_middleware(
CORSMiddleware,
allow_origins=origins,
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
max_age=3600,
)
I am running the fastapi code in a docker container as well btw
As per Documentation
The Response object, in turn, does not directly contain the actual JSON response body but is instead a representation of the entire HTTP response. So, to extract the JSON body content from the Response object, we use the json() method, which returns a second promise that resolves with the result of parsing the response body text as JSON.
.json() is an async method (it returns a Promise itself), so you have to assign the parsed value in the next .then(). So your code can be changed like this.
function fastApiRequest(path) {
let res;
fetch(`${path}`)
.then((response) => response.json())
.then((data) => (res = data))
.then(() => console.log(res));
return res;
}
response = fastApiRequest('https://proton.api.atomicassets.io/atomicassets/v1/accounts?limit=10');
console.log('response')
If you want to use async/await approach, below is the code.
async function fastApiRequest(path) {
try {
const response = await fetch(path);
const data = await response.json();
return data;
} catch (error) {
console.error(error);
}
}
async function test() {
console.log(await fastApiRequest('https://proton.api.atomicassets.io/atomicassets/v1/accounts?limit=10'))
}
test()
first you need to parse the response into json if it's a json API.
function fastapiRequest(path) {
return fetch(`${path}`)
.then((response) => {
return response.json();
});
}
you need to 'await' for the rsponse
you need to write the below code in an async function
const xxxx = await fastapiRequest(
`http://0.0.0.0:8008/xcascasc/Dexaa/Emo.json?Magic=Dexxaa&emotion=expressions`
);
When you make an http request using fetch in javascript it will return a Promise, it's not stuck it's just need to be resloved, you can resolve it just like the above code with async await, or you can use the .then(() => { /* code... */ }) function, you can also use .catch(() => { /* handle error... */ }) function to handle errors.
In Your curl you use x_token as header variable, if it's required you need to pass a header with your path too. All other answers are valid too.
I am trying to download distance between 2 locations from tomtom api.
Protractor will not let me use
*fetch - fetch is not defined - please use import
*import - Cannot use import statement outside of module
*when I add
{
type: module
}
to package.json - protractor stops working, as no entire code is a module of ES
*browser.get - opens http with json data, but I cannot extract it.
Is there any other way? I tried to import json to a different file and export response.data, but the module error stops me from doing that also.
Protractor is for testing angular webpages, but you can have the browser execute arbitrary javascript, but to use fetch, you need to use window
function getTomTomData() {
//replace this with your tomtom api call, and transform the response
return window.fetch(TOM_TOM_URL);
}
browser.executeScript(getTomTomData).then(res=> {
//do something with the response
});
I did not manage to run node-fetch on my script as Protractor kept rejecting the import. I managed to to sort it out with require 'https'
const https = require('https');
let measureDistance = async function(pickup, dropoff) {
let url = `https://api.tomtom.com/routing/1/calculateRoute/${pickup[0]}%2C${pickup[1]}%3A${dropoff[0]}%2C${dropoff[1]}/json?routeType=shortest&avoid=unpavedRoads&key=uwbU08nKLNQTyNrOrrQs5SsRXtdm4CXM`;
await https.get(url, res => {
let body = '';
res.on('data', chunk => {
body += chunk;
});
res.on("end", () => {
try {
let json = JSON.parse(body);
howFar = json.routes[0].summary.lengthInMeters;
} catch (error) {
console.error(error.message);
}
}).on("error", (error) => {
console.error(error.message);
});
});
};
Also I used to put require on top of the file like in Ruby, which seemed to be another issue.
I am trying to pass simple data from my server to a javascript file called on another html page. I am testing sending a single string from the server, but am not sure how to receive it. Server below:
const express = require('express')
const app = express()
const port = 3000
app.use(express.static("./assets"));
app.get('/', (req, res) => {
//res.send('Hello World!')
res.sendFile('./main.html', { root: __dirname });
})
app.listen(port, () => {
console.log(`Example app listening at http://localhost:${port}`)
})
app.get('/get-test', async (_req, res) => {
try {
const test_string = "get-test request string";
return res.send(test_string);
} catch (e) { throw e; }
});
And in another javascript file I have the following:
async function testing() {
const response = await fetch('/get-test');
console.log(response);
}
testing();
The console.log gives me a whole object, and clicking through it I can't seem to find my test_string anywhere:
So I believe the get request worked, but how do I actually access the data inside that I want?
You need to call await response.text() before console.loging it.
So in this functio:
async function testing() {
const response = await fetch('/get-test');
console.log(response);
}
testing();
You will need to put await inside the console.log
so like this:
async function testing() {
const response = await fetch('/get-test');
console.log(await response);
}
testing();
Why ?
Since javascript is asynchronous when retrieving data there will be a delay. That is why they implemented async / await and promises. So when you trying to make a get request with fetch you will need need to await the response. But the response is also a promise, which mean you will need to await the response aswell. This makes more sense, when you try to process the response to lets say json.
A little tip
When the console.log returns no error, but no data either. It might because of a promise issue
I have a fastify node.js app that I am able to see the text results of a promise right before it is returned to the calling browser JS. When that promise is returned to the browser JS, I only get am empty string out of the promise text. I am assuming that the promises are not chained and this is a new promise that does not have the contents of the other. If that is correct, how would I access the inner promise results?
I have passed promises between modules in the fastify app with no problem getting the results at any point, I just do not understand what I am doing wrong at this point. These are the basics of what I am trying to do on both sides of the call:
// node.js
fastify.get('/promise', async function(request, reply) {
var results = await someFunction(request)
console.log(await results.text()) // this displays results as XML
return results
})
// call to fastify app from browser JS
async function getPromise(params) {
var response = await fetch("http://localhost:3000/promise" + params, { mode: 'no-cors' })
console.log(await response.text()) // this is empty
}
The { mode: 'no-cors' } is blocking you to access the response because it is opaque
An opaque filtered response is a filtered response whose type is "opaque", URL list is the empty list, status is 0, status message is the empty byte sequence, header list is empty, and body is null.
Here a complete example:
'use strict'
const fetch = require('node-fetch')
const fastify = require('fastify')({ logger: true })
const fastifyCors = require('fastify-cors')
fastify.register(fastifyCors, {
credentials: true,
origin: '*'
})
fastify.get('/promise', async function (request, reply) {
const results = await fetch('http://www.mocky.io/v2/5e738e46300000fd9b2e66ae')
return results.text()
})
fastify.listen(3000)
In the browser:
await fetch("http://localhost:3000/promise").then(res => res.text())
It will print HELLO WORLD