so im scraping a web and i need to use a specific cookies but i dont know how to exactly use "fetch"
const url="https://www.example.com";
let response = await fetch(url),
html = await response.text();
let $ = cheerio.load(html)
var example= $('.exampleclass').text();
Now i can scrape the web but in case i would have to use a specific cookies i dont know how to put in on the fetch.
In python was something like that
response = requests.get(url, headers=headers, cookies=cookies)
Thank you!
You can add the cookies on the headers on node-fetch, I've made a helper function that you can use for your purposes:
const cookieMaker = object => {
const cookie = [];
for (const [key, value] of Object.entries(object)) {
cookie.push(`${key}=${value}`);
}
return cookie.join('; ');
};
const fetchText = async (url, cookie) => {
const r = await fetch(url, {
headers: {
Cookie: cookieMaker(cookie),
},
});
return await r.text();
};
fetchText('http://someurl.com', { token: 'abc', myValue: 'def' });
Related
GitHub seems to have made updates since end of 2021.
https://developer.github.com/changes/2020-02-10-deprecating-auth-through-query-param/
I have followed numerous resources where the below code increases the amount of requests one can do per hour. Now the below request does not work. Instead the documentation says to use CURL, so for instance the below works in a terminal:
curl -u client_id:secret_key https://api.github.com/users/<username>
I want to do this in JavaScript, I am playing around with a GitHub user finder app in JavaScript. Can someone please show me how I can get this to actually work. The code I am using is below.
TL:DR: I can access the github API using the code below and receive a JSON object to display, but it's limited to 60 requests per hour. GitHub documentation says that since end of 2021 query parameters are not allowed anymore so I'm lost now. How can I do this in JavaScript now?
const client_id = "df2429c311a306c35233";
const client_secret = "5c23233326680aa21629451a6401d36ec";
const fetchUsers = async (user) => {
const api_call = await fetch(`https://api.github.com/users/${user}?client_id=df5429c311a306c356f4&
client_secret=${client_secret}`);
const data = await api_call.json();
return {data};
};
EDIT/UPDATE:
const inputValue = document.querySelector("#search");
const searchButton = document.querySelector(".searchButton");
const nameContainer = document.querySelector(".main__profile-name");
const unContainer = document.querySelector(".main__profile-username");
const reposContainer = document.querySelector(".main__profile-repos");
const urlContainer = document.querySelector(".main__profile-url");
const client_id = "<user_id>";
const client_secret = "<client_secret>";
const headers = {
'Authorization': 'Basic ' + (new Buffer(client_id + ':' + client_secret).toString('base64'))
}
const fetchUsers = async (user) => {
const api_call = await fetch(`https://api.github.com/users/${user}`, {
method: 'GET',
headers: headers
});
const data = await api_call.json();
return {data};
};
const showData = () => {
fetchUsers(inputValue.value).then((res) => {
console.log(res);
nameContainer.innerHTML = `Name: <span class="main__profile-value">${res.data.name}</span>`
unContainer.innerHTML = `Username: <span class="main__profile-value">${res.data.login}</span>`
reposContainer.innerHTML = `Repos: <span class="main__profile-value">${res.data.public_repos}</span>`
urlContainer.innerHTML = `Url: <span class="main__profile-value">${res.data.url}</span>`
})
};
searchButton.addEventListener("click", () => {
showData();
})
Those behave as username and password of the basic authentication type. Hence your Api request should have the following header.
const headers = {
'Authorization': 'Basic ' + btoa(CLIENT_ID + ':' + CLIENT_SECRET)
}
Please note that btoa function is being used because browsers don't have a native support of Buffer. If btoa throws error then try with window.btoa and use it like
const response = await fetch(url, {method:'GET',
headers: headers,
})
I'm needing to interact with the Flickr api from a cloudflare worker, but it's turning out to be exceedingly tricky.
My initial idea was to reach for the oauth1.0a library, but unfortunately it requires being passed a synchronous signature function. This is an issue because I need to use WebCrypto on the worker and it only exposes an asynchronous API.
Are there any other libraries I can use? I've currently spent hours trying to manually craft the request but keep getting errors saying the signature is bad.
This is my current attempt using someone's fork of oauth1.0a that adds support for an async signature function. This currently results in an "invalid_signature" response:
import OAuth from 'oauth-1.0a';
const CALLBACK_URL = "https://localhost:3000/oauth/callback";
const encoder = new TextEncoder();
async function signData(baseString: string, keyString: string) {
return await crypto.subtle.importKey(
'raw',
encoder.encode(keyString),
{ name: 'HMAC', hash: 'SHA-1' },
false,
['sign']
).then(key => {
return crypto.subtle.sign(
"HMAC",
key,
encoder.encode(baseString)
);
}).then(signature => {
let b = new Uint8Array(signature);
// base64 digest
return btoa(String.fromCharCode(...b));
});
}
export async function getRequestToken(consumerKey: string, consumerSecret: string) {
const url = "https://www.flickr.com/services/oauth/request_token";
const token = {
key: consumerKey,
secret: consumerSecret
}
const oauth = new OAuth({
consumer: token,
signature_method: 'HMAC-SHA1',
// #ts-ignore
hash_function: signData,
});
const requestData = {
url,
method: 'GET',
data: {
oauth_callback: CALLBACK_URL
}
};
// #ts-ignore
const authorisedRequest = await oauth.authorizeAsync(requestData, token);
let params = new URLSearchParams();
for (let [key, value] of Object.entries(authorisedRequest)) {
params.append(key, value as string);
}
const response = await fetch(requestData.url + `?${params}`, {
method: requestData.method,
});
const body = await response.text();
const parsedBody = oauth.deParam(body);
return parsedBody;
}
What I'm trying to do:
I'm trying to scrape all images in a discord channel and getting their URL by requesting attachments but I can't seem to find a way to request it
Code
const fs = require("fs");
const fetch = require("node-fetch");
function readFileString(path) {
return fs.readFileSync(path, {encoding: "utf8"}).replace(/\r?\n|\r/g, "");
}
const token = readFileString("token.txt");
const channel = process.argv[2];
if(!channel) {
console.error("Usage: node index.js <channel id>");
process.exit(1);
}
const headers = {authorization: token};
async function request(before) {
const options = {
method: "GET",
headers: headers
};
const request = await fetch(
`https://discord.com/api/channels/${channel}/attachments`,
options
);
return await request.json();
}
let result;
async function go() {
let page = await request();
result = page;
while(page.length >= 100) {
page = await request(page[page.length - 1].id);
result = result.concat(page);
}
console.log(`Fetched ${result.length} images`);
fs.writeFileSync("links.json", JSON.stringify(result, null, 2));
}
go();
Output: Console
Fetched undefined images
Output: links.json
{
"message": "404: Not Found",
"code": 0
}
Any help of how I would get all image links in the links.json file would be appreciated
It seems at looking at the Docs It does not allow you to make a GET request for message attachments.
I need to get styles with exact, not computed values for my extension. I can use document.styleSheets in some cases, but in case when css styles hosted on other domain, I am getting CORS error. I found a way to get those styles with help of chrome.debugger API, but having difficulties with implementation:
chrome.debugger.attach(debuggeeId, "1.3", () => {
chrome.debugger.sendCommand(debuggeeId, "Page.enable", null, (r) => {
chrome.debugger.sendCommand(debuggeeId, "Page.getResourceTree", null, (res) => {
// get style URLs from resourceTree object
const cssResources = getCSSResources(res.frameTree.resources);
for(let url of cssResources) {
chrome.debugger.sendCommand(debuggeeId, "Page.getResourceContent", {frameId: toString(tabId), url: url}, (resp) => {
console.log(resp) /// return undefined
})
}
})
})
})
For some reasons I am getting undefined from Page.getResourceContent. Just to clarify, getting undefined because of CORS(also works here??) or because of incorrect response to chrome.debugger API?
Code below cause the same - no data from request was returned.
The only problem I see is that frameId is an internal id of the frame which you can get from res.frameTree.frame.id, it's not related to tabId.
Might process all frames recursively, at least of the same origin, and use the modern syntax:
chrome.debugger.attach(debuggeeId, '1.3', async () => {
const send = (cmd, params = null) =>
new Promise(resolve =>
chrome.debugger.sendCommand(debuggeeId, cmd, params, resolve));
await send('Page.enable');
const {frameTree} = await send('Page.getResourceTree');
const frameQueue = [frameTree];
const results = [];
for (const {frame, childFrames, resources} of frameQueue) {
frameQueue.push(...childFrames);
const frameId = frame.id;
for (const {url, type} of resources) {
if (type === 'Stylesheet') {
results.push({
url,
frameUrl: frame.url,
...await send('Page.getResourceContent', {frameId, url}),
});
}
}
}
console.log(results);
});
I'm new to Node.js and I need to upload some PDFs to an external API (Zip Forms).
Right now I have the code below but the PDF pages are blank when they arrive at the destination. I tried saving the PDF locally, using the same binary data that I'm sending to the API, and the PDFs are correctly saved.
I am also using setTimeout method here because I cannot find a method that waits for the PDF to read, before sending it to the API.
Also tried binary instead of latin-1 in readFileSync method, but it doesn't change anything.
Code:
const aws = require('aws-sdk');
const https = require('https');
const request = require('request');
const { createWriteStream, readFileSync, writeFileSync } = require('fs');
const s3 = new aws.S3(); // Pass in opts to S3 if necessary
// Look up order and related info.
var order = await Order.findOne({ id })
.populate('agent');
if (createZiplogixTransaction) {
ziplogixTransactionId = await sails.helpers.ziplogix.createZiplogixTransaction.with({
ziplogixContextId: ziplogixContextId,
transactionName: order.propertyStreetAddress + ', ' + order.propertyCity,
// FUTURE: if the transaction helper is updated, include actual order information
// e.g. Primary seller name, property street address, etc.
});
}
if (!order) {
throw 'noSuchOrder';
}
// Permissions
if (this.req.me && this.req.me.accountType !== 'agent' && !ziplogixContextId) {
throw 'forbidden';
}
let savedPdfs = await PdfOrderExternalId.find({ orderId: id });
await PdfOrderExternalId.destroy({
where: { orderId: id }
});
for (const pdf of pdfs) {
let url = await s3.getSignedUrl('getObject', {
Bucket: 'disclosure-pdfs',
Key: pdf.uploadFd,
Expires: 60 * 5
});
let file = createWriteStream(`/tmp/${pdf.slug}.pdf`);
await https.get(url, async (response) => {
await response.pipe(file);
// Need to wait for file to write on disk :|. Doesn't work with await or Promise (Why? IDK)
setTimeout(async () => {
let postData = await readFileSync(`/tmp/${pdf.slug}.pdf`, 'latin1');
let queryString = `Name=${pdf.displayName}&Description=${pdf.displayName}`;
savedPdfs.forEach(item => {
if (item.pdfTemplate === pdf.pdfTemplate) {
queryString += `Id=${item.externalId}`;
}
});
request({
method: 'POST',
url: `${sails.config.custom.ziplogixApiBaseUrl}/transactions/${ziplogixTransactionId}/documents/file?${queryString}`,
headers: {
'X-Auth-ContextID': ziplogixContextId,
'X-Auth-SharedKey': sails.config.custom.ziplogixSharedKey,
'Content-Type': ['application/pdf', 'application/pdf']
},
body: postData
}, async (error, response, body) => {
// code here ...
});
}, 1000);
});
}
await exits.success(Date.now());
Any ideas what I'm doing wrong?
Thank you