I recently got the idea to scrape information from instagram accounts and their posts, like the amount of comments or amount of likes. I got so far that I figured out while debugging in chrome that for example the link https://www.instagram.com/instagram/?__a under the network tab returns a JSON with the wanted information, but what is actually loaded is still the normal website html code.
so far I tried in python with this code:
import urllib.request
r = urllib.request.urlopen(url)
print(r.read())
or in javascript :
window.onload = function () {
res = fetch("https://www.instagram.com/instagram/?__a", {
method: 'get'
}).then(function (data) {
return data.json();
}).catch(function (error) {
console.log("ERROR".concat(error.toString()));
});
console.log(res.user);
};
So the problem I have, is that when using these functions I only get the website code (html), is there a way to only get the JSON which is loaded in the background? I know people will recommend me using the instagram api, but I have no website nor a company to register.
I ran into a problem trying to get the API to do what I wanted, and really just needed JSON data including urls and captions for images for a specific account.
Use the following GET request:
https://www.instagram.com/account_name/?__a=1
where account_name is the profile I'm scraping.
It returns all JSON I needed for my task.
Update 2022:
You can no longer get the JSON output by adding the query string ?__a=1.
Currently, it would help if you used the following query string to get profile information, video, and post information on Instagram:
https://www.instagram.com/instagram/?__a=1&__d=dis
Trying to get the Json loaded in the background is too much work for a simple problem.
You should use the Instagram Api. Just put your name as a company.
Related
I am working on a Covid-19 Tracker Project which requires me to get and use api data through the URL https://thevirustracker.com/free-api?global=stats recommended on the website https://thevirustracker.com/api but for some reason I can neither access the suggested website nor can I get the required data through the suggested URL.
Instead I'm trying to do it through https://corona.lmao.ninja/v2/all?yesterday but the result has a slightly different structure and therefore I'm finding it hard to code in line 48 of my GlobalData.js file on Visual Studio in order to display the "cases" figure of the result of the api GET request under the "Global Data as of Today" in the first Paper section of the app.
Can someone please help me out?
Pic Result of api GET request
Pic GlobalData.js
Fetch api sends always you a promise so in console you can see that is a promise.
try this you will get your actual data in console.
fetch('https://corona.lmao.ninja/v2/all?yesterday')
.then((res)=>res.json())
.then((data)=>console.log(data))
.catch((err)=>console.log(err))
I am new to API usage. I have properly managed to utilize Google Page Insights V.5 API through javascript code, but I cannot for the life of me succeed in doing so for GTMetrix. It seems the only information relating to GTMetrix API & Javascript is a link to the RapidApi website. I simply wish to achieve the same simple retrieval of data from GTMetrix as I have from Google. Is this possible?
Am I simply structuring my request incorrectly when I set it as:
https://gtmetrix.com/api/0.1/?login-user=myemail#email.com&login-pass=MyRanDomApIKeY&location=2&url=https://sitetotest.com
Because when I set my Google Page Insights Request URL as the following it works.
https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://websitetotest.com&category=performance&strategy=desktop&key=MyRanDomApIKeY
The below code works for Google Page Insights and I am even able to retrieve JSON data in a browser window with a URL such as:
<div id="firstmetric"></div>
<br>
<div id="domSize"></div>
<button>Click Me</button>
<script>
$('button').click(function(){
var baseUrl = "https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=";
var fieldUrl = "https://websitetotest.com";
var trailing = "&category=performance&strategy=desktop&key=MyRanDomApIKeY";
$.getJSON(baseUrl + fieldUrl + trailing, function(data){
console.log(data);
var item = data.lighthouseResult.categories.performance.auditRefs[0].weight;
var domSize = data.lighthouseResult.audits['dom-size'].displayValue;
$("#firstmetric").html( item );
$("#domSize").html( domSize );
});
});
I truly need it spelled out for me because anything less is going to lead me to ask follow up questions and put us in a tail spin. :/
As a newbie, JSFiddle has been a life saving resource for testing and trying, breaking, and building in my learning process. If it wouldn't be too much to ask for, a fiddle would help me get my brain around things.
The parameters that you are using: login-user and login-pass refer to HTTP authentication on the page you are analyzing (as in, GTmetrix will pass these parameters on your analysis) not your GTmetrix API credentials.
The authentication used for the GTmetrix API is your e-mail for the username and your API key as the password, as pointed out by the API docs.
Another thing to keep in mind is that GTmetrix will not allow you to do API calls through your web application frontend, since they disallow CORS requests. If you do it through your Web application on a normal website, you would be exposing your GTmetrix API key, which is probably not a good idea.
So, you would then do it through your backend code. For example if done through Node JavaScript:
fetch("https://gtmetrix.com/api/0.1/locations", {
headers: new Headers({
"Authorization": 'Basic ' + btoa("[YOUR E-MAIL]" + ":" +"[YOUR API KEY]"),
}),
}).then(res => res.json())
.then(response => console.log(response));
would print me the array of locations.
Note that whichever backend code you choose, you need to add the basic authorization header request for you API call and encode it properly (that is what the btoa function call does).
Hi guys - Super Quick TL;DR version:
My Circle CI endpoint to return artifacts in a JSON contains a first entry, That entry has a URL to a code-coverage JSON file.
That is the data that i need returned via my axios, ajax or fetch API.
I've tried all three.
On Postman, via webserver or in the browser console... they all come back with "Not Logged In" sort of errors.
So this is an issue with tokens, access etc I guess?
[ Another person experiencing the same issue - https://discuss.circleci.com/t/circle-token-param-ignored-when-using-api-url-to-fetch-latest-artifact/3197 ]
Has anyone managed to resolve this yet?
Can anyone advise how I can get that data (not just the artifacts) returned if Circle CI's endpoint only works in the browser? And if the URL for that one-level-deeper sort of information doesn't accept tokens?
Ummmm. Help?
Inside The React Component:
componentDidMount() {
axios.get("https://circleci.com/api/v1.1/project/github/ORG/REPO/Latest/artifacts?circle-token=<My TOKEN>")
.then(function(result){
console.log('API result===>', result);
})
}
What kind of artifact URLs are you using? You should be using the *.circle-artifacts.com/* type URLs.
The endpoint you provided in your question should work fine as long as you lowercase Lower. It should look like this:
componentDidMount() {
axios.get("https://circleci.com/api/v1.1/project/github/<org>/<repo>/latest/artifacts?circle-token=<my-token>")
.then(function(result){
console.log('API result===>', result);
})
}
I am trying to get the tweets associated with this user #WeAreShootProd but the JSON and XML is empty.
https://api.twitter.com/1/users/show.xml?screen_name=WeAreShootProd
https://search.twitter.com/search.json?rpp=10&callback=?&q=from:WeAreShootProd
As you can see from this link there are at least 5 tweets
https://twitter.com/WeAreShootProd
Does anyone know why they aren't appearing in the XML or JSON?
Thanks very much
When I tried the URL you provided, I see the following response:
{
errors: [
{
message: "The Twitter REST API v1 is no longer active. Please migrate to API v1.1. https://dev.twitter.com/docs/api/1.1/overview.",
code: 68
}
]
}
I suspect the problem was that in the JSON call, you had two question marks in the query portion of the URI. But whatever the problem was at the time, the issue is rather moot. That version of the Twitter API is no longer supported.
I'm currently researching how to add persistence to a realtime twitter json feed in node.
I've got my stream setup, it's broadcasting to the client, but how do i go about storing this data in a json database such as couchdb, so i can access the stores json when the client first visits the page?
I can't seem to get my head around couchdb.
var array = {
"tweet_id": tweet.id,
"screen_name": tweet.user.screen_name,
"text" : tweet.text,
"profile_image_url" : tweet.user.profile_image_url
};
db.saveDoc('tweet', strencode(array), function(er, ok) {
if (er) throw new Error(JSON.stringify(er));
util.puts('Saved my first doc to the couch!');
});
db.allDocs(function(er, doc) {
if (er) throw new Error(JSON.stringify(er));
//client.send(JSON.stringify(doc));
console.log(JSON.stringify(doc));
util.puts('Fetched my new doc from couch:');
});
These are the two snippets i'm using to try and save / retrieve tweet data. The array is one individual tweet, and needs to be saved to couch each time a new tweet is received.
I don't understand the id part of saveDoc - when i make it unique, db.allDocs only lists ID's and not the content of each doc in the database - and when it's not unique, it fails after the first db entry.
Can someone kindly explain the correct way to save and retrieve this type of json data to couchdb?
I basically want to to load the entire database when the client first views the page. (The database will have less than 100 entries)
Cheers.
You need to insert the documents in the database. You can do this by inserting the JSON that comes from the twitter API or you can insert one status at a time (for loop)
You should create a view that exposes that information. If you saved the JSON directly from Twitter you are going to need to emit several times in your map function
There operations (ingestion and querying) are not the same thing, so you should really do them at the different times in your program.
You should consider running a bg process (maybe in something as simple as a setInterval) that updates your database. Or you can use something like clarinet (http://github.com/dscape/clarinet) to parse the Twitter streaming API directly.
I'm the author of nano, and here is one of the tests that does most of what you need:
https://github.com/dscape/nano/blob/master/tests/view/query.js
For the actual query semantics and for you learn a bit more of how CouchDB works I would suggest you read:
http://guide.couchdb.org/editions/1/en/index.html
I you find it useful I would suggest you buy the book :)
If you want to use a module to interact with CouchDB I would suggest cradle or nano.
You can also use the default http module you find in Node.js to make requests to CouchDB. The down-side is that the default http module tends to be a little verbose. There are alternatives that give you an better API to deal with http requests. The request is really popular.
To get data you need to make a GET request to a view you can find more information here. If you want to create a document you have to use PUT request to your database.