Using Wikia API - javascript

I am trying to access the X-men API on wikia, to try and extract the name and image of each character, to then be used on a SPA using javascript.
This is the link too the page on the wiki:
http://x-men.wikia.com/wiki/Category:Characters
I cannot for the life of me figure out how to access the API. It doesn't seem to be RESFTful, and that's all I have any experience in.
Has anyone used the Wikia API successfully before? I can get some articles and such, but nothing useful.
(The documentation is shocking, been searching around for hours.)

Probably you have already found a solution, but I think you should write something like this:
import requests
xmen_url = "http://x-men.wikia.com/api/v1/Articles/List?expand=1&category=Characters&limit=10000"
r = requests.get(xmen_url)
response = r.json()
# print response
a = 0
for item in response['items']:
a += 1
print("{}\t{}\t({})".format(str(a),item['title'].encode(encoding='utf-8'),item['id']))
This will print a list of all the articles of the category Characters (I think there also some subcategories, you should check). If you want to take a deeper look at the json file you can uncomment the commented code.
Hope it helps.

Related

Web scraping in R by first navigating through a JavaScript module

I looked up various questions and answers but unfortunately none of the problems I found dealt with a case that is similar to mine. In a typical question, the JavaScript table builds up directly when the website is loaded. In my case, however, I first have to navigate through the JavaScript module and select several criteria before I get the sought-after result.
This is my case: I have to scrape the exchange rates for various currencies from this website www.globocambio.co. To do that, I have (1) to navigate to “I WANT COLOMBIAN PESO”, (2) select the currency (e.g., “Chilean Peso”), (3) and the collection destination (e.g., “El Dorado International Airport”). Only then the respective exchange rate is being loaded. See this screenshot for illustration. I marked the three selection steps red. Green is the data point that I want to scrape for different currencies.
I am not very familiar with JavaScript but I tried to understand what is going on. Here is what I found out:
Using Chrome DevTools, I investigated the Network activity when loading an exchange rate. There is an XHR called “GetPrice” that requests the price using this URL: https://reservations.globocambio.co/DesktopModules/GlobalExchange/API/Widget/GetPrice and using the following Form Data
ISOAOrigen=CLP&cantidadOrigen=9000&ISOADestino=COP&cantidadDestino=0&centerId=27&operationType=OperationTypesBuying
I understand that the Form Data contains the information that I initially selected manually:
operationType=OperationTypesBuying: this is the “I WANT COLOMBIAN PESO” option
ISOAOrigen=CLP: this is the “Chilean Peso”
centerId=27: this is the “El Dorado International Airport”
The server responds to my request with the following information:
{“MonedaOrigen":{"ISOA":"CLP","Nombre":null,"Margen":0.1630000000,"Tramo":0.0,"Fixing":2.9000000000},"CantidadOrigen":9000.00,"MonedaDestino":{"ISOA":"COP","Nombre":null,"Margen":0.0,"Tramo":0.0,"Fixing":0.0},"CantidadDestino":21845.70,"TipoCambio":2.42730000000000000000,"MargenOrigen":0.0,"TramoOrigen":0.0,"FixingOrigen":0.0,"MargenDestino":0.0,"TramoDestino":0.0,"FixingDestino":0.0,"IdCentro":"27","Comision":null,"ComisionTramoSuperior":null,"ComisionAplicada":{"CodigoMoneda":null,"CodigoTipoMoneda":0,"ComisionFija":0.0,"ComisionVariable":0.0,"TramoInicio":0.0,"TramoFin":null,"Orden”:0}}
From this response, "TipoCambio":2.42730000000000000000 is then being written on the website using this line of HTML code: <span id="spTipoCambioCompra">2.427300</span>
This means that "TipoCambio" is the value that I am looking for.
So, I have to communicate somehow via R with the server using the Form Data as input variables. Can anyone tell me how to do this?
I mean, understand that I have to combine the URL https://reservations.globocambio.co/DesktopModules/GlobalExchange/API/Widget/GetPrice with the Form Data “ISOAOrigen=CLP&cantidadOrigen=9000&ISOADestino=COP&cantidadDestino=0&centerId=27&operationType=OperationTypesBuying” somehow but I do not know how it works..
Any help will be appreciated!
Update:
I still have no idea how to solve the above issue, yet. However, I try to approach it with small steps.
Using RSelenium, I am currently trying to find out how to click on the option “I WANT COLOMBIAN PESO”. My idea was to use the following code:
library(RSelenium)
remDr <- RSelenium::remoteDriver(remoteServerAddr = "localhost",
port = 4445L,
browserName = "chrome")
remDr$open()
remDr$navigate("https://www.globocambio.co/en/home")
webElem <- remDr$findElement("id", "tabCompra") #What is wrong here?
webElem$clickElement() # Click on "I WANT COLOMBIAN PESO"
But I get an error message after executing webElem <- remDr$findElement("id", "tabCompra"):
Selenium message:no such element: Unable to locate element: {"method":"css selector","selector":"#tabCompra"}
(Session info: chrome=81.0.4044.113)
For documentation on this error, please visit: https://www.seleniumhq.org/exceptions/no_such_element.html
...
Error: Summary: NoSuchElement
Detail: An element could not be located on the page using the given search parameters.
class: org.openqa.selenium.NoSuchElementException
Further Details: run errorDetails method
What am I doing wrong here?
I solved my problem using selenium in Python:
from selenium import webdriver
driver = webdriver.Firefox(executable_path = '/your_path/geckodriver')
driver.get("https://www.globocambio.co/en/")
driver.switch_to.frame("iframeWidget");
elem = driver.find_element_by_id('tabCompra')
elem.click()
elem = driver.find_element_by_id('inputddlMonedaOrigenCompra')
elem.click()
elem.send_keys(Keys.CLEAR)
elem.send_keys("Chilean Peso")
elem.send_keys(Keys.ENTER)
elem.send_keys(Keys.ARROW_DOWN)
elem.send_keys(Keys.RETURN)
elem = driver.find_element_by_id('info-change-compra')
print(elem.text)

Codebird.js for Twitter API won't return more than one tweet in single call

Codebird.js is not working when I try to return a list of n number of tweets by adding to my params object.
It works when I include just the property screen_name to get a single tweet but when I add count in, as below, the response I get is still for only one tweet
params = {
"screen_name": screenName,
"count": "3"
};
I can't seem to find any codebird.js documentation besides the README.MD on the main github page.
Is my syntax correct? Am I approaching this the correct way by adding to params
Solved it. Turns out I was using the wrong api endpoint and should have been using statuses/user_timeline.
Note: Check this part of the docs to see how to map this endpoint's string to the right format https://github.com/jublonet/codebird-js#mapping-api-methods-to-codebird-function-calls

How to query JSON with JS API to return JSON properties?

Apologies if this seems basic to some, but I'm new to JS/node.js/JSON and still finding my way. I've searched this forum for an hour but cannot find a specific solution.
I have a basic website setup running of a local Node.js server along with 2x JSON data files with information about 32x local suburbs.
An example of an API GET request URL on the site would be:
.../api/b?field=HECTARES
The structure of the JSON files are like:
JSON Structure
In the JSON file there are 32x Features (suburbs), each with it's own list of Properties as shown above. What I am trying to do is use the API 'field' query to push all the HECTARES values each of the 32x Features into a single output variable. The code below is an example of how far I have got:
var fieldStats = [];
var fieldQ = req.query['field'];
for (i in suburbs.features) {
x = suburbs.features[i].properties.HECTARES;
fieldStats.push(x);
}
As you can see in the above "HECTARES" is hard-coded - I need to be able to pass the 'fieldQ' variable to this code but have no idea how to.
Advice appreciated!
Exactly the same syntax you are using just above:
suburbs.features[i].properties[fieldQ];

finding the ajax request in dojo

I am working on a crawlers to scrap all data from the website. they use ajax for pagination. I found this on the href of the page numbers
javascript:dojo.publish("showResultsForPageNumber",[{pageNumber:"4",pageSize:"15", linkId:"WC_SearchBasedNavigationResults_pagination_link_4_categoryResults"}])
what is happening here. I am not aware of these dojo. Can any one help me to find the corresponding server script so that i can scrap all the data including pagination.
update#1
in the console i found
this is the code where it is redirected.
showResultsPage:function(data){
var pageNumber = data['pageNumber'];
var pageSize = data['pageSize'];
pageNumber = dojo.number.parse(pageNumber);
pageSize = dojo.number.parse(pageSize);
setCurrentId(data["linkId"]);
if(!submitRequest()){
return;
}
console.debug(wc.render.getContextById('searchBasedNavigation_context').properties); //line 773
var beginIndex = pageSize * ( pageNumber - 1 );
cursor_wait();
wc.render.updateContext('searchBasedNavigation_context', {"productBeginIndex": beginIndex,"resultType":"products"});
this.updateHistory();
MessageHelper.hideAndClearMessage();
},
It's part of the publisher/subscriber part of the Dojo framework and does not say anything about the executed AJAX request.
If you're not familiar with the publisher/subscriber pattern, then let's explain that first. To decouple certain components/parts of the application, this pattern is commonly used.
On one side, someone publishes information, while on the other side (= some other part of the application) someone listens to it.
In this case, the following data is published (= second parameter):
[{
pageNumber: "4",
pageSize: "15",
linkId: "WC_SearchBasedNavigationResults_pagination_link_4_categoryResults"
}]
Obviously, not all subscribers in the application need to know about this data, so there's a topic system, in this case, the data is published to a topic called "showResultsForPageNumber"(= first parameter)
To know what happens next, you will have to look through the code for someone who subscribes to that topic. So somewhere in the code you will find something like this:
dojo.subscribe("showResultsForPageNumber", function(data) {
// Does something with the data, perhaps an AJAX call?
});
TO answer your question, look in the code for something like: dojo.subscribe("showResultsForPageNumber", as it will tell you what happens next.
However, if you're just interested in the AJAX calls, it will be easier to check the network requests, if you're using Google Chrome/Mozilla Firefox/... you can use the F12 key to open your developer tools, then select the Network tab and activate if necessaray. Now click on the pagination controls and you will see a log of all network traffic and the request + response data.
Here you are publishing the topic with name "showResultsForPageNumber" where "pageNumber", "pageSize", "linkId" are properties of object of your argument array.
See following link: ref1 ref2

Jira Gadget: Configuration isn't saved for reconfiguration

I am writing a gadget for Jira with some configuration options. One of these configuration options is a "project or filter picker".
My problem lies in the part, when I want to reconfigure the gadget's preferences. I have read the code of the timesince-gadget as an example and I think the relevant part is the following:
if (/^jql-/.test(gadget.getPref("projectOrFilterId"))){
projectAndFilterPicker =
{
userpref: "projectOrFilterId",
type: "hidden",
value: gadgets.util.unescapeString(this.getPref("projectOrFilterId"))
};
} else {
projectAndFilterPicker = AJS.gadget.fields.projectOrFilterPicker(gadget, "projectOrFilterId", args.options);
}
Basicly I've copied the code from the timesince-gadget. Unfortunately even if already configured, the javascript always enters the else part.
A problem is, that I ve no experience with jql and don't totally understand the if clause.
But usually (e.g. when calling the rest api and processing the config infos)
gadget.getPref("projectOrFilterId")
returns a string containing the id of the picked project or filter.
Question is now: How can I make my gadget remember the last configuration like it's done with some many other Jira gadgets?
I really hope anyone can help me with that.
It turnes out, the answer is even simplier then I thought.
First: In the descriptor you can totally forget the if part from above. Just
var projectAndFilterPicker = AJS.gadget.fields.projectOrFilterPicker(gadget, "projectOrFilterId", args.options);
is needed.
Second: Retrieve the project's or filter's name in your rest resource, which shouldn't be a problem, since you already want to use the processed id. Then return this name back to the view part of your javascript and type in something like
this.projectOrFilterName = args.myrestclasskey.projectOrFilterName;
And tada: reconfiguration will display the old configured name!
I had this problem once when I forgot to specify the option in the Gadget XML file. I solved it by adding this to the XML:
<UserPref name="projectOrFilterId" datatype="hidden"/>

Categories