How to get table data as rows and columns from wikipedia api? - javascript

When I tried to get table data as json, I could find distinguishable children in json output of the following query:
https://en.wikipedia.org/w/api.php?action=parse&page=List_of_football_clubs_in_India&prop=wikitext&section=3&format=json
I want to get the rows and columns of this table (the text) :-
https://en.wikipedia.org/wiki/List_of_football_clubs_in_India#Assam
The JSON output seems complicated and I don't find a good way to extract text from it.
(I am doing this in Javascript (Node.js)
Please help..

I'm not sure, what you expect. Your API request to the page is actually returning the wikitext encapsulated into a JSON structure. However, the wikitext (where the table is part of) is not JSON, so you can not really interpret it as such.
I'm also not quite sure, what information you want to have. If you want to have the football clubs in the table, then your only bet is to parse the wikitext (you can also return the actual parsed HTML from the API to make it "easier") and go through the data yourself. However, this is probably an error prone and not fun task.
So, if you want to get all football clubs of india in a structured data format, I would probably better try Wikidata for that. It allows you to crunch structured data for the information you need (and also get you the links to Wikipedia articles, if the objects has a link to a Wikipedia page). In your use case, it's probably a good idea to try out the Wikidata Query service.
There you could issue a query like:
SELECT ?itemLabel ?sitelink WHERE {
?item wdt:P31 wd:Q476028;
wdt:P17 wd:Q668.
?sitelink schema:isPartOf <https://en.wikipedia.org/>;
schema:about ?item.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
which queries a list of all football clubs in India and returns you a list with the item label as well as the link to the english Wikipedia article:
https://query.wikidata.org/#SELECT%20%3FitemLabel%20%3Fsitelink%20WHERE%20%7B%0A%20%20%3Fitem%20wdt%3AP31%20wd%3AQ476028%3B%0A%20%20%20%20%20%20%20%20wdt%3AP17%20wd%3AQ668.%0A%20%20%3Fsitelink%20schema%3AisPartOf%20%3Chttps%3A%2F%2Fen.wikipedia.org%2F%3E%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20schema%3Aabout%20%3Fitem.%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%7D

Assume that this is the res is the data that you get from the wiki
//This will get you the innermost part of the object which is the text you want
let wikiText = res.parse.wikitext['*'];
//This will strip out all the numbers and non-alphabet charater.
let pureText=wikitext.replace(/[^a-zA-Z\s]+/g, ' ');
The above code can give you clean access to the text; however, how you are going to separate the column and row is up to you.

This will slow down performance a bit (It seems, but I'm not sure if any other faster way exists).
This can be done by setting prop=text and then parsing the obtained HTML using JSDOM (comes with/for Node.js)

I know this question is old but there is an API for this. You can supply a page title and it will return the tables of your choice in JSON.

Related

Accessing data through gspread comes up blank for lists

I'm experiencing this odd problem with gspread. I'm using it to access a sheet, and I want to take that data and convey it to the frontend of my website.
I put the data into a variable like so:
data = worksheet.get_all_values()
And this works, unless I try and access a cell which has what I believe to be a list in it.
sheets error
For example, data[2][5] will give "test", but data[3][5] will give nothing (instead of:
"
Conversation practice
Advice and mentorship
Anything!").
The circled cells are the ones which return nothing. Any ideas?

Recieving high volume data in client browser using Jquery / JavaScript

I am in the process of making a WordPress based application where a student can take the examination on his web-browser. The questions will be randomly selected and served from the question bank stored in a WordPress CMS.
In this regard following is important to share:
-each examination can have as many as 100 multiple choice questions.
-Each question can have images, each choice can have associated images.
-since the examination is time bound I can not send request to server every time the student completes his question.
My query is :
How do I send the questions from the server:
-should I send the whole question set in one go and then have the Java Script parse all the questions and choices parsed at the client side
or
-should the client repeatedly request the questions from server in the background in the chunks of say 5 question each, for example. If this is better approach I am not sure how do I implement this. Any pointers?, please.
Or is there a third approach which I am not aware of.
Please advise for any comments and solutions for the problem.
Thanks in advance.
Depends on user's selection,send appropriate JSON data to client and render it dynamivally,but if you want to use XML then lets talk about it:
I should mention that this comparison is really from the perspective of using them in a browser with JavaScript. It's not the way either data format has to be used, and there are plenty of good parsers which will change the details to make what I'm saying not quite valid.
JSON is both more compact and (in my view) more readable - in transmission it can be "faster" simply because less data is transferred.
In parsing, it depends on your parser. A parser turning the code (be it JSON or XML) into a data structure (like a map) may benefit from the strict nature of XML (XML Schemas disambiguate the data structure nicely) - however in JSON the type of an item (String/Number/Nested JSON Object) can be inferred syntactically, e.g:
myJSON = {"age" : 12,
"name" : "Danielle"}
The parser doesn't need to be intelligent enough to realise that 12 represents a number, (and Danielle is a string like any other). So in javascript we can do:
anObject = JSON.parse(myJSON);
anObject.age === 12 // True
anObject.name == "Danielle" // True
anObject.age === "12" // False
In XML we'd have to do something like the following:
<person>
<age>12</age>
<name>Danielle</name>
</person>
(as an aside, this illustrates the point that XML is rather more verbose; a concern for data transmission). To use this data, we'd run it through a parser, then we'd have to call something like:
myObject = parseThatXMLPlease();
thePeople = myObject.getChildren("person");
thePerson = thePeople[0];
thePerson.getChildren("name")[0].value() == "Danielle" // True
thePerson.getChildren("age")[0].value() == "12" // True
Actually, a good parser might well type the age for you (on the other hand, you might well not want it to). What's going on when we access this data is - instead of doing an attribute lookup like in the JSON example above - we're doing a map lookup on the key name. It might be more intuitive to form the XML like this:
<person name="Danielle" age="12" />
But we'd still have to do map lookups to access our data:
myObject = parseThatXMLPlease();
age = myObject.getChildren("person")[0].getAttr("age");

Creating a short, decryptable URL with no back end

I'm creating a front-end React application which has no back-end logic. In this application, users can enter data into multiple form fields. I'm wanting to allow users to link directly to the application with their fields filled in based on data stored in the URL. The problem is, I don't know if there's a way of generating a nice looking URL with JavaScript alone.
Ideally I'm wanting the end-result to look something like this:
http://example.com/#/.../sdf098sdfipodfi0sf3j
...when the user loads up that URL, it should decrypt the string and restore the saved data (how resources like Codepen and JSFiddle allow linking directly to results, but without a back-end or database).
Here's an example of the data I have:
{
"Episodes": [
{
"Id":"1",
"Age":"25",
"SEX":"1",
"Diagnosis":["1","2","3","4","5","6","7","8","9","10"],
"Procedure":["1","2","3","4","5","6","7","8","9","10"]
}
]
}
There problem I'm having is that there can be:
Up to 250 Episodes.
Up to 999 Diagnoses and Procedures within each.
The above requirements are absolute extremes and for the most part there will probably only ever be around 5 or 6 Episodes each with around 10 Diagnoses and 4 or 5 Procedures.
If I stringify and then encode the JSON object and stick that in the URL, I'll end up with an incredibly ugly output riddled with % symbols:
http://example.com/#/.../%7B%22Episodes%22:%5B%7B%22Id%22:%221%22,%22Age%22:%2225%22,%22SEX%22:%221%22,%22Diagnosis%22:%5B%221%22,%222%22,%223%22,%224%22,%225%22,%226%22,%227%22,%228%22,%229%22,%2210%22%5D,%22Procedure%22:%5B%221%22,%222%22,%223%22,%224%22,%225%22,%226%22,%227%22,%228%22,%229%22,%2210%22%5D%7D%5D%7D
If I go further and convert that into Base64 using btoa, I get something which looks more like a shortened URL, but bearing in mind this is with the demo data I've provided above, this becomes way too long:
http://example.com/#/.../JTdCJTIyRXBpc29kZXMlMjI6JTVCJTdCJTIySWQlMjI6JTIyMSUyMiwlMjJBZ2UlMjI6JTIyMjUlMjIsJTIyU0VYJTIyOiUyMjElMjIsJTIyRGlhZ25vc2lzJTIyOiU1QiUyMjElMjIsJTIyMiUyMiwlMjIzJTIyLCUyMjQlMjIsJTIyNSUyMiwlMjI2JTIyLCUyMjclMjIsJTIyOCUyMiwlMjI5JTIyLCUyMjEwJTIyJTVELCUyMlByb2NlZHVyZSUyMjolNUIlMjIxJTIyLCUyMjIlMjIsJTIyMyUyMiwlMjI0JTIyLCUyMjUlMjIsJTIyNiUyMiwlMjI3JTIyLCUyMjglMjIsJTIyOSUyMiwlMjIxMCUyMiU1RCU3RCU1RCU3RA==
Is there a nicer way I can condense a lot of information into a relatively small string which would fit nicely in a URL, without any back-end logic?
Ideally I'm wanting a native JavaScript solution, but I appreciate that may not be possible.
How uniform is the data structure?
I can see that with the example you've given above:
{
"Episodes": [
{
"Id":"1",
"Age":"25",
"SEX":"1",
"Diagnosis":["1","2","3","4","5","6","7","8","9","10"],
"Procedure":["1","2","3","4","5","6","7","8","9","10"]
}
]
}
You could reconstruct that with:
http://example.com/#/.../E1I1A25S1D10P10
But that's still a 15-character shortcode for 1 episode... so if there were 6 episodes, that would increase to around 90 characters.

Parse.com relations count

I want to query object from Parse DB through javascript, that has only 1 of some specific relation object. How can this criteria be achieved?
So I tried something like this, the equalTo() acts as a "contains" and it's not what I'm looking for, my code so far, which doesn't work:
var query = new Parse.Query("Item");
query.equalTo("relatedItems", someItem);
query.lessThan("relatedItems", 2);
It seems Parse do not provide a easy way to do this.
Without any other fields, if you know all the items then you could do the following:
var innerQuery = new Parse.Query('Item');
innerQuery.containedIn('relatedItems', [all items except someItem]);
var query = new Parse.Query('Item');
query.equalTo('relatedItems', someItem);
query.doesNotMatchKeyInQuery('objectId', 'objectId', innerQuery);
...
Otherwise, you might need to get all records and do filtering.
Update
Because of the data type relation, there are no ways to include the relation content into the results, you need to do another query to get the relation content.
The workaround might add a itemCount column and keep it updated whenever the item relation is modified and do:
query.equalTo('relatedItems', someItem);
query.equalTo('itemCount', 1);
There are a couple of ways you could do this.
I'm working on a project now where I have cells composed of users.
I currently have an afterSave trigger that does this:
const count = await cell.relation("members").query().count();
cell.put("memberCount",count);
This works pretty well.
There are other ways that I've considered in theory, but I've not used
them yet.
The right way would be to hack the ability to use select with dot
notation to grab a virtual field called relatedItems.length in the
query, but that would probably only work for me because I use PostGres
... mongo seems to be extremely limited in its ability to do this sort
of thing, which is why I would never make a database out of blobs of
json in the first place.
You could do a similar thing with an afterFind trigger. I'm experimenting with that now. I'm not sure if it will confuse
parse to get an attribute back which does not exist in its schema, but
I'll find out, by the end of today. I have found that if I jam an artificial attribute into the objects in the trigger, they are returned
along with the other data. What I'm not sure about is whether Parse will decide that the object is dirty, or, worse, decide that I'm creating a new attribute and store it to the database ... which could be filtered out with a beforeSave trigger, but not until after the data had all been sent to the cloud.
There is also a place where i had to do several queries from several
tables, and would have ended up with a lot of redundant data. So I wrote a cloud function which did the queries, and then returned a couple of lists of objects, and a few lists of objectId strings which
served as indexes. This worked pretty well for me. And tracking the
last load time and sending it back when I needed up update my data allowed me to limit myself to objects which had changed since my last query.

Lucene-like searching through JSON objects in JavaScript

I have a pretty big array of JSON objects (its a music library with properties like artist, album etc, feeding a jqgrid with loadonce=true) and I want to implement lucene-like (google-like) query through whole set - but locally, i.e. in the browser, without communication with web server. Are there any javascript frameworks that will help me?
Go through your records, to create a one time index by combining all search
able fields in a single string field called index.
Store these indexed records in an Array.
Partition the Array on index .. like all a's in one array and so on.
Use the javascript function indexOf() against the index to match the query entered by the user and find records from the partitioned Array.
That was the easy part but, it will support all simple queries in a very efficient manner because the index does not have to be re-created for every query and indexOf operation is very efficient. I have used it for searching up to 2000 records. I used a pre-sorted Array. Actually, that's how Gmail and yahoo mail work. They store your contacts on browser in a pre-sorted array with an index that allows you to see the contact names as you type.
This also gives you a base to build on. Now you can write an advanced query parsing logic on top of it. For example, to support a few simple conditional keywords like - AND OR NOT, will take about 20-30 lines of custom JavaScript code. Or you can find a JS library that will do the parsing for you the way Lucene does.
For a reference implementation of above logic, take a look at how ZmContactList.js sorts and searches the contacts for autocomplete.
You might want to check FullProof, it does exactly that:
https://github.com/reyesr/fullproof
Have you tried CouchDB?
Edit:
How about something along these lines (also see http://jsfiddle.net/7tV3A/1/):
var filtered_collection = [];
var query = 'foo';
$.each(collection, function(i,e){
$.each(e, function(ii, el){
if (el == query) {
filtered_collection.push(e);
}
});
});
The (el == query) part of course could/should be modified to allow more flexible search patterns than exact match.

Categories