Selfdescribing RESTful api

Selfdescribing RESTful api - javascript

I am building a (RESTful) api (using HTTP) that I want to use with javascript.
I find my self writing stuff in javascript like
function getPost(id)
{
$.getJSON("/api/post/" + id, function(data){
// Success!
});
}
There must be a smarter way than hardcoding the api in javascript, maybe something like querying the api itself for how the getPost url should look like?
function getApiPostUrl()
{
$.getJSON("/api/getpost/");
}
returning something like
url: "/api/post/:id"
Which can be parsed by javascript to obtain the url for actually getting the post with id=:id. I like this approach.
Is a standard way of doing this? I am looking for a good approach so I don't have to invent it all, if there already exists a good solution.

Well, per definition, a RESTful API shall contain the full URI - a Resource Identifier, and not only the resource's path. Thus your question is more a question on how you're designing your whole API.
So, for example, your API could contain a http://fqdn/api/posts that contains a list of all the posts within your site, e.g.:
[ "http://fqdn/api/posts/1",
"http://fqdn/api/posts/2",
"http://fqdn/api/posts/3" ]
and then your javascript only iterates over the values within the list, never needing to craft the path for each resource. You only need to have one known entry point. This is the HATEOAS concept, that uses hyperlinks as API to identifies states of an application.
All in all, it's a good idea to thoroughly think your application (you can use UML tools like the state machine or sequence diagrams..) so that you can cover all your use cases with a simple yet efficient set of sequences defining your API. Then for each sequence, it's a good idea to have a single first state, and you can have a single first step linking to all the sequences.
Resources:
ACM Article
Restful API Design Second Edition Slides
Restful API design blog

Yes, there are quite a few standard ways of doing this. What you want to look for is "hypermedia APIs" - that is, APIs with embedded hypermedia elements such as link templates like yours, but also plain links and even more advanced actions (forms for APIs).
Here is an example representation using Mason to embed a link template in a response:
{
id: 1234,
title: "This is issue no. 1",
"#link-templates": {
"is:issue-query": {
"template": "http://issue-tracker.org/mason-demo/issues-query?text={text}&severity={severity}&project={pid}",
"title": "Search for issues",
"description": "This is a simple search that do not check attachments.",
"parameters": [
{
"name": "text",
"description": "Substring search for text in title and description"
},
{
"name": "severity",
"description": "Issue severity (exact value, 1..5)"
},
{
"name": "pid",
"description": "Project ID"
}
]
}
}
}
The URL template format is standardized in RFC 6570.
Mason is not the only available media type for hypermedia APIs. There is also HAL, Sirene, Collection-JSON and Hydra.
And here is a discussion about the benefits of hypermedia.

Your code clearly violates the self-descriptive messages and the hypermedia as the engine of application state (abbr. HATEOAS) of the uniform interface constraint of REST.
According to HATEOAS you should send back hyperlinks, and the client should follow them, so it won't break by changes of the API. A hyperlink does not equal with an URL. It contains an URL, a HTTP method, maybe the content-type of the body, possibly input fields, etc...
According to self-descriptive messages you should add semantics to the data, the links, the input fields, etc... The client should understand that semantics and behave accordingly. So for example you can add a "create-post" API specific link relation to your hyperlink so the client will understand that it is for creation of posts. Your client should always use these kind of semantics instead of parsing the URLs.
URLs are always API specific, semantics not necessarily, so these constraints decouple the client from the API. After that the client won't break by URL changes or not even data structure changes, because it will use a standard hypermedia format (for example HAL, JSON-LD, ATOM or even HTML) and semantics (probably RDF) to parse the response body.

Related

Wikipedia (API) "List_of" page contents - Parse to JSON

My question is simple: how can I return a JSON structure for all list items on any wikipedia page that begins with "List of"? If that is not feasible through Wiki API, what is best way to parse wiki HTML/ XML to what I need? (note- parsing does not have to be perfect)
There is roughly 225,000 of these pages and they mostly seem to be one of these 4 styles. For example:
https://en.wikipedia.org/wiki/List_of_Star_Trek%3A_The_Next_Generation_episodes https://en.wikipedia.org/wiki/List_of_car_brands https://en.wikipedia.org/wiki/List_of_Presidents_of_the_United_States https://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_goalscorers
specifically I would like something I can use, like:
Star Trek: Next Generation episodes->
season 1->
Encounter at Farpoint
Encounter at Farpoint
The Naked Now
...
season 2->
The Child
Where Silence Has Lease
Elementary, Dear Data
...
...
...
The closest solutions I have come up with so far are Axios Wikipedia API parse calls that I would need to run for each section. Furthermore, despite setting JSON parameter I still receive list items as xml or HTML for "text" property of returned JSON. Parsing this becomes difficult for all the different page types. Any suggesting with how to parse multiple wiki type lists items would be helpful if JSON return is not possible.
Any suggestions to accomplish my goal? I am using VUE.js with nodejs.
Maybe their is a library that could help?
Maybe a get request on URL to get full html dump would work better?
Maybe their is a wikidump of just list pages that I could parse to firestore?

The concept of wiki data solves this issue, however it is still no where near maturity level to provide much value. In maybe 3-5 years it could avoid this problem all together.
At this time the quick and dirty way to answer this question is just grabbing all the links on a given wikipedia page through api, then either programmatically filter or have user do so. This works because the vast majority of star trek episodes, presidents, and car brands on a given list will be linked to their individual wikipedia pages.
I used the following API query to get all links on a wikipedia page (using pageid)
axios({
method: 'get',
url: 'https://en.wikipedia.org/w/api.php',
params: {
action: 'query',
format: 'json',
prop: 'pageterms|pageimages',
origin: '*',
generator: 'links',
gpllimit: '500',
redirects: 'true',
pageids: pageidIn,
piprop: 'thumbnail',
formatversion: 2
}
Then save off response.data.query.pages[i].terms.description and response.data.query.pages[i].title to class object of results
Then I added an additional search field for user to filter their prior results. If they enter "episode" it will get me what I need since the word "episode" is typically in the response.data.query.pages[i].terms.description field of the page.
Only drawback is this solution wont return list results that don't have their own wiki page. But for the sake of simplicity, I will accept that.

Is it considered bad practice to have random/optional properties in a JSON response?

I'm currently developing a Node/Express/MongoDB Restful Service.
Now, since MongoDB doesn't have "columns", it easily happens that a response for the same endpoint can contain a specific property or not. e.g.
# GET /users/1
{"name": "Alexander", "nickname": "Alex"}
# GET /users/2
{"name": "Simon"}
While this doesn't make any difference to handle with weakly typed languages like JavaScript, one of my coworkers who's implementing a client in C#, struggles to parse the JSON string when having missing properties.
IMO, the current approach is better from the API perspective, since it causes better performance, less code and even traffic on the serverside. Otherwise I would need to normalize Objects before sending or even run migrations every time a property gets added. Also it doesn't send "virtual" data which doesn't even exist on the resource.
But on the other hand, I also want to build a solid Service from the clients perspective and "normalizing" on the client side is at least as bad as on the server side.
Theres also another use case, which works well in JS but will cause problems in C# and refers to the same problem:
# GET /users/1/holidays
{
"2018-12-25": { "title": "Christmas" },
"2019-01-01": { "title": "New Year" }
}
I took this approach to automatically prevent multiple entries for the same days. But I could understand if this is really considered bad practice.
Update
As commented by #jfriend00, the second example is not that handy. So I won't use it anymore.

How to get Wikipedia article data from Freebase suggest response

I'm looking for a simple way to get data about a university (name, native_name, city etc) from the Wikipedia infobox after a user selects a university from Freebase suggest. The dataset returned from freebase, however, is very small and doesn't include the wikipedia link unfortunately.
Currently I am using the "name" property and making an ajax request to http://live.dbpedia.org/data/"+name+".json. This often works but while doing some tests it turned out the name doesn't always map directly to the correct page. Let me split my question in a few to make myself clear:
Is it possible to configure the Freebase suggest plugin so that the
response includes the wikipedia link?
OR is there a similar plugin that queries DBpedia directly and is as
simple and user-friendly as Freebase's?
OR, as a plan B, is there a way to send a request to
"live.dbpedia.org" so that it only returns me the json after
redirects? On the Wikipedia API I can send a "redirects" variable that does this. But then I'd have to parse the data myself…
The problem with the plan B is that nothing guarantees that the freebase object's name will ever lead me to the correct Wikipedia page. Even after the redirects…
I swear I've read a lot of API documentation but everything is extremely confusing and I chose not to read long tutorials about RDF, SPARQL and MQL because I really don't think the solution should be so complicated. I'm asking here because I really hope I'm missing a simple solution…
UPDATE
{
id: "/en/babes-bolyai_university",
lang: "en",
mid: "/m/08c4bf",
name: "Babeş-Bolyai University",
notable: {
id: "/education/university",
name: "College/University"
},
score: 37.733559
}
This is the result I get after selecting "Babeş-Bolyai University" in the suggest widget.
SOLUTION
I assumed I can't configure the Suggest widget to return more data, so after getting the Freebase ID of the object I just send another request, this time with a query specifically asking for the Wikipedia ID. I didn't know any MQL and couldn't find the name of the Freebase field with the Wikipedia ID. Maybe I'm stupid but the Freebase documentation really confuses me. In any case Tom Morris' answer and this question helped me build the query that returned what I wanted:
https://www.googleapis.com/freebase/v1/mqlread?query={"id":"/en/babes-bolyai_university","key":{"namespace":"/wikipedia/en_title","value":null}}
The strings in the result come with numeric codes for special unicode characters though (in my case Babe$0219-Bolyai_University). I've been able to convert the code with String.fromCharCode(parseInt(219, 16)) but if someone knows of a way to convert the whole string that would be helpful. Otherwise I can just make my own function replacing the "$dddd" pattern with the corresponding character.
Thanks for the help!

There isn't a DBpedia autosuggest comparable to Freebase Suggest as far as I'm aware.
Anything that's in Freebase can be retrieved with Suggest by using an MQL expressions in the output parameter. For simple things, e.g. names, aliases, the MQL is basically just a JSON snippet containing the relevant property name.
EDIT: The output parameter doesn't actually appear to be documented in the context of Suggest, but anything that isn't a Suggest parameter gets passed through transparently to the Freebase Search API, so you can use all of the stuff described here: https://developers.google.com/freebase/v1/search-output You can get as much or as little information as you require returned with each suggestion.
If you do need to query DBpedia, you should be using the Wikipedia/DBpedia key, which is not necessarily the same as the name. For English Wikipedia, the key is in the namespace /wikipedia/en or if you want the numeric Wikipedia ID in the namespace `/wikipedia/en_id'. Replace the 'en' with the appropriate language code if you want to query other language Wikipedias. These keys have some non-ASCII characters escaped, so if you need to unescape them, you can use the documentation here: http://wiki.freebase.com/wiki/MQL_key_escaping

You can update the "flyout_service_path" parameter. Here there is a description in the freebase suggest documentation (https://developers.google.com/freebase/v1/suggest). I'm using this configuration for to get all the keys of a entity.
$(inputClass).suggest(
{
"key" : _FREEBASE_API_KEY_BROWSER_APP,
"flyout_service_path":"/search?filter=(all mid:${id})&output=(notable:/client/summary description type /type/object/key)&key=${key}"
}
).bind("fb-select", this.fbSelectedHandler);
In the freebase response I can see now in the "output" parameter the "/type/object/key" with all the keys of the entity (wikipedia,etc..).
My question now is how I can acquire this data from output ?. In the "fb-select" event the "data" variable, don't carry this fields.
Some help, please..

Should json data in a restful response contain the object type information?

Let's say we want to populate some javascript models (eg backbone.js models), given a json response from a server like this:
{
"todo": {
"title": "My todo",
"items": [
{ "body": "first item." },
{ "body": "second item"}
]
}
}
This data does not contain the type information, so we do not know which model to populate when we see the "todo" key.
Of course one can create some custom standard to link the keys in the json response object to the client side models. For instance:
{
"todo": {
"_type": "Todo",
"title": "My todo",
...
}
}
While this works for objects, it gets awkward when it comes to lists:
"items": {
"_type": "TodoItem",
"_value": [
{ "body": "first item." },
{ "body": "second item"}
]
}
Before creating this custom rules, the questions are:
Are there any RESTful guidelines on including client side type information in response data?
If not, is it a good idea to include the client side type information in the response json?
Beside this whole approach of populating models, what are other alternatives?
Edit
While the model type can be retrieved from the url, eg /todo and /user, the problem with this approach is that the initial population of N models would mean N http requests.
Instead, the initial population can be done from a single big merged tree with only 1 request. In this case, the model type information in the url is lost.

A different endpoint (url) is used for each REST object. So the url includes the "which model" information.
And each model is a fixed collection of variables and (fixed) types.
So there is usually no need to send dynamic type information over the wire.
Added Re the comment from #ali--
Correct. But you're now asking a different/more precise question: "How do I handle the initial load of Backbone models without causing many http requests?" I'm not sure of the best answer to this question. One way would be to tell backbone to download multiple collections of models.
That would reduce the number of calls to one per model vs one per model instance.
A second way would be a non-REST call/response to download the current tree of data from the server. This is a fine idea. The browser-client can receive the response and then feed it model by model into backbone. Be sure to give the user some feedback about what's going on.
Re: nested models. Here's a SO q on it.

Consider that, as already said in other answers, in REST each resource has its own endpoint, and thus what you are trying to do (ie. hide all your models behind a single endpoint) is not fully REST-compliant, IMHO. Not a big deal per se.
Nested collections could be the answer here.
The "wrapper" collection fetches all the models from a single endpoint at init time, and pushes them to the respective collections. Of course you must send the type info in the json.
From that point on, each "inner" collection reacts to its own events, and deals with its own endpoint.
I don't see huge problems with such an optimization, as long as you are aware of it.

REST has nothing to do with the content sent back and forth. It only deals with how the state is transferred. JSON (which is the protocol you seem to be using) would be the one that indicated what would need to be sent, and as far as I know, it doesn't dictate that.
Including the type info in the JSON payload really depends on the libraries you are using. If it makes it easier for you to use JSON to include the types, then I would say put it in. If not, leave it out.

it's really useful when you have a model that extends another. indicating which model specifically to use eliminate the confusion

Requesting data with Javascript

I am creating a micro site that uses Javascript for data visualisation. I am working with a back-end developer who will pass me customer data to be displayed on the front end in order to graph and display different customer attributes (like age, sex, and total $ spent).
The problem I am having is that the developer is asking me what data I want and I have no idea what to tell them. I don't know what I need to or want to request, in the past I have always just taken data or content and marked it up. It's a new project for me and I am feeling a little bit out of my depth.
edit:
After thinking about this further and working a little bit with the back-end developer the specific problem I am having is how to do the actual ajax requests and update the results on my page. I know specifically that I am looking for things like age, sex, $ spend but I need to focus more on how to request them.

If you're using jQuery you can do asynchronous data requests (AJAX requests) in the JSON format using .getJSON which makes processing the response quite easy.
You can ask the backend developer to create a RESTful API which returns whichever data you need in the JSON format. As for the data itself, tell him to include whatever you think will need or may need in the future. Once you process the JSON data you can determine what you need. Don't go overboard and tell him to return stuff you'll never use or you'll just waste bandwidth though.

If you work with JavaScript, then the data format JavaScript understands natively is JSON. If they can provide you with data in JSON format, it would be a good start:
http://en.wikipedia.org/wiki/Json
{
"customers":
[
{
"age": "23",
"sex": "male",
"dollars-spent": "7"
},
{
"age": "22",
"sex": "female",
"dollars-spent": "10000"
}
]
}
I would guess you will need something like customer ID together with age and sex so that you could uniquely identify them.

We Keep Coding

JavaScript is the programming language of the Web.