Data structure visualizer for javascript - javascript

I am looking for a javascript library which renders an arbitrary (but acyclic) JSON data blob into some sort of semi-interactive HTML view. James Padolsey's Prettyprint library comes close, but its output is very verbose ("object" and "array" headers on everything, for instance), is only marginally interactive (the ability to collapse and expand subtrees would be nice, especially) and not particularly customizable. I also found jstree, but it looks like that doesn't do arbitrary JSON data blobs, only ones specifically constructed to be fed to it. Also, a strict treeview is not really right for the data I have; I want more of a key/value presentation (but with some values being nested objects).
I do not need the ability to modify the data structure, just show it in a more-or-less human readable fashion.
Any suggestions?

I have a small project to display jsobjects.
Its not very pretty and could use some improvements but it might help out a little.
It is built on "jquery-1.4.2.min.js" but should work with older versions.
Files:
http://empirium.dnet.nu/js/object-browser.js
http://empirium.dnet.nu/js/object-browser.css
This is an example on how to use it:
http://empirium.dnet.nu/OBTest.html
Clicking on the bold black type will open and close complex datastructures that are not visible imediatly.
I hope you have some use for it and if you have any suggestions please just comment here.
Its not an active project, just something I wrote to do some debugging.

Related

Best way to scrape a set of pages with mixed content

I’m trying to show a list of lunch venues around the office with their today’s menus. But the problem is the websites that offer the lunch menus, don’t always offer the same kind of content.
For instance, some of the websites offer a nice JSON output. Look at this one, it offers the English/Finnish course names separately and everything I need is available. There are couple of others like this.
But others, don’t always have a nice output. Like this one. The content is laid out in plain HTML and English and Finnish food names are not exactly ordered. Also food properties like (L, VL, VS, G, etc) are just normal text like the food name.
What, in your opinion, is the best way to scrape all these available data in different formats and turn them into usable data? I tried to make a scraper with Node.js (& phantomjs, etc) but it only works with one website, and it’s not that accurate in case of the food names.
Thanks in advance.
You may use something like kimonolabs.com, they are much easier to use and they give you APIs to update your side.
Remember that they are best for tabular data contents.
There my be simple algorithmic solutions to the problem, If there is a list of all available food names this can be really helpful, you find the occurrence of a food name inside a document (for today).
If there is not any food list, You may use TF/IDF. TF/IDF allows to calculate the score of a word inside a document among the current document and also other documents. But this solution needs enough data to work.
I think the best solution is some thing like this:
Creating a list of all available websites that should be scrapped.
Writing driver classes for each website data.
Each driver has the duty of creating the general domain entity from its standard document.
If you can use PHP, Simple HTML Dom Parser along with Guzzle would be a great choice. These two will provide a jQuery like path finder and a nice wrapper arround HTTP.
You are touching really difficult problem. Unfortunately there are no easy solutions.
Actually there are two different parts to solve:
data scraping from different sources
data integration
Let's start with first problem - data scraping from different sources. In my projects I usually process data in several steps. I have dedicated scrapers for all specific sites I want, and process them in the following order:
fetch raw page (unstructured data)
extract data from page (unstructured data)
extract, convert and map data into page-specific model (fully structured data)
map data from fully structured model to common/normalized model
Steps 1-2 are scraping oriented and steps 3-4 are strictly data-extraction / data-integration oriented.
While you can easily implement steps 1-2 relatively easy using your own webscrapers or by utilizing existing web services - data integration is the most difficult part in your case. You will probably require some machine-learning techniques (shallow, domain specific Natural Language Processing) along with custom heuristics.
In case of such a messy input like this one I would process lines separately and use some dictionary to get rid Finnish/English words and analyse what has left. But in this case it will never be 100% accurate due to possibility of human-input errors.
I am also worried that you stack is not very well suited to do such tasks. For such processing I am utilizing Java/Groovy along with integration frameworks (Mule ESB / Spring Integration) in order to coordinate data processing.
In summary: it is really difficult and complex problem. I would rather assume less input data coverage than aiming to be 100% accurate (unless it is really worth it).

JSON diff of large JSON data, finding some JSON as a subset of another JSON

I have a problem I'd like to solve to not have to spend a lot of manual work to analyze as an alternative.
I have 2 JSON objects (returned from different web service API or HTTP responses). There is intersecting data between the 2 JSON objects, and they share similar JSON structure, but not identical. One JSON (the smaller one) is like a subset of the bigger JSON object.
I want to find all the interesecting data between the two objects. Actually, I'm more interested in the shared parameters/properties within the object, not really the actual values of the parameters/properties of each object. Because I want to eventually use data from one JSON output to construct the other JSON as input to an API call. Unfortunately, I don't have the documentation that defines the JSON for each API. :(
What makes this tougher is the JSON objects are huge. One spans a page if you print it out via Windows Notepad. The other spans 37 pages. The APIs return the JSON output compressed as a single line. Normal text compare doesn't do much, I'd have to reformat manually or w/ script to break up object w/ newlines, etc. for a text compare to work well. Tried with Beyond Compare tool.
I could do manual search/grep but that's a pain to cycle through all the parameters inside the smaller JSON. Could write code to do it but I'd also have to spend time to do that, and test if the code works also. Or maybe there's some ready made code already for that...
Or can look for JSON diff type tools. Searched for some. Came across these:
https://github.com/samsonjs/json-diff or https://tlrobinson.net/projects/javascript-fun/jsondiff
https://github.com/andreyvit/json-diff
both failed to do what I wanted. Presumably the JSON is either too complex or too large to process.
Any thoughts on best solution? Or might the best solution for now be manual analysis w/ grep for each parameter/property?
In terms of a code solution, any language will do. I just need a parser or diff tool that will do what I want.
Sorry, can't share the JSON data structure with you either, it may be considered confidential.
Beyond Compare works well, if you set up a JSON file format in it to use Python to pretty-print the JSON. Sample setup for Windows:
Install Python 2.7.
In Beyond Compare, go under Tools, under File Formats.
Click New. Choose Text Format. Enter "JSON" as a name.
Under the General tab:
Mask: *.json
Under the Conversion tab:
Conversion: External program (Unicode filenames)
Loading: c:\Python27\python.exe -m json.tool %s %t
Note, that second parameter in the command line must be %t, if you enter two %ss you will suffer data loss.
Click Save.
Jeremy Simmons has created a better File Format package Posted on forum: "JsonFileFormat.bcpkg" for BEYOND COMPARE that does not require python or so to be installed.
Just download the file and open it with BC and you are good to go. So, its much more simpler.
JSON File Format
I needed a file format for JSON files.
I wanted to pretty-print & sort my JSON to make comparison easy.
I have attached my bcpackage with my completed JSON File Format.
The formatting is done via jq - http://stedolan.github.io/jq/
Props to
Stephen Dolan for the utility https://github.com/stedolan.
I have sent a message to the folks at Scooter Software asking them to
include it in the page with additional formats.
If you're interested in seeing it on there, I'm sure a quick reply to
the thread with an up-vote would help them see the value posting it.
Attached Files Attached Files File Type: bcpkg JsonFileFormat.bcpkg
(449.8 KB, 58 views)
I have a small GPL project that would do the trick for simple JSON. I have not added support for nested entities as it is more of a simple ObjectDB solution and not actually JSON (Despite the fact it was clearly inspired by it.
Long and short the API is pretty simple. Make a new group, populate it, and then pull a subset via whatever logical parameters you need.
https://github.com/danielbchapman/groups
The API is used basically like ->
SubGroup items = group
.notEqual("field", "value")
.lessThan("field2", 50); //...etc...
There's actually support for basic unions and joins which would do pretty much what you want.
Long and short you probably want a Set as your data-type. Considering your comparisons are probably complex you need a more complex set of methods.
My only caution is that it is GPL. If your data is confidential, odds are you may not be interested in that license.

Easiest way to edit a huge geoJSON?

I'm sitting here with a huge geoJSON that I got from an Open Street Map shape-file. However, most of the polygons are unnecessary. These could, in theory, easily be singled out based on certain properties.
But how do I query the geoJSON file to remove certain elements (features)? Or would it be easier to save the shape-file in another format (working in QGIS)?
Link to sample of json-file: http://dl.dropbox.com/u/15955488/hki_test_sample.json (240 kB)
When you say "query the geoJSON," are you talking about having the source where you get the geoJSON give you a subset of data? There is no widely-implemented standard for "querying" JSON like this, but each site you retrieve from may have its own parameters to reduce the size of data you get.
If you're talking about paring down the data in client-side code, simply looping through the structure and removing properties (with delete) and array items is what you'd have to do.
Shapefile beats GeoJSON for large (not mega) data. It supports random access to features. To get at the GeoJSON features in a collection you have to read and deserialize the entire file.
Depending on how you want to edit it and what software is available you have a few options. If you have access to Safe FME this is by far the best geographic feature manipuluation software and will give you tons of options (it can read / write (and convert between) just about any geographic format). If you're just looking for a text editor that can handle the volume of data I would look at Notepad++ - it can hold a lot of text and you can do find / replace using regular expressions. Safe FME can be a little pricy, but you might be able to get a trial
As Jacob says, just iterate and remove the elements you don't want. I like http://documentcloud.github.com/underscore/#reject for convenience.
If you are going to permanently remove fields just convert it to a shapefile, remove the fields you don't want, and re-export it as GeoJSON.
I realize this question is old, but if anyone comes across this now, I'd recommend TopoJSON.
Convert it to TopoJSON.
By default TopoJSON removes all attributes, but you can flag those you'd like to keep like this:
topojson -o output.topojson -p fieldToKeep,anotherFieldToKeep input.geojson
More info in the TopoJSON command line reference

Is there a table object with sorting functionality freely available for JavaScript?

I am in need of an abstraction for tables in a JavaScript application that is heavily table-based. I plan on making a JavaScript class with the following functions
Constructor to create the table from an array of objects
Constructor to create the table from an array of headers and an array of data
Methods for sorting by any of the columns
Methods for splitting the table into pages and traversing them
Methods for tying the object to a HTML element, drawing a HTML table representation inside it
The class will be small, light and contained in a single JavaScript file. Also the class will be free to use for anybody.
But before I possibly reinvent the wheel, I have to ask:
Does something like this exist already? (I couldn't find anything)
The class will be small, light and contained in a single JavaScript file. Also the class will be free to use for anybody.
Once you have that restriction, I'm pretty sure that the tradeoff between using someone's API or implementing your own will lead you to 'reinvent the wheel', because the time to understand that API and adapt it to your current code maybe is not worth, compared to the effort to implement.
Anyway, if you still wants to use some API, i've found this post somewhere that maybe can help you. Good luck.
Why you want write one more sortable table class? There are some good one already.
Look at jqGrid for example. It is free open source. You can download it here, see different demos here. Additionally you can find the latest source code of jqGrid on GitHub.

Implementing dynamic ranking mechanism in Javascript

I have to implement this functionality as a part of a webapp I'm working on:
I have a file which contains entries in the form of
key1, val1, val2, val3, val3
key2, bval1, bval2, bval3
where key1 is a key to the values. Each val has a rank
that is the index in this array. e.g.val1 is rank 1, val2 is rank2
and so on.
Now, I want to a make a UI which will allow the user to
change the ranks of the values associated with a particular key
and finally write that changes out to the file.
Interacting with a database will be second part of the project
so would want to avoid that as of now.
Can all of this be accomplished just by javascript(or jQuery)
If yes, how do I model each value and provide up and down
arrows to allow the user to change the ranking. Can anyone
point to some resources(or plugins) that I can read and learn?
Any help would be greatly appreciated.
Although it binds your hands a bit visually, you might want to take a look at jQuery UI's Sortable and Dragable plugins. A demo is here. If you follow their documentation and examples, modeling this becomes trivial.
As for writing it out to a file, you may not need to do that (assuming you're talking about doing this on the server). As long as the list of values is not huge, you can use JSON.stringify to serialize your array/object and write it out persistently to a cookie.
Since you are writing out to a database, you may be specifically targeting Gears/Webkit/iOS browsers, in which case you may want to look into window.localStorage (it will eventually have full support in other browsers, but for now I think it can only safely be used in recent webkit browsers, including iPhone and iPad.)
Perhaps you can store the data with cookies.
But the "write changes out to a file" part... might not be possible (i don't think it's possible) with JUST javascript/jquery, you would need server-side assist from PHP,Python,Ruby,Perl etc...

Categories