I can make my app to "force" Chrome (v 39.0.2171.99 m) to show http response as Json (instead of XML).
How do I get the Json in a tree structure (instead of a string)?
Checking the Preview tab in dev tools doesn't work for me.
I could paste the Json string into JsonLint, but I want to know a more direct route, if there is one.
I've been using this extension (It's called JSON Viewer and the source is available at github) for years, it works great.
I don't know who's the developer is but if he ever reads this: Thanks for taking the time to develop such a timesaving tool!
Related
I have the following data input:
a XML string
a XSLT string
The goal is to render the output HTML (output of the XSLT processing on the XML data) on the React Native (i.e., on the webview, integrated browser, etc. is not important where, I have just give the possibility to read the output).
The problem is that javascript core doesn't provide API such as XSLTProcessor() for solving this requirementes. I tried also the [https://github.com/fiduswriter/xslt-processor][https://github.com/fiduswriter/xslt-processor] project but I receive an error on the "decimal-format" support.
BTW, I tried to show data my desktop browsers and It works, so input I have is good.
Hi I am a Python newbie and I am webscraping a webpage.
I am using the Google Chrome Developer Extension to identify the class of the objects I want to scrape. However, my code returns an empty array of results whereas the screenshots clearly show that that those strings are in the HTML code.
Chrome Developer
import requests
from bs4 import BeautifulSoup
url = 'http://www.momondo.de/flightsearch/?Search=true&TripType=2&SegNo=2&SO0=BOS&SD0=LON&SDP0=07-09-2016&SO1=LON&SD1=BOS&SDP1=12-09-2016&AD=1&TK=ECO&DO=false&NA=false'
html = requests.get(url)
soup = BeautifulSoup(html.text,"lxml")
x = soup.find_all("span", {"class":"value"})
print(x)
#pprint.pprint (soup.div)
I am very much appreciating your help!
Many thanks!
Converted my comment to an answer...
Make sure the data you are expecting is actually there. Use print(soup.prettify()) to see what was actually returned from the request. Depending on how the site works, the data you are looking for may only exist in the browser after the javascript is processed. You might also want to take a look at selenium
I'm trying to make my own DB with data pulled from a Javascript Variable located on this URL (https://www.numberfire.com/nba/fantasy/fantasy-basketball-projections/). Since the data is only made available in the variable (NF_DATA), I'm not able to easily query as I would an API.
I'm able to pull the data as I would any JSON object using the Chrome Developer Console as such:
I would like to be able to pull data with a Python script in the same way as I was doing in the Chrome Developer Console. (For instance, identifying the exact data by writing 'NF_DATA['daily_projections']['1'][...]' in the script, so I can do the manipulation directly in the script, not in Chrome Devtools). Any recommendations on how I do this? I have tried using BeautifulSoup in Python, but was having trouble grabbing the data without making the complete output into a string (would this even be a good way to think about it??)
I'm building a game using HTML5 Canvas and Javascript and I'm using JSON formatted tile maps for my levels. The tiles render correctly in FireFox, but when I use Chrome, the JSON fetching fails with a "Origin null is not allowed by Access-Control-Allow-Origin." I was using jQuery's $.ajax command and all my files are in one directory.
I would use this post's solution, but I can't use the web server solution.
Is there any other way to fetch JSON files to be parsed and read from? Something akin to loading an image just by giving the URL? Or is there some way to quickly convert my JSON files into globally available strings so I can parse it with JSON.parse()?
Why is the local web server not an option? Apache is free, can be installed on anything, and easy to use, IMO. Also, for Chrome specifically, look into --allow-file-access-from-files
But if nothing else works, maybe you could just add links to the files in script tags, and then append var SomeGlobalObject = ... to the top of each file. You might even be able to do this dynamically by using JS to append the script tag to head. But in the end, instead of using AJAX, you can just do JSON.parse(SomeGlobalObject)
In other words, load the files into the global namespace by adding script tags. Normally this would be considered bad practice, but used ONLY for testing, in the absence of any other options, it may work.
One option which may work for you in Chrome is to invoke the browser with the command line switch --allow-file-access-from-files. This question addresses the issue : Google Chrome --allow-file-access-from-files disabled for Chrome Beta 8
Another possibility is to fetch the JSON data as a script, setting a global variable to the JSON value
How does one parse html documents which make heavy use of javascript? I know there are a few libraries in python which can parse static xml/html files and I'm basically looking for a programme or library (or even firefox plugin) which reads html+javascript, executes the javascript bit and outputs html code without javascript so it would look identical if displayed in a browser.
As a simple example
link
should be replaced by the appropriate value the javascript function returns, e.g.
link
A more complex example would be a saved facebook html page which is littered with loads of javascript code.
Probably related to
How to "execute" HTML+Javascript page with Node.js
but do I really need Node.js and JSDOM? Also slightly related is
Python library for rendering HTML and javascript
but I'm not interested in rendering just the pure html output.
You can use Selenium with python as detailed here
Example:
import xmlrpclib
# Make an object to represent the XML-RPC server.
server_url = "http://localhost:8080/selenium-driver/RPC2"
app = xmlrpclib.ServerProxy(server_url)
# Bump timeout a little higher than the default 5 seconds
app.setTimeout(15)
import os
os.system('start run_firefox.bat')
print app.open('http://localhost:8080/AUT/000000A/http/www.amazon.com/')
print app.verifyTitle('Amazon.com: Welcome')
print app.verifySelected('url', 'All Products')
print app.select('url', 'Books')
print app.verifySelected('url', 'Books')
print app.verifyValue('field-keywords', '')
print app.type('field-keywords', 'Python Cookbook')
print app.clickAndWait('Go')
print app.verifyTitle('Amazon.com: Books Search Results: Python Cookbook')
print app.verifyTextPresent('Python Cookbook', '')
print app.verifyTextPresent('Alex Martellibot, David Ascher', '')
print app.testComplete()
From Mozilla Gecko FAQ:
Q. Can you invoke the Gecko engine from a Unix shell script? Could you send it HTML and get back a web page that might be sent to the printer?
A. Not really supported; you can probably get something close to what you want by writing your own application using Gecko's embedding APIs, though. Note that it's currently not possible to print without a widget on the screen to render to.
Embedding Gecko in a program that outputs what you want may be way too heavy, but at least your output will be as good as it gets.
PhantomJS can be loaded using Selenium
$ ipython
In [1]: from selenium import webdriver
In [2]: browser=webdriver.PhantomJS()
In [3]: browser.get('http://seleniumhq.org/')
In [4]: browser.title
Out[4]: u'Selenium - Web Browser Automation'