Is there any way how to get data from another website which is using AJAX to change its content? For now, I am using PHP Simple HTML DOM Parser, but I can't use it in this case. I want to simulate just one click. And after this click, I get new data which I want to parse.
If the website has RSS feeds. Try using that instead. http://website.com/rss.xml or whatever the rss url is. Then you can use Simplepie, or any other RSS Feed parser to get the data.
Related
I have a plugin that puts all the content in a database, so I cannot put PHP code into the plugin, it won't execute. It does allow Javascript though.
So how would I go about using javascript to contact a page and get the data, if I have the data just pass the content, like this:
field1=data|field2=data|field3=data|field4=data
like that string.
Is there a way to get that content using javascript or jquery, so that I could parse that and pull the data I need?
I can build a php script to output that data, since I cannot have PHP executed in the plugin, so I could still securely get the data I need, since I could pull the data and make sure to pass a key, so it is secure.
In my textbook the URL http://services.faa.gov/airport/status/SFO?format=application/JSON was provided. That link points to a page that provides the content of the original page in JSON format. I want to format another webpage's content into JSON so I tried copying the method used, (Also the link my professor provided for an assignment uses the same format) and I get nothing. http://www.programmableweb.com/apitag/weather?format=application/JSON Clicking the link from here leads to a search of the website via a search engine. Copy pasting that exact same link just takes you to the actual webpage. My question is, why cant I just append ?format=application/JSON to any url for the JSON format of the webpage?
If it matters I'm trying to get JSON data to display via a Chrome extension.
My question is, why cant I just append ?format=application/JSON to any url for the JSON format of the webpage?
Because a URL is just data, and there is nothing standard about a query string parameter called "format". The server has to be designed to give you JSON before it can or will do that.
That particular website simply provides a feature where you can get the same data in an alternate format such as JSON. Not all websites provide features like that, and not all of them implement it with the same URL parameter. Some sites may have URLs ending with .html be HTML pages and ones ending with .json provide the same info in JSON. Others might provide a separate API. You might check that website to see if it has a "developers" section that gives information on their API, if they have one.
I have used php simple html dom to no success on this issue.
Now I have gone to DOMDocument and DOMXpath and this does seem promising.
Here is my issue:
I am trying to scrape data from a page which is loaded via a web service request after the page initially shows. It is only milliseconds but because of this, normal scraping shows a template value as opposed to the actual data.
I have found the endpoint url using chrome developer network settings. So if I enter that url into the browser address bar the data displays nicely in JSON format. All Good.
My problem arises because any time the site is re-visited or the page refreshed, the suffix of the endpoint url is randomly-generated so I can't hard-code this url into my php file. For example the end of the url is "?=253648592" on first visit but on refresh it could be "?=375482910". The base of the url is static.
Without getting into headless browsers (I tried and MY head hurts!) is there a way to have Xpath find this random url when the page loads?
Sorry for being so long-winded but I wanted to explain as best I could.
It's probably much easier and faster to just use a regex if you only need one item/value from the HTML. I would like to give an example but therefor I would need a more extended snippet of how the HTML looks like that contains the endpoint that you want to fetch.
Is it possible to give a snippet of the HTML that contains the endpoint?
I want to make a 'recommend' button on my app that will go out and fetch the description tag for a given URL, and return it back. i was thinking of having this be a getScript() request to a certain controller (POST or GET?), then when the server returns a response the script inserts it into the text box
What is the easiest method to do this without all the overhead of something like nokogiri? this is the only place im scraping something in my whole app so id rather keep it to a very lightweight method.
Also, should I use GET or POST in my controller (according to the rails way)? Thanks!
By description do you mean the meta tag?
Let yahoos yql service do the heavy lifting for you.
e.g.
SELECT * FROM html WHERE url="http://google.com" AND xpath="//head/meta"
Here it is in their testing console
and this is the url you would grab to retrieve the response in json
How should I proceed in achieving the following:
I need to get the data from another server which is a jsp page it has the data related to the information i want to show in tooltip. The code for this is working and I can make ajax call to get the response.
The concern is that I want the contents of qtip library to fit in the page since the page doesn't allow cross domain contents. If I will try to just reference the contents of qtip saved on my website(the domain is different from the page which I am using) it wont allow to do this. so is it fine embedding the contents in the main form or there is some other optimal way?
Similar question was asked:
How to display information returned by ajax call in a tooltip
If you can't reach cross domain via AJAX you can always uses an intermediary script (in your case Java) to output a buffer containing the information you want in the qTip.
Script calls digest.jsp?params=someparameters
digest.jsp fetchs the information from any domain it needs.
outputs the information in a buffer in XML o JSON
with javascript you parse the information an put it in the option attribute.
If it doesn't work for you nor you want to do it you can always relay in putting the information in each title="" attribute in each option.