To document some recent events I saved all tweets including a special hashtag. Now I have about 50.000 tweets which I want to publish. To save bandwidth and server load I want just want to send the raw tweet text to the client and then render it with javascript (linking hashtags, useranames and urls).
Is there already javascript library which is able to parse and create a html representation from a raw tweet?
twitterlib.render() looks like a good start... assuming you have parsed JSON tweet data:
<script src="twitterlib.js"></script>
<script>
var parsed_tweet_data = getTweetData(...); // get a Tweet JS object...
var html = twitterlib.render(parsed_tweet_data);
// Do something with the rendered html now...
</script>
Here's a twitterlib walkthrough on SlideShare.net (slide 17 has a demo.)
Have you considered using the Twitter oembed API? It basically lets you request the "official" embedded tweet HTML programmatically using an anonymous API (no authentication required). This would at least make it easy to meet the display requirements without reinventing the wheel. You can even do this client side, depending on your use case.
I'm grappling with this same issue, so let us know what you try and how it works for your project.
Related
I'm trying to scrape information using PHP from this site, however the information I'm looking for seems to be generated through Javascript or similar. I would be greateful for any suggestions on what approach to take!
This is the remote site that I'm trying to fetch data from: http://www.riksdagen.se/sv/webb-tv/video/debatt-om-forslag/yrkestrafik-och-taxi_H601TU11
The page contains a video and beneith the headline "Anförandelista" there are a number of names/links to individual time spots in the video.
I want to use PHP to automatically fetch the names and links in this list and store it in a database. However, this information is not included in the HTML source and thus I fail to retreive it.
Any ideas on how I can remotely access the information using an automated script? Or in which direction I should look for a solution? Any pointers are very much appreciated.
You can get this info as a json response from the API call the page makes. I don't know PHP, yet, but a quick Google shows handling json is possible and fairly straightforward. I give an example python script at the bottom.
The API call is this
http://www.riksdagen.se/api/videostream/get/H601TU11
It returns json as follows (just an excerpt shown. The json includes the speech as well):
Explore full json response here.
PHP
Looking at this question you could start with something like:
$array = json_decode(file_get_contents('http://www.riksdagen.se/api/videostream/get/H601TU11'));
Example python if wanted:
import requests
import pandas as pd
r = requests.get('http://www.riksdagen.se/api/videostream/get/H601TU11').json()
results = []
for item in r['videodata'][0]['speakers']:
start = item['start']
duration = item['duration']
speaker = item['text']
row = [speaker, start, duration]
results.append(row)
df = pd.DataFrame(results, columns = ['Speaker', 'Start', 'Duration'])
print(df)
Example output:
You can not get the information loaded by JS using just PHP solution. Curl, file_get_contents and similar options will only get the server response for you, they will not execute JS as it is client side script.
For that you will need to use a headless browser (there are multiple to choose from: Chromium, Google Chrome with it's new headless mode or Selenium web driver are just few of the most popular ones)
I have used php simple html dom to no success on this issue.
Now I have gone to DOMDocument and DOMXpath and this does seem promising.
Here is my issue:
I am trying to scrape data from a page which is loaded via a web service request after the page initially shows. It is only milliseconds but because of this, normal scraping shows a template value as opposed to the actual data.
I have found the endpoint url using chrome developer network settings. So if I enter that url into the browser address bar the data displays nicely in JSON format. All Good.
My problem arises because any time the site is re-visited or the page refreshed, the suffix of the endpoint url is randomly-generated so I can't hard-code this url into my php file. For example the end of the url is "?=253648592" on first visit but on refresh it could be "?=375482910". The base of the url is static.
Without getting into headless browsers (I tried and MY head hurts!) is there a way to have Xpath find this random url when the page loads?
Sorry for being so long-winded but I wanted to explain as best I could.
It's probably much easier and faster to just use a regex if you only need one item/value from the HTML. I would like to give an example but therefor I would need a more extended snippet of how the HTML looks like that contains the endpoint that you want to fetch.
Is it possible to give a snippet of the HTML that contains the endpoint?
I am creating a website where each user will have their uniq page. users can visit other user's pages by
http://website/user?user=<username>&session=<session>
Now I want to simplify above URL to
http://website/user/<username> (something like pinterest or facebook)
I thought I can use mod_rewrite. However, mod_rewrite is for server side. I do not want to include any PHP code. What I do to get data for a user :
load the basic HTML template and then based on which user we are talking about, load user's data asynchronously.
Can I achieve above in JS? If yes, how?
-Ajay
Unfortunately, you can't do exactly this.
But possible solution would be to place your HTML hub page to http://website/user/ and form user URLs like this: http://website/user/#username. JS can get the user name simply by var username = location.href.split("#")[1].
By the way, you said that you are not using PHP. How do you parse URL arguments then?
I am trying to retrieve data from an XML file that is not located on my site's server, and then use that data fro various things, such as charts. Here is one example: http://forecast.weather.gov/MapClick.php?lat=40.78158&lon=-73.96648&FcstType=dwml. This is an XML file with the weather data for central park. I want to retrieve the data that is in the <value>tag, which is in the <pressure> tag, so I can create a graph with barometric pressure. I would prefer to do this with JavaScript, but I don't think it's possible to do so when the file isn't on my server.
Note: I do not want a different solution to retrieve the pressure data from somewhere else, because I want to retrieve other pieces of data from other XML files as well.
There's an interesting article about using Yahoo! Pipes to transform Xml weather data to JSON and use the result in a web page without need for any server side stuff (PHP, curl, etc.).
EDIT
Being new to jQuery myself, I a had to dig a little more to find out that (almost) everything described in the first article can be condensed down to
$.getJSON("<your Yahoo pipes url here>&_callback=?", function (data) {
alert(data.value.items[0].data[0].parameters.wordedForecast.text[0]);
});
using jQuerys builtin JSONP.
Pitfall!
Beware that Yahoo expects the callback url param to be named _callback
Nice summary on Cross-domain communications with JSONP which helped a lot to come up with this answer.
If your javascript code is on a server (as opposed to a mobile device), have PHP code load the xml, escape it and insert it into the HTML page. Then you just have to grab that in your code and process it with DOMParser.
You could use curl to pull the data to your server and act on it from there.
curl -o data.txt "http://forecast.weather.gov/MapClick.php?lat=40.78158&lon=-73.96648&FcstType=dwml"
This will give you the information in a file called data.txt. You could then either parse it server side and then just give the bits of data needed, or make the whole file available to your client, since they are both now in the same domain.
I want to get a short string hosted on a server where I do not have access to the data as XML, JSON, etc. I am trying to use either .load or .ajax to do this. I want to be able to parse the data into a javascipt array. The entire contents of the remote page is text and I am happy to take all of it and remove what I do not need via a small javascript. I have tried:
<script>
$(document).ready(function(){
$("button").click(function(){
$.ajax({url:"http://url:8888/data", success:function(result){
$("div").html(result);
}});
});});
</script>
I have two questions.
1- why does this not work?
2- What would be the best way to store the string in a javascript var?
I am sure JQuery is working correctly.
The answer would be to long to post here (really). But look those up:
Same Origin Policy
Padded JSON
If you have no control over the remote site, you have lost - you will not get any data from it by Ajax (which is actually a feature, not a limitation of the technology). One way of circumventing the protection would be to build a proxy that just mirrors the remote service you need to reach and makes it available in the same domain that your main HTML came from.