Parsing XML / RSS from URL using Java Script

Parsing XML / RSS from URL using Java Script - javascript

Hi i want to parse xml/rss from a live url like http://rss.news.yahoo.com/rss/entertainment using pure Java Script(not jquery). I have googled a lot. Nothing worked for me. can any one help with a working piece of code.

(You cannot have googled a lot.) Once you have worked around the Same Origin Policy, and if the resource is served with an XML MIME type (which it is in this case, text/xml), you can do the following:
var x = new XMLHttpRequest();
x.open("GET", "http://feed.example/", true);
x.onreadystatechange = function () {
if (x.readyState == 4 && x.status == 200)
{
var doc = x.responseXML;
// …
}
};
x.send(null);
(See also AJAX, and the XMLHttpRequest Level 2 specification [Working Draft] for other event-handler properties.)
In essence: No parsing necessary. If you then want to access the XML data, use the standard DOM Level 2+ Core or DOM Level 3 XPath methods, e.g.
/* DOM Level 2 Core */
var title = doc.getElementsByTagName("channel")[0].getElementsByTagName("title")[0].firstChild.nodeValue;
/* DOM Level 3 Core */
var title = doc.getElementsByTagName("channel")[0].getElementsByTagName("title")[0].textContent;
/* DOM Level 3 XPath (not using namespaces) */
var title = doc.evaluate('//channel/title/text()', doc, null, 0, null).iterateNext();
/* DOM Level 3 XPath (using namespaces) */
var namespaceResolver = (function () {
var prefixMap = {
media: "http://search.yahoo.com/mrss/",
ynews: "http://news.yahoo.com/rss/"
};
return function (prefix) {
return prefixMap[prefix] || null;
};
}());
var url = doc.evaluate('//media:content/#url', doc, namespaceResolver, 0, null).iterateNext();
(See also JSX:xpath.js for a convenient, namespace-aware DOM 3 XPath wrapper that does not use jQuery.)
However, if for some (wrong) reason the MIME type is not an XML MIME type, or if it is not recognized by the DOM implementation as such, you can use one of the parsers built into recent browsers to parse the responseText property value. See pradeek's answer for a solution that works in IE/MSXML. The following should work everywhere else:
var parser = new DOMParser();
var doc = parser.parseFromString(x.responseText, "text/xml");
Proceed as described above.
Use feature tests at runtime to determine the correct code branch for a given implementation. The simplest way is:
if (typeof DOMParser != "undefined")
{
var parser = new DOMParser();
// …
}
else if (typeof ActiveXObject != "undefined")
{
var xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
// …
}
See also DOMParser and HTML5: DOM Parsing and Serialization (Working Draft).

One big problem you might run into is that generally, you cannot get data cross domain. This is big issue with most rss feeds.
The common way to deal with loading data in javascript cross domain is calls JSONP. Basically, this means that the data you are retrieving is wrapped in a javascript callback function. You load the url with a script tag, and you define the function in your code. So when the script loads, it executes the function and passes the data to it as an argument.
The problem with most xml/rss feeds is that services that only provide xml tend not to provide JSONP wrapping capability.
Before you go any farther, check to see if your data source provides a json format and JSONP functionality. That will make this a lot easier.
Now, if your data source doesn't provide json and jsonp functionality, you have to get creative.
On relatively easy way to handle this is to use a proxy server. Your proxy runs somewhere under your control, and acts as a middleman to get your data. The server loads your xml, and then your javascript does the requests to it instead. If the proxy server runs on the same domain name then you can just use standard xhr(ajax) requests and you don't have to worry about cross-domain stuff.
Alternatively, your proxy server can wrap the data in a jsonp callback and you can use the method mentioned above.
If you are using jQuery, then xhr and jsonp requests are built-in methods and so make doing the coding very easy. Other common js libraries should also support these. If you are coding all of this from scratch, its a little more work but not terribly difficult.
Now, once you get your data hopefully its just json. Then there's no parsing needed.
However, if you end up having to stick with an xml/rss version, and if you're jQuery, you can simply use jQuery.parseXML http://api.jquery.com/jQuery.parseXML/.

better convert xml to json. http://jsontoxml.utilities-online.info/
after converting if you need to print json object check this tutorial
http://www.w3schools.com/json/json_eval.asp

Related

How to call a fetch request and wait for it's answer inside a onBeforeRequest in a web extension

I'm trying to write a web extension that stops the requests from a url list provided locally, fetches the URL's response, analyzes it in a certain way and based on the analysis results, blocks or doesn't block the request.
Is that even possible?
The browser doesn't matter.
If it's possible, could you provide some examples?
I tried doing it with Chrome extensions, but it seems like it's not possible.
I heard it's possible on mozilla though

I think that this is only possible using the old webRequestBlocking API which Chrome is removing as a part of Manifest v3. Fortunately, Firefox is planning to continue supporting blocking web requests even as they transition to manifest v3 (read more here).
In terms of implementation, I would highly recommend referring to the MDN documentation for webRequest, in particular their section on modifying responses and their documentation for the filterResponseData method.
Mozilla have also provided a great example project that demonstrates how to achieve something very close to what I think you want to do.
Below I've modified their background.js code slightly so it is a little closer to what you want to do:
function listener(details) {
if (mySpecialUrls.indexOf(details.url) === -1) {
// Ignore this url, it's not on our list.
return {};
}
let filter = browser.webRequest.filterResponseData(details.requestId);
let decoder = new TextDecoder("utf-8");
let encoder = new TextEncoder();
filter.ondata = event => {
let str = decoder.decode(event.data, {stream: true});
// Just change any instance of Example in the HTTP response
// to WebExtension Example.
str = str.replace(/Example/g, 'WebExtension Example');
filter.write(encoder.encode(str));
filter.disconnect();
}
// This is a BlockingResponse object, you can set parameters here to e.g. cancel the request if you want to.
// See: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/BlockingResponse#type
return {};
}
browser.webRequest.onBeforeRequest.addListener(
listener,
// 'main_frame' means this will only affect requests for the main frame of the browser (e.g. the HTML for a page rather than the images, CSS, etc. that are loaded afterwards). You might want to look into whether you want to expand this.
{urls: ["*://*/*"], types: ["main_frame"]},
["blocking"]
);
Correction:
The above example only works properly if the response data fits in one chunk. If it is larger (and you still want to inspect the entirety of the response data), you would need to put all of the data into a buffer, and then work on it once all data has been received. See the document here for more information: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/StreamFilter/ondata#webextension_examples (the code section titled "This example combines all buffers into a single buffer" would be of most interest to you I think).
In terms of using this API to block responses, data is only returned from this URL if you call filter.write(), so if you don't like the response, you can simply not call it (just call filter.close()) and an empty response will be returned. You can also only return part of the full response body by filter.write()ing only the bits that you want to return.

JSONP callback in Dart

I have been trying to get basic JSONP working in Dart and I am getting stuck. Reading this blog post as well as this this blog show that I should use window.on.message.add(dataReceived); to get a MessageEvent and retrieve data from the event.
Dart complains that "There is no such getter 'message' in events". In addition, I looked up different ways of getting a MessageEvent but it seems to be something completely unrelated (WebSockets?) and is not what I actually need.
If anybody can explain what is going on and how to really use JSONP in Dart, that would be awesome!

You don't need to use what is described in the articles you point anymore. You can use dart:js :
import 'dart:html';
import 'dart:js';
void main() {
// Create a jsFunction to handle the response.
context['processData'] = (JsObject jsonDatas) {
// call with JSON datas
};
// make the call
ScriptElement script = new Element.tag("script");
script.src = "https://${url}?callback=processData";
document.body.children.add(script);
}

I recently wrote a blog post on this myself as I was running into similar problems.
I first cover a few prerequisite things like Verifying CORS Compliance and Verifying JSONP Support
I too ended up registering with the updated method:
window.onMessage.listen(dataReceived);
I then had a fairly simple method to dynamically create the script tag in Dart as well (my requirement was that I had to use Dart exclusively and couldn't touch the website source files):
void _createScriptTag()
{
String requestString = """function callbackForJsonpApi(s) {
s.target="dartJsonHandler";
window.postMessage(JSON.stringify(s), '*');
}""";
ScriptElement script = new ScriptElement();
script.innerHtml = requestString;
document.body.children.add(script);
}
I then invoked it from Dart with some simple logic that I wrapped in a method for convenience.
void getStockQuote(String tickerId)
{
String requestString = "http://finance.yahoo.com/webservice/v1/symbols/" + tickerId + "/quote?format=json&callback=callbackForJsonpApi";
ScriptElement script = new ScriptElement();
script.src = requestString;
document.body.children.add(script);
}
If you are using dart:js I find Alexandre's Answer useful and, after upvoting Alexandre, I have updated my post to include the simplified version as well:
context['callbackForJsonpApi'] = (JsObject jsonData)
{
//Process JSON data here...
};
This obviously eliminates the need for the onMessage and _createScriptTag above, and can be invoked the same as before.
I decided to keep both approaches, however, as I have noticed over time the Dart APIs changing and it seems to be a good idea to have a fallback if needed.

The syntax has changed
window.onMessage.listen(dataReceived);

JavaScript: Writing an output file within a limited & secure scenario

I would like to add a function in my javascript to write to a text file in the local directory where the javascript file is located. This means I'm not looking for some insecure way of accessing the user's file system in any way. All I care about is extracting the user's input into an html page that is accessed by my javascript then using that input as data externally. I just need a simple text file. This user input isn't actually text by the way, but rather a bunch of actions using my online game's components that the underlying javascript turns into a text string (so this particular string is what I want to save, not really even anything direct from the user).
I don't want to write to a user's file system, but rather, the file where the javascript (and html) code is located (a folder hosted on a server). Is there any simple way to get some file I/O going?
I know Javascript has a FileReader, is there any way to get it to do this in reverse? Like a FileWriter. GoogleClosure looks like it has a FileWriter, but it doesn't seem to quite work and I can't find any decent examples of how to get it to do this.
If this requires a different language, is there any way I can just get the relevant snippet and insert this into my Javascript file?
(the folder is hosted on a Linux system if that helps)
ADDENDUM: Elias Van Ootegem's solution below is excellent and I would highly recommend looking into it as it's a great example of client-server interaction and getting your system to provide you the data you're looking to extract. Workers are pretty interesting.
But for those of you looking at this post with that similar question that I initially had about JavaScript I/O, I found one other work-a-round depending on your case. My team's project site made use of a database site, MongoDB, that stored some of the user's interaction data if the user had hit a "Save" button. MongoDB, and other online database systems, provide a "dumping" function/script that you can call from your local machine/server and put that data into an output file (I was able to put the JSON data into a text file). From that output, you can write a parser to extract and format the data you desire from that output since databases like MongoDB can be pretty clear as to what format the text will be in (very structured, organized). I wrote a parser in C (with a few libraries I had written to extend the language) to do what I needed, but the idea is pretty generalizable to other programming/scripting languages.
I did look at leaving cookies as option as well, and made use of a test program to try it out (it works too!). However, one tradeoff for leaving cookies on a user's local system is that those cookies generally are meant to hold small amounts of data (usually things like username, date created, & expiration date of the cookie) and are dependent upon the user's local machine. Further, while you can extract the data in those cookies from JavaScript, you are back to the initial problem: the data still exists on the web, not on an output file on your server's file system. In the case you need to extract data and want some guarantee this data will exist on your machine, use Elias Van Ootegem's solution.

JavaScript code that is running client-side cannot access the server's filesystem at the same time, let alone write a file. People often say that, if JS were to have IO capabilities, that would be rather insecure... just imagine how dangerous that would be.
What you could do, is simply build your string, using a Worker that, on closing, returns the full data-string, which is then sent to the server (AJAX call).
The server-side script (Perl, PHP, .NET, Ruby...) can receive this data, parse it and then write the file to disk as you want it to.
All in all, not very hard, but quite an interesting project anyway. Oh, and when using a worker, seeing as it's an online game and everything, perhaps a setInterval to send (a part of) the data every 5000ms might not be a bad idea, either.
As requested - some basic code snippets.
A simple AJAX-setup function:
function getAjax(url,method, callback)
{
var ret;
method = method || 'POST';
url = url || 'default.php';
callback = callback || success;//assuming you have a default function called "success"
try
{
ret = new XMLHttpRequest();
}
catch (error)
{
try
{
ret= new ActiveXObject('Msxml2.XMLHTTP');
}
catch(error)
{
try
{
ret= new ActiveXObject('Microsoft.XMLHTTP');
}
catch(error)
{
throw new Error('no Ajax support?');
}
}
}
ret.open(method, url, true);
ret.setRequestHeader('X-Requested-With', 'XMLHttpRequest');
ret.setRequestHeader('Content-type', 'application/x-www-form-urlencode');
ret.onreadystatechange = callback;
return ret;
}
var getRequest = getAjax('script.php?some=Get&params=inURL', 'GET');
getRequest.send(null);
var postRequest = getAjax('script.php', 'POST', function()
{//passing anonymous function here, but this could just as well have been a named function reference, obviously...
if (this.readyState === 4 && this.status === 200)
{
console.log('Post request complete, answer was: ' + this.response);
}
});
postRequest.send('foo=bar');//set different headers to pos JSON.stringified data
Here's a good place to read up on whatever you don't get from the code above. This is, pretty much a copy-paste bit of code, but if you find yourself wanting to learn just a bit more, Here's a great place to do just that.
WebWorkers
Now these are pretty new, so using them does mean not being able to support older browsers (you could support them by using the event listeners to send each morsel of data to the server, but a worker allows you to bundle, pre-process and structure the data without blocking the "normal" flow of your script. Workers are often presented as a means to sort-of multi-thread JavaScript code. Here's a good intro to them
Basically, you'll need to add something like this to your script:
var worker = new Worker('preprocess.js');//or whatever you've called the worker
worker.addEventListener('message', function(e)
{
var xhr = getAjax('script.php', 'post');//using default callback
xhr.send('data=' + e.data);
//worker.postMessage(null);//clear state
}, false);
Your worker, then, could start off like so:
var time, txt = '';
//entry point:
onmessage = function(e)
{
if (e.data === null)
{
clearInterval(time);
txt = '';
return;
}
if (txt === '' && !time)
{
time = setInterval(function()
{
postMessage(txt);
}, 5000);//set postMessage to be called every 5 seconds
}
txt += e.data;//add new text to current string...
}
Server-side, things couldn't be easier:
if ($_POST && $_POST['data'])
{
$file = $_SESSION['filename'] ? $_SESSION['filename'] : 'File'.session_id();
$fh = fopen($file, 'a+');
fwrite($fh, $_POST['data']);
fclose($fh);
}
echo 'ok';
Now all of this code is a bit crude, and most if it cannot be used in its current form, but it should be enough to get you started. If you don't know what something is, google it.
But do keep in mind that, when it comes to JS, MDN is easily the best reference out there, and as far as PHP goes, their own site (php.net/{functionName}) is pretty ugly, but does contain a lot of info, too...

reading server file with javascript

I have a html page using javascript that gives the user the option to read and use his own text files from his PC. But I want to have an example file on the server that the user can open via a click on a button.
I have no idea what is the best way to open a server file. I googled a bit. (I'm new to html and javascript, so maybe my understanding of the following is incorrect!). I found that javascript is client based and it is not very straightforward to open a server file. It looks like it is easiest to use an iframe (?).
So I'm trying (first test is simply to open it onload of the webpage) the following. With kgr.bss on the same directory on the server as my html page:
<IFRAME SRC="kgr.bss" ID="myframe" onLoad="readFile();"> </IFRAME>
and (with file_inhoud, lines defined elsewhere)
function readFile() {
func="readFile=";
debug2("0");
var x=document.getElementById("myframe");
debug2("1");
var doc = x.contentDocument ? x.contentDocument : (x.contentWindow.document || x.document);
debug2("1a"+doc);
var file_inhoud=doc.document.body;
debug2("2:");
lines = file_inhoud.split("\n");
debug2("3");
fileloaded();
debug2("4");
}
Debug function shows:
readFile=0//readFile=1//readFile=1a[object HTMLDocument]//
So statement that stops the program is:
var file_inhoud=doc.document.body;
What is wrong? What is correct (or best) way to read this file?
Note: I see that the file is read and displayed in the frame.
Thanks!

Your best bet, since the file is on your server is to retrieve it via "ajax". This stands for Asynchronous JavaScript And XML, but the XML part is completely optional, it can be used with all sorts of content types (including plain text). (For that matter, the asynchronous part is optional as well, but it's best to stick with that.)
Here's a basic example of requesting text file data using ajax:
function getFileFromServer(url, doneCallback) {
var xhr;
xhr = new XMLHttpRequest();
xhr.onreadystatechange = handleStateChange;
xhr.open("GET", url, true);
xhr.send();
function handleStateChange() {
if (xhr.readyState === 4) {
doneCallback(xhr.status == 200 ? xhr.responseText : null);
}
}
}
You'd call that like this:
getFileFromServer("path/to/file", function(text) {
if (text === null) {
// An error occurred
}
else {
// `text` is the file text
}
});
However, the above is somewhat simplified. It would work with modern browsers, but not some older ones, where you have to work around some issues.
Update: You said in a comment below that you're using jQuery. If so, you can use its ajax function and get the benefit of jQuery's workarounds for some browser inconsistencies:
$.ajax({
type: "GET",
url: "path/to/file",
success: function(text) {
// `text` is the file text
},
error: function() {
// An error occurred
}
});
Side note:
I found that javascript is client based...
No. This is a myth. JavaScript is just a programming language. It can be used in browsers, on servers, on your workstation, etc. In fact, JavaScript was originally developed for server-side use.
These days, the most common use (and your use-case) is indeed in web browsers, client-side, but JavaScript is not limited to the client in the general case. And it's having a major resurgence on the server and elsewhere, in fact.

The usual way to retrieve a text file (or any other server side resource) is to use AJAX. Here is an example of how you could alert the contents of a text file:
var xhr;
if (window.XMLHttpRequest) {
xhr = new XMLHttpRequest();
} else if (window.ActiveXObject) {
xhr = new ActiveXObject("Microsoft.XMLHTTP");
}
xhr.onreadystatechange = function(){alert(xhr.responseText);};
xhr.open("GET","kgr.bss"); //assuming kgr.bss is plaintext
xhr.send();
The problem with your ultimate goal however is that it has traditionally not been possible to use javascript to access the client file system. However, the new HTML5 file API is changing this. You can read up on it here.

Catch Javascript syntax errors while using YUI3

I'm using slightly modified sample code provided by the YUI team. When my source responds with something other than JSON (or just has a JSON syntax error) my browser (Safari) aborts script processing, preventing me from notifying the user there was a problem.
I'm definitely no JS guru, so this code may be a lot uglier than it has to be. The code is, roughly:
YUI().use("dump", "node", "datasource-get", "datasource-jsonschema", function(Y) {
var myDataSource = new Y.DataSource.Get({
source:"/some/json/source/?"}),
myCallback = {
success: function(e){
myResponse = e.response;
doSomething(myDataSource);
},
failure: function(e){
Y.get("#errors").setContent("<li>Could not retrieve data: " + e.error.message + "</li>");
}
};
myDataSource.plug(Y.Plugin.DataSourceJSONSchema, {
schema: {
resultListLocator: "blah.list",
resultFields: ["user", "nickname"]
}
});
myDataSource.sendRequest("foo=bar", myCallback);
}
I've tried wrapping the "var myDataSource" block in a try/catch, and I've also tried wrapping the whole YUI().use() block.
Is it possible to catch syntax errors? Do I have to replace the all-in-one DataSource.Get call with separate IO and parse calls?

Since you are requesting a local script, you can use Y.io + Y.JSON.parse inside a try/catch or Y.DataSource.IO + Y.DataSchema.JSON (+ Y.JSON).
The benefit of DataSource.Get is that it avoids the Same Origin Policy. However, it is less secure and less flexible. If it is not necessary, you should avoid using it.
The contract of DataSource.Get is that the server supports JSONP. The way this works is that Get adds a script node to the page with a src=(the url you provided)&callback=someDataSourceFunction.
The browser will request the resource at that url and one of two things will happen:
the server will respond with a JavaScript string in the form of someDataSourceFunction({"all":"your data"}); or
the server will return some text that can't be parsed as JavaScript.
In either event, that string is treated as the contents of a script node--it is parsed and executed. If it cannot be parsed, the browser will throw an error. There's no stopping this. While JSONP is technically not under the spec constraints of true JSON (even invalid JSON should parse and execute), you should always use pure JSON, and always use a server side lib to generate the JSON output (look on http://json.org for a list of libs in every conceivable language). Don't hand-roll JSON. It only leads to hours of debugging.

The problem is probably that the error happens at some level in the browser (Javascript parsing) before YUI even gets the occasion to report a failure.
It is notoriously hard to catch this kind of error in Safari, which does not implement window.onerror. In order to catch more errors with my Javascript library, bezen.org, I added try/catch in places where asynchronous code is triggered:
dynamic script loading (equivalent to your JSON download)
setTimeout/setTimer: I wrapped and replaced these browser functions to insert a try/catch which logs errors
You may be interested in having a look at the source code of the corresponding modules, which may be useful to you as is or as hints for the resolution of your problem:
bezen.dom.js Look for safelistener in appendScript method
bezen.error.js Check safeSetTimeout/safeSetInterval and catchError

Maybe try this before you "doSomething":
try
{
var test = YAHOO.lang.JSON.parse(jsonString);
...
}
catch (e)
{
alert('invalid json');
}

We Keep Coding

JavaScript is the programming language of the Web.