I want to retrieve the favicon url of the website once it is loaded. How can I implement this for my firefox extension.
You can use nsIFaviconService, it caches favicons for known pages. Along these lines:
var faviconService = Components.classes["#mozilla.org/browser/favicon-service;1"]
.getService(Components.interfaces.nsIFaviconService);
var favicon = faviconService.getFaviconImageForPage(gBrowser.currentURI);
alert(favicon.spec);
Please note that it works with nsIURI objects, not with strings. You can use nsIIOService.newURI() to get an nsIURI object from a string.
Yes, I realize that I am duplicating karthik's answer - but it has no explanation and only a bogus code example.
https://developer.mozilla.org/en/nsIFaviconService
https://developer.mozilla.org/en/Using_the_Places_favicon_service
Please read the page carefully. You can use the service defined below:
nsIServiceManager serviceManager =
Mozilla.getInstance().getServiceManager();
nsIFaviconService service =
(nsIFaviconService)serviceManager.getServiceByContractID("#mozilla.org/brows
er/favicon-service;1", nsIFaviconService.NS_IFAVICONSERVICE_IID);
Related
I am writing a web crawler. I extracted heading and Main Discussion of the this link but I am unable to find any one of the comment (Ctrl+u -> Ctrl+f . Comment Text). I think the comments are written in JavaScript. Can I extract it?
RT are using a service from spot.im for comments
you need to do make two POST requests, first https://api.spot.im/me/network-token/spotim to get a token, then https://api.spot.im/conversation-read/spot/sp_6phY2k0C/post/353493/get to get the comments as JSON.
i wrote a quick script to do this
import requests
import re
import json
def get_rt_comments(article_url):
spotim_spotId = 'sp_6phY2k0C' # spotim id for RT
post_id = re.search('([0-9]+)', article_url).group(0)
r1 = requests.post('https://api.spot.im/me/network-token/spotim').json()
spotim_token = r1['token']
payload = {
"count": 25, #number of comments to fetch
"sort_by":"best",
"cursor":{"offset":0,"comments_read":0},
"host_url": article_url,
"canonical_url": article_url
}
r2_url ='https://api.spot.im/conversation-read/spot/' + spotim_spotId + '/post/'+ post_id +'/get'
r2 = requests.post(r2_url, data=json.dumps(payload), headers={'X-Spotim-Token': spotim_token , "Content-Type": "application/json"})
return r2.json()
if __name__ == '__main__':
url = 'https://www.rt.com/usa/353493-clinton-speech-affairs-silence/'
comments = get_rt_comments(url)
print(comments)
Yes, if it can be viewed with a web browser, you can extract it.
If you look at the source it is really an iframe that loads a piece of javascript, that then creates a new tag in the document with the source of that script tag loading bundle.js, which really contains the commenting software. This in turns then fetches the actual comments.
Instead of going through this manually, you could consider using for example webkit to create a headless browser that executes the javascript like an ordinary browser. Then you can scrape from that instead of having to manually make your crawler fetch the external resources.
Examples of such headless browsers could be Spynner, Dryscape, or the PhantomJS derived PhantomPy (the latter seems to be an abandoned project now).
Can anybody explain how the following javascript variables:
document.referrer
document.location.href
or the http REFERRER header, could come to be 'javascript:window["contents"]' ?
Not only do I not understand how they could be set to a javascript uri - but window.contents isn't a standard DOM attribute in any browser that I know of... (It is window["contents"], not window["content"])
I believe I found the solution to this..
There are some javascripts in the wild which seem to create iframes using code (something) like this:
var contents = '<html>......</html>';
var ifr = document.createElement('iframe');
ifr.contentWindow.open();
ifr.contentWindow.write(contents);
some particular combination of this sometimes ends up specifying either the href of the iframe , or the referrer, as "javascript:window['contents']" - i.e. the javascript variable which temporarily holds the page data.
(still not completely finalized on the details, but that's the basic idea)
I'd like to inject a couple of local .js files into a webpage. I just mean client side, as in within my browser, I don't need anybody else accessing the page to be able to see it. I just need to take a .js file, and then make it so it's as if that file had been included in the page's html via a <script> tag all along.
It's okay if it takes a second after the page has loaded for the stuff in the local files to be available.
It's okay if I have to be at the computer to do this "by hand" with a console or something.
I've been trying to do this for two days, I've tried Greasemonkey, I've tried manually loading files using a JavaScript console. It amazes me that there isn't (apparently) an established way to do this, it seems like such a simple thing to want to do. I guess simple isn't the same thing as common, though.
If it helps, the reason why I want to do this is to run a chatbot on a JS-based chat client. Some of the bot's code is mixed into the pre-existing chat code -- for that, I have Fiddler intercepting requests to .../chat.js and replacing it with a local file. But I have two .js files which are "independant" of anything on the page itself. There aren't any .js files requested by the page that I can substitute them for, so I can't use Fiddler.
Since your already using a fiddler script, you can do something like this in the OnBeforeResponse(oSession: Session) function
if ( oSession.oResponse.headers.ExistsAndContains("Content-Type", "html") &&
oSession.hostname.Contains("MY.TargetSite.com") ) {
oSession.oResponse.headers.Add("DEBUG1_WE_EDITED_THIS", "HERE");
// Remove any compression or chunking
oSession.utilDecodeResponse();
var oBody = System.Text.Encoding.UTF8.GetString(oSession.responseBodyBytes);
// Find the end of the HEAD script, so you can inject script block there.
var oRegEx = oRegEx = /(<\/head>)/gi
// replace the head-close tag with new-script + head-close
oBody = oBody.replace(oRegEx, "<script type='text/javascript'>console.log('We injected it');</script></head>");
// Set the response body to the changed body string
oSession.utilSetResponseBody(oBody);
}
Working example for www.html5rocks.com :
if ( oSession.oResponse.headers.ExistsAndContains("Content-Type", "html") &&
oSession.hostname.Contains("html5rocks") ) { //goto html5rocks.com
oSession.oResponse.headers.Add("DEBUG1_WE_EDITED_THIS", "HERE");
oSession.utilDecodeResponse();
var oBody = System.Text.Encoding.UTF8.GetString(oSession.responseBodyBytes);
var oRegEx = oRegEx = /(<\/head>)/gi
oBody = oBody.replace(oRegEx, "<script type='text/javascript'>alert('We injected it')</script></head>");
oSession.utilSetResponseBody(oBody);
}
Note, you have to turn streaming off in fiddler : http://www.fiddler2.com/fiddler/help/streaming.asp and I assume you would need to decode HTTPS : http://www.fiddler2.com/fiddler/help/httpsdecryption.asp
I have been using fiddler script less and less, in favor of fiddler .Net Extensions - http://fiddler2.com/fiddler/dev/IFiddlerExtension.asp
If you are using Chrome then check out dotjs.
It will do exactly what you want!
How about just using jquery's jQuery.getScript() method?
http://api.jquery.com/jQuery.getScript/
save the normal html pages to the file system, add the js files manually by hand, and then use fiddler to intercept those calls so you get your version of the html file
Can someone tell me if there is any jquery plugin to dynamically create .ics file with values coming from the page div values like there would be
<div class="start-time">9:30am</div>
<div class="end-time">10:30am</div>
<div class="Location">California</div>
or javascript way to dynamically create an .ics file? I basically need to create .ics file and pull these values using javascript or jquery? and link that created ics file to "ADD TO CALENDAR" link so it gets added to outlook?
you will need to make it in ICS format. also you will need to convert the date and time zone; E.G. 20120315T170000Z or yyyymmddThhmmssZ
msgData1 = $('.start-time').text();
msgData2 = $('.end-time').text();
msgData3 = $('.Location').text();
var icsMSG = "BEGIN:VCALENDAR\nVERSION:2.0\nPRODID:-//Our Company//NONSGML v1.0//EN\nBEGIN:VEVENT\nUID:me#google.com\nDTSTAMP:20120315T170000Z\nATTENDEE;CN=My Self ;RSVP=TRUE:MAILTO:me#gmail.com\nORGANIZER;CN=Me:MAILTO::me#gmail.com\nDTSTART:" + msgData1 +"\nDTEND:" + msgData2 +"\nLOCATION:" + msgData3 + "\nSUMMARY:Our Meeting Office\nEND:VEVENT\nEND:VCALENDAR";
$('.test').click(function(){
window.open( "data:text/calendar;charset=utf8," + escape(icsMSG));
});
the above sample will create a ics file for download. the user will have to open it and outlock, iCal, or google calendar will do the rest.
This is an old question, but I have some ideas that could get you started (or anyone else who needs to do a similar task).
And the JavaScript to create the file content, and open the file:
var filedata = $('.start-time, .end-time, .Location').text();
window.open( "data:text/calendar;charset=utf8," + escape( filedata ) );
Presumably you'd want to add that code to the onclick event of a form button.
I don't have Outlook handy, so I'm not sure if it will automatically recognize the filetype, but it might.
Hope this helps.
From what I have found online and on this site, it is not possible to get this to work in IE as you need to include certain headers to let IE know to download this file.
The window.open method works for Chrome and Firefox but not IE so you may need to restructure your code to use a server-side language to generate and download the ICS file.
More can be found in this question
While this is an older question, I have been looking for a front-end solution as well. I recently stumbled across the
ICS.js library which looks like the answer you're looking for.
This approach worked fine however with IE8 the browser couldn't recognize the file type and refused to open as a calendar item. To get around this i had to create the code on the server side (and exposed via RESTful service) and then set the response headers as follows;
#GET
#Path("generateCalendar/{alias}/{start}/{end}")
#Produces({ "text/v-calendar" })
public Response generateCalendar(
#QueryParam("alias") final String argAlias,
#QueryParam("start") final String argStart,
#QueryParam("end") final String argEnd) {
ResponseBuilder builder = Response.ok();
builder.header("content-disposition", "attachment;filename=calendar.ics");
builder.entity("BEGIN:VCALENDAR\n<........insert meeting details here......>:VCALENDAR");
return builder.build();
}
This can be served up by calling window.location on the service URL and works on Chrome, Firefox and IE8.
Hope this helps.
Question:
IE and Firefox / Safari seem to deal differently with BASE HREF and Javascript window.location type requests. First, is this an accurate description of the problem? What's going on? And what's the best cross-browser solution to deal with this situation?
Context:
I have a small PHP flat file sitelet (it's actually a usability testing prototype).
I dynamically generate the BASE tag's HREF value in PHP, i.e. if it's running on our company's server, it's:
$basehref = 'http://www.example.com/alpha/bravo/UsabilityTest/';
and on my local dev machine, it's:
$basehref = 'http://ellen.local/delta/echo/foxtrot/UsabilityTest/';
For one of the tasks, I collect some user input, do some transformations on it in Javascript, and send to the server using code like this:
function allDone() {
// elided code for simplicity of stackoverflow question
var URI = "ProcessUserInput.php?";
URI = URI + "alphakeys=" + encodeURI( keys.join(",") );
URI = URI + "&sortedvalues=" + encodeURI( values.join(",") );
window.location = URI;
}
Both the javascript file (containing function allDone()) and the processing PHP script (ProcessUserInput.php) live in a subdirectory of UsabilityTest. In other words, their actual URL is
http://www.example.com/alpha/bravo/UsabilityTest/foxtrot/ProcessUserInput.php
aka
$basehref . '/foxtrot/ProcessUserInput.php'
The Problem
IE's JavaScript basically seems to ignore the BASE HREF. The javascript and the PHP processor live in the same directory, so the call to ProcessUserInput.php works out fine. The input gets processed and everything works fine.
But when I test on Firefox, the JavaScript does appear to use the BASE HREF, because the script's output gets sent to
$basehref . '/ProcessUserInput.php'
This breaks, because ProcessUserInput.php is in a subdirectory of basehref. However, if I add the subdirectory name to the javascript, it no longer works in IE.
Solutions?
I can think of a few ways to solve this:
In Javascript, read the HREF property of the BASE tag and manually prepend to var URI in the javascript, calling a fully-resolved absolute URL
Process the .js file with PHP and insert the $basehref variable into the script
Move the files around
Something else?
I'm sure there must be other ways to solve this too. What's the best way to deal with BASE HREF in JavaScript when IE and Firefox apply it differently in JavaScript?
Using the assign method of window.location seems like the most straightforward answer.
Instead of
window.location = URI;
I'm using this:
window.location.assign( URI );
which is doing the right thing in both IE and Firefox.
IE and Firefox / Safari seem to deal differently with BASE HREF and Javascript window.location type requests.
Yes, this is a long-standing difference going back to the early days of Netscape-vs-IE.
IE enforces base-href only at the point a document element is interacted-with. So, you can createElement('a'), set a relative href and click() it*, but the base-href will be ignored; appendChild it to the document containing the base-href, and it'll work.
On the other browsers the base-href is taken as global per-window and always applied. Which is right? It seems to be unspecified. The original JavaScript docs say only that location.hash (and hence, location applied as a string):
represents a complete URL
So setting it to a relative URL would seem to be an undefined operation.
(*: link.click() is a non-standard method supported by IE and Opera)
read the HREF property of the BASE tag and manually prepend
Probably what I'd do, yeah, if you're dead set on using <base>.
I believe you want to modify window.location.pathname, not window.location. window.location is a Location object, that has multiple variables. As a result, the effects of changing it is not well defined. However, window.location.pathname is defined as the path relative to the host, which is what you want.
If you want to read up more on the many variables you can change in window.location, I'd check here. According to Mozilla's documentation, changing any variable in window.location should reload the page with a new URL corresponding to those changes.
I had the same problem today, after some researching, couldn´t findn any way to override this issue in IE9, what is a requiremente for my project, so, i did the following approach (jquery based, but it´s really easy to make it in simple javascript).
href = function(url){
if ($("base").length > 0 ){
location.href= $("base").attr("href")+url;
}else{
location.href = url;
}
}
And then, change
location.href= 'emp/start'
to
href('emp/start');
just add $('base').attr('href') before the link. (using jquery) or
document.getElementBytagname('base').href
You can always use Vanilla JS :)
var href = document.getElementBytagname('base')[0].href
I hope this helps.