Combine explicit protocol specification with relative URL - javascript

I have a page that is accessed via HTTP. This page links to another page on the same server using HTTPS. What is the most elegant way, using HTML and/or Javascript, to force a transition to HTTPS while using a relative URL?
Basically, I want the opposite of a protocol-relative URL. I want to explicitly specify HTTPS WITHOUT hardcoding the hostname into the URL.
I'm working on a large legacy site so a solution using unobtrusive javascript with minimal changes to existing markup is ideal.
I realize that enforcing HTTPS is better performed at the destination page, but that isn't an option in this case.

$("a").each(function () {
this.href = "https://" + window.location.host + this.pathname + this.search + this.hash;
});
You could provide a more specific selector to make sure it doesn't mess up any links you didn't intend to change, but I leave that up to you since you know the requirements.

I think you're going to have to build the URL yourself from the pieces on window.location.
var path = anchor.href;
var httpsUrl = "https://" +
window.location.host +
(path.charAt(0) === '/' ? path : window.location.pathname + '/' + path);
or something like that (esp. if there's are parameters etc).
edit — it's been noted that modern browsers will give back the complete URL when you access the "href" value, making this an even easier problem to solve in those cases (as you just have to fix the protocol prefix). (Thanks #Daniel!)

Related

Parse host from url without subdomains etc

I'm working on a chrome-extension that reads the domain from window.location.hostname. Now for this extension to work properly, I need to be able to separate subdomains and other url variation to the same host. example:
I need all of the following url:s
www.google.com
accounts.google.com
photos.google.se
example.google.co.uk
https://google.com
all of these need to be resolved to, in this case, "google", in a way that is reliable and will work for any website with sometimes quirky subdomainconfigurations.
this is my current aproach, somewhat simplified:
var url = window.location.hostname.split(".") //returns an array of strings
for(i=0;i<url.length;i++){
if(url[i].match(domainregex) //regex for identifying domains ".com",".se",".co.uk" etc
return url[i-1] //usually what I'm after is directly before the domain, thus i-1
}
This approach is alot of hassle, and has proven unreliable at times...Is there any more straitforward way of doing this?
A more reliable solution to strip the top level domain part and get the main domain part is to use Public Suffix List which is used by Firefox and Chrome and other browsers.
Several js parsers of the list data are available if you don't want to write your own.
I had to do it for my fork of edit-my-cookies, so It will able to change profile of cookies per site. (https://github.com/AminaG/swap-my-cookies-multisite/blob/master/js/tools.js)
It is what I did, and it is working for me. I am sure if it not complete solution, but I am sure it can helps.
var remove_sub_domain=function(v){
var is_co=v.match(/\.co\./)
v=v.split('.')
v=v.slice(is_co ? -3: -2)
v=v.join('.')
console.log(v)
return v
}
it is working for:
www.google.com
accounts.google.com
photos.google.se
example.google.co.uk
google.com
if you want it to work also for:
http://gooogle.com
You first need to remove the protocol:
parser=document.createElement('a');
parser.href=url;
host=parser.host;
newurl=remove_sub_domain(host);

Replace domain in url for publishers academia login

When working outside of my university IP domain, I have to use a login server to access publishers websites.
Most url that were of the form http://pubs.domain.org/XXX.htmlare transformed into http://pubs.domain.org.gateway.university.edu/XXX.html
The problem is: most publishers websites have a useless search tool, so I use google and land on the regular website, and using Web of Knowledge outside of the university often fails to connect to the publisher. I have found that replacing manually the URL works as long as I have authentified in the last hour.
I am searching for a way of using a bookmarklet to do this automatically. I have found this question that seems to be what I'm looking for, but I never used javascript before and have been unable to adapt it.
Bookmarklet to edit current URL
Thanks!
try
javascript:(function() {
window.location.href =
location.protocol
+ '//'+location.hostname
+ '.gateway.university.edu'
+ location.pathname
+ location.search;
})();
I've encountered this exact situation myself recently. As you want to append a suffix to the hostname, but otherwise leave the URL intact, you can edit window.location.hostname directly to trigger a reload:
javascript:window.location.hostname += '.gateway.university.edu';
or with closure:
javascript:(function() {
window.location.hostname += '.gateway.university.edu';
})();
window.location.href = window.location.href.replace("://pubs.domain.org","://pubs.domain.org.gateway.university.edu");

location.host vs location.hostname and cross-browser compatibility?

Which one of these is the most effective vs checking if the user agent is accessing via the correct domain.
We would like to show a small js based 'top bar' style warning if they are accessing the domain using some sort of web proxy (as it tends to break the js).
We were thinking about using the following:
var r = /.*domain\.com$/;
if (r.test(location.hostname)) {
// showMessage ...
}
That would take care of any subdomains we ever use.
Which should we use host or hostname?
In Firefox 5 and Chrome 12:
console.log(location.host);
console.log(location.hostname);
.. shows the same for both.
Is that because the port isn't actually in the address bar?
W3Schools says host contains the port.
Should location.host/hostname be validated or can we be pretty certain in IE6+ and all the others it will exist?
As a little memo: the interactive link anatomy
--
In short (assuming a location of http://example.org:8888/foo/bar#bang):
hostname gives you example.org
host gives you example.org:8888
host just includes the port number if there is one specified. If there is no port number specifically in the URL, then it returns the same as hostname. You pick whether you care to match the port number or not. See https://developer.mozilla.org/en-US/docs/Web/API/Location for more info on the window.location object and the various choices it has for matching (with or without port).
I would assume you want hostname to just get the site name.
If you are insisting to use the window.location.origin
You can put this in top of your code before reading the origin
if (!window.location.origin) {
window.location.origin = window.location.protocol + "//" + window.location.hostname + (window.location.port ? ':' + window.location.port: '');
}
Solution
PS: For the record, it was actually the original question. It was already edited :)
Your primary question has been answered above. I just wanted to point out that the regex you're using has a bug. It will also succeed on foo-domain.com which is not a subdomain of domain.com
What you really want is this:
/(^|\.)domain\.com$/
MDN: https://developer.mozilla.org/en/DOM/window.location
It seems that you will get the same result for both, but hostname contains clear host name without brackets or port number.
Just to add a note that Google Chrome browser has origin attribute for the location. which gives you the entire domain from protocol to the port number as shown in the below screenshot.

Preventing the user from entering javascript instead of a url for use in an href

I need to stop the user putting javascript into what should be a link field. I know I could just check for "javascript:" at the start of the url they enter, but I was wondering if there was some way I could construct the <a> tag to force it to treat the href as an address? I feel like this would be a better solution, as people are always finding ways to get around basic checks.
A funny solution (and very effective if you ask me), is to put http:// in front of urls that don't already start with it. This is a sketch of what I mean:
if(url.slice(0,"http://".length) !== "http://" && url.slice(0,"https://".length) !== "https://") {
url = "http://" + url;
}
Better whitelist than blacklist and check for http(s) protocol, I'd guess.
You could always prepend the http:// or https:// protocol. May require a replace to remove any existing http or https.
Even if you have
http://javascript:alert('test');
the javascript will not run.
First you should recognize that the browser can be manipulated into submitting whatever the user wants, so client-side validation is neither necessary nor sufficient, just convenient (to the user).
Given that, an easy process comes to mind:
Enforce that every URL is absolute by requiring a protocol spec at the beginning of the URL.
Enforce that the protocol is one of {http, https}.
Try this:
function validateUrl(value) {
return value.match(/^(http|https):\/\//) != null;
}
if(validateUrl(inputField.value)) {
// value is acceptable
} else {
// value is not an acceptable URL
}
There is no pure HTML way of forcing the tag to treat the href as a URL.
The only thing (I know of) that you can do is check for javascript in the href attribute.

Reading window.location after setting document.domain in IE(6)

I've got a situation a page on where a script on www.example.com/index.html opens home.example.com/foo.html in a popup window. When the user closes the popup, I want to notify the opener page by calling a Javascript function on it (which does a few things with the DOM). I use unbeforeunload like this:
// In index.html on www.example.com:
window.fn = function () { /* Perform stuff after foo.html has closed */ }
// In foo.html on home.example.com:
window.onbeforeunload = function () {
if (window.opener && window.opener.fn)
window.opener.fn();
};
This doesn't work because the web pages are on different domains. I can set the document.domain property to overcome this:
document.domain = "example.com";
Unfortunately, this doesn't play well with the web app framework I use on the foo.html side (Apache Wicket), as it includes a script which does something like this:
var src = (window.location.protocol == 'https:') ? something : other;
Apparently, in IE6*, when you set the document domain, the location object becomes write-only, and so trying to read window.location.protocol throws "Access denied".
So, my question is: How do I allow cross-domain Javascript function calls while still allowing my scripts to read the contents of the location object?
I can't go via the server. (The work performed by the function I want to call doesn't really play that way.)
I can't read the window.location.protocol property before setting document.domain and then use that value in the conditional assignment; doing so would require me to rebuild the web framework libraries - not something I want to do.
* Possibly in other versions of IE, too; haven't checked.
Can you use jQuery? There's a nice plugin that allows you to do window.postMessage through an iframe in IE 6-8: http://benalman.com/code/test/js-jquery-postmessage/
You could open your popup from the iframe and pass your object between iframe and parent with postMessage.
I can't read the
window.location.protocol property
before setting document.domain and
then use that value in the conditional
assignment; doing so would require me
to rebuild the web framework libraries
- not something I want to do.
can't you read window.location.protocol prior to to setting document.domain and set window.location.protocol once it becomes write-only? Would that require a a rebuild of framework as well? it is a hack but so is IE.

Categories