Fix URL in JavaScript - javascript

My web application has an input field where users can enter a link to their website. Usually they enter invalid URLs such as e-918kiss.com. I want to fix it to https://e-918kiss.com/ automatically.
But the URL can point to any domain name.
I tried using a tag to parse the URL, but it just appended it as the local path of the current source:
const elem = document.createElement('a');
a.href = 'Twitter.com/mhluska';
console.log(a.href); // "http://e-918kiss.com"
I researched some URL parsing libraries, but they usually just throw errors for invalid URLs. Includes native URL API.
Is there an easy way to try to create a valid link from any junk that the user might enter 918kiss ?

Regex is your friend. What is the best regular expression to check if a string is a valid URL?
At least minimum to know the TLD of the user domain. You can't try to guess TLD for random strings.
Then you should check if input value matches for a valid url, if not apply to the string the missing "https".
PS: domains are recommended to use SSL (https) as protocol reference but its not sure https is enabled on your users website.

Maybe this will help:
// https://stackoverflow.com/a/49849482/6525081
function isValidURL(string) {
var res = string.match(/(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)/g);
return (res !== null)
};
document.querySelector('#URL').addEventListener('change', function() {
// trigger the change event so that:
const isUrl = isValidURL((this.value).toString()); // variable for input value
if (isUrl !== true) { // if this is not already url,
newVal = 'http://www.'+this.value+'.com'; // set it as url and
this.value = newVal; // add it as the input value
}
else {return false} // if not abort
});
</script>
<input type="text" id="URL" placeholder="type string..."/>

Related

Simplest way to check if the current url contains a subdomain

I'm looking for the simplest way to check if the user is on the normal domain (domain.com) or is on a subdomain et.domain.com and display content based on that. If it matters I'm trying to do that on shopify.
You can split the url with dot(.) and check the length. This will only work for .com url.
Note: This will not work for domains like google.co.in
const domain = 'domain.com';
const subDomain = 'et.domain.com'
const isSubdomain = (domain) => domain.split('.').length > 2;
console.log(isSubdomain(domain));
console.log(isSubdomain(subDomain));
You can actually use regex method.
var isSubdomain = function(url) {
url = url || 'http://www.test-domain.com'; // just for the example
var regex = new RegExp(/^([a-z]+\:\/{2})?([\w-]+\.[\w-]+\.\w+)$/);
return !!url.match(regex); // make sure it returns boolean
}
console.log(isSubdomain("example.com"));
console.log(isSubdomain("http://example.com:4000"));
console.log(isSubdomain("www.example.com:4000"));
console.log(isSubdomain("https://www.example.com"));
console.log(isSubdomain("sub.example.com"));
console.log(isSubdomain("example.co.uk")); //it doesn't work on these very specific cases

Fortify Scan issue: Cross-Site Scripting issue when assigning a new URL

In my JavaScript code I am creating a URL to redirect my page to. It works fine, but when I run it through the Fortify Scan, it gives me the following error:
The method reloadParentTab() sends unvalidated data to a web browser on line (line number), which can result in the browser executing malicious code.
I've added a URL validator and ran the newly created URL through it, however, the error is still present.
Now, this is what it looks like now:
//This is reloadParentTab function, mentioned above
function reloadParentTab() {
if(window.opener && !window.opener.closed) { //checks if parent tab is present
//here we check a variable, irrelevant here, except that it decides whether we
//run window.opener.location.reload() or specify the URL, which is what the scan
//complains about
if (someTriggerValid()) {
var href = window.opener.location.href; //This is the source
if (validURL(href)) { // running the validator
//building the new URL
var newHref = href.substring(0, href.indexOf("tracking.")) +
"tracking.base.open.request.do?dataObjectKey=object.dsaidCase&trackingId=" + caseId;
//running the new URL through the validator, just to show that I tried it both ways
if (validURL(newHref)) {
//this is "Sink", that's where unvalidated data is, supposedly, sent,
//although I am validating it
window.opener.location.href = newHref;
}
}
} else {
window.opener.location.reload();
}
}
}
//And this is the validator
function validURL(str) {
var pattern = new RegExp('^(https?:\\/\\/)?'+ // protocol
'((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.)+[a-z]{2,}|'+ // domain name
'((\\d{1,3}\\.){3}\\d{1,3}))'+ // OR ip (v4) address
'(\\:\\d+)?(\\/[-a-z\\d%_.~+]*)*'+ // port and path
'(\\?[;&a-z\\d%_.~+=-]*)?'+ // query string
'(\\#[-a-z\\d_]*)?$','i'); // fragment locator
return !!pattern.test(str);
}
So my question is: what am I missing and what else do I need to do to get this resolved? Any ideas are welcome. Thank you.

Why Blogger URL Related Post Undefined

I create a function using Javascript for related post on my Blogger template,
Here is my code:
function toHttps(link) {
var protocol=link.replace(/\:/g,'');
if(protocol=='http') {
var url=link.replace('http','https');
return link.replace(url);
}
}
if my original url is
https://dpawoncatering.blogspot.com/2008/08/nasi-box-murah.html
Why is the result like this?
https://dpawoncatering.blogspot.com/2008/08/undefined?
Naren Murali's answer is correct. I'd just like to add a different way of doing "protocol" swap using javascript's own URL parser that might be interesting for other people.
You can instantiate an a element and use its href attribute to parse your URL, then you can access and change the protocol attribute of the href and retrieve the resulting URL:
function toHttps(link) {
var url = document.createElement('a');
url.href = link;
url.protocol = 'https';
return url.href;
}
Since the URL contains https already it does not enter the if condition, hence nothing is returned hence we get undefined, please check my corrected function. Let me know if you have any issues!
function toHttps(link) {
if(link.indexOf('http://') > -1){
var url=link.replace('http','https');
return url;
}
return link
}
console.log(toHttps('http://dpawoncatering.blogspot.com/2008/08/nasi-box-murah.html'))
console.log(toHttps('https://dpawoncatering.blogspot.com/2008/08/nasi-box-murah.html'))

Is there a better way to do this? (recursively resolving HTML unicode entities)

I'm parsing an untrusted URI, but its URI-hood must be honored. I'm trying to protect against javascript: links, but I feel like I need to recurse on it, since you could have:
javascriptjavascript::
and after stripping out all instances of javascript: get back our old friend javascript: once again.
My other concern is analogously-nested unicode entities. For instance, we could have:
"j&#X41vascript:alert('pwnt')"
...but we could also have:
"j&#&#X5841vascript:alert('pwnt')"
...though I seem to be doing it wrong (whereas a successful attacker obviously won't.)
function resolveEntities(uri) {
var s = document.createElement('span')
, nestTally = uri.match(/&/) ? 0 : 1
, limitReached = false;
s.innerHTML = uri;
while (s.textContent.match(/&/)) {
s.innerHTML = s.textContent;
if(nestTally++ >= 5) {
limitReached = true;
break;
}
}
return encodeURI(s.textContent);
}
Didn't you already ask almost the same question before? Anyway, my suggestion remains the same: use a proper HTML sanitizer.
The particular sanitizer I linked to strips javascript: URLs automatically, but you can also set it up to allow only certain whitelisted URL schemes like Thomas suggests. As he notes, this is a good idea, since it's much safer to only allow schemes like http and https which you know to be safe.
(In particular, whether a given obscure URL scheme is safe or not may depend not only on the user's browser, but also on their OS and on what third-party software they may have installed — a lot of programs like to register themselves as handlers for their own URL schemes.)
Rather than specifying what you want to blacklist (e.g. javascript: URIs), it's better to specify what you want to whitelist (e.g. http and https only). What about something like this:
function sanitizeUri(uri) {
if (!uri.match(/^https?:\/\//)) {
uri = "http://" + uri;
}
return uri;
}

JavaScript issue with matching URL

How can I add something in JavaScript that will check the web site URL of someone on a web site and then redirect to a certain page on the web site, if a match is found? For example...
The string we want to check for, will be mydirectory, so if someone went to example.com/mydirectory/anyfile.php or even example.com/mydirectory/index.php, JavaScript would then redirect their page / url to example.com/index.php because it has mydirectory in the url, otherwise if no match is found, don't redirect, I'm using the code below:
var search2 = 'mydirectory';
var redirect2 = 'http://example.com/index.php'
if (document.URL.substr(search2) !== -1)
document.location = redirect2
The problem with that, is that it always redirects for me even though there is no match found, does anyone know what's going wrong and is there a faster / better way of doing this?
Use String.indexOf() instead:
if (window.location.pathname.indexOf('searchTerm') !== -1) {
// a match was found, redirect to your new url
window.location.href = newUrl;
}
substr is not what you need in this situation, it extracts substrings out of a string. Instead use indexOf:
if(window.location.pathname.indexOf(search2) !== -1) {
window.location = redirect2;
}
If possible it's better to do this redirect on the server side. It will always work, be more search engine friendly and faster. If your users have JavaScript disabled, they won't get redirected.

Categories