why javascript protocol decode the URL automatically? - javascript

I am confused about why javascript protocol decodes the encoded URL, for example:
press
function myFunction(id)
{
alert(id); //it will generate =cDO4w67epn64o76
}
I am using these strings in encryption and decryption.
Please provide me with a real reason and a solution (the reason is very important for me), I know I can replace the (=) sign, but I am afraid of the rest of the encoded strings to be decoded also by the wrapper.
Note: in php, the GET, REQUEST Global variables, the url is decoded automatically.

Because it's in an href attribute, where URLs are expected, so the browser is "normalizing" the URI-encoding of the "URL" (which is using the javascript pseudo-scheme).
You can put it in a different attribute and then get that, like so:
function myFunction(element) {
console.log(element.getAttribute("data-value")); //it will generate =cDO4w67epn64o76
}
press
...although I discourage using onclick="..." handlers. Instead:
function linkHandler(e) {
console.log(this.getAttribute("data-value"));
e.preventDefault();
}
var links = document.querySelectorAll("a[data-value]");
Array.prototype.forEach.call(
links,
function(link) {
link.addEventListener("click", linkHandler, false);
}
);
press

A URL using the javascript: scheme is still a URL.
You've attempted to store a URL in a JavaScript string in a URL.
When decoding the outside URL into JavaScript, the percent encoded characters are decoded.
To do what you are attempting you need to convert any special characters (like %) in the JavaScript to URL encoding:
test
You should only use this for creating bookmarklets though.
If you want to run JavaScript when something is clicked, then use a click event handler. You could use an onclick attribute, but addEventListener in the modern approach (for values of modern equal to "not the 1990s").
Likewise, if you aren't linking somewhere, don't use a link. Use a button instead.

Related

What is the right way to safely and accurately insert user-provided URL data into an HTML5 document?

Given an arbitrary customer input in a web form for a URL, I want to generate a new HTML document containing that URL within an href. My question is how am I supposed to protect that URL within my HTML.
What should be rendered into the HTML for the following URLs that are entered by an unknown end user:
http://example.com/?file=some_19%affordable.txt
http://example.com/url?source=web&last="f o o"&bar=<
https://www.google.com/url?source=web&sqi=2&url=https%3A%2F%2Ftwitter.com%2F%3Flang%3Den&last=%22foo%22
If we assume that the URLs are already uri-encoded, which I think is reasonable if they are copying it from a URL bar, then simply passing it to attr() produces a valid URL and document that passes the Nu HTML checker at validator.w3.org/nu.
To see it in action, we set up a JS fiddle at https://jsfiddle.net/kamelkev/w8ygpcsz/2/ where replacing the URLs in there with the examples above can show what is happening.
For future reference, this consists of an HTML snippet
<a>My Link</a>
and this JS:
$(document).ready(function() {
$('a').attr('href', 'http://example.com/request.html?data=>');
$('a').attr('href2', 'http://example.com/request.html?data=<');
alert($('a').get(0).outerHTML);
});
So with URL 1, it is not possible to tell if it is URI encoded or not by looking at it mechanically. You can surmise based on your human knowledge that it is not, and is referring to a file named some_19%affordable.txt. When run through the fiddle, it produces
My Link
Which passes the HTML5 validator no problem. It likely is not what the user intended though.
The second URL is clearly not URI encoded. The question becomes what is the right thing to put into the HTML to prevent HTML parsing problems.
Running it thru the fiddle, Safari 10 produces this:
My Link
and pretty much every other browser produces this:
My Link
Neither of these passes the validator. Three complaints are possible: the literal double quote (from un-escaping HTML), the spaces, or the trailing < character (also from un-escaping HTML). It just shows you the first of these it finds. This is clearly not valid HTML.
Two ways to try to fix this are a) html-escape the URL before giving it to attr(). This however results in every & becoming & and the entities such as & and < become double-escaped by attr(), and the URL in the document is entirely inaccurate. It looks like this:
My Link
The other is to URI-encode it before passing to attr(), which does result in a proper validating URL which actually clicks to the intended destination. It looks like this:
My Link
Finally, for the third URL, which is properly URI encoded, the proper HTML that validates does come out.
My Link
and it does what the user would expect to happen when clicked.
Based on this, the algorithm should be:
if url is encoded then
pass as-is to attr()
else
pass encodeURI(url) to attr()
however, the "is encoded" test seems to be impossible to detect in the affirmative based on these two prior discussions (indeed, see example URL 1):
How to find out if string has already been URL encoded?
How to know if a URL is decoded/encoded?
If we bypass the attr() method and forcibly insert the HTML-escaped version of example URL 2 into the document structure, it would look like this:
My Link
Which seemingly looks like valid HTML, yet fails the HTML5 validator because it unescapes to have invalid URL characters. The browsers, however, don't seem to mind it. Unfortunately, if you do any other manipulation of the object, the browser will re-escape all the &'s anyway.
As you can see, this is all very confusing. This is the first time we're using the browser itself to generate the HTML, and we are not sure if we are getting it right. Previously, we did it server side using templates, and only did the HTML-escape filter.
What is the right way to safely and accurately insert user-provided
URL data into an HTML5 document (using JavaScript)?
If you can assume the URL is either encoded or not encoded, you may be able to get away with something along the lines of this. Try to decode the URL, treat an error as the URL not being encoded and you should be left with a decoded URL.
<script>
var inputurl = 'http://example.com/?file=some_19%affordable.txt';
var myurl;
try {
myurl = decodeURI(inputurl);
}
catch(error) {
myurl = inputurl;
}
console.log(myurl);
</script>

Javascript regex to replace ampersand in all links href on a page

I've been going through and trying to find an answer to this question that fits my need but either I'm too noob to make other use cases work, or their not specific enough for my case.
Basically I want to use javascript/jQuery to replace any and all ampersands (&) on a web page that may occur in a links href with just the word "and". I've tried a couple different versions of this with no luck
var link = $("a").attr('href');
link.replace(/&/g, "and");
Thank you
Your current code replaces the text of the element within the jQuery object, but does not update the element(s) in the DOM.
You can instead achieve what you need by providing a function to attr() which will be executed against all elements in the matched set. Try this:
$("a").attr('href', function(i, value) {
return value.replace(/&/g, "and");
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
link
link
Sometimes when replacing &, I've found that even though I replaced &, I still have amp;. There is a fix to this:
var newUrl = "#Model.UrlToRedirect".replace(/&/gi, '%').replace(/%amp;/gi, '&');
With this solution you replace & twice and it will work. In my particular problem in an MVC app, window.location.href = #Model.UrlToRedirect, the url was already partially encoded and had a query string. I tried encoding/decoding, using Uri as the C# class, escape(), everything before coming up with this solution. The problem with using my above logic is other things could blow up the query string later. One solution is to put a hidden field or input on the form like this:
<input type="hidden" value="#Model.UrlToRedirect" id="url-redirect" />
then in your javascript:
window.location.href = document.getElementById("url-redirect").value;
in this way, javascript won't take the c# string and change it.

encodeURIComponent() adds too many characters

Either my encodeURICOmponent() in java script is adding to many characters or I don't understand exactly how it works.
I am using this line of code:
var encoded = encodeURIComponent(searchTerm);
When I look in the chrome inspect element after passing Abt 12 it shows the encoded variable added to the URL as this:
Abt%252012
I would think it should be this:
Abt%12
So when I pass it through PHP I get really odd results when actually conducting the search.
Form the comments, it looks like you are sending the value to server via jQuery ajax request, then it will take care of parameter encoding, so there is no need for you to encode it again.
$.get("website.php", { p: searchTerm, })

quick Jquery .load chat not working

I have the following jquery:
var msg = $("#newmessage").val();
var user = $("#userchat").val();
var filename = "/inc/chat.php?msg="+msg+"&user="+user;
alert(filename);
$("#chatData").load(filename);
when 'msg' does not have a space in it, the #chatData loads fine and posts the variable.
When it does have a space in it, I just get a blank div. With no information in it whatsoever.
if I load up the php file that inserts the data into the DB, and manually type the same GET data, it works fine.
Whats going on?
Try using
encodeURIComponent(msg)
Also consider:
$("#chatData").load('/inc/chat.php',
{ 'msg' : $("#newmessage").val(), 'user' : $("#userchat").val() }
);
URI encoding is done, if needed, by jQuery.
You don't have to worry about URI encoding as the POST method is used since data is provided as an object (source).
In this case POST may be better than GET anyways.
If you were using $_GET in your php file you will need to use either $_REQUEST or $_POST.
you have to encode your message before sending using encodeURIComponent() and decode on server-site using urldecode().
doing this will escape/encode special characters that aren't allowed in an url or that will break your query-string otherwise (like a & in your message that would otherwise start a new argument).
You can use either escape, encodeURI or encodeURIComponent, but escape is the only method supported by every browser, although most modern browsers support the latter.
Reference
Take a look at this document, which does a good job of explaining all three.
The space could be causing an issue - try javascript's encodeURIComponent():
var msg = encodeURIComponent($("#newmessage").val());
var user = encodeURIComponent($("#userchat").val());

Making a URL W3C valid AND work in Ajax Request

I have a generic function that returns URLs. (It's a plugin function that returns URLs to resources [images, stylesheets] within a plugin).
I use GET parameters in those URLs.
If I want to use these URLs within a HTML page, to pass W3C validation, I need to mask ampersands as &
/plugin.php?plugin=xyz&resource=stylesheet&....
but, if I want to use the URL as the "url" parameter for a AJAX call, the ampersand is not interpreted correctly, screwing up my calls.
Can I do something get & work in AJAX calls?
I would very much like to avoid adding parameters to th URL generating function (intendedUse="ajax" or whatever) or manipulating the URL in Javascript, as this plugin model will be re-used many times (and possibly by many people) and I want it as simple as possible.
It seems to me that you're running into the problem of having one piece of your application cross multiple layers. In this case it's the plugin.
A URL as specified by RFC 1738 states that a URL should use a & token to separate key/value pairs from one another. However ampersand is a reserved token in HTML and therefore should be escaped into &. Since escaping the ampersands is an artifact of HTML, your plugin should probably not be escaping them directly. Instead you should have a function or something that escapes a canonical URL so that it can be embedded in HTML markup.
The only place that this is likely to actually happen is if you are:
Using XHTML
Serving it as text/html
Using inline <script>
This is not a happy combination, and the solution is in the spec.
Use external scripts if your script
uses < or & or ]]> or --.
The XHTML media types note includes the same advice, but also provides a workaround if you choose to ignore it.
Try returning JSON instead of just a string, that way your Javascript can read the URL value as an object, and you shouldn't have that issue. Other than that, try simply HTML decoding the string, using something like:
function escapeHTML (str)
{
var div = document.createElement('div');
var text = document.createTextNode(str);
div.appendChild(text);
return div.innerHTML;
};
Obviously you'll want to make sure you remove any reference to DOM elements you might create (which I've not done here to simplify the example).
I use this technique in the AJAX sites I create at my work and have used it many times to solve this problem.
When you have markup of the form:
<a href="?a=1&b=2">
Then the value of the href attribute is ?a=1&b=2. The & is only an escape sequence in HTML/XML and doesn't affect the value of the attribute. This is similar to:
<a href="<>">
Where the value of the attribute is <>.
If, instead, you have code of the form:
<script>
var s = "?a=1&b=2";
</script>
Then you can use a JavaScript function:
<script>
var amp = String.fromCharCode(38);
var s = "?a=1"+amp+"b=2";
</script>
This allows code that would otherwise only be valid HTML or only valid XHTML to be valid in both. (See Dorwald's comments for more info.)

Categories