I have the weirdest thing happening to some integers in my front-end.
Javascript, true to its legacy, decides to decrement integers if I decide to add them to a string. Not all, mind you, but only some, because if it was all that would be too simple.
Here is an example.
var jid = 10152687447723705;
$('#slide_2_inner').html("<img src='//graph.facebook.com/"+jid+"/picture?width=80&height=80' width='80' height='80' />");
This code fetches the Facebook profile picture of user with ID 10152687447723705 (a friend of mine) and displays it inside the slide_2_inner div. Or at least it's supposed to, because when the HTML shows up, the image location is wrong. Why is it wrong? Because instead of //graph.facebook.com/10152687447723705/picture?width=80&height=80 it's //graph.facebook.com/10152687447723704/picture?width=80&height=80.
So, you see, 10152687447723705 got turned into 10152687447723704, somehow.
When I do this instead...
var jid = "10152687447723705";
$('#slide_2_inner').html("<img src='//graph.facebook.com/"+jid+"/picture?width=80&height=80' width='80' height='80' />");
...the URL is created correctly and the image shows up.
But wait, it gets worse. Let's try the first way again, but with a different Facebook ID:
var jid = 1593894704165626;
$('#slide_2_inner').html("<img src='//graph.facebook.com/"+jid+"/picture?width=80&height=80' width='80' height='80' />");
Guess what? It works! The image shows up.
So, the question is, why is this craziness happening and how do I stop it from happening?
Could it have to do with the fact that one number is odd and the other is even? I don't have enough data to confirm the pattern.
In the example that doesn't work, doing jid = jid.toString() doesn't change anything.
The main thing for me is understanding the root cause. I'm sure I could find some hacky solution, put the integer through some function to turn it into something that will work, but I'd prefer to get to the bottom of this properly.
The maximum integer value that is safe to use in JS is 2^53 - 1 (Number.MAX_SAFE_INTEGER). Integer values higher than that are prone to precision errors.
Facebook IDs are usually larger than that, thus they have to be represented by a string. That should not be a problem since one does not use IDs for mathematically computations anyway.
Related
Take this very simple example HTML:
<html>
<body>This is okay & fine, but the encoding of this link seems wrong.</body>
<html>
On examining document.body.innerHTML (e.g. in the browser's JS console, in JS itself, etc.), this is the value I see:
This is okay & fine, but the encoding of this link seems wrong.
This behaviour is the same across browsers but I can't understand it, it seems wrong.
Specifically, the link in the orginal document is to http://example.com?a=1&b=2, whereas if the value of innerHTML is treated as HTML then it links to http://example.com?a=1&b=2 which is NOT the same (e.g. If I created a new document, which actually had innerHTML as its inner HTML, and I clicked on the link then the browser would be sent to a materially different URL as far as I can see).
(EDIT #3: I'm wrong about the above. Firstly, yes, those two URLs are different; but secondly, the innerHTML which I thought was wrong is right, and it correctly represents the first URL, not the second! See the end of my own answer below.)
This is different from the issue discussed in question innerHTML gives me & as & !. In my case (which is the opposite to the case in that question) the original HTML is correct and it looks to me as if it is the innerHTML which is wrong (i.e. because it is HTML which does not represent what the original HTML represented).
(EDIT #2: I was wrong about this, too: it's not really different. But I think it is not widely known that & is the correct way to represent & inside an href, not just within body text. Once you realise that, then you can see that these are the same issue really.)
Can anyone explain this?
(EDIT #1+4: This only occurred to me a bit late, after writing my original question, but: "is & actually correct within the href text, and & technically incorrect?" As I said when I first wrote those words, that "seems very unlikely! I've certainly never seen HTML written that way." But however 'unlikely', or not, that is the case, and is the main part of what I wasn't understanding!)
Also related and would be useful, can anyone explain how to cleanly get HTML which does correctly represent the target of document links? You definitely can't just un-encode all HTML character references within innerHTML, because (as shown in the example I've used, and also as discussed in innerHTML gives me & as & !) the ones in the main run of text should be encoded, and just un-encoding everything would make these wrong.
I originally thought this was not a duplicate of innerHTML gives me & as & ! (as discussed above; and in a way it still isn't, if it's agreed that it's not as obvious or widely known that the same issues apply inside href as in body text). It's still definitely not a duplicate of A href in innerHTML (which somehwat unclearly asks about how to set innerHTML using JS).
Most browser tools don't show the actual HTML because it wouldn't be of much help:
HTML is often generated dynamically after page load with the help of CSS and JavaScript.
HTML is often broken and the browser needs to repair it in order to generate the memory representation needed for rendering and other stuff.
So the HTML you see is not the actual source but it's generated on the fly from the current status of the document, which of course includes all the fixed applied (in your case, the invalid HTML entities).
The following example hopefully illustrates all the combinations:
const section = document.querySelector("section");
const invalid = document.createElement("p");
invalid.innerHTML = 'Invalid HTML (dynamic)';
const valid = document.createElement("p");
valid.innerHTML = 'Valid HTML (dynamic)';
section.appendChild(valid);
section.appendChild(invalid);
const paragraphs = document.querySelectorAll("p");
for (p of paragraphs) {
console.log(p.innerHTML);
}
const links = document.querySelectorAll("a");
for (a of links) {
console.log(a.getAttribute("href"));
}
<section>
<p>Invalid HTML (static)</p>
<p>Valid HTML (static)</p>
<section>
Is & actually correct within the href text, and & technically incorrect? It seems very unlikely! I've certainly never seen HTML written that way.
There's no such thing as "technically correct", let alone today when HTML is pretty well standardised. (Well, yes, there're two competing standards bodies and specs are continuously evolving, but the basics were set up long ago.)
The & symbol starts a character entity and &b is an invalid character entity. Period.
But it works! Doesn't that mean it's technically correct?
It works because browsers are explicitly designed to deal with completely broken markup, what's known as tag soup, because it was thought that it would ease usage:
<p><strong>Hello, World!</u>
<body><br itspartytime="yeah">
<pink>It works!!!</red>
But HTML entities are just an encoding artefact. That doesn't mean that URLs are not allowed to contain literal ampersands, it just means that —when in HTML context— they need to be represented as &. It's the same as when you type a backslash in a JavaScript string to escape some quotes: the backslash does not become part of your data.
Having thought up a possible (but I thought 'unlikely') explanation - which I put in as an edit in the original question - I've realised that it is the answer:
Using & to represent & inside an href is technically incorrect, and & is technically correct
I gathered this initially from this SO answer https://stackoverflow.com/a/16168585/795690, and I think it is relevant that (as it also says in that answer) the idea that & is the correct way to represent & in an href is not as widely understood as the idea that & is the correct way to represent & in body text.
Once you do understand this, it makes sense that what the browser is doing is right, and that the innerHTML value which comes back represents the link correctly.
EDIT:
#ÁlvaroGonzález gives a much longer answer, and it took me a while to see how everything he says applies, so I thought I'd try to explain what I didn't understand starting from where I started from, in case it helps someone else!
If you start with raw HTML with <a href="http://example.com/?a=1&b=1"> and then you inspect the DOM in the browser, or look at the value of the href attribute in JS then you see "http://example.com/?a=1&b=1" everywhere. So it looks as if nothing has changed, and nothing was wrong. What I didn't understand is that actually the browser has parsed a technically incorrect href (with invalid entities) to be able to display this to you! (Yes, LOTS of people use this 'broken' format!)
To see this first hand, load this longer HTML example into your browser:
<html>
<body style="font-family: sans-serif">
<p>Now & then http://example.com/?a=1&b=2</p>
<p>Now & then http://example.com/?a=1&b=2</p>
<p>Now & then http://example.com/?a=1&b=2</p>
</body>
</html>
then in your javascript console try running this code taken from #ÁlvaroGonzález's answer:
const paragraphs = document.querySelectorAll("p");
for (p of paragraphs) {
console.log(p.innerHTML);
}
const links = document.querySelectorAll("a");
for (a of links) {
console.log(a.getAttribute("href"));
}
Also try clicking on the links to see where they go.
Once you've made sense of everything that you see there, it is no longer surprising how innerHTML works!
I'm trying to grasp JavaScript DOM-based injection attacks better, so I would appreciate some input on this.
I have this output from Burpsuite as "firm" indicating it should be something here.
So the the main page loads a .js file with the code below.
Data is read from document.location and passed to eval() via the following statements:
var _9f=document.location.toString();
var _a0=_9f.split("?",2);
var _a1=_a0[1];
var _a2=_a1.split("&");
var sp=_a2[x].split("=");
djConfig[opt]=eval(sp[1]);
If I understand this correctly, it gets the content after '?' in the url, then splits the parameters after '=' and then evals the second array of that. So www.domain.tld?first=nothing&second=payload, is that correct?
Given that it's already inside of a js file, I'd assume I don't need the < script > tags in the payload? I really can't get it to fire anything so I'm doing it wrong obviously. Would appreciated some input to understand this better, not just a code snippet but some explanation would be great.
...it gets the content after '?' in the url, then splits the parameters after '=' and then evals the second array of that...
Almost. It gets the part of the string after the first ?, splits that into an array of parameters (by splitting on &), then gets the value of the xth parameter (the one at index x), splits it to get its value, and evals that.
This means the page executes code entered into it via the query string, which means Mary can give Joe a URL with code in it that will then execute within the page when Joe opens it, which is a potential security risk for Joe.
Say x is 2. This URL would show an alert: http://example.com/?a=1&b=2&c=alert(42)
var x = 2;
var _9f="http://example.com/?a=1&b=2&c=alert(42)";
var _a0=_9f.split("?",2);
var _a1=_a0[1];
var _a2=_a1.split("&");
var sp=_a2[x].split("=");
/*djConfig[opt]=*/eval(sp[1]);
Here's an example on JSBin: https://output.jsbin.com/cibusixeqe?a=1&b=2&c=alert(42)
How big a risk it is depends on what page this code is in.
Since the code doesn't use decodeURIComponent there are limits on what the code in the query string can be, though they can probably be worked around...
I have the following javascript which works fine for the most part. It gets the user that has logged in to the site and returns their DOMAIN\username info. The problem arises when the username starts with a letter that completes a valid escape character (eg. DOMAIN\fname). The \f gets interpolated and all kinds of terrible things happen. I have tried every sort of workaround to try and replace and/or escape/encode the '\'. The problem is the \f does not seem like it is available to process/match against. The string that gets operated on is 'DOMAINname'
// DOMAIN\myusername - this works fine
// DOMAIN\fusername - fails
var userName='<%= Request.ServerVariables("LOGON_USER")%>';
userName = userName.replace('DOMAIN','');
alert("Username: " + userName);
I also see all kinds of weird behaviour if I try to do a workaround using the userName variable, I think this may be because it contains a hidden \f. I've searched high and low for a solution, can't find a thing. Tried to find out if I could remove the DOMAIN\ on the serverside but that doesn't seem available either. Does anyone have a solution or workaround? In the debugger, the initial value of the servervariable is correct but the next immediate call to that variable is wrong. So the interpolated values in the debugger look like this:
var userName='DOMAIN\fusername';
userName; // 'DOMAINusername' in debugger.
Thanks
If you're using ASP.net (as it looks like you are), use AjaxHelper.JavaScriptStringEncode or HttpUtility.JavaScriptStringEncode to output the string correctly.
var userName='<%= HttpUtility.JavaScriptStringEncode(Request.ServerVariables("LOGON_USER"))%>';
This is by far the strangest error I've ever seen.
In my program I have one variable called avgVolMix. It's a decimal variable, and is not NaN (console.log(avgVolMix) prints something like 0.3526246 to console). However, using the variable at all in an assignment statement causes it to corrupt whatever is trying to use it to NaN. Example:
console.log(avgVolMix); <- prints a working decimal
var moveRatio = 10 + avgVolMix * 10;
console.log(moveRatio); <- prints NaN
I seriously have no idea why this is happening. I've tried everything to fix it; I've converted it to a string and then back, rounded it to 2 decimal places, adding 0.0001 to it - nothing works! This is the only way I can get it "working" right now:
var temp = 0.0;
for(i = 0; i <= avgVolMix; i+=0.1)
temp = i;
This assigns a number that is close to avgVolMix to temp. However, as you can see, it's extremely bad programming. I should also note that this isn't just broken with this one variable, every variable that's associated with a library I'm using does this (I'm working on a music visualizer). Does anyone know why this might be happening?
Edit: I'm not actually able to access the code right now to test any of this stuff, and since this is a company project I'm not comfortable opening up a jsfiddle anyway. I was just wondering if anyone's ever experienced something like this. I can tell you that I got the library in question from here: http://gskinner.com/blog/archives/2011/03/music-visualizer-in-html5-js-with-source-code.html
If its showing the variable value as NaN. Then try converting the variable as parseInt(); method. Hope it works. Because I also faced such problem and solved when tried it.
I'm using document.location.hash to preserve state on the page, and I'm putting url-encoded key value pairs up there, separated by "&" chars. So far so good.
However I'm running into an annoying problem on Firefox -- Firefox will quietly url-decode the hash value on the way in, so when you get it out later it's been decoded.
I can patch the problem by detecting when I'm running on firefox and calling encodeURIComponent on everything twice on the way in, but obviously that is hideous and I don't really want to do that.
Here's a simple example, where I encode "=" as "%3D", put it in the hash, and when I get it out later it's been turned back into "=" automatically:
// on the way in::
document.location.hash = "foo=" + encodeURIComponent("noisy=input");
//then later.....
// on the way out:
var hash = document.location.hash;
kvPair = hash.split("=");
if (kvPair.length==2) {
console.log("that is correct.")
} else if (kvPair.length==3) {
console.log("oh hai firefox, this is incorrect")
}
I have my fingers crossed that there's maybe some hidden DOM element that firefox creates that represents the actual (un-decoded) hash value?
but bottom line -- has anyone run into this and found a better solution than just doing browser detection and calling encodeURIComponent twice on Firefox?
NOTE: several other questions I think have the same root cause. Most notably this one:
https://stackoverflow.com/questions/4834609/malformed-uri-in-firefox-not-ie-using-encodeuricomponenet-and-setting-hash
I would strongly advise against using the hash value to preserve the state. Hash is supposed to point to object's fragment-id, as explained in RFC 1630
This represents a part of, fragment of, or a sub-function within, an
object. (...) The fragment-id follows the URL of the whole object from which it is
separated by a hash sign (#).
Is there anything stopping you from using cookies to preserve the state? Cookies are simple enough to use in JS, described on Geko DOM Reference pages, and would do the trick quietly, without appending values to the URL which is never pretty.
If you absolutely have to use hash though, you may want to consider replacing '=' with some other character, e.g. ":".
What you could do, is change the "=" to something else using
var string = string2.replace("=", "[$equals]")
You may have to run the line above a couple of times, depending on how many "=" there are.
Then same process you had as above.
NB If you require it for further code, you can replace [$equals] back to "=" after splitting the hash into an array.