Why does Google unescape their Analytics tracking code? - javascript

Just getting off my JavaScript training wheels.
Why does Google choose to unescape the document.write line in Part 1 below?
Why don't they just write it like this? Maybe unescape is required for some older browser compatibility?
document.write('<script src="'
+ gaJsHost
+ 'google-analytics.com/ga.js" type="text/javascript"></script>');
For reference, the entire Google Analytics tracking code looks like this:
Part 1:
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol)
? "https://ssl."
: "http://www."
);
document.write(unescape("%3Cscript src='"
+ gaJsHost
+ "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"
));
</script>
Part 2:
<script type="text/javascript">
try
{
var pageTracker = _gat._getTracker("UA-0000000-0");
pageTracker._trackPageview();
}
catch(err){}
</script>
I understand what the rest of the code does, just curious about the unescape part.
Edit
The bottom line is, unescape is required. Voted to close this question because it is a duplicate (see answer marked correct).

It means the code will work in XML / XHTML and HTML without having to mess with CDATA
Please see:
https://stackoverflow.com/questions/1224670/what-is-the-advantage-of-using-unescape-on-document-write-to-load-javascript

My understanding is when </script> is found even inside the quotes "</script>"
the parser wrongly understood that, its reach end of the script, so they cannot do like "</script>"
And Google wants to make sure variables like pageTracker are set before the google-analytics.com/*.js load, so unescaping %3Cscript and %3E%3C/script%3E is only the way for them.
just my 2 cents, sorry If I say wrong.

Writing directly into the document without using the '<' or '>' characters means that you don't have to escape them in document formats which interpret these literally. Otherwise, the correct interpretation is that the <script> tags begin inside of the string, which is not what's desired.
Also, note that there's an error in your proposed alternative code (you missed a quote mark after the end of the src attribute).

I think that:
document.wrIte('<script src="'"
will fail HTML Validation as well. Interestingly, it also breaks the Preview on this comment box :)

Related

"%3Cscript" vs "<script"

Every once in a while, I'll see an HTML code snippet with:
%3Cscript
where the %3C replaces the <. Is this because the code was auto-generated or needs to display properly in an editor or was it coded that way explicitly for some reason and needs to keep that form on the HTML webpage? In case it is helpful here is the full beginning of the line of code I was questioning:
document.write(unescape('('%3Cscript
Wouldn't the line of code work just fine it you replaced the %3C with a <?
The unescape() Javascript function converts the %3C back to < before it gets written into the document. This is apparently an attempt to avoid triggering scanners that might see the literal <script tag in the source and misinterpret what it means.
When writing javascript in a script tag embedded in html, the sequence </script> cannot appear anywhere in the script because it will end the script tag:
<script type="text/javascript">
var a = "<script>alert('hello world');</script>";
</script>
Is more or less treated as:
<script type="text/javascript">
var a = "<script>alert('hello world');
</script>
";
<script></script>
In the eyes of the html parser.
Like mplungjan said, this is convoluted way and one can simply <\/script> in a javascript string literal to make it work:
<script type="text/javascript">
var a = "<script>alert('hello world');<\/script>";
</script>
This is not related to document.write technically at all, it's just that document.write is a common place where you need "</script>" in javascript string literal.
Also note that "<script>" is indeed totally fine as is. It's just the "</script>" that's the problem which you have cut out from the code.
As mentioned, possible attempt to fool scanners.
A more useful and important one is the
<\/script> or '...<scr'+'ipt>' needed to not end the current script block when document.writing a script inline

correct syntax for javascript "src" declaration

trying to add a math.random() method to my embedded js declaration to prevent the linked javascript file from being cached.
The problem is that I keep getting syntax errors. I think I'm close but my debugging efforts are proving futile. The code is as follows:
<script type="text/javascript">
document.write("<script src='links7.js?'+Math.random()+></script>");
</script>
any suggestions?
Issue #1
Escaping/Encoding
When the HTML parser finds <script> it will start parsing the contents until it finds </script> which is present in:
document.write("<script src='links7.js?'+Math.random()+></script>
As such, you'll need to change the source so that it's not parsed as the end of a script element:
document.write("<script src='links7.js?'+Math.random()+></scri" + "pt>");
Ideally, you'd have HTML escaped all your inline JavaScript code, which would also mitigate this issue:
document.write("<script src='links7.js?'+Math.random()+></script>");
But honestly, doing that gets really annoying really quickly.
A better means of escaping all character content within your inline JavaScripts is to use a CDATA (Character Data) element:
<script type="text/javascript">
<![CDATA[
document.write("<script src='links7.js?'+Math.random()+></script>");
]]>
</script>
Of course, because the CDATA element is within the <script> element, the JS parser tries to execute it as code, and doesn't know what to do with a syntax like < ! [ CDATA [, so to get around this issue, you'll need to wrap the CDATA element in a JS comment:
<script type="text/javascript">
//<![CDATA[
-or-
/* <![CDATA[ */
//]]>
-or-
/* ]]> */
</script>
I recommend the second form, using multi-line comments so that there wont be any issues with your code if whitespace is mistakenly parsed out for whatever reasons (I've seen some Content Management Systems that drop newlines in their Rich Text Editors).
One more caveat with CDATA, the CDATA element, just like the script element, stops wrapping its contents at the first sign of its closing tag (]]>), so be careful to avoid writing something like:
lorem = ipsum[dolor[sit]]>amet;
because it will kill your CDATA element.
To avoid all of these issues, keep your JavaScripts in external .js files where they belong whenever possible.
I understand for simple one-liners it may be preferable to avoid the external file, in which case you should use the <![CDATA[ ]]> element or even HTML escape your JavaScript.
Issue #2
Quotation
The string you've provided to document.write:
document.write("<script src='links7.js?'+Math.random()+></script>");
is incorrectly formatted to execute anything other than define a static string of <script src='link7.js?'Math.random()+></script>.
If you'd like to execute Math.random() you need to have a string before and a string after:
'<script src="links7.js?' + Math.random() + '"></script>'
You may notice that I swapped the usage of ' and ". I typically like to see " used in HTML attributes, and hate having to escape \" literals, so I use ' for string literals as often as possible in JS.
Additionally you should notice the opening and closing quote for the src attribute. The DOM may be able to give a good guess of when and where an attribute ends, but don't push it just in case it causes some breakage in your code.
Step by step:
Start with what you want the code to look like:
<script src="links7.js?89736459827345"></script>
Then put it in a literal string:
'<script src="links7.js?89736459827345"></script>'
Isolate the random part in a separate string:
'<script src="links7.js?' + '89736459827345' + '"></script>'
Replace the random part with the random expression:
'<script src="links7.js?' + Math.random() + '"></script>'
Put something in the closing script tag, to keep it from messing with the script tag that it will go inside:
'<script src="links7.js?' + Math.random() + '"></scr' + 'ipt>'
Put it in a document.write call:
document.write('<script src="links7.js?' + Math.random() + '"></scr' + 'ipt>');
And the script tag around it:
<script type="text/javascript">
document.write('<script src="links7.js?' + Math.random() + '"></scr' + 'ipt>');
</script>
You've got an issue with open/close quotes. Try:
document.write("<script src='links7.js?" + Math.random() + "'></script>");
<script type="text/javascript">
document.write("<script src='links7.js?" + Math.random() + "'></script>");
</script>
Suggestion: use IDE with code highlighting.

How do I get Aptana/Firefox to execute my JavaScript rather than show the code?

I'm extremely new to coding and I'm reading a book on it. And I think I have the basics down on this little test project I'm doing, but whenever I test the page I just see the code I used. Here's the entirety of my code.
<script type = "text/javascript">;
//<![CDATA[
// from concat.html
var person = "" ;
person = prompt( "What is your name?") ;
alert("Hi there, ") + person + "!");
//]]>
</script>
Honestly I don't know what the CDATA is for or what concat.html is.
How can I get Firefox to run my JavaScript rather than just show the code?
Try wrapping it in <html> to make the whole page get treated as HTML. Does the file have a .js extention, by any chance?
CDATA is to distinguish code from markup.
Put it in an HTML file.
So, first, save it as scriptname.html - you're embedding JavaScript within an HTML file.
Next, make it valid html - add <html> to the top and </html> to the bottom. And <head> and <body> tags where appropriate - if you don't know what those are, head over to any HTML site to look them up (www.diveintohtml5.org is nice, if you can follow it.)
Better install Firebug plugin for Firefox or use other browser's Javascript console. It will allow you to run your code
http://www.w3resource.com/web-development-tools/execute-JavaScript-on-the-fly-with-Firebug.php

What is the purpose of this JavaScript?

I was playing around with a Python-based HTML parser and parsed Stackoverflow. The parser puked on a line with
HTMLParser.HTMLParseError: bad end tag: "</'+'scr'+'ipt>", at line 649, column 29
The error points to the following lines of javascript in the site's source:
<script type="text/javascript">
document.write('<s'+'cript lang' + 'uage="jav' + 'ascript" src=" [...] ">');
document.write('</'+'scr'+'ipt>');
</script>
([...] replace a long link, which is removed for simplicity)
Out of curiosity, is there a specific reason for what looks to me like artificial 'obfuscation' of the code, i.e. why use the document.write method to concatenate all the chopped up strings?
I think it's to fight adblockers.
... + 'uage="jav' + 'ascript" src="http://ads.stackoverflow.com
It has been written in that way to avoid the browser thinks it's the closing tag for <script>, which would cause some problems.
When the HTML parser encounters document.write('</script>');, it thinks it has found the end of the enclosing <script> tag. Breaking the tag up stops the parser from recognising the closing tag.
The other way I've seen this achieved is by escaping the slash, i.e. document.write('<\/script>');.
The correct way to do this is either:
Enclose the body of the script in a <![CDATA[ ... ]]> block (if serving XHTML), or
Put the script in an external file, or
Use the DOM API instead (i.e. create a script node and append that to the document head)
Perhaps its there to stop programs that search specifically for script tags. Ad blockers, for example, look for script tags and object tags.

Javascript external script loading strangeness

I'm maintaining a legacy javascript application which has its components split into 4 JS files.
They are "Default.aspx", "set1.aspx", "set2.aspx" and "set3.aspx". The ASPX pages writes out compressed JS from multiple (all-different) source files belonged to their respective set and set content-type header to "text/javascript".
The application is invoked by adding a reference to the first set and creating the main entry object.
<script src="/app/default.aspx" type="text/javascript"></script>
<script type="text/javascript>
var ax;
// <body onload="OnLoad()">
function OnLoad() {
ax = new MyApp(document.getElementById("axTargetDiv"));
}
</script>
At the end of the first set of scripts (default.aspx) is the following exact code:
function Script(src) {
document.write('<script src="' + src + '" type="text/javascript"></script>');
}
Script("set1.aspx?v=" + Settings.Version);
Which loads the second set of scripts (set1.aspx). And this works without any errors in all major browsers (IE6-8 Firefox Safari Opera Chrome).
However, as I've been working on this script for quiet sometime, I'd like to simplify function calls in a lot of places and mistakenly inlined the above Script function, resulting in the following code:
document.write('<script src="set1.aspx?v=' + Settings.Version + '" type="text/javascript"></script>');
Which, when tested with a test page, now throws the following error in all browsers:
MyApp is not defined.
This happens at the line: ax = new MyApp(... as Visual Studio JS debugger and Firebug reports it.
I've tried various methods in the first 4 answers posted to this question to no avail. The only thing that will enable MyApp to loads successfully is only by putting the actual "add script" code inside a function (i.e. the document.write('script') line):
If I put the document.write line inside a function, it works, otherwise, it doesn't. What's happening?
Splitting and/or escaping the script text does not work.
To see the problem, look at that top line in its script element:
<script type="text/javascript">
document.write('<script src="set1.aspx?v=1234" type="text/javascript"></script>');
</script>
So an HTML parser comes along and sees the opening <script> tag. Inside <script>, normal <tag> parsing is disabled (in SGML terms, the element has CDATA content). To find where the script block ends, the HTML parser looks for the matching close-tag </script>.
The first one it finds is the one inside the string literal. An HTML parser can't know that it's inside a string literal, because HTML parsers don't know anything about JavaScript syntax, they only know about CDATA. So what you are actually saying is:
<script type="text/javascript">
document.write('<script src="set1.aspx?v=1234" type="text/javascript">
</script>
That is, an unclosed string literal and an unfinished function call. These result in JavaScript errors and the desired script tag is never written.
A common attempt to solve the problem is:
document.write('...</scr' + 'ipt>');
This is still technically wrong (and won't validate). This is because in SGML, the character sequence that ends a CDATA element is not actually ‘</tagname>’ but just ‘</’ — a sequence that is still present in the line above. Browsers generally are more forgiving and in practice will allow it.
Probably the best solution is to escape the sequence. There are a few possibilities, but the simplest is to use JavaScript string literal escapes ('\xNN'):
document.write('\x3Cscript src="set1.aspx?v=1234\x26w=5678" type="text/javascript"\x3E\x3C/script\x3E');
The above escapes all ‘<’, ‘>’ and ‘&’ characters, which not only stops the ‘</’ sequence appearing in the string, but also allows it to be inserted into an XHTML script block without causing errors.
(In XHTML, there's no such thing as a CDATA element, so these characters would have the same meaning as if included in normal content, and a string '<script>' inside a script block would actually create a nested script element! It's possible to allow <>& in an XHTML script block by using a <![CDATA[ section, but it's a bit ugly and usually better to avoid using those characters in inline script.)
1) Assure that you do not try to reference MyApp before the script is "actually" included in your page.
2) Try breaking the word "script" in your inline loader like this:
<script type="text/javascript">
document.write('<scr' + 'ipt src="set1.aspx?v=1234" type="text/javascript"></scr' + 'ipt>');
</script>
Alternatively, use this syntax which i borrowed from google analytics code and have been able to use successfully:
<script type="text/javascript">
document.write(unescape("%3Cscript src='set1.aspx?v=1234' type='text/javascript'%3E%3C/script%3E"));
</script>
You could also try:
var script = document.createElement("script");
script.src = "set1.aspx?v=1234";
script.type = "text/javascript";
document.getElementsByTagName("head")[0].appendChild(script);
Steve
If you could use JQuery you could use the following:
$.getScript("set1.aspx?v=1234");
This loads the script into the global javascript context.
Make sure you set contenttype of the response to "text/javascript".
Hope this helps...

Categories