JavaScript analytics / conversion script: expected failure rate? - javascript

Question for developers who work with 3rd-party analytics tools:
is there an industry-standard or expected failure rate with JavaScript tracking?
Scenario: I have a one-page website. I install Google Analytics, Mixpanel, and Heap 3rd-party analytics JavaScript tracking. The page loads clean and error-free. I use Adwords to buy 100 clicks to my site.
Now, according to raw server logs, I receive all 100 visitors. However, my Analytics dashboards report:
GA: 97 unique visitors
MixPanel: 96 unique visitors
Heap: 99 unique visitors
Report latency isn't an issue (I've waited 48 hours). I don't want to quibble about which analytics tool's definition of a "unique visitor" is best.
What I'm trying to get to the bottom of is this: is there an anticipated error bar I should apply globally to any/all analytics reports? Say that each script loads properly 95% - 99% of the time? (That way I can ignore mismatching numbers so long as they fall into this expected error bar and focus on true outliers.) Additionally, if there's an expected failure rate, I can have greater confidence that, despite the mismatched numbers above, my scripts are reporting properly and save my IT team a lot tail-chasing.
File under Anecdotes Not Data: A colleague told me his ecommerce site uses a hosted, JavaScript-based, enterprise-level conversion tracking platform. Based on 400-500 transactions per day, his analytics under-reports conversions consistently by 4-5%. He has several years of data documenting this (99.9% confidence).
What I don't know is, does this hold true globally? Do everyone's analytics scripts misfire, fail to load, or otherwise go CLICK instead of BANG 4-5% of the time?
Here are potential issues I AM aware of:
Script errors
Script conflicts
Timeouts when pulling from a third-party server
User bounces before scripts complete loading
**Not to get all chemtrails on you but: **IF there's an expected fail rate, it's certainly not common knowledge. Nobody I've spoken to at any analytics companies admit to consistent failure. Neither do they guarantee 100% accuracy.
So I ask: in your experience, what's the expected accuracy rate of your JavaScript-based, hosted analytics platforms?

I think your fourth bullet may be the most revealing. Paid media will have a very high bounce rate compared to organic/referrer/direct traffic.
What is the bounce rate reported by those various tracking tools? I would wager it is at least 80% which means the chances of users exiting prior to the scripts loading is high. You could correlate that with your page load time.
One thing you could try (since my expertise is with Google Analytics) is to use the Measurement Protocol to send pageview data on the server-side. Since that no longer requires JavaScript to load, it takes page load time and bounces out of the equation. I would not recommend using this method for production, but it could illuminate the issue.
To summarize, I think your issue is with load abandonment and not necessarily failure of the various tools.

Related

Aren't Javascript analytics scripts susceptible to easy data hacks?

On Production environments, Javascript based analytics scripts (Google Analytics, Facebook Pixel etc.), are injected into most web applications, along with the Unique ID/Pixel ID, in plain Javascript.
For example, airbnb uses Google Analytics. I can open up my dev console and run
setInterval(function() {ga('send', 'pageview');}, 1000);
which will cause the analytics pixel to be requested every 1 second, forever. That is 3600 requests an hour from my machine alone.
Now, this can easily be done in a distributed fashion, causing millions of requests per second, completely skewing the Google Analytics data for the pageview event. I understand that the huge amounts of data collected would correct this skewing to a certain extend, but that can be easily compensated by hiking up the amount of requests.
My question is this: are there any safeguards to prevent competitors or malicious individuals from destroying the data integrity of applications in this manner? Does GA or Facebook provide such options?
Yes,but the unsafe part don't comes for the Javascript. For example, you can use the measurement protocol to flood data to one account. Here you can see a lot of people in the same comunity having thoubles with this (and it's quiet simple to solve.)
https://stackoverflow.com/search?q=spam+google+analytics
All this measurement systems uses HTTP calls to fill the data on your "database". If you are able to build the correct call you can Spam Everyone and everywhere (but don't do it, don't be evil).
https://developers.google.com/analytics/devguides/collection/protocol/v1/?hl=es-419
This page of Google Analytics explain what is the protocol measurement, Javascript only work as framework to build and send the hit.
https://developers.google.com/analytics/devguides/collection/protocol/v1/?hl=es-419
But, so not everything is lost.
For example, if you try to do that on you browser with that code, The Google Analytics FrameWork limit to 1 call per second and 150 per session (or cookie value). Yes it's not complicated to jump that barrier, but after that other barriers will come.
So if you use the Javascript framework are safe. Now imagine you do the same with python, sending http to the Google Analytics server. It's possible but:
So here are 2 important things to says.
Google Analytics has a proactive "firewall", to detect Spammers and ban them.(How and when they do this is not public), but in my case i see a lot of less spammer that few years ago.
Also there is a couple of good practices to avoid this. For example, store only domains under a white list, creating a filter to allow only traffic from your domain
https://support.google.com/analytics/answer/1033162?hl=en
Also it's a very good practice to protect you ecommerce, using a filter to include only data from certain store or with certain parameter, "for example brand == my brand" or "CustomDimension== true". Exclude transactions with products over $1.000 (check your limits and apply proactive filters). All this barrier make complex to broke.
If you do this, you will protect your domain a lot(because it's too much complicated to know the combination of UA + Domain Valid when you create a robot), but you know, all the system can be broken. In my experience i only see 2 or 3 cases of damage comming from spammer or people who wanna hurt, and in all this case could be prevented if I created a proactive filter. Usually spammer only spam ads into your account,almost never want to hurt you. Facebook, Piwik and other Tools happens more or less the same.

What's the advantage of client-side analytics over server-side?

I've always used client-side web analytics that uses JavaScript to track visitor hits to the site, and all the useful information that gives. But some people have recently told me they prefer server side analytics because it's faster.
So what I wondered is what are the main advantages of doing it client-side with JavaScript? Which has more features and why?
Server or Client side for Analytics?
Server-side Advantages:
Servers can be set up with infinitely more power than desktop machines and so can crunch "the big numbers".
Performance can be more predictable as the same machines are used for everyone's analysis and generation of results.
Output will not have dependencies on browser / browser version as they just have to display an image.
Output can also be multi-device without any dependencies.
Output can be the same everywhere both reducing client issues and also making the image generation be about supporting 1 output format over many.
Client-side Advantages:
If the number of clients is large, say thousands per minute, it can be good to unload the processing to client machines to avoid having them slow down a central server.
Solutions tend to provide more interactivity and faster results as all the data and the logic is on the client.
Once downloaded initially, views can be changed without being online.
If the traffic varies a lot, say sometimes a few queries per hour, other times, hundreds per minute client-side makes sure that a central server is not over-loaded by this effort
Server-side infrastructure will not be needed and so will not cost (the provider) money.
Many companies use both Google Analytics (client side) and Webtrends (server side/client side) to do web analytics.
One thing about Google Analytics is that it doesn't work when the user doesn't allow scripts. Webtrends can crawl your access logs.
Client-side tracking provides more information in comparison with server-side tracking.

End user experience monitoring tools

I have a web application with a great deal of both client-side and server-side logic. It is considered business-critical that this application feel responsive to the end user, for some definition of "feels responsive." ;)
Most website monitoring discussions revolve around keeping an eye on server-side metrics (response time, I/O queue depth, latency, CPU load, etc.), i.e. we tend to treat server performance and responsiveness as though it's a viable "proxy" for what the user is experiencing.
Unfortunately, as we move more and more logic to client side Javascript, the correlation decreases and our server metrics become less useful.
I didn't find any good matching SO questions on this. Googling gives a range of commercial products that might be related, but they're generally from the manufacturers' websites, full of unhelpful marketspeak and "please call us for details," so it's hard to know.
Are there any commonly-used tools for this sort of thing, other than rolling your own? Both free and commercial are welcome, although free is obviously better all else being equal.
EDIT: To clarify, I primarily need to gather bulk data on the user experience, including both responsiveness and breakage/script errors. Automatic analysis is a very-nice-to-have, although I'd expect to have to occasionally dig into the data myself regardless of the solution.
There are some freely available tools for performance monitoring. Yahoo open-sourced a script they used called Boomerang which can measure page load times and other performance metrics for end-users. Full documentation here. Google analytics also offers a basic page load time report.
For error monitoring, you'll want to listen for the window.onerror event. I don't know of any scripts that will automatically log it for you, or mine the logs on the server side. If you implement your own, you'll want to be very careful about not pinging the server too often--imagine how many requests it would generate if there was a JS error in your JS error handling code!
Bucky Client and Bucky Server, can perform that task :
http://github.hubspot.com/bucky/
From their website :
Open-source tool to measure the performance of your web app directly
from your users' browsers.
To analyse data they advise Graphite or OpenTSDB
You can try Atatus which provides Real User Monitoring(RUM) and Advanced error tracking for websites and web apps.
https://www.atatus.com/
http://www.whitefrost.com/documents/html/technical/dhtml/funmon.html#part1 tests the performance of javascript functions.
You can utilize Dynatrace Ajax for measuring and profiling the performance of the JavaScript in IE and Firefox. For Chrome, they have built in tools - take a look at:
http://blog.chromium.org/2011/05/chrome-developer-tools-put-javascript.html
For monitoring the performance of the overall application/site I would recommend synthetic monitoring utilizing real browsers, also known as web performance monitoring. These are services that have robotic agents sitting on Backbone ISPs performing the same activity as end users.
We utilize Catchpoint, which supports Selenium scripting. But there are others like Gomez and Keynote out there that have been providing such solutions for years.
You can also check out New Relic - now it has "real user monitoring" integrated - which measures the performance across all browser types. There is a 14 day trial period so you can set it up for free and see if you like it. You'll get visibility into browser rendering speed, DOM processing, the time it spends on the network, all the way back to your app performance on the server.

Does google analytics slow down my website? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I am at the final stages of my website, and currently I need to find a suitable statistics application/tool.
I have looked into webalizer, but it seems outdated.
Also, I have looked into Google analytics, but I am afraid that if I implement it, my website will go slow. It is already pretty heavy with database material being displayed which is dynamic btw.
I have read I can put the GA js code at the bottom of the page and thus the page will load first, but I still don't want a slow down.
You are all much more experienced in statistics than I am, so I believe you can give me some good advice.
I have my own private server (Linux) and I have root access as well (offcourse).
Do you think I should have a statistics app on the server, without interferring with my website, or should I go the Google way and use analytics?
Please give me good application names which you have tested etc...
Thanks
Any additional calls to scripts will slow down your site. However, Google Analytics instructs you to place it in a specific place so that it isn't loaded until the page has loaded. (It used to be before the </body> tag but I believe it's now supposed to be the last <script> in the <head> tag.) Don't worry about it too much; the benefits of analytics will far outweigh the extra call to a remote file.
Focus on other optimizations (database queries, CSS sprites, fewer HTTP requests). Analytics is necessary in today's site market and is indispensable; IMO it is not an option to forgo it.
As far as having your own "statistics app," I assume you're talking about building your own proprietary statistics codebase? I would discourage that, because it takes a lot of time and effort and in the end you will not have the same optimizations that Google has employed an entire project's worth of software engineers to make. Remember that while it's always great to create your own product, you don't have to reinvent the wheel, especially when it comes to things like this that have many sensible drop-in solutions that are widely available for free.
With respect to non-Google analytics solutions, one other of note is Clicky. I'm not as experienced with it as I am with GA, but I've heard many reviews that it is more precise and more informative than GA. However, just as an end-user browsing the web I've noticed a lot of times that its calls to Clicky's website do tend to slow down pages, and noticeably so; I cannot really say that I have seen the same effect with GA.
One last thing I would caution against is this: Do not employ more than one analytics solution unless you are trying to find the best one to suit your needs. It's just overkill to run two remotely-hosted analytics solutions on every single one of your pages, so what I would encourage you to do is try out a few for the first few weeks or so of your site (yes, pages will slow down during this trial phase) and then simply stick with the one that you like best. That will also give you the added benefit of being able to see first-hand what the speed implications are on your unique hosting environment for each script.
Here's some other analytics solutions that you might check out:
Piwik
Webtrends
GoingUp!
Yahoo! Web Analytics
Straight from Google's analytic sign up page (https://www.google.com/analytics/provision/)
"The appearance of your website will never be affected by your use of Google Analytics - we don't place any images or text on your pages. Likewise, the performance of your pages won't be impacted, with the possible exception of the very first page-load after you have added the tracking code. This first pageview calls the JavaScript on Google's servers, which may take slightly longer than a regular page load. Subsequent pageviews will use cached data and will not be affected."
Use the Asynchronous Snippet of Analytics:
http://code.google.com/apis/analytics/docs/tracking/asyncTracking.html
People focus to much on total load times when what is important is render times and in particular progressive rendering. If you use Google Analytics properly, it will load after the page has shown to the user. So yes, it will add a small overhead to every request but because the user can see the page already they probably won't even notice. Just go for it.
Webalizer runs on server side after apache logs doesn't it? That's why it appears outdated, it can't collect as much info as JS can. But it doesn't slow the user down any. You could run Webalizer and Google together for a bit and see what serves your needs best.
We decided to work around the possibility of google's servers appearing to slow our site down. Instead of our users downloading the ga.js file from google's servers we store it locally. The only problem with that approach is that our local copy becomes outdated. So we wrote an application that periodically compares our local file to google's and updates our file accordingly.
Andrew
Google Analytics is javascript based and does not tun on your server. All processing and storage is done on Google servers, so it's ideal if you are worrying about local resources.

Non-invasive javascript performance agent?

I am seeking to (legitimately) plant bugging in my web pages to collect and report information about website performance.
Preference for internally hosted. While I expect that there are commercial offerings out there (e.g. Google Analytics) I'm keen to find something we can run entirely in-house (its not a public website and may contain sensitive data).
Also, I'm looking for something where it can report back to an independent URL - i.e. not relying on adding in a reverse-proxy / recording results within existing webserver logs. Indeed, I'd prefer something which does not require access to the webserver logs logs at all (other than those for the URL the bug reports back to).
I need to be able to monitor bulk traffic - so things tools like pagespeed and tamperdata are not appropriate.
I've tried googling but just seem to be getting lots of noise about the performance of javascript and web pages rather than how to actually measure these.
TIA
You could use the open source analytics software Piwik and write a plugin for it that sends the performance data to it.
Thanks chiborg. I'd kind of forgotten about this it was so long ago I asked. Yes, I was aware of PiWik - but not been very impressed with either its implementation nor the quality of documentation.
I'm currently working on a solution using Boomerang.

Categories