Google Analytics: _trackPageview without page refresh?

Google Analytics: _trackPageview without page refresh? - javascript

I'm reworking some site tracking for a site I'm working with. For the tracking we are currently using Google Analytics, which seems to be working fairly well. However, I'm having some troubles resembling the ones in this question, but it's old and no one answered, so I'm bumping a bit here. :)
Basically, I'm tracking two kinds of things. Raw pageviews (entering a page), and events on the page (lightbox opened, something important clicked, etc). I'm using _trackPageview for both kinds of events, because I need to be able to track some lightbox flows in GA's goal funnel tracking, and as I understand it _trackEvent calls can't be tracked in goal funnels.
The problem here is that it seems like the way GA works, it doesn't really post its data instantly (firebug doesn't show any requests happening, at least), but defers it to a page refresh or something like that. I'm not totally sure what happens, but basically I'm getting all events up to the first one leading to a page refresh all shuffled up in the funnel and looking like they all happened as an exit from the event causing the refresh. (Did that make sense? :) Is there any way of forcing GA to "flush" an event when it happens and not defer it? Or am I using things totally wrong?
EDIT: I was a bit blind reading the firebug logs... It does actually do the request to __utm.gif with the correct data. Makes the funnel being weird even more strange though, so the basic question is still valid.
Thanks

I made a function for this. We wanted to track how many people click on each on of a few links we have so we "track pageviews" for it.
function trackPV(trackerCode, url)
{
var tracker = _gat._getTracker(trackerCode);
if(url)
{
tracker._trackPageview(url);
}
else
{
tracker._trackPageview();
}
}
Basically, you pass in your tracker code (UA-XXXXX) and a url if you'd like to, such as "http://www.example.com/link1", by default it just tracks the page you are on.
Hope this helps.

I believe each call to _trackPageview will submit a unique request to Google Analytics (via parameters to the __utm.gif object). Google Analytics is pretty tough to debug since there is such a lag between the time your send your data, until it is actually visible online. Typically, you will have to wait 4+ hours before your data will show up - so maybe you just need to wait to confirm that your code is working.

Hmmm... I really only have experience with the old GA, but it seems to me that your best course of action is decoding the utm.gif request and seeing if it contains incorrect information. Here's a list of debugging tools that Google recommends.

use "event tracking" . At least check it out in google analytics help.

Related

Code in Google Tag Manager does not working

I have a test tasks and 2 from 3 I've done.
But this one I don't understand how and what I need to do?!
I managed to find syntax error:
At first should be:
...function someFunctionName() {...}
or
(function() {...})()
...second it's anonymous function...
TASK:
This script is executed in GTM and implemented in Google analytics by custom Task.
The script sends information about user behavior to Optimozg server and then to Bigquery (bq.php file processes and forwards data). Optimozg server data is coming in correctly, but the data in Google Analytics does not reach.
What is the reason?
How do you fix it?
Hint:
(test the code on your site instance with GTM)
function(){return function(tracker){if("undefined"===typeof tracker.get("BigQueryStreaming")){var f=tracker.get("sendHitTask"),h=function(){function d(c){var a=!1;try{document.createElement("img").src=e(!0)+"?"+c,a=!0}catch(k){}
return a}
function e(c){var a="https://test.optimozg.com/bq/bq-test.php";c||(a+="?tid="+encodeURIComponent(tracker.get("trackingId")));return a}
return{send:function(c){var a;if(!(a=2036>=c.length&&d(c))){a=!1;try{a=navigator.sendBeacon&&navigator.sendBeacon(e(),c)}catch(g){}}
if(!a){a=!1;var b;try{window.XMLHttpRequest&&"withCredentials" in(b=new XMLHttpRequest)&&(b.open("POST",e(),!0),b.setRequestHeader("Content-Type","text/plain"),b.send(c),a=!0)}catch(g){}}
return a||d(c)}}}();tracker.set("sendHitTask",function(d){h.send(d.get("hitPayload"));tracker.set("BigQueryStreaming",!0)})}}}

Not sure why JS devs should know anything about GTM. They typically don't go there.
But yes, to understand how to use the given code properly, just read this article: https://www.simoahava.com/analytics/customtask-the-guide/ it describes what custom tasks are and how to use them.
Ok, so first make a GTM account. Deploy the GTM code on your site. May as well use a local site. Or, rather, have the GTM code being injected by a local extension to a random site that doesn't have GTM yet. Or maybe use a redirector extension to redirect the request for their GTM to yours, up to you.
After that, you just make a tag in GTM that would send a Universal Analytics pageview. GA4 decided not to bother with custom tasks, unfortunately, so UA only. Then you make a trigger on pageview. You assign the trigger to the tag. Don't forget to publish the workspace at least once for it to be testable. Then you preview. Preview is a CTA in GTM in top right corner, near the publish. Basically a neat GTM debugger. Enter the site where you have your GTM snippet deployed/injected. Make sure preview sees your tag firing on page load. That would mean you did the preparation correctly.
We're doing the Hint section here, by the way. Now you need to make a custom javascript variable in GTM, paste the code snippet as is in there. The reason why it wants the code in an anonymous function is because it will run it as a closure on it's own. So they kinda remove the need of the extra ()(). It's mostly done for people who don't know JS, so don't be surprised.
Ok, you've made the CJS var, now go to your tag, and set your customTag exactly as Simo shows in his article:
Good, now publish your container, go to the site where you have it deployed, open the network tab and reload the page.
Inspect the calls to the BQ and Optimozg endpoints. Now what they ask is, I believe, why the original call that is meant to be sent by the tag is not being send. So if you remove the setting of the customTask, then publish and reload the page, you should see a request to the collect endpoint, which is the GA's endpoint for data tracking. If you re-add the customTask code, it will prevent the normal tag's functionality from execution, so no collect call.
What they want to hear from you is how to make the tag fire the original event alongside their optimozg and bq calls.
Most likely, the answer is pretty simple and elegant, but requires a lot of debugging to reach to. Reading Simo's article will help understanding the significance of setting various tasks.
Uh, ok, I didn't mean to really debug it, but it looks like I found the bug. It's in the var f = tracker.get("sendHitTask") It's being used to store the original sendHitTask function, but it never gets used. Why is that? Basically, you just need to call the function in the new sendHitTask function that you set in the last line. I'm not going to debug it in my GTM, but I'm pretty sure that's the issue. It's kinda begging to be found there.
Also, this is not quite a junior JS dev task. It's a senior tracking implementation task. Basically, about $110/hr in Canada and US. Junior JS devs are around $35/hr, I guess. They're just trying to save money, heh. I was thinking of hiring junior JS devs instead of tracking implementators too, but it's hard to teach how data analysis works in all the different tools.

XSS vulnerability despite encoding

I had a security audit on a website on which I've been working. The audit has shown that one of my parameter, called backurl, wasn't protected enough in my jsp file. This url is put inside the href of a button, button that allows the user to get back to the previous page.
So what I did was to protect it using the owasp library, with the function "forHTMLAttribute". It gives something like this:
<a class="float_left button" href="${e:forHtmlAttribute(param.backUrl)}">Retour</a>
However, a second audit showed that by replacing the value of the parameter by:
javascript:eval(document%5b%27location%27%5d%5b%27hash%27%5d.substring(1))#alert(1234)
The javascript code would be executed and the alert would show, when clicking on the button only.
They said that something that I could do was to hardcode the hostname value in front of the url, but I don't really get how this would help solve the problem. I feel like no matter what I do, solving a XSS vulnerability will just create a new one.
Could someone help me on this? To understand what's happening and where to look at least.
Thanks a lot.

As #Pointy said, the problem is more fundamental here. Accepting untrusted input and rendering that as a link verbatim (or as text), is a security issue, even if you escape the heck out of it. For example, if you allow login?msg=Password+incorrect and that's how you deal with relaying messages - you have a problem. I can make a site with: Click for cute kittens! and you see why this is a problem.
The real solution is to not accept potentially tainted information, period (nevermind escaping!), if that information ends up being rendered without the surrounding context that it is tainted. For example, take twitter. The username and the tweet are potentially tainted. This is no big deal because users of twitter get clued in due to the very design of the website that you're looking at what some rando wrote. If someone tweets 'If you transfer 5 bucks to Jack Dorsey's account at 12345678, he'll give you a twitter blue logo!', there's a reasonable expectation that it's on the users of the site to not be morons and trust it.
Your website's "click here to go to the previous page" is not like that. You can't reasonably expect the users of your site to hang over that button, check their browser's status bar, and figure it out.
Hence, the entire principle is wrong. You simply can't do it this way, period.
Your alternatives are threefold:
Instead of letting the 'previous link' property be a URL param, it needs to be in the session. Websites work with sessions, generally. You can store whatever you want in them, serverside (the HTTP handler code either manually takes e.g. the cookie and uses that to look up on-server info for that user by doing a lookup, or runs on a framework that just provides an HttpSession-style object, which works in that exact fashion).
If you really want to bend over backwards, you can include it as a signed blob. This is creative but I really wouldn't go there.
A quick hack: What if you just include Click to go back as a static link in your web page?

Google Tag Manager and Single Page APP: How to read a DOM Element?

I want to create a variable on my GTM to store a DOM element. I've tried a custom javascript, for example:
function(){
return document.querySelector('.room__price-value').innerText;
}
But nothing, on GTM preview I see always NULL. I think the issue was the single page app.
And I can't involve the programmers.
Any suggestions?
Thanks

Make sure that when in preview, you check the value of your variable on every event on the left.
Make sure when you execute document.querySelector('.room__price-value').innerText; in your console, you actually get the result you're expecting.
Change the code until you get the result in the local console.
Make sure your timing is good, as Eike mentioned in the comment.
What you're doing there seems right. That's pretty much how you do DOM scrape in GTM when front-end devs aren't available. If the suggestions still don't work for you, give us more info on what you're doing, on how your DOM looks like, what your debugging showed and such. As much potentially useful context as you can.

Puppeteer - Scraping FiveThirtyEight Presidential Approval doesn't work

I saw one other SO question, but it had no answers, and was from 2018. Am I supposed to intercept XHR? JSON Get requests? Any help getting started here appreciated.
I try to scrape by selector or xpath, and it always times out saying the xpath doesn't exist. I've tried various functions to wait until the page is well and truly loaded, no dice. Something funky seems to be occurring for puppeteer that it doesn't actually load the page, whereas for me, as a real human, it well and fully loads, and I see an element with the approval rating.

I see it's in an iframe, here is the src: https://projects.fivethirtyeight.com/trump-approval-ratings/promo.html
You can get those with:
document.querySelector('.approve .val').innerHTML
// "42.6"
document.querySelector('.disapprove .val').innerHTML
// "52.7"

jQuery on MTurk, why does Chrome report "Unsafe JavaScript attempt to access frame with URL"?

I'm doing a couple of things with jQuery in an MTurk HIT, and I'm guessing one of these is the culprit. I have no need to access the surrounding document from the iframe, so if I am, I'd like to know where that's happening and how to stop it!
Otherwise, MTurk may be doing something incorrect (they use the 5-character token & to separate URL arguments in the iframe URL, for example, so they DEFINITELY do incorrect things).
Here are the snippets that might be causing the problem. All of this is from within an iframe that's embedded in the MTurk HIT** (and related) page(s):
I'm embedding my JS in a $(window).load(). As I understand it, I need to use this instead of $(document).ready() because the latter won't wait for my iframe to load. Please correct me if I'm wrong.
I'm also running a RegExp.exec on window.location.href to extract the workerId.
I apologize in advance if this is a duplicate. Indeed - after writing this, SO seems to have a made a good guess at this: Debugging "unsafe javascript attempt to access frame with URL ... ". I'll answer this question if I figure it out before you do.
It'd be great to get a good high-level reference on where to learn about this kind of thing. It doesn't fit naturally into any topic that I know - maybe learn about cross-site scripting so I can avoid it?
** If you don't know, an MTurk HIT is the unit of work for folks doing tasks on MTurk. You can see what they look like pretty quick if you navigate to http://mturk.com and view a HIT.
I've traced the code to the following chunk run within jquery from the inject.js file:
try {
isHiddenIFrame = !isTopWindow && window.frameElement && window.frameElement.style.display === "none";
} catch(e) {}

I had a similar issue running jQuery in MechanicalTurk through Chrome.
The solution for me was to download the jQuery JS files I wanted, then upload them to the secure amazon S3 service.
Then, in my HIT, I called the .js files at their new home at https://s3.amazonaws.com.
Tips on how to make code 'secure' by chrome's standards are here:
http://developer.chrome.com/extensions/contentSecurityPolicy.html

This isn't a direct answer to your question, but our lab has been successful at circumventing (read hack) this problem by asking workers click on a button inside the iframe that opens a separate pop-up window. Within the pop-up window, you're free to use jQuery and any other standard JS resources you want without triggering any of AMT's security alarms. This method has the added benefit of allowing workers to view your task in a full-sized browser window instead of AMT's tiny embedded iframes.

We Keep Coding

JavaScript is the programming language of the Web.