Calling tracking JavaScript from AMP pages

Calling tracking JavaScript from AMP pages - javascript

We are using in house tracking mechanism for our website. We use our tracking.js file on our all pages.
Every page sent some info in an js object to this script file which later send this information to our tracking application using spring controller.
Now as to move page faster we use some pages in AMP templates.
But this does not allow us to use tracking.js
We tried iframe tag but it does not allow to use http call (it only allow https calls)
Could you please suggest a way to do it as it very critical and we can not move to https right now for other limitation.
Thanks
Virendra Agarwal

You can't use tracking.js with AMP as it is considered as an external library. It's written on their How It Works page that it won't allow author-written/3rd party JS:
"One thing we realized early on is that many performance issues are
caused by the integration of multiple JavaScript libraries, tools,
embeds, etc. into a page. This isn’t saying that JavaScript
immediately leads to bad performance, but once arbitrary JavaScript is
in play, most bets are off because anything could happen at any time
and it is hard to make any type of performance guarantee. With this in
mind we made the tough decision that AMP HTML documents would not
include any author-written JavaScript, nor any third-party scripts."
Only the components on this AMP example can be used.

As we worked with Google. We got it sorted.
You can add your API to AMP pages after validation by Google.
This API must be behind https and all calls should be validated by Google.
Google then will white list on AMP page and you can use that code in production.

Related

How to execute external JS file blocked by users' adblockers

We use an external service (Monetate) to serve JS to our site such that we can perform adhoc presentation-layer site updates without going through the process of a site re-deploy - which in our case is a time-consuming, monolithic process which we can only afford to do about once per month.
However, users who use adblockers in the browser do not see some of these presentation-layer updates. This can negatively affect their experience of the site as we sometimes include time-sensitive promotions that those users may not be aware of.
To work around this, I was thinking to duplicate the JavaScript file that Monetate is serving and host it on a separate infrastructure from the site. That way, it we needed to make updates to it, we could do so as needed without doing a full site re-deploy.
However, I'm wondering if there is some way to work around the blocking of the Monetate JS file and somehow execute the remote Monetate JS file from our own JS code in such a way that adblockers would not be able to block it? This avoid the need to duplicate the file.

If that file is blocked by adblockers, chances are that it is used to serve ads. In fact, your description of time-sensitive promotions sounds an awful lot like ads, just not for an external provider, but for your own site.
Since adblockers usually match the URL, the easiest solution would indeed be to rehost this file, if possible under a different name. Instead of hosting a static copy, you can also implement a simple proxy with the equivalent of <?php readfile('http://monetdate.com/file.js'); or apache's mod_rewrite. While this will increase load times and can fail if the remote host goes down, it means the client will always get the newest version of the file.
Apart from using a different URL, there is no client-side solution - adblockers are included in the browser (or an extension thereof), and you cannot modify that code for good reasons.
Beware that adblockers may decide to block your URL too, if the script is indeed used to serve ads.

Monetate if probably blacklisted in Adblock, so you can't do nothing about.
I think that self-hosting Monetate script would require to keep it updated by checking for new versions from time to time (maintaining it could become a pain in the ass).
A good solution in my opinion is to inform your users about that limitation with a clear message.
Or, you can get in touch with Monetate and ask for a solution.

what is better? using iframe or something like jquery to load an html file in external website

I want my customers create their own HTML on my web application and copy and paste my code to their website to showing the result in the position with customized size and another options in page that they want. the output HTML of my web application contain HTML tags and JavaScript codes (for example is a web chart that created with javascript).
I found two way for this. one using iframe and two using jquery .load().
What is better and safer? Is there any other way?

iframe is better - if you are running Javascript then that script shouldn't execute in the same context as your user's sites: you are asking for a level of trust here that the user shouldn't need to accede to, and your code is all nicely sandboxed so you don't have to worry about the parent document's styles and scripts.
As a front-end web developer and webmaster I've often taken the decision myself to sandbox third-party code in iframes. Below are some of the reasons I've done so:
Script would play with the DOM of the document. Once a third-party widget took it upon itself to introduce buggy and performance-intensive PNG fix hacks for IE across every PNG used in img tags and CSS across our site.
Many scripts overwrite the global onload event, robbing other scripts of their initialisation trigger.
Reading local session info and sending it back to their own repositories.
Loading any number of resources and perform CPU-intensive processes, interrupting and weighing down my site's core experience.
The above are all examples of short-sightedness or malice on the part of the third parties you may see yourself as above, but the point is that as one of your service's users I shouldn't need to take a gamble. If I put your code in an iframe, I know it can happily do its own thing and not screw with my site or its users. I can also choose to delay load and execution to a moment of my choosing (by dynamically loading the iframe at a moment of choice).
To argue the point in terms of your convenience rather than the users':
You don't have to worry about any of the trust issues associated with XSS. You can honestly tell your users they're not exposing themselves to any unnecessary worry by running your tool.
You don't have to make the extra effort to circumvent the effects of CSS and JS on your users' sites.

Recording web page events / ajax calls/results and so on

I'm mostly looking for directions here.
I'm looking to record events that happen within a web page. Somewhat similar to your average "Macro-recorder", with the difference that I couldn't care less about exact cursor movement or keyboard input. The kind of events I would like record are modification of input fields, hovers, following links, submitting forms, scripts that are launched, ajax calls, ajax results and so on.
I've been thinking of using Jquery to build a little app for this, and inserting this on whichever pages I would like to test it on (or more likely, loading the pages into an iframe or something). I however can not accommodate the scripts on these pages to work with this so it has to work regardless of the content.
So I guess my first question is: Can this be done? Especially in regards to ajax calls and various script execution.
If it can, how would I go about the ajax/script part of it? If it can't, what language should I look into for this task?
Also: maybe there's something out there that can already do what I'm looking for?
Thanks in advance

Two ways I can think of are:
Use an add on (firefox) or an extension (chrome) to inject a script tags that loads jquery and your jquery app
Set a proxy (you can use node.js or some other proxy server) and in your proxy inject script tags, be sure to adjust the ContentLength header. (tricky in https sites).
A much simpler and faster option where you don't need to capture onload is to write a JavaScript snippet that load jquery and your app by inject script tags, make that a bookmarklet and after the page loads hit the bookmarklet.

Came across this post when looking for a proxy for tag injection.
Yes it's quite possible to trap (nearly) all the function and method calls by a browser via code in a javascript loaded in the page - usually a javascript debugger (firebug?) or HTTP debugger (tamperdata / fiddler) will give you msot of what you require with a lot less effort.
OTOH if you really want to do this with bulk data / arbitary sites, then (based on what I've seen so far) you could use Squid proxy with an icap server/ecap module (not trivial - will involve a significant amount of programming) or implement the javascript via greasemonkey as a browser extension.
Just to clarify, so far I've worked out how to catch function and method (including constructor calls) and proxy then within my own code, but not yet how to deal with processing triggered by direct setting of a property (e.g. img.src='http://hackers-r-us.com') nor handle ActiveX neatly.

how do web crawlers handle javascript

Today a lot of content on Internet is generated using JavaScript (specifically by background AJAX calls). I was wondering how web crawlers like Google handle them. Are they aware of JavaScript? Do they have a built-in JavaScript engine? Or do they simple ignore all JavaScript generated content in the page (I guess quite unlikely). Do people use specific techniques for getting their content indexed which would otherwise be available through background AJAX requests to a normal Internet user?

JavaScript is handled by both Bing and Google crawlers. Yahoo uses the Bing crawler data, so it should be handled as well. I didn't look into other search engines, so if you care about them, you should look them up.
Bing published guidance in March 2014 as to how to create JavaScript-based websites that work with their crawler (mostly related to pushState) that are good practices in general:
Avoid creating broken links with pushState
Avoid creating two different links that link to the same content with pushState
Avoid cloaking. (Here's an article Bing published about their cloaking detection in 2007)
Support browsers (and crawlers) that can't handle pushState.
Google later published guidance in May 2014 as to how to create JavaScript-based websites that work with their crawler, and their recommendations are also recommended:
Don't block the JavaScript (and CSS) in the robots.txt file.
Make sure you can handle the load of the crawlers.
It's a good idea to support browsers and crawlers that can't handle (or users and organizations that won't allow) JavaScript
Tricky JavaScript that relies on arcane or specific features of the language might not work with the crawlers.
If your JavaScript removes content from the page, it might not get indexed.
around.

Most of them don't handle Javascript in any way. (At least, all the major search engines' crawlers don't.)
This is why it's still important to have your site gracefully handle navigation without Javascript.

I have tested this by putting pages on my site only reachable by Javascript and then observing their presence in search indexes.
Pages on my site which were reachable only by Javascript were subsequently indexed by Google.
The content was reached through Javascript with a 'classic' technique or constructing a URL and setting the window.location accordingly.

Precisely what Ben S said. And anyone accessing your site with Lynx won't execute JavaScript either. If your site is intended for general public use, it should generally be usable without JavaScript.
Also, related: if there are pages that you would want a search engine to find, and which would normally arise only from JavaScript, you might consider generating static versions of them, reachable by a crawlable site map, where these static pages use JavaScript to load the current version when hit by a JavaScript-enabled browser (in case a human with a browser follows your site map). The search engine will see the static form of the page, and can index it.

Crawlers doesn't parse Javascript to find out what it does.
They may be built to recognise some classic snippets like onchange="window.location.href=this.options[this.selectedIndex].value;" or onclick="window.location.href='blah.html';", but they don't bother with things like content fetched using AJAX. At least not yet, and content fetched like that will always be secondary anyway.
So, Javascript should be used only for additional functionality. The main content taht you want the crawlers to find should still be plain text in the page and regular links that the crawlers easily can follow.

crawlers can handle javascript or ajax calls if they are using some kind of frameworks like 'htmlunit' or 'selenium'

What are advantages of using google.load('jQuery', ...) vs direct inclusion of hosted script URL?

Google hosts some popular JavaScript libraries at:
http://code.google.com/apis/ajaxlibs/
According to google:
The most powerful way to load the libraries is by using google.load() ...
What are the real advantages of using
google.load("jquery", "1.2.6")
vs.
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.2.6/jquery.min.js"></script>
?

Aside from the benefit of Google being able to bundle multiple files together on the request, there is no perk to using google.load. In fact, if you know all libraries that you want to use (say just jQuery 1.2.6), you're possibly making the user's browser perform one unneeded HTTP connection. Since the whole point of using Google's hosting is to reduce bandwidth consumption and response time, the best decision - if you're just using 1 library - is to call that library directly.
Also, if your site will be using any SSL certificates, you want to plan for this by calling the script via Google's HTTPS connection. There's no downside to calling a https script from an http page, but calling an http script from an https page will causing more obscure debugging problems than you would want to think about.

It allows you to dynamically load the libraries in your code, wherever you want.
Because it lets you switch directly to a new version of the library in the javascript, without forcing you to rebuild/change templates all across your site.

It lets Google change the URL (but they can't since the URL method is already established)
In theory, if you do several google.load()s, Google can bundle then into one file, but I don't think that is implemented.

I find it's very useful for testing different libraries and different methods, particularly if you're not used to them and want to see their differences side by side, without having to download them. It appears that one of the primary reason to do it, would be that it is asynchronous versus the synchronous script call. You also get some neat stuff that is directly included in the google loader, like client location. You can get their latitude and longitude from it. Not necessarily useful, but it may be helpful if you're planning to have targeted advertising or something of the like.
Not to mention that dynamic loading is always useful. Particularly to smooth out the initial site load. Keeping the initial "site load time" down to as little as possible is something every web designer is fighting an uphill battle on.

You might want to load a library only under special conditions.
Additionally the google.load method would speed up the initial page display. Otherwise the page rendering will freeze until the file has been loaded if you include script tags in your html code.

Personally, I'm interested in whether there's a caching benefit for browsers that will already have loaded that library as well. Seems like if someone browses to google and loads the right jQuery lib and then browses to my site and loads the right jQuery lib... ...both might well use the same cached jQuery. That's just a speculative possibility, though.
Edit: Yep, at very least when using the direct script tags to the location, the javascript library will be cached if someone has already called for the library from google (e.g. if it were included by another site somewhere).

If you were to write a boatload of JavaScript that only used the library when a particular event happens, you could wait until the event happens to download the library, which avoids unnecessary HTTP requests for those who don't actually end up triggering the event. However, in the case of libraries like Prototype + Scriptaculous, which downloads over 300kb of JavaScript code, this isn't practical.

We Keep Coding

JavaScript is the programming language of the Web.