Google Fonts German GDPR IP Address - javascript

I (or a lot of German people) need your help.
In Germany, more and more website operators are receiving a legal letter with a warning and are supposed to pay around €170. The problem is that it doesn't stop there, which means that if you pay the €170, someone else can come right away and warn you again.
It's about Google Fonts. Many Wordpress websites use themes that load Google Fonts. The German court has decided that it is not allowed to send the IP address to the Google because of Google Fonts and this is a violation of the rights of the customers.
Since I run a few websites, I'm now looking for a solution, but to be honest I'm coming up against technical limits. So I want to open this thread to discuss possibilities.
I have listed issues below, I will add them to my solutions.
I can think of the following options:
Create a child theme and then load the Google fonts locally. (Issue: 1st 2nd 3rd 4th)
Service worker that rewrites the URLs
(Issue: 5.)
Nginx rewrite, rewrite the php output and replace google fonts url
(Issuing: 1. 4.)
More?
Issues:
If you have e.g. integrated a script (Google Maps, Recaptcha, Intercom, ...) it can happen that Google Fonts are reloaded by Javascript.
Theme updates.
A lot of work when multiple customers.
Plugins load elements on certain pages or only later after it can happen that google fonts are loaded again.
Only works if the service worker is installed
I am open for any idea. It looks like Google will not fix this.

There is no easy technical fix. The only long-term fix is to review how you include any third-party content on your websites, in case this embedding causes any visitor personal data to flow to such third parties.
This is not a new development. A lot of the relevant compliance steps already entered the (German) mainstream in the early 2010s when the problem was Facebook's “Like button”. The generally accepted solution for that is that the third party content is not loaded directly. Instead, a placeholder widget is rendered that indicates which content would be available there. Then, the user can give consent with one click and the actual embedded content is loaded.
With Google Fonts, no such consent flow is needed or appropriate. All of the fonts on Google Fonts are Open Source licensed – you are allowed to use and distribute them for free, but subject to the license conditions (like making the license notice available to users). So on a technical level, it is easy to self-host the fonts in question.
What is tricky is efficiently rewriting the requests caused by your websites to point to your own servers instead of to the Google servers. You have identified a couple of approaches and have identified their pros and cons. Some comments:
Client-side rewriting sounds very fragile, I'd avoid it.
Server-side rewriting can be very powerful, but would also be somewhat fragile. The main advantage of such rewrites would be that it doesn't just handle Google Fonts embeds from your themes, but also requests inserted by server-side plugins.
Updating the theme is the only reliable long-term solution. Creating a child theme might be a suitable stop-gap measure until the theme developer fixes the problem. Similarly, it may be necessary to temporarily modify WordPress plugins.
I think that as a band-aid, server-side rewrites will be good enough to prevent many automated scanning tools used by these cease-and-desist lawyers from sounding the alarm on your sites.
However, you have correctly identified that especially JavaScript could cause problems for achieving actual compliance. This is why you should revisit your decisions about what plugins and scripts you have integrated. Loading third party JavaScript has pretty much the same legal consequences as loading fonts from Google, so you should only do it if it's actually necessary for your site (where necessity depends on the user's perspective), or if the user has given valid consent. For example, you can use the placeholder widget technique mentioned above for embedded content like Google Maps or Intercom, whereas loading a Captcha may indeed be strictly necessary on some pages.
For testing these issues, I'd recommend installing Firefox with the uBlock Origin addon, and setting the addon to hard mode. This will block all third-party/cross-origin requests. You can then allowlist those domains that are under your direct control, or are provided by your data processors (who are contractually bound to only use the personal data as instructed by you, and are considered first-party for GDPR purposes), or domains for which you have a legal basis to share the data (e.g. a “legitimate interest” to load stuff that is strictly necessary for your site to work, or to investigate what requests are made when the user gives consent).

IANAL but the two sections may be relevant.
Using their APIs. From what I can tell nothing here explicitly forbids proxying.
API Prohibitions on sublicensing. The last part of the statement and offer it for use by third parties means you're okay as long as you're not offering it for other people to use.
I do have Google Fonts Proxy Docker Image which I use for my own stacks, I don't offer the use of my running proxy for use with other services. It does not mean you can't simply deploy my image on your own servers.
This won't resolve your 3rd party Google services such as Maps though.

Related

Aren't Javascript analytics scripts susceptible to easy data hacks?

On Production environments, Javascript based analytics scripts (Google Analytics, Facebook Pixel etc.), are injected into most web applications, along with the Unique ID/Pixel ID, in plain Javascript.
For example, airbnb uses Google Analytics. I can open up my dev console and run
setInterval(function() {ga('send', 'pageview');}, 1000);
which will cause the analytics pixel to be requested every 1 second, forever. That is 3600 requests an hour from my machine alone.
Now, this can easily be done in a distributed fashion, causing millions of requests per second, completely skewing the Google Analytics data for the pageview event. I understand that the huge amounts of data collected would correct this skewing to a certain extend, but that can be easily compensated by hiking up the amount of requests.
My question is this: are there any safeguards to prevent competitors or malicious individuals from destroying the data integrity of applications in this manner? Does GA or Facebook provide such options?
Yes,but the unsafe part don't comes for the Javascript. For example, you can use the measurement protocol to flood data to one account. Here you can see a lot of people in the same comunity having thoubles with this (and it's quiet simple to solve.)
https://stackoverflow.com/search?q=spam+google+analytics
All this measurement systems uses HTTP calls to fill the data on your "database". If you are able to build the correct call you can Spam Everyone and everywhere (but don't do it, don't be evil).
https://developers.google.com/analytics/devguides/collection/protocol/v1/?hl=es-419
This page of Google Analytics explain what is the protocol measurement, Javascript only work as framework to build and send the hit.
https://developers.google.com/analytics/devguides/collection/protocol/v1/?hl=es-419
But, so not everything is lost.
For example, if you try to do that on you browser with that code, The Google Analytics FrameWork limit to 1 call per second and 150 per session (or cookie value). Yes it's not complicated to jump that barrier, but after that other barriers will come.
So if you use the Javascript framework are safe. Now imagine you do the same with python, sending http to the Google Analytics server. It's possible but:
So here are 2 important things to says.
Google Analytics has a proactive "firewall", to detect Spammers and ban them.(How and when they do this is not public), but in my case i see a lot of less spammer that few years ago.
Also there is a couple of good practices to avoid this. For example, store only domains under a white list, creating a filter to allow only traffic from your domain
https://support.google.com/analytics/answer/1033162?hl=en
Also it's a very good practice to protect you ecommerce, using a filter to include only data from certain store or with certain parameter, "for example brand == my brand" or "CustomDimension== true". Exclude transactions with products over $1.000 (check your limits and apply proactive filters). All this barrier make complex to broke.
If you do this, you will protect your domain a lot(because it's too much complicated to know the combination of UA + Domain Valid when you create a robot), but you know, all the system can be broken. In my experience i only see 2 or 3 cases of damage comming from spammer or people who wanna hurt, and in all this case could be prevented if I created a proactive filter. Usually spammer only spam ads into your account,almost never want to hurt you. Facebook, Piwik and other Tools happens more or less the same.

Calling tracking JavaScript from AMP pages

We are using in house tracking mechanism for our website. We use our tracking.js file on our all pages.
Every page sent some info in an js object to this script file which later send this information to our tracking application using spring controller.
Now as to move page faster we use some pages in AMP templates.
But this does not allow us to use tracking.js
We tried iframe tag but it does not allow to use http call (it only allow https calls)
Could you please suggest a way to do it as it very critical and we can not move to https right now for other limitation.
Thanks
Virendra Agarwal
You can't use tracking.js with AMP as it is considered as an external library. It's written on their How It Works page that it won't allow author-written/3rd party JS:
"One thing we realized early on is that many performance issues are
caused by the integration of multiple JavaScript libraries, tools,
embeds, etc. into a page. This isn’t saying that JavaScript
immediately leads to bad performance, but once arbitrary JavaScript is
in play, most bets are off because anything could happen at any time
and it is hard to make any type of performance guarantee. With this in
mind we made the tough decision that AMP HTML documents would not
include any author-written JavaScript, nor any third-party scripts."
Only the components on this AMP example can be used.
As we worked with Google. We got it sorted.
You can add your API to AMP pages after validation by Google.
This API must be behind https and all calls should be validated by Google.
Google then will white list on AMP page and you can use that code in production.

Speed up slow loading website on mobile due to 3rd party analytics scripts

Speed tests report that my site loads in about 8 seconds on mobile and 2 on desktop.
When looking at the "waterfall" of the asset/script loading it seems to be mostly due to 3rd party analytic scripts like zendesk, hubspot, and google analytics.
Is there anyway to optimize a site for mobile when using these type of scripts?
As far as the files go on my site I've optimized them nearly as much as I could. I've even used a cron (probably not a great idea) to cache the google analytics script locally and fetch a newer version every few days.
I've looked into using Tag Manager or something like Segment to optimize the script loading of these files, but I'm not sure if that will actually improve performance or if those services are mostly just for convenience.
I've also looked at service workers for mobile app caching, but I'm not sure if that will help either and don't want to dive into learning how to use them if it won't actually make a difference.
To sum up, is there a way to speed up mobile with multiple 3rd party analytic scripts or am I just going to have to forgo using some of them or possibly add a mobile version or AMP version without using them or some of them.
Since you've tagged AMP in your question, I'll answer from the AMP perspective:
AMP solves this problem by using a mediator that connects directly to the endpoints of various analytics providers, replacing their clunky and power hungry JS scripts.
Of course, AMP comes with its own set of constrains and is really only targeted at powering static content, but if you're willing to go that route, learn more about AMP Analytics here.

Widget - Iframe versus JavaScript

I have to develop a widget that will be used by a third party site. This is not an application to be deployed in a social networking site. I can give the site guys a link to be used as the src of an iframe or I can develop it as a JavaScript request.
Can someone please tell me the trade offs between the 2 approaches(IFrame versus JS)?
I was searching about the same question and I found this interesting article:
http://prettyprint.me/prettyprint.me/2009/05/30/widgets-iframe-vs-inline/
Widgets are small web applications that can easily be added to any web
page. They are sometimes called Gadgets and are vastly used in growing
number of web pages, blogs, social sites, personalized home pages such
as iGoogle, my Yahoo, netvibes etc. In this blog I use several
widgets, such as the RSS counter to the right which displays how many
users are subscribed to this blog (don’t worry, it’ll grow, that’s a
new blog ;-) ). Widgets are great in the sense that they are small
reusable piece of functionality that even non-programmers can utilize
to enrich their site.
I’ve written several such widgets over the time both “raw” widgets
that can get embedded in any site as well as iGoogle gadgets which are
more structured, worpress*, typepad and blogger widgets, so I’m happy
to share my experience.
As a widget author, for widgets that run on the client side (simple
embeddable HTML code) you have the choice of writing your widget
inside an iframe or simply inline the page and make it part of the dom
of the hosting page. The rest of the post discusses the pros and cons
of both methods.
How is it technically done? How to use an iframe or how to implement
an inline widget?
Iframes are somewhat easier to implement. The following example
renders a simple iframe widget: http://my-great-widget.com/widgwt' width="100" height="100"
frameborder='0'>
frameborder=’0′ is used to make sure the ifrmae doesn’t have a border
so it looks more natural on the page. The
http://my-great-widget.com/widget is responsible of serving the widget
content as a complete HTML page.
Inline gadgets might look like this:
function createMyWidgetHtml() { return "Hello world of widgets"; }
document.getElementById('myWidget').innerHTML = createMyWidgetHtml();
As you can see, the function createMyWidgetHtml() it responsible for
creating the actual widget content and does not necessarily have to
talk to a server to do that. In the iframe example there must be a
server. In the inline example there does not need to be a server,
although if needed, it’s possible to get data from the server, which
actually is a very common case, widgets typically do call server side
code. Using the inline method server side code is invoked by means of
on-demmand javascript.
So, to summarize, in the iframe case we simply place an iframe HTML
code and point the source of the iframe to a sever location which
actually serves the content of the widget. In the inline case we
create the content locally using javascript. You may of course combine
usage of iframe with javascript as well as use of the inline method
with server side calls, you’re not restricted by that, but the paths
start differentially.
So what is the big deal? What’s the difference? There are several
important differences, so here starts the interesting part of the
post.
Security. iFrame widgets are more secure.
What risks do gadgets impose and who’s actually being put at risk? The
user of the site and the site’s reputation are at risk.
With inline gadgets the browser thinks that the source of the gadget
code code comes from the hosting site. Let’s assume you’re browsing
your favorite mail application http://my-wonderful-email.com and this
mail application has installed a widget that displays a clock from
http://great-clock-widgets.com/. If that widgets is implemented as an
inline widget the browser thinks that the widget’s code originated at
my-wonderful-email.com and not at great-clock-widgets.com and so it’ll
let the widget’s code ultimately get access to the cookies owned by
my-wonderful-email.com and the widget’s evil author will steal your
email. It’s important to realize that browsers don’t care about where
the javascript file is hosted; as long as the code runs on the same
frame, the browser regards all code as originationg at the frame’s
domain. So, you as a user get hurt by losing control over your email
account and my-wonderful-email gets hurt by losing its reputation.
If the same clock would have gotten implemented inside an iframe and
the iframe source is different from the page source (which is the
common case, e.g. the page source is my-wonderful-email.com and the
gadget source is great-clock-widgets.com) then the browser would not
allow the clock widgets access to the page cookies, nor will it allow
access to any other part of the hosting document, including the host
page dom. That’s way more secure. As a matter of fact, personal home
pages such as iGoogle don’t even allow inline gadgets, only iframe
gadgets are allowed. (inline gadgets are allowed only in rare cases,
only after thorough inspection by the iGoogle team to make sure
they’re not malicious)
To sum up, iframe widgets are way more secure. However, they are also
way more limited in functionality. Next we’ll discuss what you lose in
functionality.
Look and feel In the look and feel battle inline gadgets (usually**)
win. The nice thing about them is that they can be made to look as
part of the page. They can inherit CSS styles from the page, including
fonts, colors, text size etc. Iframes, OTHO must define their CSS from
the grounds up so it’s pretty hard for them to blend nicely in the
page.
But what’s even more important is that iframes must declare what their
size is going to be. When adding an iframe to a page you must include
a width and a height property and if you don’t, the browser will use
some default settings. Now, if your widget is a clock widget that’s
easy enough b/c you know exacly what size you want it to be, but in
many cases you don’t know ahead of time how much space your widget is
going to take. If, for example you’re authoring a widget that displays
a list of some sort and you don’t know how long this list is going to
be or how wide each item is going to be. Usually in HTML this is not a
big deal because HTML is a declarative based language so all you need
to do is tell the browser what you want to display and the browser
will figure out a reasonable layout for it, however with iframe this
is not the case; with ifrmaes browsers demand that you tell it exactly
what the iframe size is and it will not figure it out by itself. This
is a real problem for widget authors that want to use iframes – if you
require too much space the page will have voids in it and if you
specify too little the page will have scrollbars in it, god forbids.
Look and feel wise, inline wins. But note that this really depends on
your widget application. If all you want to do is a clock, you may get
along with an iframe just as well.
Server side vs. Client side IFrmaes require you specify a src URL so
when implementing a widget using an iframe you must have server side
code. This could both be a limitation and a headache to some (owning a
server, domain name etc, dealing with load, paying network bills etc)
but to others this is actually a point in favor of iframes b/c it
let’s you completely write your widgets in server side technologies,
so you can write a lot of the code and actually almost all of it using
your favorite server side technology whether it be asp.net, django,
ror, jsp, struts , perl or other dinosaurs. When implementing an
inline gadget you’ll find yourself more and more practicing your
javascript Ninja.
What’s the decision algorithm then? Widget authors: If the widget can
be implemented as an iframe, prefer an Iframe simply for preserving
users security and trust. If a widget requires inlining (and the
medium allows that, e.g. not iGoogle and friends) use inline but dare
not exploit users trust!
Widget installers: When installing a widget in your blog you don’t see
a “safe for users” ribbon on the widgets. How can you tell if the
widget is safe or not? There are two alternatives I can suggest: 1)
trust the vendor 2) read the code. Either you trust the widget
provider and install it anyway or you take the time to read its code
and determine yourself whether it’s trustworthy or not. Reality is
that most site owners don’t bother reading code or are not even aware
of the risk they’re putting their users at, and so widget providers
are blindly trusted. In many cases this is not an issue since blogs
don’t usually hold personal information about their readers. I suspect
things will start changing once there are few high profile exploits
(and I hope it’ll never get to it).
Users: Usres are kept in the dark. Just as there are no “safe for
users” ribbons on widgets site owners install, there are no “safe to
use” sites and basically users are kept in the dark and have no idea,
even if they have the technical skills, whether or not the site they
are using contains widgets, whether the widgets are inline or not and
whether they are malicious. Although in theory a trained developer can
inspect the code up-front, before running it in her browser and losing
her email account to a hacker, however this is not practical and there
should be no expectation that users en mass will do that. IMO this is
an unfortunate condition and I only hope attackers will not find a way
of taking advantage of that and doom the wonderful open widget culture
on the web.
Happy widgeting folks!
Some blog platforms have a somewhat different structures for widgets and they may sometimes have both widgets and plugins that may
correlate in their functionality, but for the matter of the discussion
here I’ll lously use the term widget to discuss the “raw” type which
consists of client side javascript code
** Although in most cases you’d want widgets to inherit styles from the hosting page to make them look consistent with it, sometimes you
actually don’t want the widget to inherit styles from the page, so in
this case iFrames let you start your CSS from scratch.
Why not doing both ?
I prefer to offer third party sites a script like:
<script type="text/javascript" src="urlToYourScript"></script>
the file on your server looks like :
document.writeln('<iframe src="pathToYourWidget"
name="MagicIframe" width="300" height="600" align="left" scrolling="no"
marginheight="0" marginwidth="0" frameborder="0"></iframe>');
UPDATE:
one disadvantage of using an iframe that points to an url on your server is that you do not generate a "real" backlink if someone clicks on an url from your server pointing to your server.
I'm sure many developers/site owners would appreciate a Javascript solution that they can style to their needs rather than using an iframe. If I was going to include a component from a third party, I would rather do it via Javascript because I would have more control.
As far as ease of use, both are similar in simplicity, so no real tradeoff there.
One other thought, make sure you get a SSL cert for whatever domain you're hosting this on and write out the include statement accordingly if the page is served over SSL. In case your site owners have a reason for using SSL, they would surely appreciate this, because Firefox and other browsers will complain when a page is served with a mix of secure/insecure content.
If the widget can be embedded in an iframe, it will be better for the frontend performance of the hosting site as iframes do not block content download. However, as others have commented there are other drawbacks to using iframes.
If you do implement in javascript, please consider frontend performance best practices when developing. In particular, you should look at Non blocking javascript loading. Google analytics and other 3rd party widget providers support this method of loading. It would also help if you can load the javascript at the bottom of the page.
Nice to know that it's not to be deployed in a social networking site... that merely leaves the rest of the web ;-)
What would be most useful depends on your widget. IFrames and javascript generally serve quite different purposes, and can be mixed (i.e. javascript inside an iframe, or javascript creating an iframe).
IFrames got sizing issues; if it's supposed to be an exact fit to the page, do you know that it renders the same on all browsers, the data won't overflow it's container etc?
IFrames are simple. They can be a simple, static HTML-page.
When using IFrames, you expose your widget quite plainly.
But then again, why not have your third party site simply include the HMTL at a given url? The HTML can then be extended to contain javascript when/if you need it.
Pure Javascript allows for more flexibility but at the cost of some complexity.
The big plus of iframes: all CSS and JS is separated from the host page, so your existing CSS just works. (If you want the host site to style your content to fit in, that's a minus of course.)
The big minus of iframes: they have a fixed width and height and scroll-bars will appear if your content is larger.

Hosted Yui, Google maps, JQuery - an easy way of monitoring website usage?

The Yahoo Javascript library (YUI), JQuery and less so Google maps all allow you to reference their files using the following format:
<script type="text/javascript" src="http://yui.yahooapis.com/2.6.0/build/yahoo-dom-event/yahoo-dom-event.js"></script>
This does a request for the script from their servers, which will also pass to their web server the HTTP referrer. Do Yahoo etc. use this to produce statistics on which websites get what traffic? Or is this a conspiracy theory?
Of course their servers most of the time will be a lot faster than any small company would buy, so using the hosted version of the script makes more sense.
Chris,
I work on the YUI team at Yahoo.
We host only YUI on yui.yahooapis.com; Google hosts YUI and many other libraries on its CDN. I can tell you from the Yahoo side that we don't monitor site usage of YUI from our CDN. We do track general growth of yui.yahooapis.com usage, but we don't track which sites are generating traffic. You're right to suggest that we could track usage -- and we state as clearly as we can in our hosting docs that you should only use this kind of service if the traffic logs generated on our side don't represent a privacy concern for you.
In general, though, I don't regard CDN traffic for library usage to be a reliable measurement of anything. Most YUI usage, even at Yahoo, doesn't use yui.yahooapis.com or Google's equivalent, and I'm sure the same is true for other libraries. And even when a site is using YUI from our servers, we wouldn't have comprehensive traffic data of the kind you'd get from Google Analytics or Yahoo Analytics -- because not all pages would use YUI or the CDN uniformly.
Given the advantages of the hosted service -- including SSL from Google and YUI combo-handling from Yahoo -- I see the CDN as being a big win for most implementers, with little downside.
-Eric
Of course they produce statistics - at minimum they need to know how many resources they spend on hosting these scripts. And it's also nice to know who uses your code.
I don't think it's a bad thing.
And using a hosted version makes even more sense because your visitors might have the script already cached after visiting another site.
Sure, they can easily have statistics about which sites use YUI and how often, and also which parts of YUI API are more populare (among small sites). However, they cannot know what exactly web site visitors do with their libs.
Given, that they (Google & Yahoo) index a lot of web pages, they can get an even more precise statistics if they analyze their indexes. So you cannot hide that you are using YUI if your site is public.
The same applies to Google maps and jQuery.

Categories