I want to add facebook like buttons to different pages and use different titles, descriptions and images.
The problem now is: Facebook uses meta tags from the header to determine this values, e.g.: . I use GWT and therefore I only have one host page (e.g. index.html) and different content is rendered in this page:
"www.myurl.com#blogpost:1" would load the blogpost with id "1". Therefore every blogpost would have the same title, description, image. I could change the metatags with javascript according to which blogpost is requested. But I guess javascript is not executed by the facebook parser. Is there a way to realize different like buttons with only one host page?
I now generate a special link for facebook. So if my GWT URL looks like "www.myurl.com#blogpost:1", I will generate the URL "www.myurl.com/fb/blogpost/1". Now I check in a Servlet Filter for URLs starting with "fb". If I find a Request with a URL like that, I just write out the Meta tags and a java script forward to my actual page: "www.myurl.com#blogpost:1". The facebook crawler just sees the meta tags and doesn't use the javascript forward.
Normal Users on the other hand are forwarded to the regular page. This works pretty good for me. Thanks CBroe for the hint.
Related
I have a web page that allows a user to choose some options for a widget and then dynamically generates example HTML based on those options. The HTML is put in a div on the page so that the user can see how it looks and copy/paste it to their own site, if they so desire.
I would like to add a "view this example page" link, which opens in a new window and has the example HTML from the div, so that the example can instantly be seen in action.
Is there a way to do this with javascript/jquery?
You can actually use the window.open method, saving a reference to the opened window, and then writing to it.
https://developer.mozilla.org/en-US/docs/Web/API/Window/open
var exampleWin = window.open("", "example");
var docMarkup = "<!doctype html><html><head><title>test</title></head>" +
"<body><p>Hello, world.</p></body></html>";
exampleWin.document.write(docMarkup);
// later you can also do exampleWin.close() if you wish
Try pasting the above code in your browser's developer tools console.
The usual way to accomplish the end goal works a bit differently. You have a web server listening for GET requests at /code (or similar) and it constructs and responds with the appropriate HTML based on the query string. So you can request /code?color=blue, for example.
Constructing documents is what web servers are designed to do. This approach allows you to leverage caching policies, integrate with a wider variety of user authentication and authorization systems, etc.
To display the source code to the user, simply fetch() the appropriate URL and put the contents in a <code> tag. To display the rendered widget, use an <iframe> whose src is the same URL.
If you really want it to be a new window, open() the URL instead of using an iframe. But beware of popup blockers.
I have an AngularJS application that is injected into 3rd party sites. It injects dynamic content into a div on the 3rd party page. Google is successfully indexing this dynamic content but does not appear to be crawling links within the dynamic content. The links would look something like this in the dynamic content:
Link Here
I'm using query parameters for the links rather than an actual url structure like:
http://www.example.com/support/title/Example Title/titleId/12345
I have to use the query parameters as I don't want the 3rd party site to have to change their web server configuration to redirect unfound URLs.
When the link is clicked I use the $locationService to update the url in the browser and then my angular application responds accordingly. Mainly it shows just the relevant content based on the query params, sets the page title and meta description.
Many of the articles I have read use the route provider in angularJS and templates but I'm not sure why this would make a difference to the crawler?
I have read that google should view urls with query parameters as separate pages so I don't believe that should be the issue:
https://webmasters.googleblog.com/2008/09/dynamic-urls-vs-static-urls.html
The only things I have not tried are 1. providing a sitemap with the urls that have the query parameters and 2. adding static links from other pages to the dynamic links to help google discover those pages.
Any help, ideas or insights would be greatly appreciated.
This happens because google crawlers are not able to get the static html from your url since your pages are dynamically rendered with Javascript, you can achieve what you want using the following :
Since #! is deprecated, You can tell google that your pages are rendered with javascript by using the following tag in your header
<meta name="fragment" content="!">
On finding the above tag google bots will request your urls with the _escaped_fragment_ query parameter from your server like
http://www.example.com/?_escaped_fragment_=/support?title=Example Title&titleId=12345
Then you need to rebuild your original url from the _escaped_fragment_ on your server and it will look like this again
http://www.example.com/support?title=Example Title&titleId=12345
Then you will need to serve the static HTML to the crawler for that url.
You can do that using a headless browser to access the url. Phantom.js is a good option to render your page using the javascript and then give the contents into a file to create a HTML snapshot of your page. You can save the snapshot as well on your server for further crawling, so when google bots visit can you can directly serve the snapshot instead of re-rendering the page again.
The web crawler might be running at a higher priority than the AngularJS interpretation of your dynamic links as the web crawler loads the page. Using ng-href makes the dynamic link interpretation happen at a higher priority. Hope it works!
If you use urls with #
Nothing after the hash in the url gets sent to your server. Since Javascript frameworks originally used the hash as a routing mechanism, that's a main reason why Google created this protocol.
Change your urls to #! instead of just using #.
angular.module('myApp').config([
'$locationProvider',
function($locationProvider) {
$locationProvider.hashPrefix('!');
}
]);
This is how Google and bing handle the ajax calls.
The documentation is mentioned here.
The overview as mentioned in the docs is as follows
The crawler finds a pretty AJAX URL (that is, a URL containing a #! hash fragment). It then requests the content for this URL from your server in a slightly modified form. Your web server returns the content in the form of an HTML snapshot, which is then processed by the crawler. The search results will show the original URL.
Step by Step guide is shown in the docs.
Since the Angular JS is designed for the Client Side so you will need to configure your Web server to summon a headless html browser to access your web page and deliver a hashbang url which will be given to the special google URL.
If you use hashbang URL then you would need to instruct the angular application to use them instead of regular hash values
App.config(['$routeProvider', '$locationProvider', function($routes, $location) {
$location.hashPrefix('!');
$routes.when('/home',{
controller : 'IndexCtrl',
templateUrl : './pages/index.html'
});
as mentioned in the code example here
However if you do not wish to use hashtag url but still inform the google of the html content but still want to inform the google then you can use this meta tag as this
<meta name="fragment" content="!" />
and then configure the angular to use the htmlURL's
angular.module('HTML5ModeURLs', []).config(['$routeProvider', function($route) {
$route.html5Mode(true);
}]);
and then whichever method to be installed via module
var App = angular.module('App', ['HashBangURLs']);
//or
var App = angular.module('App', ['HTML5ModeURLs']);
Now you will need a headless browser to access the url
You can use phantom.js to download the contents of the page ,run the javascript and then give the contents into a temporary file.
Phantomrunner.js which takes any url as input,downloads and parses the html into DOM and then checks the data status.
Test each page by using the function defined here
SiteMap can also be made as well as shown in this example
The best feature is you can use search console of verify your site url using
Google search console
Full attribution goes to the website and the author mentioned in this site
.
UPDATE 1
Your crawler needs the pages as -
- com/
- com/category/
- com/category/page/
By default, however, Angular sets your pages up as such:
- com
- com/#/category
- com/#/page
Approach 1
Hash bang allows Angular to know which HTML elements to inject with JS which can be done as mentioned before but since it has been depericated hence the another solution would be the following
Configure the $locationProvider and set up the base for relative links
You can use the $locationProvider as mentioned in these docs and set the html5mode to true
$locationProvider.html5Mode(true);
This lets Angular change the routing and URLs of our pages without refreshing the page
Set the base and head of your document as <base href="/">
The $location service will automatically fallback to the hashbang method for browsers that do not support the HTML5 History API.
Full attribution goes to the page and the author
Also to mention there are also some other measures and tests that you can take care of as mentioned in this document
I'm planning to build a site where I can share my handpicked curated contents and I couldn't wrap my head around the basic idea of getting those data fed into my site without going through API.
I first thought maybe I should inspect the source HTML of the page I want to embed on my site and access it with something like $('div.post').find('img').attr('src').
But I can't imagine myself doing that every time so I guess there must be a better way.
It's what Google+ does with their post. Once you add a url link, after a second it pulls featured image and some text snippet from the linked page.
Many sites use the Open graph protocol to get the meta-title, meta-description, image etc. for any url.
For example open: view-source:https://blog.kissmetrics.com/open-graph-meta-tags/ and search for "Open Graph Protocol Meta".
They are contained in the page source. You will have to send a request to the URL you want to crawl from, and read the appropriate meta tags through Regular Expr / HTML Parsers.
You can't make this with javascript. You need a server-side script that downloads the page you need and then parse it with a DOM parser.
With PHP you can get the content of one URL with cURL.
See more:
http://php.net/manual/es/book.curl.php
I've read a lot of topics about redirecting Tumblr to WordPress, but I still can’t find a suitable solution.
Here is the problem: I want to redirect jeby.tumblr.com, a Tumblr blog, to the new jeby.it, a WordPress (WIP) blog with a custom domain and web space etc. I’ve already imported all contents, now all I want is to “automagically” redirect every single post from
jeby.tumblr.com/post/[POST ID]/some-slug
to
jeby.it/2012/05/some-slug
I know that the post year and month are available in the Tumblr HTML code, as they are used to compose the permalink. I can’t use .htaccess redirects because the Tumblr blog is hosted by Tumblr.
I’ve done the same thing with Blogspot, where I found a plugin that created the right JavaScript code to paste into the Blogspot model and get automatic redirection.
As you have realized already, you will need to do the redirection client side. How to achieve that is a matter of Tumblr’s theme templating system’s possibilities and limitations.
The year and month of a post are available as {Year} and {MonthNumberWithZero} tokens respectively; that gets you 2012 and 05.
The post slug, however, is not available as a token (not sure if this is an intentional omission – post slugs are optional on Tumblr; you can manually set one, but if you don’t your post just goes by its numeric ID) – so you will have to parse it out of the link inserted by the {Permalink} token.
Redirection can only take place on single-post pages. Unluckily, there is no template based way to make sure you target only these, as the beauty and limitation of Tumblr’s theme DSL is that it is defined as one page: you get a {block:Posts} block and Tumblr does the figuring out how many posts to display inside that. That means that meta http-equiv="Refresh" tags are out of the question, as they would also be included in all non-single-posts pages.
Luckily, Tumblr does let you include arbitrary JavaScript in your template, and that is the way to go:
add the following line inside the {block:Posts} block:
<span class="redirdata">{Year},{MonthNumberWithZero},{Permalink}</span>
Style your redirdata spans with display:none in your CSS to hide them (or style them inline – considering their intended use, there is little point in being dogmatic).
now add a script to your <head> to parse these and redirect to the WordPress URL. As you want to do this for every possible visitor, i.e. as cross-browser compatibly as possible, I recommend using jQuery. Include it with:
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js" type="text/javascript"></script>
or any other hosted source (see the jQuery docs for CDN URLs). Then add the following script:
<script type="text/javascript">
$(function(){
if ( $('.redirdata').length === 1) {
var redirdataTokens = $('.redirdata').text().split(',');
if (redirdataTokens.length === 3) {
var redirectTo = 'http://jeby.it/'+redirdataTokens[0]+'/'+redirdataTokens[1]+'/'+redirdataTokens[2].replace(/.*\//, '');
window.location = redirectTo;
}
}
});
</script>
and your single-post page should automatically redirect all visitors to the WordPress post page (if they have JavaScript enabled, of course). I am sure you will figure out to adapt this to tag archives, searches and such, if need be.
Caveat Empteor: this will only work if the slugs of your WP posts match the slugs of your Tumblr posts – crucially also in the case of Tumblr posts with no textual slug (meaning you WP slug needs to match the numeric Tumblr post ID in that case).
I have a Website, which is Fully Ajax-Based (Hash Navigation).
Is there a way to refresh Open Graph meta-tags for ajax-based websites using Javascript?
(When I Click on a link, the Tags, and there values should Change)
No. Open Graph markup must be present on HTML pages which are GETable with pure HTTP.
This is because when a user interacts with an OG object (like, performs an action etc) Facebook will perform an HTTP GET on the OG URL, and expect to see OG tags returned in the markup.
The solution is to create canonical URLs for each of your objects. These URLs contain basic HTML markup including OG tags.
On requests to these URLs, if you see the incoming useragent string containing 'facebookexternalhit' then you render the HTML. If you don't, you serve a 302 which redirects to your ajax URL. On the ajax URLs, your like buttons and any OG actions you publish should point to the canonical URL object
Example:
As a user, I'm on http://yoursite.com/#!/artists/monet. I click a like button, or publish an action, but the like button's href parameter, or the URL of the object when you post the action should be a web hittable canonical URL for the object - in this case, perhaps http://yoursite.com/artists/monet
When a user using a browser hits http://yoursite.com/artists/monet you should redirect them to http://yoursite.com/#!/artists/monet, but if the incoming useragent says it is Facebook's scraper, you just return markup which represents the artist Monet.
For real world examples, see Deezer, Rdio and Mog who all use this design pattern.
A little bit more investigation lead to the following findings:
Let's say you made an application with a hash that looks like this:
http://yoursite.com/#/artists/monet
The Facebook scraper will call your url without the /#/artists/monet part. This is a problem because you have no way of knowing what information you have to parse into the meta og: tags.
Then try the same with the suggested url as Simon says:
http://yoursite.com/#!/artists/monet
Now you'll notice that the Facebook scraper is respecting the google ajax specifications and it will convert the #! to ?_escaped_fragment_= so the URL looks like this:
http://yoursite.com/?_escaped_fragment_=/artists/monet
You can check this out for yourself with the facebook debugger: https://developers.facebook.com/tools/debug
upload the php script to your server
go to the facebook debugger
enter the url with the /#/ part
click 'See exactly what our scraper sees for your URL' - no hash fragment
enter the url again with /#!/
click 'See exactly what our scraper sees for your URL' - hash fragment has been turned to
?_escaped_fragment_=
The script
<html>
<head>
<title>Scraping</title>
</head>
<body>
<?
print_r($_SERVER);
?>
</body>
</html>
Or summarized: always use /#!/ (hashbang) deeplinks ;)
I ran a quick test that seems to work. Dependant on the fact the FB scraper doesn't run JavaScript.
As most of my sites are static Single Page Apps with no server logic, I can generate the static pages quickly with tools such as Grunt and Gulp.
If you Share http://wh-share-test.s3-website-eu-west-1.amazonaws.com/test
Facebook will scrape the test page meta tags, when a user clicks the link the JS redirects to /#/test for my single page app to react and present the correct view.
Seems hacky but works;
<html lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
<title>This is a shared item</title>
</head>
<body>
<h1>This is a shared item page.</h1>
<script>
var path = window.location.pathname;
window.location = '/#' + path;
</script>
</body>
</html>