Page created with ReactJs is not indexed by google - javascript

I have a news section created with ReactJs, each news post acts as an individual page.
Unfortunately Google is not indexing these pages because of REactJs. I tried to use the babel-polyfill webpack, but it's still not working. Also, I'm making my Ajax call BEFORE rendering the DOM.
Any other ideas for another workaround on this?

the google crawler won't wait for async requests to resolve, and because your pages are rendered on the users' client, they will appear to be empty pages.
You have two options. Either modify your app to render on the server, this is often called a universal app, or an isomorphic app. There are many tutorials for creating these. The other option is to pre-render static html from your code so that the crawler can see what should be there. There are numerous libraries you can use for this.
The first option is more extensible and probably preferable, but a bit more complex. So make the choice about whats more appropriate for you

It is not indexing them because they are bundled so tat they could be rendered inside your client's browser. What you ought to do is server side rendering.
You can find more about it here: https://medium.freecodecamp.org/server-side-rendering-your-react-app-in-three-simple-steps-7a82b95db82e

Related

What's the difference between SSR and fallback = true for dynamic paths in NextJS

So I'm finding it difficult to see the benefits of doing SSR for dynamic paths in NextJs when I can just just pre-render a few static paths, and use fallback=true to cover my bases on most pages.
Say I have an eCommerce site with 1 million product detail pages, but I only want to pre-render featured products on the home page(most clicked). If I set fallback to true in getStaticPaths, then the getStaticProps function runs every time a non featured product page is requested.
So what's the advantage of using SSR when I can just have a fallback that queries the database every time a non pre-rendered page is called?
Note: I saw a similar question on Stack Overflow, and the answer was that web-crawlers see only the fallback state of your react Component that you set for non pre-rendered pages (so the source code would only read <p>Loading...</p> or something like that, vs the SSR page which would load all your data for the product directly as the source code. But this doesn't seem to be true in my app.
Thanks for any help.
TLDR: [In NextJs..] Why can't I just use SSG for dynamic paths, with fallback=true in getStaticPaths, instead of SSR?
THANKS ALL
I tried reading the NextJs docs and couldn't find an explanation for the cons of using fallback=true in getStaticPaths
From next.js docs:
By default, Next.js pre-renders every page. This means that Next.js
generates HTML for each page in advance, instead of having it all done
by client-side JavaScript.
Two Forms of Pre-rendering
Next.js has two forms of pre-rendering: Static Generation and
Server-side Rendering. The difference is in when it generates the HTML
for a page.
Static Generation is the pre-rendering method that generates the HTML
at build time. The pre-rendered HTML is then reused on each request.
Server-side Rendering is the pre-rendering method that generates the
HTML on each request.
I put those definitions to clarify the terms in next.js. I believe your question is regarding fallback:true versus generating HTML on each request (or building page runtime vs build time). I think this note you shared is not correct
Note: I saw a similar question on Stack Overflow, and the answer was
that web-crawlers see only the fallback state of your react Component
that you set for non pre-rendered pages (so the source code would only
read Loading... or something like that, vs the SSR page which
would load all your data for the product directly as the source code.
But this doesn't seem to be true in my app.
In each case the populated page is seen by the crawlers.
Using getStaticPath in your e-commerce example is the usage of caching. those pages for popular products are already built and inside next build folder you can see them if you build your app locally. But in large applications, those static assets are stored in CDN, and whenever the server gets a request response will be in no time. so customer will have a better user experience so which will eventually affect the profit of the ecommerce site.
I think the clearest example would be thinking about a blogging website like Medium. The most popular blogs will be pre-generated since the content of the blogs do not change that often. Medium will use CDN's from different parts of the world, so user all around the world will have faster access to blogs.
Hitting databases is a very expensive operation. The more load you have on the database harder to maintain the availability, scalability, and reliability of your application.
Also, you might have a better internet connection, you use high end clients so you might access any data faster but you have to think about all people around the world who try to access data with low-quality clients or internet connections.

How to dynamically add meta tags for seo and social share in react js app

I am currently working on a news website using react js (backend expres.js rest API). This site needs social share functionality with the image and title of the post. I add meta tags using a helmet. I tried to pre-render packages too, Even though this does not show images of the post when sharing. Can I achieve this using API without server-side rendering? please help me to do this.
As far as I know, while you can change meta tags with javascript (see the answers to this question most consumers of meta tags (Facebook, Twitter, etc) request the document from the server, but don't execute any javascript on that page. That way, the scraper consumes less resources, and reduces the number of attack vectors. That means you will need some way to return specific dynamic values in the <head> for different URLs from a server. SSR is probably the best way to do that – at this point, there should be enough different tools/options that it shouldn't be too hard, and you don't need to run two different systems to render markup, one server-side, the way you would if you did something like dynamically alter the contents of your index page.
I did this using next.js (the react framework). This can make API call and generate HTML in the server. With this, SEO and sharing with title and image are working properly.

Can Googlebot crawl javascript generated content?

We have a web app that its content generated by javascript. Can google index those pages?
When we investigate this issue we always found solutions from old pages about using "#!" in links.
In our app the links are like this:
domain.com/paris
domain.com/london
When we use these kind of links, javascript populates content.
Is it wise to use HTML snapshot or do you have any other suggestions?
Short answer
Yes they can crawl JavaScript generated content, as long as you are using pushstates.
Detailed answer
It depends on your setup. Google and Bing CAN crawl javascript and AJAX based content if your are using pushstates. If you do they will handle content coming from AJAX calls, updates to page title or meta tags using javascript, and in general any such things.
Most frontend frameworks like Angular, Ember or Backbone already works with pushstates so in these cases you don't need to do anything. Check whatever system you are using to see how they do things. If you are not using pushstates you will need to implement it on your own or use the whole escapted_fragment html snapshot deal.
So if you use pushstate then yes, search engines can crawl your page just fine. If you don't then no, you will need to implement pushstates or do HTML snapshots.
Bonus info - Unfortunately Facebook does not handle pushstates, so the facebook crawler needs either non-dynamic og-tags or HTML snapshots.
"Generated by JavaScript" is ambiguous. That could mean that you are running a JS script on the server or it could mean that you are making an AJAX call with a JS API. The difference appears to matter as far as Googlebot is concerned. But you don't have to take my word for it, as there is empirical proof of what Googlebot will and won't currently cache as far as JavaScript content in the form of live experiments using both the XMLHTTPRequest API and the Fetch API. So, as you can see, server-side rendering is still going to be the best way to go for SEO.

Use AngularJS on website and still get indexed by search engines

I want to rebuild an old website made on plain HTML and add some extra functionality with AngulaJS. But since I plan to use ng-views to render templates on my main layout, is it possible to make search engines still find the templates of these subpages?
In a general sense, this is not an angular problem - its the same problem with any single page site that uses javascript to generate your html.
The general solution would be to detect when it is a crawler accessing your page instead of a person (usually by using the query agent string), and then use server side logic to render pages that are suitable for the crawler to process.
Here is one article that discusses this problem:
http://www.webdesignerdepot.com/2013/10/how-to-optimize-single-page-sites-for-search-engines/
but google (or searching this site) for "google seo single page app" will give you lots of other ideas.

How to force every page to load a certain javascript file?

I've created a pretty basic system here at work that does what Google analytics does (extremely simplistic in comparison) and it works quite well, but like Google Analytics it requires each page to reference a JavaScript file. Is there any way to make all of our pages that are served from IIS reference this Javascript file? I would like to capture these stats for every page.
Any ideas?
Thanks
Hmm, it looks like you are looking for this.
If you're dealing with static HTML files your best bet seems to be this previous question.
If you have an ASP site going, and you already have a header or layout file, I'd recommend putting it in there.
This depends on how you build your web site, but most people do this by adding the reference to their templates, layouts, master pages, or whatever term is used in your development platform.
You don't want every page tracked, e.g., pages returning data such as JSON or XML should not be meddled with. This is why it is better to have explicit control over which pages get the analytics javascript added to them.

Categories