javascript and seo [closed] - javascript

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I have a javascript snippet that clients can put on their webpages that loads some text associated with embedded flash objects (like Slideshare presentations) on that page. Does Google crawl this type of content? Will this provide any SEO benefit? If not, what else should I consider. I don't want to force people to embed the actual content since they typically have multiple pages that use this script and the there is typically a lot of text. Any suggestions?

google does execute on page javascript quite well, but the current seo consensus is that external javascript (i.e.: asynchronous loaded content )does not count as part of the page.
this means, that script (the loaded text) is not seo valuable.
if you want the text to be valuable it must be onpage, means it must be in the html of the page, so basically you will have to go with the big (text already included) js snippet.
but before you rush to make it "seo-valuable"e please be aware that duplicate content is usually not valuable. so if the text shows up on different pages it might not be useful to include the text at all.

Flash is popular on the Web, but each presents challenges to the search engines in terms of indexing the related content. This creates a gap between the user experience with a site and what the search engines can find on that site.
It used to be that search engines did not index Flash content at all. In June 2008, Google announced that it was offering improved indexing of this content
(http://googlewebmastercentral.blogspot.com/2008/06/improved-flash-indexing.html).
This announcement indicates that Google can index text content and find and follow links within Flash files. However, Google still cannot tell what is contained in images within the Flash file. Here are some reasons why Flash is still not fully SEO-friendly:
Different content is not on different URLs
This is the same problem you encounter with AJAX-based pages. You could have unique frames, movies within movies, and so on that appear to be completely unique portions of the Flash site, yet there’s often no way to link to these individual elements.
The breakdown of text is not clean
Google can index the output files in the SWF file to see words and phrases, but in Flash, a lot of your text is not inside clean or tags; it is jumbled up into half-phrases for graphical effects and will often be output in the incorrect order. Worse still are text effects that often require “breaking” words apart into individual letters to animate them.
Flash gets embedded
A lot of Flash content is only linked to by other Flash content wrapped inside shell Flash pages. This line of links, where no other internal or external URLs are referencing the interior content, means some very low PageRank/link juice documents. Even if they manage to stay in the main index, they probably won’t rank for anything.
Flash doesn’t earn external links like HTML
An all-Flash site might get a large number of links to the home page, but interior pages almost always suffer. For embeddable Flash content, it is the HTML host page earning those links when they do come.
SEO basics are often missing
Anchor text, headlines, bold/strong text, img alt attributes, and even title tags are not simple elements to properly include in Flash. Developing Flash with SEO in mind is just more difficult than doing it in HTML. In addition, it is not part of the cultural lexicon of the Flash development world.
A lot of Flash isn’t even crawlable
Google has indicated that it doesn’t execute external JavaScript calls (which many Flashbased sites use) or index the content from external files called by Flash (which, again, a lot of Flash sites rely on). These limitations could severely impact what a visitor can see versus what Googlebot can index.
Note that it used to be that you could not test the crawlability of Flash, but the Adobe Search Engine SDK does allow you to get an idea as to how the search engines will see your Flash file.

You can have the content on an external page.
If you don't want Google to crawl it, block it with robots.txt
If you don't want Google to index it (possibly a better option) use x-robots or noindex in the head.
Whether you use javascript to pull it into the page, iframes, or both really comes down to implementation and what the included page may need to access on the page, tracking, sessions etc.

Although google does not crawl flash and java script so well but these are not the only things crucial for SEO. There are many other things which matters such as keyword density, quality of content, inboubd and outbound linking, titles and content should be well managed with proper tags etc. So if flash/java script is necessity then use it but do not use it in excess.

Google is not efficient at reading or indexing flash elements. If I had to publish content like slideshare, I would produce a PDF. This can be indexed with no problem, it could drag traffic to my website.

Google crawls Flash objects to some extent. But in my experience a best solution (if Flash is imminent) is to use SWFObject for alternative HTML text. This will make your Flash and your Site 100% Google friendly and, more importantly, user friendly.
For more information go here:
http://www.adobe.com/devnet/flashplayer/articles/alternative_content.html

People read way too much into what Google will think about the technology or specific code they use on their site. If you're on the up-and-up and not using the code to cloak, deceive visitors or hijack sessions...you're going to be just fine. Will you rank better if you subbed all text for Flash? Maybe a little, but at the end of the day it's the quality of your content (yes, even if it's not text-based), the number of people that find it useful (via high quality links) and other small factors.
8 years ago, your answer would have been more valid for not including JS, but it just doesn't matter much anymore, Google treats navigable websites the same and ranks primarily around "quality", not usability excessive keyword rich text.

Related

Advanced Customization of Alfresco

I am having trouble figuring out how to do even the simplest things in Alfresco, like typing a simple document. I've been Googling and noticed that customizations can be done through HTML documents. I need help and decided to post a question to a knowledgeable user platform. THe following customizations I would like are WAY far fetched and most likely not even achievable, but any help that can be provided I would really appreciate.
*list items in bold are most important
Anyone could be assigned a login and when they logged in they would have access to and easily view all of the contents of the site (or multiple sites that make up one accessible website?)
All of the items on the website would be a hierarchy, the user facing contents of the site would be a list of links with thumbnails, when one link was clicked it would be another list of links with large thumbnails, when one of those links was clicked a text document would be brought up, that document would contain clickable sections, when one of those sections was clicked it would bring up a page only containing the section clicked:
Links (crafts)
2nd layer of links (modules)
Text and image document with clickable links (single module containing clickable sections)
Section (single sections of module)
The module and section text would also contain images and tables throughout and mixed in the text
If a link (module or section) was used in multiple places all instances of the link would be linked to each other. If on instance was edited, the other would also change. THis setting could be turned off for any individual link if necessary.
Every document should have an easy to use live commenting system (something simple like Disqus would work) The comments are the most important on the single section pages but would also be good on the module page
An advanced tagging system that would be part of the entire site/website environment. A user could type anything they wanted as a tag and use multiple tags. The tags would be used for their comments on the content (text, sections) but the tags could be searched (most importantly by the administrators of the site) at any time in the whole environment. A popularity of any tag could also be viewed (I'm not sure how that would work, possibly another section of the site or an easy to see column on any text/image document?)
A user could edit their own comment if they wished but would not be able to delete it entirely. Comments would also be date and time stamped.
I know all of this is most likely impossible but if anyone has an idea of Alfresco customizations that could pull any of this off, or of an entirely different secure platform or site that would perform anything similar to this please let me know.
Thank you!
It sounds like you are looking for a Web Content Management (WCM) System. Alfresco is a Document Management (DM) System. You can use Alfresco as a back-end for a custom content-centric solution, but if you are expecting to install it, start it up, and have anything close to what you've listed above, you are barking up the wrong tree.
Everything you've listed is a front-end concern. You can use whatever you want to develop that functionality, but none of it will leverage Alfresco unless you choose to store some of the data in the Alfresco back-end.
You might be better off looking at something in the WCM space, such as Drupal or Wordpress. Or if you want something Java-based, look at Magnolia CMS or Hippo CMS.

Should I inject style tags into the head dynamically or include include style tags in the body?

I have some html content that gets embedded into a page via a server side call. So, when the page's html is being compiled on the server, a call is made to another server to return some html, which is then embedded within a div somewhere in the body. The problem is, this content contains it's own css. So, I wrote a script to inject style tags into the HEAD on ready, which works fine on desktop browsers. However, on mobile devices there's a fairly significant flash of unstyled content. I know that you're technically not supposed to include style tags in the body, but in this case would it yield better results to just include them in the body instead of injecting them into the head?
In this case, it sounds like the right solution is to fix up your architecture so that the server-side compiler can include CSS for the remote page in the page head. This probably involves separating the CSS of the remote page(s) out of the markup there and then grabbing it as a separate file to be included in the page head during compilation.
Since the right solution is not always feasible given a myriad reasons, compromise is often required. Leaving the CSS in the remote markup, if it produces the result you desire, could be the best solution for you. Or perhaps some other hack to get the CSS into the head server-side could be appropriate. You need to decide if it is worth the effort to do any of these things, if they are possible for you to accomplish given your constraints.
Some discussion here. In my experience a lot of enterprise content does it. Does that mean it's the RIGHT thing to do? I dont know. But it's certainly not frowned upon in my experience.
Source: https://www.w3.org/wiki/The_web_standards_model_-_HTML_CSS_and_JavaScript
Why separate?
Efficiency of code: The larger your files are, the longer they will take to download, and the more they will cost some people to view (some people still pay for downloads by the megabyte.) You therefore don’t want to waste your bandwidth on large pages cluttered up with styling and layout information in every HTML file. A much better alternative is to make the HTML files stripped down and neat, and include the styling and layout information just once in a separate CSS file. To see an actual case of this in action, check out the A List Apart Slashdot rewrite article where the author took a very popular web site and re-wrote it in XHTML/CSS.
Ease of maintenance: Following on from the last point, if your styling and layout information is only specified in one place, it means you only have to make updates in one place if you want to change your site’s appearance. Would you prefer to update this information on every page of your site? I didn’t think so.
Accessibility: Web users who are visually impaired can use a piece of software known as a “screen reader” to access the information through sound rather than sight — it literally reads the page out to them, and it can do a much better job of helping people to find their way around your web page if it has a proper semantic structure, such as headings and paragraphs. In addition keyboard controls on web pages (important for those with mobility impairments that can't use a mouse) work much better if they are built using best practices. As a final example, screen readers can’t access text locked away in images, and find some uses of JavaScript confusing. Make sure that your critical content is available to everyone.
Device compatibility: Because your HTML/XHTML page is just plain markup, with no style information, it can be reformatted for different devices with vastly differing attributes (eg screen size) by simply applying an alternative style sheet — you can do this in a few different ways (look at the [mobile articles on dev.opera.com] for resources on this). CSS also natively allows you to specify different style sheets for different presentation methods/media types (eg viewing on the screen, printing out, viewing on a mobile device.)
Web crawlers/search engines: Chances are you will want your pages to be easy to find by searching on Google, or other search engines. A search engine uses a “crawler”, which is a specialized piece of software, to read through your pages. If that crawler has trouble finding the content of your pages, or mis-interprets what’s important because you haven’t defined headings as headings and so on, then your rankings in relevant search results will probably suffer.
It’s just good practice: This is a bit of a “because I said so” reason, but talk to any professional standards-aware web developer or designer, and they’ll tell you that separating content, style, and behaviour is the best way to develop a web application.
Additional stackoverflow articles:
Using <style> tags in the <body> with other HTML
Will it be a wrong idea to have <style> in <body>?

Altering a page from another site

Sorry for the vague question name - didn't know how to phrase it.
I have built a PHP engine to parse web pages and extract phone numbers, addresses etc.
This is going to be used by clients to populate an address book by simply entering a new contacts web address.
The problem I am having is useability:
At the moment the script just adds each item (landline number, fax etc) to a different list box and the user picks the correct one - from a useability standpoint this is hard work (how do you know which is the correct contact number without looking at the site)
so my question (finally!)
How would achieve the functionality of
http://bartaz.github.io/sandbox.js/jquery.highlight.html
On someone else website (I have no problem writing this functionality).
FOR CLARITY**
I want to show someone elses site (their contact page for example) on my site BUT I want to highlight items I have found (so for example add a tag around a phone number my php script has found)
I am aware that to display a website not on your domain an iFrame would be used - but as I need to alter the page content this is useless.
I also contemplated writing a bookmarklet that could be run on that page - but that means re-writing my parsing engine in javascript and exposing some of my tricks to make it accurate.
So I am left with pulling the page by cURL and then trying to match up javascript files, css files etc. that have relative URLs
Does anyone know how best to achieve this - and any pitfalls that might befall me.
I have tried using simple html dom parser - but it is tricky to get consistency and I also dont know how having two sets of tags, body tags etc. would affect sites.
If anyone has managed this before and could point me to the tools / general methods they used I would be eternally grateful!
PLEASE NOTE - I am very proficient with google and stack-overflow and have looked there first!
The ideal HTML solution
The easiest way to work around the relative paths for an arbitrary site would be to use the base href tag to specify the default relative location (just use the url up to the filename, such as <base href="http://www.example.com/path/to/" /> for the URL http://www.example.com/path/to/page. This should go at the top of the head block.
Then you can alter the site simply by finding the relative parts and wrapping them in your own tag, such as a span. For the formatting of these tags, the easiest way would be to add a style attribute, but you could also try to insert a <style> tag in the <head>.
Of course, you'll also need to account for badly made webpages without <html>, <head> or <body> tags. You could either wrap the source in a new set of these tags, or just put in your base and style tags, hoping that the browser will work out what to do.
You probably also want to make this interactive, so you should also wrap them with some kind of link, and ideally you'll insert some javascript to handle their actions by ajax. You should also insert your own header at the top of the page, probably floating at the top, so that they know they're using your tool. Just keep in mind that some advanced pages might then conflict with your alterations (though for those cases you could have a link saying 'is this page not displaying correctly?' to take the user to your original basic listbox page as a backup).
The more robust solution
Clearly there are a lot of potential problems with the above, even though it is ideal. If you want to ensure robustness and avoid any problems with custom javascript and css on the page you're trying to alter, you could instead use a similar algorithm to that used in text based browsers such as lynx to reformat the page consistently. Then you can apply your algorithm to highlight the relevant parts of the page, and you can apply your own formatting as well without risk of it not displaying correctly. This way you can frame it really well and maintain your interface.
The problem with this is that you lose the actual look of the original page, but you should keep the context around the numbers and addresses which is the important thing. You would also then be able to use some dynamic javascript to take the user to each number and address consecutively to improve the user experience. Basically, this is rigorous and gives you complete control over the user experience, but you lose the original look of the website which may or may not confuse your users.
Personally, I'd go for the second option, but I'm not sure if anyone's created such a parser before. If not, the simplest thing you could do would be to strip the tags to get it as plain text. The next simplest would be to convert it into some simple text markup format like markdown, then convert it back into html. That way, you'd keep some basic layout such as headings, italicised and bold text, etc.
You definitely don't want to have nested body tags. It might work, but it'll probably mess up your formatting and be inconsistent across browsers.
Here's a resource I found after a quick Google search:
https://github.com/nickcernis/html-to-markdown
There are other html to markdown scripts, but this was the more robust from the few I found. I'm still not sure though whether it can handle badly formatted pages or ones with advanced formatting, try it out yourself.
There are quite a few markdown to html converters though, in fact you could probably make a custom converter yourself quite easily to accommodate your personal needs.

Single Page HTML Site - SEO technique [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I developed a Single Page Responsive Website for my company http://germin8.com/ . Everything is going good however I actually now face a problem with SEO .The site's different sections do not show up in search engine.
I know the cause it being a single page site so not crawler friendly...Inorder to get the URL change I used history pushstate technique and have put href links for menu bar items to sections ..... confused ?? eh
Sample anchor tag outlink ( I thought this is enough for my section to show up in Search Engine :-/ )
a style="text-decoration:none;color:black;padding-left:30px;" class="scrollTo" id="contactUs_Menu"
href="/contact-us">CONTACT</a></li>
Or you may have a look at the source code of the website and follow the anchor tags.
On some research and POC I came across this AJAX crawlable technique by google (https://developers.google.com/webmasters/ajax-crawling ) ...however I couldn't understand it and also feel loading site's sections through ajax would be a lot more work at this stage since my entire site is a static HTML file ( index.php ) with nothing rendered dynamically through javascript/AJAX
Someone who has faced similar problem can you suggest me the simplest and fastest way for my site's different sections ( eg .Clients , Partners , Contact Us etc ) to show up in google engine
Thanks in advance guys :)
Actually this question is more suitable for https://webmasters.stackexchange.com/ but since it has been raised here, I'll try and answer this question to the best of my knowledge.
Unfortunately, there is no shortcut for SEO and to be able to fetch search results in your favor is a slow and painful process. The basic principle of SEO is doing simple things right and provide quality content to your users in your website and not worry much about the ranking.
That being said, your expectation is slightly unrealistic for the following reasons,
You are asking Google to index a page that doesn't even exist.
The URL is changed with JavaScript on runtime, which is something no-search-enginebot is good at indexing.
However, there are couple of things that you can improve in terms of SEO (not going to guarantee what you have asked),
Make sure you have sitemap.xml file in the root directory of your website. You need to add individual sub-page links for each url like this,
<url>
<loc>http://germin8.com/clients</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
Once you are done with sitemap.xml file. Open your Google Webmaster Account (also make sure your Google Analytics account is linked to your webmaster profile) and validate the structure and schema of the sitemap file.
Write better anchor text - Add title attribute to your anchor tags. Avoid inline styles as much as you can. Use complete url instead of relative paths for href attribute.
Google doesn't like slow websites. Hence, you need to focus a lot on the performance of your website. Also no user likes to see a webpage loading for ages. Make efforts to concatenate, minify and lint your assets(html/css/js). Gzip compression is required as well.
149 requests with 4.1 MB is huge. You need to reduce the number of HTTP requests you make massively!
Conclusion
Apart from the above, I don't really see your internal links not being visible on search results as a big problem. Your primary objective is to make sure that your users land on your webpage (this is something that you already doing). After the user enters your territory (website) he has the liberty to navigate to any section of the webpage.
http://webcache.googleusercontent.com/search?q=cache:http://germin8.com&client=firefox-a&rls=org.mozilla:en-US:official&strip=1
I dont see any problem with indexing of your site. Clients will not showup in normal search but they would show up in google images. You should give alt tag to best describe the client images that you have used. Above url will give you an idea how Google bot sees your site. So you can notice all text is indexed by google including your heading where clients are listed. Hope this solves your concern.

Maintain height of the website

I have a client who wants to do a website with specific height for the content part.
The Question:
Is there any way that when the text is long / reach the maximum height of the content part, then a new page is created for the next text.
Within my knowledge, somehow I know this can't be done.
Thanks for helping guys!
You will probably want to look into something like jQuery paging with tabs
http://code.google.com/p/jquery-ui-tabs-paging/
Unfortunately you would need to figure out the maximum number of characters you want to allow in the content pane and anything after that would need to be put into another tab. You can hide the tab and use just a link instead.
Without more knowledge on what you're development is, this is a difficult question to answer. Are you looking to create a different page entirely, or just different sections on a page?
The former can be done using server-side code (e.g. Rails), and dynamically serving out pages (e.g. Google results are split across many page).
The latter can be done with Javascript and/or CSS. A simple example is:
<div id="the_content" style="overflow:hidden;width:200px;height:100px">
Some really long text...
</div>
This would create a "scroll" bar and just not disrupt the flow of the page. In Javascript (e.g. JQuery), you'll be able to split the content into "tabs".
Does this help?
(Almost) everything is possible, but your intuitions are right in that this can't be done easily or in a way that makes any sense.
If I were in your position, I would go up to the client and present advantages and disadvantages to breaking it up. Advantages include the fact that you'd be able to avoid long pages and that with some solutions to this problem, the page will load faster. Disadvantages include the increased effort (i.e., billable hours) it would take to accomplish this, the lack of precedent for it resulting in users being confused, and losses to SEO (you're splitting keywords amongst n pages).
This way, you're not shooting down the client's idea, and in the likely case the client retreats from his position, he will go away thinking that he's just made a smart choice by himself and everyone goes away happy.
If you're intent on splitting it up into pages, you can do it on the backend by either literally structuring your content into pages or applying some rule (e.g., cut a page off at the first whole paragraph after 1000 characters) to paginate the results. On the frontend, you could use hashtags to allow Javascript to paginate the results. You could even write an extensible library that "paginates" any text node. In fact, I wouldn't be surprised if one didn't exist already.

Categories