Google has become a one-stop information source for the internet. The company has consistently innovated in the field of search, and has expanded into shopping, advertising, and now maps—truly, Google is a force to be reckoned with.
Many web sites are designed ignoring both Google and the larger topic of search engine optimization (SEO). Read on for a brief discussion on how to make web pages visible to Google and other web crawlers.
Search = Hard
Search is a difficult problem. An optimal search engine would have two characteristics: perfect recall and perfect precision. These two parameters are the yardsticks of search engines.
Recall is the percentage of relevant web pages returned for a given query. For example, if you are looking for "fuzzy kittens", and there are 100,000 web pages that are relevant, a search engine that provides 90,000 of those pages would have 90% recall.
Precision, on the other hand, is the percentage of relevant results in the total returned results. Returning to the kittens example, suppose the 90,000 relevant results were returned along with 90,000 spurious pages. This would be a 50% precision rate.
Good search engines balance the two to provide the most useful collection of results in the first few pages. Google does this particularly well by not only storing the text of a web page, but storing the relationships between web pages that are expressed through simple links. Google's PageRank system considers a link a vote of confidence; if enough sites "vote" for yours, your site will enjoy a high rank in Google search results.
So what about web design?
There are a few common mistakes that can cost a web site valuable Google ranking:
1. A meaningless or nonexistent title.
A page's title is the first thing displayed in search results—why is it neglected? A page title should immediately reflect the content within the page, and possibly refer to the larger site, as well. A title such as "Dinner Menu - Ristorante Gia" is preferred to "Ristorante Gia - Fine Italian Dining" applied to every single page.
This simple mistake can cost a page dearly in terms of ranking. Search engines often count words more heavily towards a page's relevance when they appear in the title or at the very beginning of the page. New layout techniques like CSS can exploit this, and the HTML itself can be arranged so the most relevant information comes first in the source code.
2. Poorly written links.
Meghan's previous article, "No, Click HERE!", discussed the importance of link copywriting. When applied to Google and search engine optimization, link writing becomes crucial. When Google interprets a link as a "vote of confidence" for a web site, it also pays attention to how it was linked, particularly the words that are part of a link.
Perform a Google search for "Click Here", and the results speak for themselves. How many times have you seen copy like the following:
If you don't have Acrobat reader, click here.
Google bombing demonstrates the importance of a link's text. A Google Bomb manipulates Google's PageRank algorithm by encouraging many, many sites to link somewhere using a very particular phrase. This mass-linking can then cause a web page to display as the number one result on Google for the given phrase. The most famous Google Bomb is "miserable failure", which was directed to point at George W. Bush's biography page (note that the two pages below it are Jimmy Carter's bio and Michael Moore's website). Clearly, link text is important.
When deciding which text should be linked, look at the content from Google's point of view. Which adjectives and nouns best represent the content of the link? Does "click here" or "automotive price guide" describe my link? Proper linking will enable Google to return more relevant results. Intra-site links should have proper text associated with them, rather than "click here" or an image.
Finally, stay away from imperatives if possible. Here are two examples of linked text, one using an imperative approach, and the other using a more semantic one:
This file requires Acrobat reader. If you do not have Acrobat installed, click here.
This file requires Adobe Acrobat reader. If you do not have Acrobat installed, it is available from Adobe's website
Good links are hard to write. So why bother? Properly written links will help convey relationships and meaning to modern search engines. When search engines work, interested customers will find their way to your web site.
3. Inaccessible web sites.
You can run a web site with the most relevant content in the world, but it will never be found by Google if it can't browse your site. It's possible for developers to overlook Google as an active web site user—after all, Google never submits support requests or complaints. However, Google is an extremely important visitor to design for.
In the worst-case scenario, Google's web crawler won't make it past your first web page. The reality of Google means that Macromedia Flash-only web sites are not going to reach as many customers, solely because Google has no means of indexing content embedded within a Flash movie. 98% of the web can view it, but most won't be able to find it unless Google and other search engines know about it.
Therefore, care must be taken during the engineering of a web site. Any and all content worthy of search should be placed in a location reachable and readable by Google. Site maps should be deployed to ensure every page is visited . Locking out Google will in turn lock out new visitors before they even find a web site.
4. Low content per web page.
HTML is not perfect. It's very, very hard to transfer a design flawlessly to the web. Fonts, text sizes, and browsers are all uncertain. The web can be a very frustrating place for the exacting designer.
Such frustrations do not warrant regression. Some web pages choose to preserve design integrity by displaying everything as an image. Plain text and tables of data are sometimes saved into images for display exactly as the designer intended it to be seen.
This practice is not recommended, especially since few web developers properly use the ALT attribute of the IMG tag. The ALT attribute is designed to provide a textual description of the content of an image. This text is useful to people using a text-only browser, or for those who are visually impaired and employ a text-to-speach reader.
With Google, the situation is worse. Google Images may be archiving your text as an image, but Google Search has no way of deciphering what is on your web page, especially if the ALT attributes are neglected. On-line HTML sell sheets are worthless if they are not searchable—all the product-related copy is invisible to Google. The consequences may include brand dilution if another unofficial web site has more textual content. For example, a web site with full text describing the negative effects of a given product may appear before the product's actual web page if there's no text for Google to search.
PDF files can alleviate some of these problems. Google is fully capable of searching PDF files linked from the web. However, a visitor that visits the site solely through a direct PDF link will not have a chance to explore the rest of the web site and potentially re-visit pages. Therefore, using PDF files as a replacement for properly designed web pages is not recommended.
I've explained four common pitfalls and attempted to show how they can be avoided. Proper web design and copywriting can work together to naturally create a strong search ranking on Google, MSN search, Yahoo, and other search engines. The key is to provide Google with as much information as possible, through properly written titles and links, clear access, and textual data presented in HTML versus images.
Information is power, and Google's power is tempered by the quality of information that's provided to it via web pages. Properly designed web pages will have the majority of their content accessible to google, with important elements such as links and titles properly specified. Web design should always cater to Google as well as the end-user. In fact, good web design will contribute to Google's effectiveness. Remember, the web is now too large to be read solely by human beings—software agents are users, too.
A List Apart has an excellent article on using proper XHTML/CSS for SEO optimization. The W3C (Word Wide Web Consortium) has suggestions for good page titles, along with a whole host of other web coding tips. Microsoft's Small Business Center has additional tips on being friendly to search engines. Let me know if you have other good articles on the subject.
I am happy to answer any questions; leave a comment below. Thanks for reading.