Sunday 2 June 2013

Setup Url Structure

A site's URL structure should be as simple as possible. Consider organizing your content so that URLs are constructed logically and in a manner that is most intelligible to humans (when possible, readable words rather than long ID numbers). 

Consider using punctuation in your URLs. The URL http://mehratech.blogspot.in/2012/01/create-your-own-social-website.html is much more useful to us than http://mehratech.blogspot.in/2012/01/create_your_own_social_website.html. We recommend that you use hyphens (-) instead of underscores (_) in your URLs.

Overly complex URLs, especially those containing multiple parameters, can cause a problems for crawlers by creating unnecessarily high numbers of URLs that point to identical or similar content on your site. As a result, Googlebot may consume much more bandwidth than necessary, or may be unable to completely index all the content on your site.
 


Common causes of this problem
Unnecessarily high numbers of URLs can be caused by a number of issues. These include:

  • Dynamic generation of documents. This can result in small changes because of counters, timestamps, or advertisements.

  • Problematic parameters in the URL. Session IDs, for example, can create massive amounts of duplication and a greater number of URLs.

  • Calendar issues. A dynamically generated calendar might generate links to future and previous dates with no restrictions on start of end dates. For example:
    http://www.mehratech.blogspot.com/calendar.php?d=13&m=8&y=2011
    http://www.mehratech.blogspot.com/calendar/cgi?2008&month=jan

Steps to resolve this problem
To avoid potential problems with URL structure, we recommend the following:
  • Consider using a robots.txt file to block Googlebot's access to problematic URLs. Typically, you should consider blocking dynamic URLs, such as URLs that generate search results, or URLs that can create infinite spaces, such as calendars. Using regular expressions in your robots.txt file can allow you to easily block large numbers of URLs.

  • Wherever possible, avoid the use of session IDs in URLs. Consider using cookies instead.

  • Whenever possible, shorten URLs by trimming unnecessary parameters.

  • If your site has an infinite calendar, add a nofollow attribute to links to dynamically created future calendar pages.

  • Check your site for broken relative links.

No comments:

क्यों पसंद आया