Kamis, 04 Desember 2008

One place for changing your site's settings

One of the many useful features of Webmaster Tools is the ability to adjust settings for your site, such as crawl rate or geographic target. We've been steadily adding settings over time and have now gotten to the point where they merit their own page. That's right, Webmaster Tools now provides a single, dedicated page where you can see and adjust all the settings for your site.

The settings that have been moved to the new Settings page are:
1. Geographic Target
2. Preferred domain control
3. Opting in to enhanced image search
4. Crawl rate control





Changing a Setting
Whenever you change a setting, you will be given an option to save or cancel the change.

Please note: The Save/Cancel option is provided on a per setting basis and hence if you change multiple settings, you'll have to click the Save button associated with each setting.


Expiration of a setting
Some of the settings are time-bounded. That is, your setting will expire after a certain time period. For example, the crawl rate setting has an expiration period of 90 days. After this period, it's automatically reset to the default setting. Whenever you visit the Settings page, you can view the date that your setting will expire underneath the setting name.


That's all there is to it!

We always like adding features and making our interface clearer based on your suggestions, so keep them coming! Please share your feedback (or ask questions) in the Webmaster Help Forum.

A new look for our Webmaster Help Group

Googlers strongly believe in dogfooding our own products. We manage our work schedules with Google Calendar, publish our blogs on Blogger, and store scads of documentation on Google Sites. So, ever since we launched our first Webmaster Help Group, we've been using Google Groups to facilitate conversations about Webmaster Tools and web search issues.
Today, however, I'm thrilled to announce that our English and Polish Help Groups are getting a makeover. And the changes are more than just skin-deep. Our new Help Forums should make it easier for you to find answers, share resources with others, and have your participation acknowledged.
You can read more about the changes on the Official Google Blog, and then check it out for yourself: English, Polish.
Q: What will happen to the old English and Polish Help Groups?
A: While our old groups are now closed to new posts, they will still be available in read-only mode in case you want to reference any of your favorite posts from the good old days. Many of the most frequently-asked questions (and answers!) have already been transferred to our new Help Forums.
Q: If I was a member of the old group, will I automatically be a member of the new forum?
A: We won't be "transferring" membership from the old groups to the new, so even if you were a member of our Google Groups forum, you'll still need to join the new forum in order to participate. Nicknames and user profiles are also managed separately, so you're welcome to recreate your Google Groups profile in our new forum, or reinvent yourself.
Q: What about the Webmaster Help Groups in other languages?
A: They'll be moving to the new Help Forum format in 2009. Specific dates will be announced in each of the groups as they get closer to their moving date.
Feel free to post any other questions about the new Help Forums in the comments below.

More control of Googlebot's crawl rate

We've upgraded the crawl rate setting in Webmaster Tools so that webmasters experiencing problems with Googlebot can now provide us more specific information. Crawl rate for your site determines the time used by Googlebot to crawl your site on each visit. Our goal is to thoroughly crawl your site (so your pages can be indexed and returned in search results!) without creating a noticeable impact on your server's bandwidth. While most webmasters are fine using the default crawl setting (i.e. no changes needed, more on that below), some webmasters may have more specific needs.

Googlebot employs sophisticated algorithms that determine how much to crawl each site it visits. For a vast majority of sites, it's probably best to choose the "Let Google determine my crawl rate" option, which is the default. However, if you're an advanced user or if you're facing bandwidth issues with your server, you can customize your crawl rate to the speed most optimal for your web server(s). The custom crawl rate option allows you to provide Googlebot insight to the maximum number of requests per second and the number of seconds between requests that you feel are best for your environment.

Googlebot determines the range of crawl rate values you'll have available in Webmaster Tools. This is based on our understanding of your server's capabilities. This range may vary from one site to another and across time based on several factors. Setting the crawl rate to a lower-than-default value may affect the coverage and freshness of your site in Google's search results. However, setting it to higher value than the default won't improve your coverage or ranking. If you do set a custom crawl rate, the new rate will be in effect for 90 days after which it resets to Google's recommended value.

You may use this setting only for root level sites and sites not hosted on a large domain like blogspot.com (we have special settings assigned for them). To check the crawl rate setting, sign in to Webmaster Tools and visit the Settings tab. If you have additional questions, visit the Webmaster Help Center to learn more about how Google crawls your site or post your questions in the Webmaster Help Forum.


Kamis, 27 November 2008

Date with Googlebot, Part II: HTTP status codes and If-Modified-Since

Our date with Googlebot was so wonderful, but it's hard to tell if we, the websites, said the right thing. We returned 301 permanent redirect, but should we have responded with 302 temporary redirect (so he knows we're playing hard to get)? If we sent a few new 404s, will he ever call our site again? Should we support the header "If-Modified-Since?" These questions can be confusing, just like young love. So without further ado, let's ask the expert, Googlebot, and find out how he judged our response (code).


Supporting the "If-Modified-Since" header and returning 304 can save bandwidth.


-----------
Dearest Googlebot,
  Recently, I did some spring cleaning on my site and deleted a couple of old, orphaned pages. They now return the 404 "Page not found" code. Is this ok, or have I confused you?
Frankie O'Fore

Dear Frankie,
  404s are the standard way of telling me that a page no longer exists. I won't be upset—it's normal that old pages are pruned from websites, or updated to fresher content. Most websites will show a handful of 404s in the Crawl Diagnostics over at Webmaster Tools. It's really not a big deal. As long as you have good site architecture with links to all your indexable content, I'll be happy, because it means I can find everything I need.

  But don't forget, it's not just me who comes to your website—there may be humans seeing these pages too. If you've only got a very simple '404 page not found' message, visitors who aren't as savvy could be baffled. There are lots of ways to make your 404 page more friendly; a quick one is our 404 widget over at Webmaster Tools, which will help direct people to content which does exist. For more information, you can read the blog post. Most web hosting companies, big and small, will let you customise your 404 page (and other return codes too).

Love and kisses,
Googlebot


Hey Googlebot,
  I was just reading your reply to Frankie above, and it raised a couple of questions.
What if I have someone linking to a page that no longer exists? How can I make sure my visitors still find what they're after? Also, what if I just move some pages around? I'd like to better organise my site, but I'm worried you'll get confused. How can I help you?
Yours hopefully,
Little Jimmy


Hello Jimmy,
   Let's pretend there are no anachronisms in your letter, and get to the meat of the matter. Firstly, let's look at links coming from other sites. Obviously, these can be a great source of traffic, and you don't want visitors presented with an unfriendly 'Page not found' message. So, you can harness the power of the mighty redirect.

   There are two types of redirect—301 and 302. Actually, there are lots more, but these are the two we'll concern ourselves with now. Just like 404, 301 and 302 are different types of responses codes you can send to users and search engine crawlers. They're both redirects, but a 301 is permanent and a 302 is temporary. A 301 redirect tells me that whatever this page used to be, now it lives somewhere else. This is perfect for when you're re-organising your site, and also helps with links from offsite. Whenever I see a 301, I'll update all references to that old page with the new one you've told me about. Isn't that easy?

   If you don't know where to begin with redirects, let me get you started. It depends on your webserver, but here are some searches that may be helpful:
Apache: http://www.google.com/search?q=301+redirect+apache
IIS: http://www.google.com/search?q=301+redirect+iis
You can also check your manual, or the README files that came with your server.

   As an alternative to a redirect, you can email the webmaster of the site linking to you and ask them to update their link. Not sure what sites are linking to you? Don't despair - my human co-workers have made that easy to figure out. In the "Links" portion of Webmaster Tools, you can enter a specific URL on your site to determine who's linking to it.

  My human co-workers also just released a tool which shows URLs linking to non-existent pages on your site. You can read more about that here.

Yours informationally,
Googlebot



Darling Googlebot,
   I have a problem—I live in a very dynamic part of the web, and I keep changing my mind about things. When you ask me questions, I never respond the same way twice—my top threads change every hour, and I get new content all the time! You seem like a straightforward guy who wants straightforward answers. How can I tell you when things change without confusing you?
Temp O'Rary


Dear Temp,
   I just told little Jimmy that 301's are the best way to tell a Googlebot about your new address, but what you're looking for is a 302.
   Once you're indexed, it's the polite way to tell your visitors that your address is still the right one, but that the content can temporarily be found elsewhere. In these situations, a 302 (or the rarer '307 Temporary Redirect') would be better. For example, orkut redirects from http://orkut.com to http://google.com/accounts/login?service=orkut, which isn't a page that humans would find particularly useful when searching for Orkut***.
It's on a different domain, for starters. So, a 302 has been used to tell me that all the content and linking properties of the URL shouldn't be updated to the target - it's just a temporary page.

  That's why when you search for orkut, you see orkut.com and not that longer URL.

  Remember: simple communication is the key to any relationship.

Your friend,
Googlebot


*** Please note, I simplified the URL to make it easier to read. It's actually much more complex than that.

Captain Googlebot,
   I am the kind of site who likes to reinvent herself. I noticed that the links to me on my friends' sites are all to URLs I got rid of several redesigns ago! I had set up 301s to my new URLs for those pages, but after that I 301'ed the newer URLs to my next version. Now I'm afraid that if you follow their directions when you come to crawl, you'll end up following a string of 301s so long that by the end you won't come calling any more.
Ethel Binky


Dear Ethel,
   It sounds like you have set up some URLs that redirect to more redirects to... well, goodness! In small amounts, these "repeat redirects" are understandable, but it may be worth considering why you're using them in the first place. If you remove the 301s in the middle and send me straight to the final destination on all of them, you'll save the both of us a bunch of time and HTTP requests. But don't just think of us. Other people get tired of seeing that same old 'contacting.... loading ... contacting...' game in their status bar.

   Put yourself in their shoes—if your string of redirects starts to look rather long, users might fear that you have set them off into an infinite loop! Bots and humans alike can get scared by that kind of "eternal commitment." Instead, try to get rid of those chained redirects, or at least keep 'em short. Think of the humans!

Yours thoughtfully,
Googlebot


Dear Googlebot,
   I know you must like me—you even ask me for unmodified files, like my college thesis that hasn't changed in 10 years. It's starting to be a real hassle! Is there anything I can do to prevent your taking up my lovely bandwidth?

Janet Crinklenose


Janet, Janet, Janet,
   It sounds like you might want to learn a new phrase—'304 Not Modified'. If I've seen a URL before, I insert an 'If-Modified-Since' in my request's header. This line also includes an HTTP-formatted date string. If you don't want to send me yet another copy of that file, stand up for yourself and send back a normal HTTP header with the status '304 Not Modified'! I like information, and this qualifies too. When you do that, there's no need to send me a copy of the file—which means you don't waste your bandwidth, and I don't feel like you're palming me off with the same old stuff.

   You'll probably notice that a lot of browsers and proxies will say 'If-Modified-Since' in their headers, too. You can be well on your way to curbing that pesky bandwidth bill.

Now go out there and save some bandwidth!
Good ol' Googlebot

-----------

Googlebot has been so helpful! Now we know how to best respond to users and search engines. The next time we get together, though, it's time to sit down for a good long heart-to-heart with the guy (Date with Googlebot: Part III, is coming soon!).



UPDATE: Added a missing link. Thanks to Boris for pointing that out.

Kamis, 13 November 2008

Better targeting your indic language site

A lot has been said about how to start a multi-lingual site and how to better target content through meta tags. Our users have raised a number of interesting questions about creating websites in different languages, like the one below.

ganex':
> How does one do for INDIA.
> As there are many languages spoken here.
> My Site is primarily in English, but my site targets different cities in INDIA.
> For Hyderabad - I want in Urdu & Telugu and for Chennai I want in Tamil
> for Bengaluru I want in Kannada.
> For North I want in Hindi.’

We’d like to introduce the transliteration API for Indic languages (languages spoken in India) in addition to our Ajax API for languages. With this API at your disposal, content creation is simplified because it not only helps integrating transliteration in your websites but also allows users visiting your site to type in Indic languages.

To include the transliteration API, first you need the AJAX script.

<script type="text/javascript" src="http://www.google.com/jsapi"></>

This script tag will load the google.load function, which lets you load the individual Google APIs. For loading Google Transliteration API, call to google.load looks like this:

<script type="text/javascript">
google.load("elements", "1", {
packages: "transliteration"
});
</script>


When it comes to targeting, don't forget to add meta tags in your local language. And for your questions, we have a new addition to our already existing communication channels like the webmaster help groups and webmaster tools (available in 26 languages!). We also have our own official Orkut webmaster community! Here users can share thoughts and discuss webmaster related issues.

Sign up for our Orkut community now and if you have any additional thoughts we'd love to hear about them.

Cheers,

On-Demand Sitemaps for Custom Search

Since we launched enhanced indexing with the Custom Search platform earlier this year, webmasters who submit Sitemaps to Webmaster Tools get special treatment: Custom Search recognizes the submitted Sitemaps and indexes URLs from these Sitemaps into a separate index for higher quality Custom Search results. We analyze your Custom Search Engines (CSEs), pick up the appropriate Sitemaps, and figure out which URLs are relevant for your engines for enhanced indexing. You get the dual benefit of better discovery for Google.com and more comprehensive coverage in your own CSEs.

Today, we're taking another step towards improving your experience with Google webmaster services with the launch of On-Demand Indexing in Custom Search. With On-Demand Indexing, you can now tell us about the pages on your websites that are new, or that are important and have changed, and Custom Search will instantly schedule them for crawl, and index and serve them in your CSEs usually within 24 hours, often much faster.

How do you tell us about these URLs? You guessed it... provide a Sitemap to Webmaster Tools, like you always do, and tell Custom Search about it. Just go to the CSE control panel, click on the Indexing tab, select your On-Demand Sitemap, and hit the "Index Now" button. You can tell us which of these URLs are most important to you via the priority and lastmod attributes that you provide in your Sitemap. Each CSE has a number of pages allocated within the On-Demand Index, and with these attributes, you can us which are most important for indexing. If you need greater allocation in the On-Demand index, as well as more customization controls, Google Site Search provides a range of options.


Some important points to remember:
  1. You only need to submit your Sitemaps once in Webmaster Tools. Custom Search will automatically list the Sitemaps submitted via Webmaster Tools and you can decide which Sitemap to select for On-Demand Indexing.
  2. Your Sitemap needs to be for a website verified in Webmaster Tools, so that we can verify ownership of the right URLs.
  3. In order for us to index these additional pages, our crawlers must be able to crawl them. You can use "Webmaster Tools > Crawl Errors > URLs restricted by robots.txt" or check your robots.txt file to ensure that you're not blocking us from crawling these pages.
  4. Submitting pages for On-Demand Indexing will not make them appear any faster in the main Google index, or impact ranking on Google.com.
We hope you'll use this feature to inform us regularly of the most important changes on your sites, so we can respond quickly and get those pages indexed in your CSE. As always, we're always listening for your feedback on Custom Search.

Rabu, 12 November 2008

Google's SEO Starter Guide

Note: The SEO Starter Guide has since been updated.

Webmasters often ask us at conferences or in the Webmaster Help Group, "What are some simple ways that I can improve my website's performance in Google?" There are lots of possible answers to this question, and a wealth of search engine optimization information on the web, so much that it can be intimidating for newer webmasters or those unfamiliar with the topic. We thought it'd be useful to create a compact guide that lists some best practices that teams within Google and external webmasters alike can follow that could improve their sites' crawlability and indexing.

Our Search Engine Optimization Starter Guide covers around a dozen common areas that webmasters might consider optimizing. We felt that these areas (like improving title and description meta tags, URL structure, site navigation, content creation, anchor text, and more) would apply to webmasters of all experience levels and sites of all sizes and types. Throughout the guide, we also worked in many illustrations, pitfalls to avoid, and links to other resources that help expand our explanation of the topics. We plan on updating the guide at regular intervals with new optimization suggestions and to keep the technical advice current.

So, the next time we get the question, "I'm new to SEO, how do I improve my site?", we can say, "Well, here's a list of best practices that we use inside Google that you might want to check out."

Update on July 22, 2009: The SEO Starter Guide is now available in 40 languages!