Kamis, 25 Januari 2007

A quick word about Googlebombs

Co-written with Ryan Moulton and Kendra Carattini

We wanted to give a quick update about "Googlebombs." By improving our analysis of the link structure of the web, Google has begun minimizing the impact of many Googlebombs. Now we will typically return commentary, discussions, and articles about the Googlebombs instead. The actual scale of this change is pretty small (there are under a hundred well-known Googlebombs), but if you'd like to get more details about this topic, read on.

First off, let's back up and give some background. Unless you read all about search engines all day, you might wonder "What is a Googlebomb?" Technically, a "Googlebomb" (sometimes called a "linkbomb" since they're not specific to Google) refers to a prank where people attempt to cause someone else's site to rank for an obscure or meaningless query. Googlebombs very rarely happen for common queries, because the lack of any relevant results for that phrase is part of why a Googlebomb can work. One of the earliest Googlebombs was for the phrase "talentless hack," for example.

People have asked about how we feel about Googlebombs, and we have talked about them in the past. Because these pranks are normally for phrases that are well off the beaten path, they haven't been a very high priority for us. But over time, we've seen more people assume that they are Google's opinion, or that Google has hand-coded the results for these Googlebombed queries. That's not true, and it seemed like it was worth trying to correct that misperception. So a few of us who work here got together and came up with an algorithm that minimizes the impact of many Googlebombs.

The next natural question to ask is "Why doesn't Google just edit these search results by hand?" To answer that, you need to know a little bit about how Google works. When we're faced with a bad search result or a relevance problem, our first instinct is to look for an automatic way to solve the problem instead of trying to fix a particular search by hand. Algorithms are great because they scale well: computers can process lots of data very fast, and robust algorithms often work well in many different languages. That's what we did in this case, and the extra effort to find a good algorithm helps detect Googlebombs in many different languages. We wouldn't claim that this change handles every prank that someone has attempted. But if you are aware of other potential Googlebombs, we are happy to hear feedback in our Google Web Search Help Group.

Again, the impact of this new algorithm is very limited in scope and impact, but we hope that the affected queries are more relevant for searchers.

Rabu, 24 Januari 2007

About badware warnings

Some of you have asked about the warnings we show searchers when they click on search results leading to sites that distribute malicious software. As a webmaster, you may be concerned about the possibility of your site being flagged. We want to assure you that we take your concerns very seriously, and that we are very careful to avoid flagging sites incorrectly. It's our goal to avoid sending people to sites that would compromise their computers. These exploits often result in real people losing real money. Compromised bank accounts and stolen credit card numbers are just the tip of this identity theft iceberg.

If your site has been flagged for badware, we let you know this in webmaster tools. Often, we find that webmasters aren't aware that their sites have been compromised, and this warning in search results is a surprise. Fixing a compromised site can be quite hard. Simply cleaning up the HTML files is seldom sufficient. If a rootkit has been installed, for instance, nothing short of wiping the machine and starting over may work. Even then, if the underlying security hole isn't also fixed, they may be compromised again within minutes.

We are looking at ways to provide additional information to webmasters whose sites have been flagged, while balancing our need to keep malicious site owners from hiding from Google's badware protection. We aim to be responsive to any misidentified sites too. If your site has been flagged, you'll see information on the appeals process in webmaster tools. If you can't find anything malicious on your site and believe it was misidentified, go to http://stopbadware.org/home/review to request an evaluation. If you'd like to discuss this with us or have ideas for how we can better communicate with you about it, please post in our webmaster discussion forum.

Update: this post has been updated to provide a link to the new form for requesting a review.


Update: for more information, please see our Help Center article on malware and hacked sites.

Jumat, 19 Januari 2007

The Year in Review

Welcome to 2007! The webmaster central team is very excited about our plans for this year, but we thought we'd take a moment to reflect on 2006. We had a great year building communication with you, the webmaster community, and creating tools based on your feedback. Many on the team were able to come out to conferences and met some of you in person, and we're looking forward to meeting many more of you in 2007. We've also had great conversations and gotten valuable feedback in our discussion forum, and we hope this blog has been helpful in providing information to you.

We said goodbye to the Sitemaps blog and launched this broader blog in August. And after doing so, our number of unique monthly visitors more than doubled. Thanks! We got much of our non-Google traffic from other webmaster community blogs and forums, such as the Search Engine Watch blog, Google Blogoscoped, and WebmasterWorld. In December, seomoz.org and the new Searchengineland.com were our biggest non-Google referrers. And social networking sites such as digg.com, reddit,com, del.icio.us, and slashdot.org sent webmaster tools many of our visitors, and a blog by somebody named Matt Cutts sent a lot of referrers our way as well. And these are the top Google queries that visitors clicked on:


Our most popular post was about the Googlebot activity reports and crawl rate control that we launched in October, followed by details about how to authenticate Googlebot. We have only slightly more Firefox users (46.28%) than Internet Explorer users (46.25%). 89% of you use Windows. After English, our readers most commonly speak French, German, Japanese, and Spanish. And after the United States, our readers primarily come from the UK, Canada, Germany, and France.

Here's some of what we did last year.

January
We expanded into Swedish, Danish, Norwegian, and Finnish.
You could hear Matt on webmaster radio.

February
We lauched several new features, including:
  • robots.txt analysis tool
  • page with the highest PageRank by month
  • common words in your site's content and in anchor text to your site
We met many of you at the Google Sitemaps lunch at SES NY.
You could hear me on webmaster radio.

March
We launched a few more features, including:
  • showing the top position of your site for your top queries
  • top mobile queries
  • download options for Sitemaps data, stats, and errors

April
We got a whole new look and added yet more features, such as:
  • meta tag verification
  • notification of violations to the webmaster guidelines
  • reinclusion request form and spam reporting form
  • indexing information (can we crawl your home page? is your site indexed?)
We also added a comprehensive webmaster help center and expanded the webmaster guidelines from 10 languages to 18.
We met more of you at the Google Sitemaps lunch at Boston Pubcon.
Matt talked about the new caching proxy.
We talked to many of you at SES Toronto.

May
Matt introduced you to our new search evangelist, Adam Lasnik.
We hung out with some of you in our hometown at Search Engine Watch Live Seattle and over at SES London.

June

We launched user surveys, to learn more about how you interact with webmaster tools.
We expanded some of our features, such as:
  • increased the number of crawl errors shown to 100% within the last two weeks
  • Increased the number of Sitemaps you can submit from 200 to 500
  • Expanded query stats so you can see them per property and per country and made them available for subdirectories
  • Increased the number of common words in your site and in links to your site from 20 to 75
  • Added Adsbot-Google to the robots.txt analysis tool
Yahoo! Stores incorporated Sitemaps for their merchants.

July
We expanded into Polish.
We began supporting the <meta name="robots" content="noodpt"> tag to allow you to opt out of using Open Directory titles and descriptions for your site in the search results.
We had a great time talking to many of you about international issues at SES Latino in Miami.

August
August was an exciting month for us, as we launched webmaster central! As part of that, we renamed Google Sitemaps to webmaster tools, expanded our Google Group to include all types of webmaster topics, and expanded the help content in our webmaster help center. We also launched some new features, including:
  • Preferred domain control
  • Site verification management
  • Downloads of query stats for all subfolders
In addition, I took over the GoodKarma podcast on webmasterradio for two shows (one all about Buffy the Vampire Slayer!) and we met even more of you at the Google Webmaster Central lunch at SES San Jose.

September
We improved reporting of the cache date in search results.
We provided a way for you to authenticate Googlebot.
And we started updating query stats more often and for a shorter timeframe.

October
We launched several new features, such as:
  • Crawl rate control
  • Googlebot activity reports
  • Opting in to enhanced image search
  • Display of the number of URLs submitted via a Sitemap
And you could hear Matt being interviewed in a podcast.

November
We launched sitemaps.org, for joint support of the Sitemaps protocol between us, Yahoo!, and Microsoft.
We also started notifying you if we flagged your site for badware and if you're an English news publisher included in Google News, we made News Sitemaps available to you.
Partied with lots of you at "Safe Bets with Google" at Pubcon Las Vegas.
We introduced you to our new Sitemaps support engineer, Maile Ohye, and our first webmaster trends analyst, Jonathan Simon.

Dec
We met even more of you at the webmaster central lunch at SES Chicago.

Thanks for spending the year with us. We look forward to even more collaboration and communication in the coming year.