1/5/12 | 9:00:00 AM
Labels: search quality
Today we’re continuing our monthly series with details about many of the improvements we make to search. For the month of December, you’ll find a list of 30 search improvements, 9 of which we’ve blogged about previously. In addition, to have a little fun we’re including a sampling of codenames along with the list.
Codenames make changes easier to talk about and remember, and they can also be a lot of fun. You might remember “Panda” and “Caffeine,” but you probably don’t remember last month’s “Top result selection code rewrite.” That’s why many of the search quality improvements we make have internal codenames.
To give you just one example, our old question-answering feature in search was codenamed “DAFFIE,” which stood for the “Database of All Fact Fiction Information and Exaggeration.” In 2010 the team did a complete overhaul of the system and released a new short answers feature. Amit Singhal, thinking of Daffy Duck, decided to codename the new system “Porky Pig”, because Porky Pig was trying to kill Daffy Duck. The team laughed thinking that Amit was just confused (everyone knows Elmer Fudd is the hunter). But, it turns out Amit was right, as he often is. In 1937 in the original cartoon to feature Daffy Duck, Porky Pig was in fact hunting Daffy.
Here’s the list for December:
- Image Search landing page quality signals. [launch codename “simple”] This is an improvement that analyzes various landing page signals for Image Search. We want to make sure that not only are we showing you the most relevant images, but we are also linking to the highest quality source pages.
- More relevant sitelinks. [launch codename “concepts”, project codename “Megasitelinks”] We improved our algorithm for picking sitelinks. The result is more relevant sitelinks; for example, we may show sitelinks specific to your metropolitan region, which you can control with your location setting.
- Soft 404 Detection. Web servers generally return the 404 status code when someone requests a page that doesn’t exist. However, some sites are configured to return other status codes, even though the page content might explain that the page was not found. We call these soft 404s (or “crypto” 404s) and they can be problematic for search engines because we aren’t sure if we should ignore the pages. This change is an improvement to how we detect soft 404s, especially in Russian, German and Spanish. For all you webmasters out there, the best practice is still to always use the correct response code.
- More accurate country-restricted searches. [launch codename “greencr”] On domains other than .com, users have the option to see only results from their particular country. This is a new algorithm that uses several signals to better determine where web documents are from, improving the accuracy of this feature.
- More rich snippets. We improved our process for detecting sites that qualify for shopping, recipe and review rich snippets. As a result, you should start seeing more sites with rich snippets in search results.
- Better infrastructure for autocomplete. This is an infrastructure change to improve how our autocomplete algorithm handles spelling corrections for query prefixes (the beginning part of a search).
- Better spam detection in Image Search. [launch codename “leaf”] This change improves our spam detection in Image Search by extending algorithms we already use for our main search results.
- Google Instant enhancements for Japanese. For languages that use non-Latin characters, many users use a special IME (Input Method Editor) to enter queries. This change works with browsers that are IME-aware to better handle Japanese queries in Google Instant.
- More accurate byline dates. [launch codename “foby”] We made a few improvements to how we determine what date to associate with a document. As a result, you’ll see more accurate dates annotating search results.
- Live results for NFL and college football. [project codename “Live Results”] We’ve added new live results for NFL.com and ESPN’s NCAA Football results. These results now provide the latest scores, schedules and standings for your favorite football teams.
- Improved dataset for related queries. We are now using an improved dataset on term relationships to find related queries. We sometimes include results for queries that are related to your original search, and this improvement leads to results from more relevant related queries.
- Related query improvements. [launch codename “lyndsy”] Sometimes we fetch results for queries that are related to the original query but have fewer words. We made several changes to our algorithms to make them more conservative and less likely to introduce results without query words.
- Better lyrics results. [launch codename “baschi”, project codename “Contra”] This change improves our result quality for lyrics searches.
- Tweak to +1 button on results page. As part of our continued effort to deliver a beautifully simple user experience across Google products, we’ve made a subtle tweak to how the +1 button appears on the results page. Now the +1 button will only appear when you hover over a result or when the result has already been +1’d.
- Better spell correction in Vietnamese. [project codename “Pho Viet”] We launched a new Vietnamese spelling model. This will help give more accurate spelling predictions for Vietnamese queries.
- Upcoming events at venues. We've improved the recently released places panel for event venues. For major venues, we now show up to three upcoming events on the right of the page. Try it for [staples center los angeles] or [paradise rock club boston].
- Improvements to image size signal. [launch codename “matter”] This is an improvement to how we use the size of images as a ranking signal in Image Search. With this change, you’ll tend to see images with larger full-size versions.
- Improved Hebrew synonyms. [launch codename “SweatNovember”, project codename “Synonyms”] This update refines how we handle Hebrew synonyms across multiple languages. Context matters a lot for translation, so this change prevents us from using translated synonyms that are not actually relevant to the query context.
- Safer searching. [launch codename “Hoengg”, project codename "SafeSearch"] We updated our SafeSearch tool to provide better filtering for certain queries when strict SafeSearch is enabled.
- Encrypted search available on new regional domains. Google now offers encrypted search by default on google.com for signed-in users, but it’s not the default on our other regional domains (eg: google.fr for France). Now users in the UK, Germany and France can opt in to encrypted search by navigating directly to an SSL version of Google Search on their respective regional domains: https://www.google.co.uk, https://www.google.de and https://www.google.fr.
- Faster mobile browsing. [launch codename “old possum”, project codename “Skip Redirect”] Many websites redirect smartphone users to another page that is optimized for smartphone browsers. This change uses the final smartphone destination url in our mobile search results, so you can bypass all the redirects and load the target page faster.
For completeness, here’s a recap of improvements we’ve already blogged about since last time:
- Flight results on google.com
- Graphing calculator
- Google Goggles 1.7
- Tablet image results carousel view
- Updated maps for UK, Germany, Finland and Sweden
- Faster movie search on mobile
- Public Data Explorer revamp
- Author Stats in Webmaster Tools
- Smartphone Googlebot-Mobile