LaunchpadGooglification

Differences between revisions 2 and 3
Revision 2 as of 2005-04-24 05:47:33
Size: 2775
Editor: intern146
Comment:
Revision 3 as of 2005-04-25 00:52:38
Size: 2919
Editor: intern146
Comment: Add an outstanding issue. (I guess that makes me a contributer...)
Deletions are marked like this. Additions are marked like this.
Line 8: Line 8:
  * Contributors: MarkShuttleworth[[BR]]   * Contributors: MarkShuttleworth, AndrewBennetts[[BR]]
Line 70: Line 70:

 * is there a performance issue with a web crawler hitting every single page
   in Launchpad? Are we ready for that load?

Making Launchpad Google-friendly

Status

Introduction

This spec identifies issues related to search engine crawling the Launchpad web site, and making sure that the entire site is discoverable from the home page.

Rationale

We have a ton of very interesting content in Launchpad, and we also have a very neat URL schema. We need to make sure that Google and other search engines can crawl the entire web site, starting from the home page, without depending on outside links to interesting pages.

Scope and Use Cases

Google starts with the home page. From there, it should be possible to walk a list of every product, every project, every distro, every package, every branch, every bug, every bounty and every translation.

Implementation Plan

Currently, we have a few bottlenecks in the process for anyone crawling our site. For example, we don't publish a list of every product, with links to the individual product pages. We only have a search interface for "products", and then we give a list of matching products. So the search engine can't penetrate past that search box, because it has no idea what to put in there and "submit". In fact, doing an "empty" search would produce a list of all products, but the search engine will almost certainly never simulate a form post.

We need to identify all such bottlenecks and make sure that we have a way to navigate past them. For example, if the product search page had a link saying "Show All Products" that took one to a list of all products, linked to their product pages in Launchpad, then Google could bypass the form, follow that link, then proceed to index each of the product pages individually.

User Interface Requirements

The following areas are bottlenecks:

  1. the /products/ search page. It is recommended that we implement a "show all products" link which does exactly that.
  2. the /projects/ page. It is recommended that we implement a "show all projects" link which takes the viewer to a page listing all projects with links to the individual project pages.

Outstanding Issues

  • should these "show all" pages be one-long-page, or should they be
    • batched? It's not likely that a human will have any use for a batched interface of all products in any event, so it may be best to do it all as a single page.
  • is there a performance issue with a web crawler hitting every single page
    • in Launchpad? Are we ready for that load?

UbuntuDownUnder/BOFs/LaunchpadGooglification (last edited 2008-08-06 16:39:12 by localhost)