PersonalizedPackageRatings

Please check the status of this specification in Launchpad before editing it. If it is Approved, contact the Assignee or another knowledgeable person before making changes.

Summary

Personalized package ratings in gnome-app-install instead of general popularity using a recommender systems approach.

Recommender systems are a specific type of information filtering (IF) technique that attempt to present to the user information items (movies, music, books, news, web pages) the user is interested in. To do this the user's profile is compared to some reference characteristics. (quoted from wikipedia)

Rationale

  • Amazon and netflix use recommender systems to increase their profit. gnome-app-install can use a personalized package rating to make it more easy for users to find nice software.

Potential benefits :

  • users might be more likely to use popularity-contest
  • it will be a bit easier for users to find packages

Use Cases

  • John is bored and wants to find nice new software. He uses gnome-app-install to sort all packages by popularity and gets suggestions for "programming packages" while he isn't interested in programming at all. If this spec would be implemented John would get a personalized list of suggestions for nice packages.
  • Sara is bored and searches for a nice game to install. She uses gnome-app-install and sorts the game section by popularity. If this spec would be implemented there's a little bit more chance that the game Sara tries appeals to her.
  • Peter is maintainer for package X and wants to get a rough idea of it's popularity. Personalized package ratings might increase the use of popularity-contest.

Scope

Design

  • for users which have popularity-contest enabled the popularity as seen in gnome-app-install can be personalized.
  • no UI changes needed
  • only a very small change in gnome-app-install needed
  • the person who implements this spec needs access to the gathered popcon data
  • client based or server based calculation of per-user-popularity-vector
    • client based calculation might have privacy issues. (client gets direct access to all available vectors/users)
    • client based : each client needs to download x * y bits of information where x is vector size (number of packages in vector) and y is the number of users in the popcon database. client based doesn't seem like a nice solution.
    • server based might cost too much cpu power. need to investigate how much.

Possible Drawbacks

Privacy Issues

  • privacy issues are the same as for popularity-contest since no other information is necessary.
  • the person who implements this spec needs access to the gathered popcon data

Computational cost / scaling issues

  • Probably we can do something smart.
  • maybe make the vectors smaller for example by removing ubuntu-desktop,kubuntu-desktop,only having gui packages in there and/or using PCA.
    • For example using PCA to reduce the user vectors to 500 items (after we find the closest neighbours we can bring in the entire vectors) it will consider 6 mb of uncompressed data to find the closest neighbours for 100.000 users

Implementation

  • multiple techniques available. It's probably best to implement a few ideas to discover which one works best for this dataset.

a possible straightforward approach

  • represent the packages a user has installed as a vector. (1 if a user has installed the package otherwise 0).
  • find closest vectors using some metric within a specific range (for example euclidian distance,cos of angle between vectors,..).
  • sum these vectors we just found and divide the result by the number of vectors in the summation. The resulting vector will contain numbers which represent the popularity of the package.

Outstanding Issues

BoF agenda and discussion

[SebastianHeinlein] Perhpas debtags could be a good data base. Enrico Zini works on this:

Enrico: These are general info about debtags: http://www.enricozini.org/2007/paper-debconf7/index.html The debtags svn repo is at svn://svn.debian.org/debtags Various popcon code is at bzr branch http://people.debian.org/~enrico/2007-01/popcon/ My blog entries with pointers to more can be found at: http://www.enricozini.org/tags/debtags.html


CategorySpec

PersonalizedPackageRatings (last edited 2008-08-06 16:22:17 by localhost)