Recommendation Systems

(December 10, 2016) Update: I am no longer actively working on recommendation systems, except in the context of visualization recommendation. Please head on over to the website to find out more. This page is no longer actively maintained.

Needletail

Large-Scale Recommendation Systems

Recommendation systems are ubiquitous in e-commerce and internet applications. However, most recommender systems have been tuned to achieve low error rates for straightforward prediction tasks (predict the rating for a movie, predict whether the item will be purchased, etc.)  —  e.g., the Netflix Prize, but in many domains, recommendation systems forms a multidimensional problem with several constraints.

We have been looking at some of these problems from the perspective of course recommendations, for the CourseRank project at Stanford. (CourseRank was launched as a startup company in 2010, and then acquired by Chegg. The CourseRank technology is being used at over 500 universities in the United States.) Course recommendations poses an especially tricky problem, because there are several dimensions involved.

Firstly, there are the problem of requirements. Every university program stipulates that in order to graduate the student needs to satisfy a set of requirements. These requirements could read like:

take 2 courses from {a, b, c, d}, but b and c together don't count.

Next, we looked at the problem of prerequisites. For example, the course linear algebra needs to be taken before calculus. When we make recommendations, we need to make sure that we recommend a package of items such that the prerequisites are present in the package itself. In general, we could have a graph of prerequisites, with OR (take either a or b to be eligible to take c) as well as AND prerequisites.

Third, people have looked at sequence mining with the aim of generating rules (e.g., a purchase of Diapers implies a purchase of Beer), but no one has used these ideas to form an aggregated recommendation (again, highly relevant for courses). We use probabilities and approximations to solve the problem of temporality in recommendations.

Lastly, in a viewpoints article at CACM, we suggested how new companies can significantly benefit from designing a combined back-end infrastructure to support search, recommendations and advertising, and how the three technologies are rapidly converging; differing if at all on a small number of aspects.

  1. PAPER Information Seeking: Convergence of Search, Recommendations and Advertising.
    Hector Garcia-Molina, Georgia Koutrika and Aditya Parameswaran. Communications of the ACM, Viewpoint Article. November 2011
  2. PAPER Recommendation Systems with Complex Constraints: A CourseRank Perspective.
    Aditya Parameswaran, Petros Venetis and Hector Garcia-Molina. ACM Transactions on Information Systems, Volume 29(4). November 2011
  3. PAPER Evaluating, Combining and Generalizing Recommendations with Prerequisites.
    Aditya Parameswaran, Hector Garcia-Molina and Jeffrey D. Ullman,. 19th International Conf. on Information and Knowledge Management (CIKM), Toronto, Canada. October 2010
  4. PAPER Recsplorer: Recommendation Algorithms Based on Precedence Mining.
    Aditya Parameswaran, Georgia Koutrika, Benjamin Berkovitz and Hector Garcia-Molina,. SIGMOD International Conf. on Management of Data, Indianapolis, USA. June 2010
  5. PAPER Social Sites Research Through CourseRank.
    Benjamin Berkovitz, Filip Kaliszan, Georgia Koutrika, Henry Liou, Aditya Parameswaran, Petros Venetis, Zahra Mohammadi Zadeh and Hector Garcia-Molina,. SIGMOD Record, Volume XXX. December 2009
  6. PAPER Recommendations with Prerequisites (Short Paper).
    Aditya Parameswaran and Hector Garcia-Molina. 3rd ACM Conference on Recommender Systems, New York, USA. October 2009