Competitive Collaborative Learning
The quality of the products or resources may vary over time.
Some of the users in the system may be dishonest, manipulating their actions in a Byzantine manner to achieve other goals.
Such algorithms are a fundamental primitive required by applications such as reputation systems in e-commerce, personalized web search, locating resources in peer-to-peer networks, or spam filtering. Note that in all of these applications, it is essential to deal with resources whose quality changes over time (e.g. a seller on eBay may perform perfectly in some transactions but may delay or withhold delivery in others) and to deal with the presence of dishonest agents in the system (again using the eBay example, a seller may cooperate with a ``shill'' whose objective is to boost that seller's reputation).
Our problem can be viewed as unification or generalization of a number of different research frameworks, including -- Online learning theory, specifically the multi-armed bandit problem -- Collaborative filtering and recommendation
We provide algorithms that enable honest users with similar taste to dynamically adapt their choices, so that in the aggregate, their performance nearly matches that of the single best static action.
The main result in this paper is a new algorithm for trust and collaborative filtering, whose convergence time (defined, roughly, as the number of steps required to become constant-competitive) is polylogarithmic in the problem size. Previous methods suffered from a linear convergence time.
Joint work with R. Kleinberg