PDF | PostScript | doi:10.1613/jair.3543
Distributions over rankings are used to model data in a multitude of real world settings such as preference analysis and political elections. Modeling such distributions presents several computational challenges, however, due to the factorial size of the set of rankings over an item set. Some of these challenges are quite familiar to the artificial intelligence community, such as how to compactly represent a distribution over a combinatorially large space, and how to efficiently perform probabilistic inference with these representations. With respect to ranking, however, there is the additional challenge of what we refer to as human task complexity � users are rarely willing to provide a full ranking over a long list of candidates, instead often preferring to provide partial ranking information.
Simultaneously addressing all of these challenges � i.e., designing a compactly representable model which is amenable to efficient inference and can be learned using partial ranking data � is a difficult task, but is necessary if we would like to scale to problems with nontrivial size. In this paper, we show that the recently proposed riffled independence assumptions cleanly and efficiently address each of the above challenges. In particular, we establish a tight mathematical connection between the concepts of riffled independence and of partial rankings. This correspondence not only allows us to then develop efficient and exact algorithms for performing inference tasks using riffled independence based represen- tations with partial rankings, but somewhat surprisingly, also shows that efficient inference is not possible for riffle independent models (in a certain sense) with observations which do not take the form of partial rankings. Finally, using our inference algorithm, we introduce the first method for learning riffled independence based models from partially ranked data.