Abstract
Enhancing diversity in ranking on graphs has been identified as an important retrieval and mining task. Nevertheless, many existing diversified ranking algorithms either cannot be scalable to large graphs due to the time or memory requirements, or lack an intuitive and reasonable diversified ranking measure. In this paper, we propose a new diversified ranking measure on large graphs, which captures both relevance and diversity, and formulate the diversified ranking problem as a submodular set function maximization problem. Based on the submodularity of the proposed measure, we develop an efficient greedy algorithm with linear time and space complexity w.r.t. The size of the graph to achieve near-optimal diversified ranking. In addition, we present a generalized diversified ranking measure and give a near-optimal randomized greedy algorithm with linear time and space complexity for optimizing it. We evaluate the proposed methods through extensive experiments on five real data sets. The experimental results demonstrate the effectiveness and efficiency of the proposed algorithms.
Original language | English |
---|---|
Article number | 6276206 |
Pages (from-to) | 2133-2146 |
Number of pages | 14 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 25 |
Issue number | 9 |
DOIs | |
Publication status | Published - 2013 |
Externally published | Yes |
Keywords
- Diversified ranking
- Flajolet-Martin sketch
- Graph algorithms
- Scalability
- Submodular function