EMR: A Scalable Graph-based Ranking Model for Content-based Image Retrieval

Abstract—Graph-based ranking models have been widely applied in information retrieval area. In this paper, we focus on a well known graph-based model – the Ranking on Data Manifold model, or Manifold Ranking (MR). Particularly, it has been successfully applied to content-based image retrieval, because of its outstanding ability to discover underlying geometrical structure of the given image database. However, manifold ranking is computationally very expensive, which significantly limits its applicability to large databases especially for the cases that the queries are out of the database (new samples). We propose a novel scalable graph-based ranking model called Efficient Manifold Ranking (EMR), trying to address the shortcomings of MR from two main perspectives: scalable graph construction and efficient ranking computation. Specifically, we build an anchor graph on the database instead of a traditional k-nearest neighbor graph, and design a new form of adjacency matrix utilized to speed up the ranking. An approximate method is adopted for efficient out-of-sample retrieval. Experimental results on some large scale image databases demonstrate that EMR is a promising method for real world retrieval  applications.
GRAPH-BASED ranking models have been deeply studied and widely applied in information retrieval area. In this paper, we focus on the problem of applying a novel and efficient graph-based model for contentbased image retrieval (CBIR), especially for out-of-sample retrieval on large scale databases.
Traditional image retrieval systems are based on keyword search, such as Google and Yahoo image search. In these systems, a user keyword (query) is matched with the context around an image including the title, manual annotation, web document, etc. These systems don’t utilize information from images. However these systems suffer many problems, such as shortage of the text information and inconsistency of the meaning of the text and image. Content-based image retrieval is a considerable choice to overcome these difficulties. CBIR has drawn a great attention in the past two decades [1]–[3]. Different from traditional keyword search systems, CBIR systems utilize the low-level features, including global features (e.g., color moment, edge histogram, LBP [4]) and local features (e.g., SIFT [5]), automatically extracted from images. A great amount of researches have been performed for designing more informative low-level features to represent images, or better metrics (e.g., DPF [6]) to measure the perceptual similarity, but their performance is restricted by many conditions and is sensitive to the data. Relevance feedback [7] is a useful tool for interactive CBIR. User’s high level perception is captured by dynamically updated weights based on the user’s feedback.
Most traditional methods focus on the data features too much but they ignore the underlying structure information, which is of great importance for semantic discovery, especially when the label information is unknown. Many databases have underlying cluster or manifold structure.
Under such circumstances, the assumption of label consistency is reasonable [8], [9]. It means that those nearby data points, or points belong to the same cluster or manifold, are very likely to share the same semantic label. This phenomenon is extremely important to explore the semantic relevance when the label information is unknown. In our opinion, a good CBIR system should consider images’ lowlevel features as well as the intrinsic structure of the image database.
Manifold Ranking (MR) [9], [10], a famous graph-based ranking model, ranks data samples with respect to the intrinsic geometrical structure collectively revealed by a large number of data. It is exactly in line with our consideration. MR has been widely applied in many applications, and shown to have excellent performance and feasibility on a variety of data types, such as the text [11], image [12], [13], and video[14]. By taking the underlying structure into account, manifold ranking assigns each data sample a relative ranking score, instead of an absolute pairwise similarity as traditional ways. The score is treated as a similarity metric defined on the manifold, which is more meaningful to capturing the semantic relevance degree. He et al. [12] firstly applied MR to CBIR, and significantly improved image retrieval performance compared with state-of-the-art algorithms.
However, manifold ranking has its own drawbacks to handle large scale databases – it has expensive computational cost, both in graph construction and ranking computation stages. Particularly, it is unknown how to handle an out-of-sample query (a new sample) efficiently under the existing framework. It is unacceptable to recompute the model for a new query. That means, original manifold ranking is inadequate for a real world CBIR system, in which the user provided query is always an out-of-sample.
In this paper, we extend the original manifold ranking and propose a novel framework named Efficient Manifold Ranking (EMR). We try to address the shortcomings of manifold ranking from two perspectives: the first is scalable graph construction; and the second is efficient computation, especially for out-of-sample retrieval. Specifically, we build an anchor graph on the database instead of the traditional k-nearest neighbor graph, and design a new form of adjacency matrix utilized to speed up the ranking computation. The model has two separate stages: an offline stage for building (or learning) the ranking model and an online stage for handling a new query. With EMR, we can handle a database with 1 million images and do the online retrieval in a short time. To the best of our knowledge, no previous manifold ranking based algorithm has run out-of-sample retrieval on a database in this scale.