Collaborative Filtering Based Recommender

A Survey of Collaborative Filtering Based Recommender Systems for Mobile Internet Applications

[pdf-embedder url=”https://wellapets.com/wp-content/uploads/2019/06/A-Survey-of-Collaborative-Filtering-Based.pdf” title=”A Survey of Collaborative Filtering Based”]

Abstract—With the rapid development and application of the
mobile Internet, huge amounts of user data are generated and collected
every day. How to take full advantages of these ubiquitous
data is becoming the essential aspect of a recommender system.
Collaborative filtering (CF) has been widely studied and utilized
to predict the interests of mobile users and to make proper
recommendations. In this paper, we first propose a framework of
CF recommender system based on various user data including
user ratings and user behaviors. Key features of these two kinds
of data are discussed. Moreover, several typical CF algorithms
are classified as memory-based approaches and model-based
approaches and compared. Two case studies are presented in
an effort to validate the proposed framework.
Index Terms—Mobile Internet, Recommender System, Collaborative
Filtering.

INTRODUCTION
Along with the rapid development of mobile Internet and
cloud computing, massive amounts of data are produced every
day by both people and machines. Our society has already
entered the era of Big Data [1]. Thanks to the various smart
devices and mobile applications, Internet users can acquire
all sorts of information about education, shopping, social
activity, etc. [2] [3] [4] [5]. However, as the volume of data
increases, individuals have to face the problem of excessive
information, which makes it more difficult to make the right
decisions. This phenomenon is known as information overload
[6]. Moreover, limited by the input ability of mobile devices,
users are usually unwilling to type in lots of words to describe
what they want. Recommender system can alleviate these
problems by effectively finding users’ potential requirements
and selecting desirable items from a huge amount of candidate
information. Recommender systems are usually classified into
two categories, i.e., content-based and collaborative filtering
(CF) [7].
Content-based recommender system utilizes the contents of
items and finds the similarities among them. After analyzing
sufficient numbers of items that one user has already shown favor to, the user interests profile is established. Then the
recommender system could search the database and choose
proper items according to this profile. The difficulty of these
algorithms lies in how to find user preferences based on the
contents of items. Many approaches have been developed to
solve this problem in the areas of data mining or machine
learning. For example, in order to recommend some articles
to a specific reader, a recommender system firstly obtains all
the books this reader has already read and then analyzes their
contents. Key words can be extracted from the text with the
help of text mining methods, such as the well-known TF-IDF
[8]. After integrating all the key words with their respective
weights, a book can be represented by a multi-dimensional
vector. Specific clustering algorithms can be implemented to
find the centers of these vectors which represent the interests
of this reader.
On the other hand, collaborative filtering (CF) has become
one of the most influential recommendation algorithms [9].
Unlike the content-based approaches, CF only relies on the
item ratings from each user. It is based on the assumption that
users who have rated the same items with similar ratings are
likely to have similar preferences. CF is specifically designed
to provide recommendations when detailed information about
the users and items is inaccessible. Furthermore, it successfully
mitigates the problem of over-specialization [10], which is
quite common in content-based systems. Over-specialization
is the phenomenon that recommended items are always much
the same and the diversity of recommendations is neglected.
As CF makes recommendations according to the neighborhood
(people with similar preferences), the item one user has
consumed may be something new to his neighbors. The above
features are particularly attractive which make CF algorithms
extensively employed in recommender systems.
However, to the best of our knowledge, very few studies
have revealed the common features of the various CF algorithms
for mobile Internet applications. In addition, most of
the existing surveys merely introduce the principles of CF
algorithms, ignoring the importance of case study, which can
demonstrate the performances of typical algorithms visually
and specifically. Therefore, this paper focuses on collaborative
filtering based recommender systems for mobile Internet
applications. In particular, main contributions of this paper are
highlighted as follows:
 We introduce a general framework of CF recommender
system. This framework assists recommender developers to utilize the gathered data and to generate proper recommendations.
The features of data collected from both
user behaviors and user ratings are also discussed and
compared.
 CF algorithms are classified. Main procedures of CF are
briefly summarized and introduced.
 Two case studies are presented to validate the proposed
framework. Evaluations on representative CF algorithms
are conducted based on real-world datasets with detailed
analysis and comparison.
The rest of this paper is organized as follows. Section II
presents the framework of CF. Both classification and main
procedures of typical CF algorithms are introduced in Section
III. In Section IV, we conduct two case studies based on realworld
datasets in order to analyze the performances to utilize the gathered data and to generate proper recommendations.
The features of data collected from both
user behaviors and user ratings are also discussed and
compared.
 CF algorithms are classified. Main procedures of CF are
briefly summarized and introduced.
 Two case studies are presented to validate the proposed
framework. Evaluations on representative CF algorithms
are conducted based on real-world datasets with detailed
analysis and comparison.
The rest of this paper is organized as follows. Section II
presents the framework of CF. Both classification and main
procedures of typical CF algorithms are introduced in Section
III. In Section IV, we conduct two case studies based on realworld
datasets in order to analyze the performances of CF
algorithms. Finally, Section V concludes this paper.

Categories