This page is part of the Proceedings of Wikimania 2007 (Index of presentations)

Mining the Bipolar Orientation of Web Reviews

Creators ChienChin Chen (Institute of Information Science, Academia Sinica), MengChang Chen (Institute of Information Science, Academia Sinica)
Track Technical Infrastructure
License GNU Free Documentation License (details)
About the creators
Presenters/ChienChin Chen/Biography
Presenters/MengChang Chen/Biography
Abstract
The mechanism of Web2.0 encourages the Web users to devote themselves to knowledge construction of a specific topic. For example, the online encyclopedia Wikipedia consists of thousands of valuable articles contributed by numerous knowledge providers. Online E-commerce websites, such as Amazon, allow users post their reviews about products which are helpful to the subsequent users to judge the product quality. Usually, knowledge providers come with different cultures and background knowledge so that the reviews they composed may comprise different opinions. As a result, the aggregated knowledge may involve perspectives of different orientations. Moreover, popular topics or products would receive a lot of reviews that the aggregated knowledge can be huge and incomprehensible to users. Hence, it would be a great benefit for users if the perspectives of the reviews are well organized.

Reviews on the Web are represented by a set of Web documents. Mining perspectives embedded in a set of documents is a popular text mining problem. Mining methods, such as k-means clustering algorithm or latent semantic indexing, partition the documents into content coherent clusters that each of which represents a perspective of the documents. However, in Web2.0, reviews left by knowledge providers might be in opposing due to the culture difference. While previous mining methods lack a mechanism to identify the opposition in the text, we provide a method to discover the bipolar orientation. In this work, we utilize statistic principal components analysis (PCA) technique to find out the bipolar orientation embedded in the text. PCA constructs the covariance (or correlation coefficient) matrix of the reviews and treat the eigenvectors of the matrix as meaningful perspectives. Then, the bipolar orientation of perspectives can be extracted by analyzing the constitution of the eigenvectors. The results can be very useful for users in comprehend the aggregated online knowledge.

0Missing
1Submitted
2Editing
3Review
4Final edit
5Complete
6Done


Full text


PDF


Notes


Slides


Audio


Video

Discuss