Shuyi Wang

Work place: Business School, Nankai University, Tianjin, China

E-mail: wshuyi@mail.nankai.edu.cn

Website:

Research Interests: Bioinformatics, Computer systems and computational processes, Computational Learning Theory

Biography

Shuyi Wang was born in Tianjin, China, in 1982. He received his Master's degree in Computer Science from Tianjin University, Tianjin, China, in 2007. He is a Ph.D. candidate of Business School at Nankai University since July 2008. His main publications include: "Storing and indexing RDF data in a column-oriented DBMS" (In Proc. of the 2nd International Workshop on Database Technology and Applications, 2010), "SVM-Based Models for Predicting WLAN Traffic" (in Proc. of the IEEE 2006 International Conference on Communications, 2006), "Throughput Analysis of IEEE 802.11-based Ad hoc Networks in Presence of Selfish Node Networks" (In Proc. of International Symposium on Information Technologies and Communications, 2006) . His current research interests are competitive intelligence and information management.He is an Assistant Professor of the School of Computer Science and Technology, Tianjin University, Tianjin, China, since January 2010. His current research interests are bioinformatics and machine learning. Mr. Wang is a student member of Association for Computing Machinery (ACM) and China Computer Federation (CCF). 

Author Articles
CHex: An Efficient RDF Storage and Indexing Scheme for Column-Oriented Databases

By Xin Wang Shuyi Wang Pufeng Du Zhiyong Feng

DOI: https://doi.org/10.5815/ijmecs.2011.03.08, Pub. Date: 8 Jun. 2011

As increasingly large RDF data sets are being published on the Web, effcient RDF data management has become an essential factor in realizing the Semantic Web vision. However, most existing RDF storage schemes, which are built on top of row-store relational databases, are constrained in terms of efficiency and scalability. Still, the growing popularity of the RDF format used in real-world applications arguably calls for an effort to deal with these drawbacks. In this paper, we propose a novel RDF storage and indexing scheme, called CHex, which uses the triple nature of RDF as an asset to implement sextuple indexing for a column-oriented database system. Using binary association tables (BATs) in the column-oriented data model, RDF data is indexed in six possible ways, one for each possible ordering of the three RDF elements. The sextuple indexing scheme in a column-oriented database not only provides efficient single triple pattern lookups, but also allows fast merge-joins for any pair of two triple patterns. To evaluate the performance of our approach, we generate large-scale data sets upto 13 million triples, and devise benchmark queries that cover important RDF join patterns. The experimental results show that our approach outperforms the row-oriented database systems by upto an order of magnitude and is even competitive to the best state-of-the-art native RDF store.

[...] Read more.
Other Articles