CCIndex: a Complemental Clustering Index on Distributed Ordered Tables for Multi-dimensional Range Queries - Network and Parallel Computing
Conference Papers Year : 2010

CCIndex: a Complemental Clustering Index on Distributed Ordered Tables for Multi-dimensional Range Queries

Abstract

Massive scale distributed database like Google's BigTable and Yahoo!'s PNUTS can be modeled as Distributed Ordered Table, or DOT, which partitions data regions and supports range queries on key. Multi-dimensional range queries on DOTs are fundamental requirements; however, none of existing schemes work well while considering three critical issues: high performance, low space overhead, and high reliability. This paper introduces CCIndex scheme, short for Complemental Clustering Index, to solve all three issues. CCIndex creates several Complemental Clustering Index Tables for performance, leverages region-to-server information to estimate result size, and supports incremental data recovery. This paper builds a prototype on Apache HBase. Theoretical analysis and micro-benchmarks show that CCIndex consumes 5.3% ~ 29.3% more space, has the same reliability, and gains 11.4 times range queries throughput of secondary index scheme. Synthetic application benchmark shows that CCIndex query throughput is 1.9 ~ 2.1 times of MySQL Cluster.
Fichier principal
Vignette du fichier
NPC10-_1569306987.pdf (687.67 Ko) Télécharger le fichier
Origin Files produced by the author(s)
Loading...

Dates and versions

hal-01054987 , version 1 (11-08-2014)

Licence

Identifiers

Cite

Yongqiang Zou, Jia Liu, Shicai Wang, Li Zha, Zhiwei Xu. CCIndex: a Complemental Clustering Index on Distributed Ordered Tables for Multi-dimensional Range Queries. IFIP International Conference on Network and Parallel Computing (NPC), Sep 2010, Zhengzhou, China. pp.247-261, ⟨10.1007/978-3-642-15672-4_22⟩. ⟨hal-01054987⟩
233 View
341 Download

Altmetric

Share

More