J Zhou, N Bruno, and W Lin Advanced partitioning techniques for
massively distributed computation Proc ACM SIGMOD International
Conference on Management of Data, pages 13-24, 2012
A Pavlo, C Curino, and S Zdonik Skew-aware automatic database
partitioning in shared-nothing, parallel OLTP systems Proc ACM
SIGMOD International Conference on Management of Data, pages 61-72,
2012
Mohamed Y Eltabakh, Yuanyuan Tian, and Fatma Özcan, Rainer Gemulla,
Aljoscha Krettek, and John McPherson CoHadoop: Flexible Data
Placement and Its Exploitation in Hadoop Proc VLDB, 4(9):
575-585, 2011
Distributed Queries
L Amsaleg, M Franklin, A Tomasic, Dynamic Query Operator
Scheduling for Wide-Area Remote Access, Distributed and Parallel
Databases, 6(3): 217-246, 1998
A Halevy, Answering queries using views: A survey, VLDB J,
10(4): 270-294, 2001
R Avnur and J M Hellerstein, Eddies: Continuously adaptive query
processing, Proc ACM SIGMOD Int Conf on Management of Data,
pages 261-272, 2000
M A Shah, J M Hellerstein, S Chandrasekara, and M J Franklin,
Flux: An adaptive partitioning operator for continuous query
systems, Proc 19th Int Conf On Data Engineering, pages 25-36,
2003
F Tian and D J DeWitt, Tuple routing strategies for distributed
Eddies, Proc 29th Int Conf On Very Large Data Bases, pages
333-344, 2003
J R Thomsen, M L Yiu, and C S Jensen Effective caching of
shortest paths for location-based services Proc ACM SIGMOD
International Conference on Management of Data, pages 313-324,
2012
H Herodotou, N Borisov, and S Babu Query optimization techniques
for partitioned tables Proc ACM SIGMOD International Conference
on Management of Data, pages 49-60, 2011
Distributed Transactions
Alexander Thomson, Thaddeus Diamond, Shu-Chun Weng, Kun Ren, Philip
Shao, Daniel J Abadi: Calvin: fast distributed transactions for
partitioned database systems,Proc ACM SIGMOD Int Conf on
Management of Data, pages 1-12, 2012
Daniel Peng, Frank Dabek: Large-scale Incremental Processing Using
Distributed Transactions and Notifications OSDI, pages 251-264,
2010
Jun Rao, Eugene J Shekita, Sandeep Tata: Using Paxos to Build a
Scalable, Consistent, and Highly Available Datastore Proc VLDB,
4(4): 243-254, 2011
Peter Bailis, Shivaram Venkataraman, Michael J Franklin, Joseph M
Hellerstein, Ion Stoica: Probabilistically Bounded Staleness for
Practical Partial Quorums Proc VLDB, 5(8): 776-787, 2012
A Thomson, T Diamond, S-C Weng, K Ren, P Shao, and Daniel J
Abadi Calvin: fast distributed transactions for partitioned
database systems Proc ACM SIGMOD International Conference on
Management of Data, pages 1-12, 2012
A Pavlo, E PC Jones, and S Zdonik On Predictive Modeling for
Optimizing Transaction Execution in Parallel OLTP Systems Proc
VLDB, 5(2): 85-96, 2012
HT Vo, S Wang, D Agrawal, G Chen, BC Ooi LogBase: A Scalable
Log-structured Database System in the Cloud Proc VLDB, 5(10):
1004-1015, 2012
Stacy Patterson, Aaron J Elmore, Faisal Nawab, Divyakant Agrawal,
Amr El Abbadi Serializability, not Serial: Concurrency Control and
Availability in Multi-Datacenter Datastores Proc VLDB, 5(11):
1459-1470, 2012
Ippokratis Pandis, Pınar Tözün, Ryan Johnson, and Anastasia
Ailamaki PLP: Page Latch-free Shared-everything OLTPProc VLDB,
4(10): 610-621, 2011
Data Replication
Yuri Breitbart, Raghavan Komondoor, Rajeev Rastogi, S Seshadri,
Abraham Silberschatz: Update Propagation Protocols for Replicated
Databases,Proc ACM SIGMOD Int Conf on Management of Data, pages
97-108, 1999
Carlo Curino, Yang Zhang, Evan P C Jones, Samuel Madden:Schism: a
Workload-Driven Approach to Database Replication and Partitioning
Proc VLDB, 3(1): 48-57, 2010
M P Consens, K Ioannidou, J LeFevre, and N Polyzotis Divergent
physical design tuning for replicated databases Proc ACM SIGMOD
International Conference on Management of Data, pages 49-60, 2012
Peter Bailis, Shivaram Venkataraman, Michael J Franklin, Joseph M
Hellerstein, Ion Stoica Probabilistically Bounded Staleness for
Practical Partial Quorums Proc VLDB, 5(8): 776-787, 2012
Sudarshan Kadambi1, Jianjun Chen, Brian F Cooper, David Lomax1,
Raghu Ramakrishnan, Adam Silberstein, Erwin Tam, and Hector
Garcia-Molina Where in the World is My Data?, Proc VLDB, 4(11):
1040-1050, 2011
Parallel Data Management
F Akal, K Böhm, and H-J Schek, OLAP query evaluation in a
database cluster: A performance study on intra-query parallelism,
Proc 6th East European Conf Advances in Databases and Information
Systems, pages 218-231, 2002
U Röhm, K Böhm, and H-J Schek, OLAP query routing and physical
design in a database cluster, Advances in Database Technology,
Proc 7th Int Conf On Extending Database Technology, pages
254-268, 2000
A Lima, M Mattoso, and P Valduriez, OLAP query processing in a
database cluster, Proc 20th Int Euro-Par Conf, pages 355-362,
2004
C Furtado, A Lima, E Pacitti, P Valduriez and M Mattoso,
Physical and virtual partitioning in OLAP database clusters, Proc
Int Symp Computer Architecture and High Performance Computing,
pages 143-150, 2005
C Furtado, A Lima, E Pacitti, P Valduriez and M Mattoso,
Adaptive hybird partitioning for OLAP query processing in a database
cluster, Int J High Perf Comput And Networking, 5(4): 251-262,
2008
H Köhler, J Yang, and X Zhou Efficient parallel skyline
processing using hyperplane projections Proc ACM SIGMOD
International Conference on Management of Data, pages 85-96, 2011
P Upadhyaya, YC Kwon, and M Balazinska A latency and
fault-tolerance optimizer for online parallel query plans Proc
ACM SIGMOD International Conference on Management of Data, pages
241-252, 2011
E Soroush, M Balazinska, and D Wang ArrayStore: a storage
manager for complex parallel array processing Proc ACM SIGMOD
International Conference on Management of Data, pages 253-264,
2011
Martina-Cezara Albutiu, Alfons Kemper, Thomas Neumann Massively
Parallel Sort-Merge Joins in Main Memory Multi-Core Database
Systems Proc VLDB, 5(10): 1064-1075, 2012
Database Integration
R J Miller, L M Haas, and M A Hernandez Schema Mapping as
Query Discovery, In Proc Int Conf on Very Large Data Bases,
2000
A Doan, P Domingos, and A Halevy Learning to Match the Schemas
of Databases: A Multistrategy Approach, Machine Learning, 50(3):
279 - 301, 2003
R McCann, B AlShelbi, Q Le, H Nguyen, L Vu, and A Doan
Maveric: Mapping Maintenance for Data Integration Systems, In Proc
Int Conf on Very Large Data Bases, 2005
H Galhardas, D Florescu, D Shasha, E Simon, and C-A Saita
Declarative Data Cleaning: Language, Models, and Algorithms, In
Proc Int Conf on Very Large Data Bases, 2001
V Raman and J Hellerstein, Potter’s wheel: An interactive data
cleaning system, Proc 27th Int Conf On Very Large Data Bases,
pages 381-390, 2001
S Chaudhuri, K Ganjam, V Ganti, and R Motwani Robust and
Efficient Fuzzy Match for Online Data Cleaning In Proc ACM SIGMOD
Int Conf on Management of Data, 2003
L M Haas, D Kossmann, E L Wimmers, and J Yang Optimizing
Queries Across Diverse Data Sources, In Proc Int Conf on Very
Large Data Bases, pages 276-285, 1997
Zachary G Ives, Daniela Florescu, Marc Friedman, Alon Levy, Daniel
S Weld An Adaptive Query Execution System for Data Integration, In
Proc ACM SIGMOD Int Conf on Management of Data, 1999
Zachary G Ives, Alon Y Halevy, Daniel S Weld Adapting to Source
Properties in Processing Data Integration Queries, Proc ACM SIGMOD
Int Conf on Management of Data, pages 395-406, 2004
L Qian, M J Cafarella, and H V Jagadish Sample-driven schema
mapping Proc ACM SIGMOD Int Conf on Management of Data, pages
73-84, 2012
H Elmeleegy, A Elmagarmid, and J Lee Leveraging query logs for
schema mapping generation in U-MAP Proc ACM SIGMOD Int Conf on
Management of Data, pages 121-132, 2011
B Alexe, B ten Cate, P G Kolaitis, and W-C Tan Designing and
refining schema mappings via data examples Proc ACM SIGMOD Int
Conf on Management of Data, pages 133-144, 2011
M Zhang, M Hadjieleftheriou, B COoi, C M Procopiuc, and D
Srivastava Automatic discovery of attributes in relational
databases Proc ACM SIGMOD Int Conf on Management of Data,
pages 109-120 2011
W Fan, J Li, S Ma, N Tang, and W Yu Interaction between record
matching and data repairing Proc ACM SIGMOD Int Conf on
Management of Data, pages 469-480, 2011
Jiannan Wang, Guoliang Li, Jeffrey Xu Yu, and Jianhua Feng Entity
Matching: How Similar Is Similar, Proc VLDB, 4(10): 622-633,
2011
Vibhor Rastogi, Nilesh N Dalvi, Minos N Garofalakis Large-Scale
Collective Entity Matching Proc VLDB, 4(4): 208-218, 2011
Peer-to-Peer Data Management
A Kementsietsidis, M Arenas, R J Miller: Mapping Data in
Peer-to-Peer Systems Semantics and Algorithmic Issues, In Proc
ACM SIGMOD Int Conf on Management of Data, pages 325-336, 2003
B Yang, H Garcia-Molina, Comparing Hybrid Peer-to-Peer Systems, In
Proc of 27th International Conference on Very Large Data Bases,
2001
A Crespo, H Garcia-Molina, Routing Indices For Peer-to-Peer
Systems, In Proc International Conference on Distributed Computing
Systems, 2002
S Ratnasamy, P Francis, M Handley, R Karp, S Shenker, A
Scalable Content-Addressable Network, In Proc ACM SIGCOMM Conf on
Applications, Technologies, Architectures, and Protocols for
Computer Communication, 2001
BY Zhao, L Huang, J Stribling, SC Rhea, A D Joseph, and JD
Kubiatowicz, Tapestry: A Resilient Global-Scale Overlay for Service
Deployment, IEEE J on Selected Areas in Comm, 22(1), January
2004
K Aberer, P Cudre-Mauroux, and M Hauswirth, The Chatty Web:
Emergent Semantics Through Gossiping, In Proc 12th Int World Wide
Web Conf, 2003
B Gedik and L Liu, PeerCQ: A Decentralized and Self-Configuring
Peer-to-Peer Information Monitoring System In Proc 23rd Int
Conf on Distributed Computing Systems, 2003
WS Ng, B C Ooi, K-L Tan, and A Zhou, PeerDB: A P2P-based System
for Distributed Data Sharing In Proc 19th Int Conf on Data
Eng, 2003
P Kalnis, WS Ng, B C Ooi, D Papadias, and K-L Tan, An
Adaptive Peer-to-Peer Network for Distributed Caching of OLAP
Results, In Proc ACM SIGMOD Int Conf on Management of Data,
2002
Stream Data Management
C Cranor, T Johnson, O Spatscheck, and V Shkapenyuk Gigascope:
high performance network monitoring with an SQL interface In Proc
ACM SIGMOD Int Conf on Management of Data, pages 647-651, 2003
E Rundensteiner, L Ding, T Sutherland, Y Zhu, B Pielech, and N
Mehta CAPE: continuous query engine with heterogeneous-grained
adaptivity In Proc 30th Int Conf on Very Large Data Bases,
pages 1353- 1356, 2004
D Abadi, Y Ahmad, M Balazinska, U Cetintemel, M Cherniack, J-H
Hwang, W Lindner, A Rasin, N Tatbul, Y Xing, and S Zdonik The
design of the Borealis stream processing engine In Proc 1st
Biennial Conf on Innovative Data Syst Res, 2005
L Golab and M T Özsu Update-pattern-aware modeling and
processing of continuous queries In Proc ACM SIGMOD Int Conf on
Management of Data, pages 658-669, 2005
A Ayad and J Naughton Static optimization of conjunctive queries
with sliding windows over unbounded streaming information sources
In Proc ACM SIGMOD Int Conf on Management of Data, pages 419-
430, 2004
M Datar, A Gionis, P Indyk, and R Motwani Maintaining stream
statistics over sliding windows In Proc 13th SIAM-ACM Symp on
Discrete Algorithms, pages 635-644, 2002
L Golab and M T Ozsu Processing sliding window multi-joins in
continuous queries over data streams In Proc 29th Int Conf on
Very Large Data Bases, pages 500-511, 2003
B Babcock, S Babu, M Datar, and R Motwani Chain: Operator
scheduling for memory minimization in data stream systems In Proc
ACM SIGMOD Int Conf on Management of Data, pages 253-264, 2003
D Carney, U Cetintemel, A Rasin, S Zdonik, M Cherniack, and M
Stonebraker Operator scheduling in a data stream manager In Proc
29th Int Conf on Very Large Data Bases, pages 838-849, 2003
N Tatbul, U Cetintemel, S Zdonik, M Cherniack, and M
Stonebraker Load shedding in a data stream manager In Proc 29th
Int Conf on Very Large Data Bases, pages 309-320, 2003
J-H Hwang, M Balazinska, A Rasin, U Cetintemel, M Stonebraker,
and S Zdonik High-availability algorithms for distributed stream
processing In Proc 21st Int Conf on Data Engineering, pages
779-790, 2005
MapReduce-based Data Management
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung: The Google file
system SOSP, pages 29-43, 2003
K Shvachko, H Kuang, S Radia, R Chansler, The Hadoop Distributed
File System, IEEE 26th Symposium on Mass Storage Systems and
Technologies, 2010
Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar,
Andrew Tomkins: Pig latin: a not-so-foreign language for data
processingProc ACM SIGMOD Int Conf on Management of Data,
pages 1099-1110, 2008
Alan Gates, Olga Natkovich, Shubham Chopra, Pradeep Kamath, Shravan
Narayanam, Christopher Olston, Benjamin Reed, Santhosh Srinivasan,
Utkarsh Srivastava: Building a HighLevel Dataflow System on top of
MapReduce: The Pig Experience Proc VLDB 2(2): 1414-1425, 2009
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan
Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan
Sivasubramanian, Peter Vosshall, and Werner Vogels Dynamo: Amazon’s
Highly Available Key-Value Store SOSP, 2007
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C Hsieh, Deborah
A Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, Robert E
Gruber: Bigtable: A Distributed Storage System for Structured Data,
ACM Trans Comput Syst, 26(2): Article 4, 2008
Brian F Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam
Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel
Weaver, Ramana Yerneni: PNUTS: Yahoo!’s hosted data serving
platform Proc 34th Int Conf on Very Large Data Bases, pages
1277-1288, 2008
Iman Elghandour, Ashraf Aboulnaga ReStore: Reusing Results of
MapReduce Jobs Proc VLDB, 5(6): 586-597, 2012