Private search over big data leveraging distributed file system and parallel processing

Selçuk, Ayşe and Örencik, Cengiz and Savaş, Erkay (2015) Private search over big data leveraging distributed file system and parallel processing. In: Sixth International Conference on Cloud Computing, GRIDs, and Virtualization (CLOUD COMPUTING 2015), Nice, France

[thumbnail of conf03_cloud_computing_2015_5_30_20096.pdf] PDF

Download (253kB)


In this work, we identify the security and privacy problems associated with a certain Big Data application, namely secure keyword-based search over encrypted cloud data and emphasize the actual challenges and technical difficulties in the Big Data setting. More specifically, we provide definitions from which privacy requirements can be derived. In addition, we adapt an existing work on privacy-preserving keyword-based search method to the Big Data setting, in which, not only data is huge but also changing and accumulating very fast. Our proposal is scalable in the sense that it can leverage distributed file systems and parallel programming techniques such as the Hadoop Distributed File System (HDFS) and the MapReduce programming model, to work with very large data sets. We also propose a lazy idf-updating method that can efficiently handle the relevancy scores of the documents in a dynamically changing, large data set. We empirically show the efficiency and accuracy of the method through extensive set of experiments on real data.
Item Type: Papers in Conference Proceedings
Uncontrolled Keywords: Cloud computing, Big Data, Keyword Search, Privacy, Hadoop
Subjects: Q Science > QA Mathematics > QA075 Electronic computers. Computer science
Q Science > QA Mathematics > QA076 Computer software
Divisions: Faculty of Engineering and Natural Sciences > Academic programs > Computer Science & Eng.
Faculty of Engineering and Natural Sciences
Depositing User: Erkay Savaş
Date Deposited: 22 Dec 2015 15:28
Last Modified: 26 Apr 2022 09:20

Actions (login required)

View Item
View Item