Theory and application of encrypted sequential data processing: search and computation
thesisposted on 22.05.2021, 09:05 by Hoi Ting Poon
Cloud Computing has seen a dramatic rise in adoption in the past decade amid se- curity and privacy concerns. One area of consensus is that encryption is necessary, as anonymization techniques have been shown to be unreliable. However, the processing of encrypted data has proven to be difficult. Briefly, the goal is to maintain security over remotely stored and accessed data while achieving reasonable storage cost and perfor- mance. Search is the most basic and central functionality of a privacy-protected cloud storage system actively being investigated. Recent works have looked at enabling more specialized search functions. In this thesis, we explore the problem of searching and pro- cessing of sequential data. We propose three solutions targeting textual data, with em- phasis respectively on security, storage cost and performance. Our first solution achieves a high level of security with reduced communication, storage and computational cost by exploiting properties of natural languages. Our second solution achieves a minimal storage cost by taking advantage of the space efficiency of Bloom filters. Both propos- als were also first to enable non-keyword search in phrases. Using a subsequence-based solution, our final phrase search scheme is currently the fastest phrase search protocol in literature. We also show how sequential data search schemes can be extended to in- clude auditing with minimal additional cost. The solution is capable of achieving proof of retrievability with unbounded number of audits. A sample application which enables searching and computing over target values of encrypted XML files is also demonstrated. In terms of media, we describe an encrypted cloud media storage solution that simultane- ously protects user privacy and enables copyright verification, and is the first to achieve security against dishonest participants. We also describe a framework where practical scalable privacy-protected copyright detection can be performed. Finally, an application of sequence querying over generic data in the form of an Anti-Virus over encrypted cloud storage is demonstrated. A private scanning solution and a public Anti-Virus as a ser- vice solution are described, noting that the technique can be conceptualized as a generic pattern matching solution on encrypted data. We also include some directions on future work and unexplored applications.