\documentclass[11pt]{article}
\usepackage{amsfonts}
\usepackage{amsmath}
\usepackage{latexsym}
\setlength{\oddsidemargin}{.25in}
\setlength{\evensidemargin}{.25in}
\setlength{\textwidth}{6in}
\setlength{\topmargin}{-0.4in}
\setlength{\textheight}{8.5in}
\newtheorem{theorem}{Theorem}[section]
\newtheorem{fact}[theorem]{Fact}
\newtheorem{proposition}[theorem]{Proposition}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{definition}[theorem]{Definition}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{remark}[theorem]{Remark}
\newtheorem{claim}[theorem]{Claim}
\newtheorem{conjecture}[theorem]{Conjecture}
\newtheorem{assumption}[theorem]{Assumption}
\newtheorem{exercise}[theorem]{Exercise}
\newcommand{\qed}{\hfill \ensuremath{\Box}}
\newenvironment{proof}{
\vspace*{-\parskip}\noindent\textit{Proof.}}{$\qed$
\medskip
}
\newcommand{\alg}[1]{\mathsf{#1}}
\newcommand{\Prover}{\alg{P}}
\newcommand{\Verifier}{\alg{V}}
\newcommand{\Simulator}{\alg{S}}
\newcommand{\PPT}{\alg{PPT}}
\newcommand{\isom}{\cong}
\newcommand{\from}{\stackrel{\scriptstyle R}{\leftarrow}}
\newcommand{\handout}[5]{
\renewcommand{\thepage}{#1-\arabic{page}}
\noindent
\begin{center}
\framebox{
\vbox{
\hbox to 5.78in { {\bf CS395T Advanced Cryptography} \hfill #2 }
\vspace{4mm}
\hbox to 5.78in { {\Large \hfill #5 \hfill} }
\vspace{2mm}
\hbox to 5.78in { {\it #3 \hfill #4} }
}
}
\end{center}
\vspace*{4mm}
}
\newcommand{\ho}[5]{\handout{#1}{#2}{Instructor:
#3}{Scribe: #4}{Lecture #1: #5}}
\newcommand{\al}{\alpha}
\newcommand{\Z}{\mathbb Z}
\begin{document}
\ho{6}{4/09/09}{Brent Waters} {Jimmy Yang} {Problem Set 3 (cont.)/Data Storage}
%
% NOTE: If you want to scan in hand-drawings for figures, that is fine .
%
\section{Overview}
\hspace*{0.5cm} Today, we went over problems 3 and 4 from the last homework. In particular, we focused on a proof technique called the hybrid method which was needed for solving problem 4. Then we introduced the concepts of Data Storage and Proof-of-Retrievability.
\section{Problem 4}
\hspace*{0.5cm}
Notice that the standard formulation of an adversary's advantage can be transformed into the following equivalent form ($b$ is the simulator's chosen bit and $b^{'}$ is the adversary's guess):
\begin{align*}
Pr[b^{'} = b] & \geq \frac{1}{2} + \epsilon\\
\frac{1}{2} Pr[b^{'} = 1 | b = 1] + \frac{1}{2} Pr[b^{'} = 0 | b = 0] & \geq \frac{1}{2} + \epsilon\\
\frac{1}{2} Pr[b^{'} = 1 | b = 1] + \frac{1}{2} \left(1 - Pr[b^{'} = 1 | b = 0]\right) & \geq \frac{1}{2} + \epsilon\\
\frac{1}{2} Pr[b^{'} = 1 | b = 1] - \frac{1}{2} Pr[b^{'} = 1 | b = 0] & \geq \epsilon\\
Pr[b^{'} = 1 | b = 1] - Pr[b^{'} = 1 | b = 0] & \geq 2\epsilon\\
\end{align*}
In other words, the advantage can also be seen as the difference in probability of the events when the adversary correctly guesses $b^{'} = 1$, and when the adversary incorrectly guesses $b^{'} = 1$ (we don't care too much about constants).
\subsection{Hybrid Technique}
\hspace*{0.5cm} Recall the standard game for CPA security:
\begin{enumerate}
\item Challenger sends public key $PK$ to Adversary
\item Adversary sends $2$ messages to the Challenger: $m_0$ and $m_1$
\item Challenger chooses a random bit $b \in \{0, 1\}$, and sends $Enc(m_b)$ to the Adversary
\item Adversary sends a guess $b^{'}$ of the bit $b$ to the Challenger
\end{enumerate}
With respect to problem 4, consider 2 possible experiments (games) that could occur between the Challenger and the Adversary, depending on the value of $b$ that was chosen in step 3: In game 0 ($b=0$), the Challenger sends $(Enc_1(m_0), Enc_2(m_0))$. In game 1 ($b=1$), the Challenger sends $(Enc_1(m_1), Enc_2(m_1))$. Using the new formulation of the Adversary's advantage given above, this becomes:
\[ Pr[b^{'} = 1 | \mbox{ experiment 1}] - Pr[b^{'} = 1 | \mbox{ experiment 0}] \]
Let's call these probabilities $X_1$ and $X_0$, respectively. Now, the standard starting point for a proof of security ``Assume we have an adversary with advantage $\epsilon \ldots$" can be restated as ``Assume we have an adversary with $X_1 - X_0 \geq 2 \epsilon \ldots$". Now consider a ``hybrid" experiment h where in step 3 of the CPA game, the Challenger instead sends $(Enc_1(m_1), Enc_2(m_0))$. Define $X_h$ similarly as $Pr[b^{'} = 1 | \mbox{ experiment h}]$. By the triangle inequality, we have that either $X_1 - X_h \geq \epsilon$, or $X_h - X_0 \geq \epsilon$ (or possibly both). Assume the former is true. Since the only difference between experiment 1 and experiment h is the fact that we are sending either $Enc_1(m_0)$ or $Enc_1(m_1)$, the Adversary's distinguishing advantage must come from being able to distinguish between encryptions of $m_0$ and $m_1$. Thus, this Adversary can be used to break the first encryption scheme. In the case that the latter is true, we simp!
ly have an Adversary that can break the second encryption scheme.
\subsection{Exercise}
\textit{Note: This section was not part of the lecture. It was entirely inserted by the scribe.}
\vspace*{4pt}
For a better understanding of the hybrid technique, go through the steps needed to solve the following problem, a natural extension to problem 4.
\begin{exercise}[k-Composition Encryption]
Suppose we have $k$ encryption schemes:\\ $((Setup_1, Enc_1, Dec_1), \ldots, (Setup_k, Enc_k, Dec_k))$. Now we define a new composition encryption scheme where $Enc_c(m) = (Enc_1(m), \ldots, Enc_k(m))$ ($Setup_c$ and $Dec_c$ are defined in the obvious way). Show that if all k schemes are CPA secure, then the composition scheme is also CPA secure.
\end{exercise}
Something else to think about: how large can $k$ be?
\section{Data Storage}
\hspace*{0.5cm} Up until now in the class, we have been concerned with the problems of confidentiality (Encryption) and integrity (Signatures) of data flowing across communication networks. Today, we move on to the problem of availability. Consider the following scenario: a client has a lot of data (D) it would like to store on a server (presumably the server charges a fee for this service). Thus, the client would like to be assured that the server is actually storing its data. There has been a lot of debate about how to best model this problem (what does it mean to ``store" data, how to convince the client its data is really being stored, is it acceptable to lose a small fraction of the data, etc), but the algorithms we will be working with in class are:
\begin{itemize}
\item $Store(D) = (VK, SF)$: The client stores its data $D$ with the server. The output of this algorithm is a verification key $VK$ for the client to keep, and a stored file $SF$ for the server to keep.
\item $P(SF) \leftrightarrow V(VK)$: The client (V = verifier) performs an audit on the server (P = prover). This is an interactive protocol in which the client interacts with the server and attempts to ascertain that the server still has the client's data. At the end, V outputs either true or false, depending on whether it was convinced or not that the server still has its data. It should be the case that if $(VK, SF)$ are the output of the $Store$ algorithm, then V will accept.
\end{itemize}
Some of the important parameters are:
\begin{itemize}
\item audit speed: How long does it take for P and V to perform their various computations in the interactive protocol?
\item bandwidth: How much data must be sent between P and V during the audit?
\item prover/verifier storage: How large are the files SF and VK?
\end{itemize}
As a matter of functionality, if the verifier is indeed convinced that the prover still has its data, then we would like to be able to extract the original data D from the audit interaction. The following security property captures this idea:
\begin{definition}[Proof of Retrievability]
If some $P^{'}$ convinces V with non-negligible probability to accept, then there exists an extraction algorithm $Ext$ such that $Ext(P \leftrightarrow V) = D$.
\end{definition}
In the next class we will see some constructions of these algorithms.
\end{document}