forked from cpeikert/TheoryOfCryptography
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathlec16.tex
404 lines (343 loc) · 17.7 KB
/
lec16.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
\documentclass[11pt]{article}
\usepackage{fullpage}
\usepackage{times}
\usepackage{hyperref,microtype,pdfsync}
\usepackage{amsmath,amsfonts,amssymb,amsthm}
\usepackage{fancyhdr}
\usepackage{mathtools}
\usepackage{algorithmic}
\input{header}
% VARIABLES
\newcommand{\lecturenum}{16}
\newcommand{\lecturetopic}{Zero-Knowledge Proofs}
\newcommand{\scribename}{Alessio Guerrieri}
% END OF VARIABLES
\lecheader
\pagestyle{plain} % default: no special header
\begin{document}
\thispagestyle{fancy} % first page should have special header
% LECTURE MATERIAL STARTS HERE
\section{Recap: Interactive Proofs}
\label{sec:recap:IP}
\begin{definition}
\label{def:ips}
An \emph{interactive proof system} with \emph{soundness error} $s
\in [0,1]$ for a language $L \subseteq \bit^{*}$ is a pair of
algorithms: a (possibly computationally unbounded) prover $P$, and a
ppt verifier $V$, having the following properties:
\begin{enumerate}
\item \emph{Completeness} (``the specified prover convinces the
specified verifier of a true statement''): for all $x \in L$,
$\out_{V}[ P(x) \leftrightarrow V(x) ] = 1$, with probability $1$.
\item \emph{Soundness}: (``\emph{no} possibly cheating prover can
convince the specified verifier of a false statement''):
\[ \Pr\left[ \out_{V}[ P^{*}(x) \leftrightarrow V(x) ] = 1 \right]
\leq s. \]
\end{enumerate}
\end{definition}
\newcommand{\gni}{\problem{GNI}}
Last time we studied an interactive proof for the graph nonisomorphism
problem ($\gni$). Remember the definition: $ \gni=\set{(G_0,G_1) :
G_0 \not\equiv G_1}$, where $G_0 \equiv G_1$ if there exists a
bijection $\pi \colon V_0 \to V_1$ such that $(v_0,v_1) \in E_0$ if an
only if $(\pi(v_0),\pi(v_1)) \in E_1$. The protocol is as follows:
\begin{center}
\begin{tabular}{ccc}
$P(G_{0}, G_{1})$ & & $V(G_{0},G_{1})$ \\ \\
& & $b \gets \bit$\\
&& choose random permutation $\pi$ on $G_{b}$'s vertices \\
&$\underleftarrow{H=\pi(G_b)}$ & \\
find $b'$ such that $H\equiv G_{b'}$ &&\\
& $\underrightarrow{\qquad b' \qquad}$ &\\
&& accept iff $b=b'$
\end{tabular}
\end{center}
We already proved that this protocol is complete (if the graph are not
isomorphic, then the unbounded prover always convinces the verifier)
and that the protocol is sound (if the graphs are isomorphic, then the
best any prover can do is to convince the verifier with one-half
probability).
\section{Zero-Knowledge Proofs}
\label{sec:zero-knowl-proofs}
Notice a curious property of the above protocol: the prover just
replies with $b' = b$, the verifier's own random bit. Intuitively,
therefore, $V$ doesn't seem to get anything from its interaction with
$P$ that it ``doesn't already know itself'' --- aside from the fact
that the theorem is true (because the prover gives a convincing
proof). Properly formalized, this property is called ``zero
knowledge.''
\subsection{Honest-Verifier Zero-Knowledge}
\label{sec:honest-verifier-zero}
\begin{definition}
\label{def:hvzk}
An interactive proof system $(P,V)$ for a language $L$ is said to be
\emph{honest-verifier zero-knowledge} if there exists an nuppt
simulator $\Sim$ such that for all $x \in L$, \[ \view_{V}[ P(x)
\leftrightarrow V(x) ] \statind \Sim(x), \] where $\view_{V}$ is the
``view'' of the verifier in the interaction, consisting of its input
$x$, its random coins $r_V$, and the sequence of the prover's
messages.
\end{definition}
The idea behind the definition is that given an $x \in L$, it is
possible to simulate everything the (honest) verifier ``sees'' in the
interaction \emph{without the help of the unbounded prover}. Said
another way, anything revealed by the prover in the interaction could
have been generated by a simulator that knows nothing more than the
verifier itself.
Let us take a closer look at the definition of $\view_V$. The
verifier's ``view'' is the tuple of all its various inputs throughout
the execution of the protocol. These are precisely its input $x$, its
random coins, and the prover's messages. Observe that given these
values, the messages that $V$ sends to $P$ can be generated
deterministically, so we do not need to include them in the view.
\begin{lemma}
\label{lem:gni-hvzk}
The above protocol for the $\gni$ problem is honest-verifier
zero-knowledge.
\end{lemma}
\begin{proof}
To prove this lemma we define a ppt simulator $\Sim$ that is able to
generate a transcript that is indistinguishable from (actually,
identical to) the view of $V$. The simulator $\Sim(G_{0}, G_{1})$
works as follows: first, choose $b \gets \bit$ and a random
permutation $\pi$ on $G_{b}$'s vertices. Then, output the view $(x
= (G_{0}, G_{1}), r_V=(b,\pi), b' = b)$.
We show that when $(G_0,G_1) \in \gni$, we have that $S(G_0,G_1)
\equiv \view_V[P(x)\leftrightarrow V(x)]$. The random coins
$(b,\pi)$ are distributed identically to $V$'s random coins. And
since we are assuming that $(G_0,G_1) \in \gni$, the prover always
answers with the message $b'=b$. This completes the proof.
\end{proof}
\subsection{Full Zero-Knowledge}
\label{sec:full-zero-knowledge}
The above definition says that the verifier gains no knowledge from
the interaction, as long as it runs the prescribed algorithm $V$. But
what if the verifier tries to gain some knowledge from its interaction
with the prover by \emph{deviating} from the prescribed protocol? We
should consider an arbitrary (but efficient) \emph{cheating} verifier
$V^{*}$, and show that its view can be efficiently simulated as well.
\begin{definition}
\label{def:zk}
An interactive proof system $(P,V)$ for a language $L$ is
\emph{zero-knowledge} if for every nuppt $V^{*}$ there exists an
nuppt simulator $\Sim$ such that for all $x \in L$, \[ \view_{V^{*}}[
P(x) \leftrightarrow V^{*}(x) ] \statind \Sim(x). \]
\end{definition}
Does our protocol for $\gni$ enjoy full zero-knowledge? Let's try to
create a simulator $\Sim$. Since $V^{*}$ is arbitrary, we do not know
how $V^*$ will construct its first message (the graph $H$), so the
only thing $\Sim$ seems to be able to do is to choose random coins
$r_{V^*}$ and run $V^*$ using those random coins to get a message $H$.
This $H$ can be anything (it might not even be a graph at all!). If
it's a malformed message, then $\Sim$ can do exactly what the honest
prover $P$ would do: abort. But what if $H$ is a valid message? Note
that $\Sim$ can't hope to determine the correct graph just by
``inspecting'' the random coins $r_{V^{*}}$, because we have no idea
how $V^{*}$ generates $H$ from its random coins. So how can $\Sim$
know if $H$ is isomorphic to $G_0$ or $G_1$, without solving the graph
isomorphism problem?
The problem here is that the protocol in fact \emph{may not be}
zero-knowledge. Notice that in this protocol, the prover is acting as
an \emph{isomorphism oracle}, and $V^*$ might be able to exploit the
prover to get some extra knowledge that it didn't have before.
Suppose for example that $V^*$ has $H$ hardcoded into it, and wants to
know if $H \equiv G_0$ or $H \equiv G_1$ (or neither). Then $V^*$ can
simply send $H$ to the prover, and from the answer $V^*$ will learn
that $H\equiv G_{b'}$, or that $H$ is isomophic to neither of $G_0,
G_1$ (if the prover happens to abort).
\section{Graph Isomorphism}
\label{sec:graph-isomorphism}
\newcommand{\gi}{\problem{GI}}
\newcommand{\iso}{\text{iso}}
It is possible to give a zero-knowledge proof for graph
nonisomorphism, but the protocol and proof are quite involved (the
solution effectively forces $V^{*}$ itself to prove that $H$ is indeed
isomorphic to one of $G_{0}$ or $G_{1}$, which effectively eliminates
the problem we identified above). Instead, let's look at a different
problem for which we will be able to give a simple zero-knowledge
proof. We define the \emph{graph isomorphism} problem $\gi =
\set{(G_{0}, G_{1}) : G_{0} \equiv G_{1}}$. This problem is just the
complement of the $\gni$ problem. Can we create a zero-knowledge
interactive proof system for this problem? Before just giving the
solution, let's develop some intuition for how it might work.
Support that a spelunker wants to prove that a certain cave contains a
``loop,'' but without giving any other knowledge about the structure
of the cave. A solution might be the following: the verifier stays at
any point of the claimed loop, and the prover moves to some hidden
part of the loop. When the prover announces that he is ready, the
verifier will choose one of the two directions and will challenge the
prover to come to him from that direction. If the prover knows the
loop, he will be able to find a way to the verifier, while if there is
no loop the prover will be able to walk to the verifier from only one
direction and will succeed only half of the time.
We will use this basic idea for the $\gi$ problem. The prover will
generate a random graph $H$ isomorphic to the two graphs, and the
verifier will challenge it to give a permutation proving that $H$ is
isomorphic to $G_b$, for $b$ chosen at random. Let $\iso(G)$ denote
the set of all graphs isomorphic to a graph $G$.
\begin{center}
\begin{tabular}{ccc}
$P(G_0,G_1)$ & & $V(G_0, G_1)$ \\ \\
Choose random $H \gets \iso(G_0) = \iso(G_{1})$ & & \\
& $\underrightarrow{\quad H \quad }$ & \\
&& Choose $b \gets \bit$\\
& $\underleftarrow{\quad b \quad}$ & \\
Choose \emph{random} perm. $\rho$ s.t.~$H=\rho(G_b)$&& \\
& $\underrightarrow{\quad \rho \quad}$ & \\
&& Accept iff $H=\rho(G_b)$
\end{tabular}
\end{center}
Implicitly, $V$ should check that $\rho$ really is a legal permutation
on $G_{b}$'s vertices, and reject if it is not; is important for
soundness. Note also that it is very important that when replying to
the challenge bit $b$, the honest prover $P$ chooses a \emph{random}
permutation $\rho$ such that $H = \rho(G_{b})$, and not just (for
example) the first one it encounters (since there may be more than
one). This will turn out to be crucial for the proof of
zero-knowledge (Claim~\ref{claim:statind-conditional} below).
\begin{theorem}
\label{thm:gi-zkp}
The above protocol for the $\gi$ problem is a zero-knowledge
interactive proof system.
\end{theorem}
\begin{proof}
First we show completeness: if $G_0 \equiv G_1$, then $G_0 \equiv H
\equiv G_1$ and $P$ can always find an appropriate $\rho$ such that
$V$ accepts.
Next we show soundness, and calculate the probability that $V$
accepts when $G_0 \not\equiv G_1$ and interacting with some
arbitrary $P^{*}$. For any $H$ output by $P^{*}$, we have that $H
\equiv G_{b'}$ for at most one $b'$. The verifier chooses $b \in
\bit$ at random, and if $b\neq b'$ then there is \emph{no} reply
which will make $V$ accept. We can conclude that the soundness
error is at most $\frac{1}{2}$.
Now we prove that the protocol is zero-knowledge. To do this, let
$V^{*}$ be an arbitrary nuppt algorithm. Unlike in our above proof
of \emph{honest-verifier} zero-knowledge for the $\gni$ problem, we
do not know how $V^{*}$ will choose its challenge bit $b$ --- it
might introduce some bias, or choose $b$ to depend on the prover's
initial message $H$ in some bizarre way. Our simulator $\Sim$ will
work by preparing a message $H$ so that it knows how to answer
\emph{just one} of $V^{*}$'s possible challenges. If $V^{*}$
happens to make that challenge, the simulator answers and outputs
the view; otherwise, it tries again from scratch by re-running
(``rewinding'') the verifier with fresh new choices until it
succeeds.
More formally, $\Sim(G_{0},G_{1})$ works as follows:
\begin{algorithmic}
\REPEAT
\STATE Choose a random permutation $\rho$ and $b \gets \bit$
\STATE Let $H= \rho(G_b)$
\STATE Run $V^*(G_{0}, G_{1})$ with fresh random coins $r_{V^*}$,
and send $H$ as the prover's first message to $V^{*}$
\STATE Receive a bit $b^{*}$ from $V^*$ (or if $V^{*}$ sends a
malformed message, just output the view so far)
\UNTIL{$b^{*}=b$, or we've looped more than $n$ times}
\RETURN $(x = (G_{0}, G_{1}), r_{V^*}, H, \rho)$
\end{algorithmic}
We now need to show that for any $x = (G_{0},G_{1}) \in \gi$, \[
\view_{V^{*}} [ P(x) \leftrightarrow V^{*}(x) ] \statind \Sim(x). \]
This follows by Claims~\ref{claim:statind-conditional} and
\ref{claim:output-iteration} below, because conditioned on $\Sim$
succeeding in some iteration, its output is identically distributed
with the view of $V^{*}$, and it fails in all $n$ iterations with
probability at most $2^{-n}$.
\end{proof}
\begin{claim}
\label{claim:statind-conditional}
For any $x = (G_{0},G_{1}) \in \gi$, conditioned on $\Sim(x)$
succeeding in a particular iteration, its output is identically
distributed with $\view_{V^{*}}[ P(x) \leftrightarrow V^{*}(x) ]$.
\end{claim}
\begin{claim}
\label{claim:output-iteration}
For any $x = (G_{0},G_{1}) \in \gi$, the probability that $\Sim(x)$
succeeds in any particular iteration is at least $1/2$.
\end{claim}
\begin{proof}[Proof of Claim~\ref{claim:statind-conditional}]
The output of $\Sim$ is $((G_{0}, G_{1}), r_{V^*},H,\rho)$. By
construction, $r_{V^*}$ is uniform. Since we choose $\rho$
uniformly at random, we get that $H$ is uniform in
$iso(G_0)=iso(G_1)$. Furthermore, conditioned on $H$ and bit $b$,
$\rho$ is a \emph{uniformly random} permutation such that
$H=\rho(G_b)$. This is exactly the prover's distribution. (This is
where we have used the fact that the prover returns a random $\rho$
such that $H = \rho(G_{b})$.)
\end{proof}
\begin{proof}[Proof of Claim~\ref{claim:output-iteration}]
Because $G_0 \equiv G_1$, the graph $H=\rho(G_{b})$ constructed by
the simulator is uniform in $\iso(G_{0}) = \iso(G_{1})$, and
simulator $\Sim$'s bit $b$ is statistically independent from $H$.
Thus, for any challenge bit $b^{*}$ output by $V^*$, we have that
$\Pr[b^{*}=b] = 1/2$.
\end{proof}
\section{Enhancements}
\label{sec:enhancements}
A few observations on these results:
\begin{enumerate}
\item \emph{Efficient provers.} The definition of an interactive
proof allows the prover to be unbounded, but how could we hope to
\emph{use} such a prover in practice? To be applicable in the real
world, we would like the prover to be \emph{efficient}. Of course,
if the prover is given just the same inputs as the verifier, then it
can't accomplish anything that the verifier can't accomplish itself.
But in many proof systems, the prover can be made efficient by
giving it an \emph{extra} input to help it do its job. (This input
might represent some extra knowledge that the prover may generated
in connection with the theorem, and would like to keep secret.) For
example, in the $\gi$ proof we can give the prover an isomorphism
$\sigma$ such that $G_{0} = \sigma(G_{1})$, and the prover can
choose $H = \pi(G_{0})$ for a uniformly random permutation $\pi$.
Then if $V$ asks for an isomorphism between $H$ and $G_0$, the
prover answers $\pi$; if $V$ asks for an isomorphism between $H$ and
$G_1$, the prover answers $(\pi \circ \sigma)$.
An application of zero-knowledge is to identification: a prover who
knows an isomorphism between the graphs (because he constructed
them, for example) can give a convincing proof. Due to the
zero-knowledge property, after seeing some proofs, nobody can
impersonate the prover with an advantage greater than if they hadn't
seen any proofs at all. (To argue this more formally require the
notion of a \emph{proof of knowledge}, which intuitively says that
not only is the theorem true, but that any (possibly cheating)
prover capable of giving a convincing proof must ``know a reason why''
(i.e., a witness). We will investigate proofs of knowledge in more
detail later.)
\item \emph{Improved soundness error.} Our soundness error is still
$1/2$, which is quite bad. One obvious way to improve the error is
to repeat $N$ \emph{independent} iterations of the proof, and accept
if every iteration is acceptable. It is not too hard to show that
the soundness error then becomes $2^{-N}$. Furthermore, it can be
shown (though it is not trivial!) that the zero-knowledge property
is also preserved under such sequential repetition. A downside is
that this approach is not very efficient in the number of rounds.
Another approach is to run $N$ copies of the proof in
\emph{parallel}, and accept if all iterations are acceptable. This
is clearly more round-efficient, and also reduces the soundness
error to $2^{-N}$ just as with sequential repetition. However, this
parallel protocol scheme may \emph{not} preserve the zero-knowledge
property! To see why, notice that when we proved that the $\gi$
protocol is zero-knowledge, we created a simulator that had to
``predict'' the challenge bit chosen by the cheating verifier
$V^{*}$. To create a simulator for this parallel protocol, it seems
that we would need to guess all $N$ challenge bits at once ---
otherwise, the simulator would not be able to output a full
accepting transcript for the entire protocol. (Recall that $V^{*}$
can choose its $N$ challenge bits in any arbitrary way, perhaps
depending on all $N$ of the prover's initial messages.) In the
sequential approach, a simulator can guess each bit one by one
(without having to ``rewind'' all the way to the start of the
protocol), but in the parallel case all the bits need to be guessed
in one step, which would succeed with only $2^{-N}$ probability.
This would translate to a simulation running time that is
exponential in $N$.
Note that nothing in the above argument rules out a more clever
simulation strategy, so we are \emph{not} saying that the parallel
protocol is \emph{definitely not} zero-knowledge. All we are saying
is that its status is unknown, and a valid proof of the
zero-knowledge property seems hard to obtain.
\end{enumerate}
\end{document}
%%% Local Variables:
%%% mode: latex
%%% TeX-master: t
%%% End: