Constructing an AND gate as a quantum operation

Here we consider operations that map the two-bit state $\ket{a,b}$ to the one-qubit state $\ket{a\wedge b}$, for all $a,b\in\{0,1\}$, where $a\wedge b$ represents the binary logical AND operation of $a$ and $b$. We will construct  $k$ matrices $A_1,\dots, A_k$ (where $k\leq 4$, and each $A_j$ is a $2 \times 4$ matrix) such that $\sum_
 {j=1}^{k}A^\dagger_jA_j=I$, and whose quantum operation $\Phi$ computes the above mapping. In other words, for all $a,b\in \{0,1\}$, when $\rho=\ket{a,b}\bra{a,b}$,
\[
 \Phi(\rho):=\SUM{j=1}{k}A_j\rho A^\dagger_j=\ket{a\wedge b}\bra{a\wedge b}.
\]

 Define the following matrices:
\[
 A_1=\begin{pmatrix} 1&0&0&0 \\ 0&0&0&0 \end{pmatrix}, \
 A_2=\begin{pmatrix} 0&1&0&0 \\ 0&0&0&0 \end{pmatrix}, \
 A_3=\begin{pmatrix} 0&0&1&0 \\ 0&0&0&0 \end{pmatrix}, \
 A_4=\begin{pmatrix} 0&0&0&0 \\ 0&0&0&1 \end{pmatrix},
\]
 and observe that
\[ \begin{align*}
 A^\dagger_1A_1=\begin{pmatrix} 1&0&0&0\\0&0&0&0\\0&0&0&0\\
0&0&0&0 \end{pmatrix} , \ &  A^\dagger_2A_2=\begin{pmatrix} 0&0&0&0\\0&1&0&0\\0&0&0&0\\
0&0&0&0 \end{pmatrix} \\
 A^\dagger_3A_3=\begin{pmatrix} 0&0&0&0\\0&0&0&0\\0&0&1&0\\
0&0&0&0 \end{pmatrix} , \ &  A^\dagger_4A_4=\begin{pmatrix} 0&0&0&0\\0&0&0&0\\0&0&0&0\\
0&0&0&1 \end{pmatrix}.
 \end{align*}\]
 Therefore $\sum_{j=1}^{4}A^\dagger_j A_j=I$ is the $4\times 4$ identity matrix, which is necessary to define a valid channel in the Krauss representation

 Now consider the four computational basis states $\ket{a,b}$, for $a,b\in\{0,1\}$, with corresponding density operators given by $\rho_i:=\ket{a,b}\bra{a,b}$ where the index can be represented in terms of $a$ and $b$ as $i=2a+b+1$. More explicitly,

 \[ \begin{align*}
\rho_1:=\ket{0,0}\bra{0,0}=\begin{pmatrix} 1&0&0&0\\0&0&0&0\\0&0&0&0\\
0&0&0&0 \end{pmatrix} , \ &  \rho_2:=\ket{0,1}\bra{0,1}=\begin{pmatrix} 0&0&0&0\\0&1&0&0\\0&0&0&0\\
0&0&0&0 \end{pmatrix} \\
\rho_3:=\ket{1,0}\bra{1,0}=\begin{pmatrix} 0&0&0&0\\0&0&0&0\\0&0&1&0\\
0&0&0&0 \end{pmatrix} , \ &  \rho_4:=\ket{1,1}\bra{1,1}=\begin{pmatrix} 0&0&0&0\\0&0&0&0\\0&0&0&0\\
0&0&0&1 \end{pmatrix}.
 \end{align*}\]

 It then follows that,
\[
 A^\dagger_j \rho_i A_j=\left\{
 \begin{array}{rl} \begin{pmatrix}0 &0 \\
 0 &0 \end{pmatrix}  \ \ &\text{if} \ \ i\neq j,  \\
 \\
\begin{pmatrix}1 &0 \\
 0 &0 \end{pmatrix}  \ \ &\text{if} \ \ i= j<4, \\
 \\
\begin{pmatrix}0 &0 \\
 0 &1 \end{pmatrix}  \ \ &
 \text{if} \ \ i= j=4.
 \end{array}
 \right.
\]

 Therefore,
\[
 \Phi(\rho_i)=\SUM{j=1}{k}A_j\rho_i A^\dagger_j\left\{
 \begin{array}{rl}
\begin{pmatrix}1 &0 \\
 0 &0 \end{pmatrix}  \ \ &\text{if} \ \ i<4, \\
 \\
 \begin{pmatrix}0 &0 \\
 0 &1 \end{pmatrix}  \ \ &
 \text{if} \ \ i=4.
  \end{array}
 \right.
\]

Now, for $a,b\in\{0,1\}$ the logical AND operation is such that $a\wedge b=1$ if and only if $a=b=1$, and $a\wedge b=0$ otherwise. Moreover, 
\[
\ket{a\wedge b}\bra{a\wedge b}=\left\{
\begin{array}{rl}
\ket{0}\bra{0}=\begin{pmatrix}1 &0 \\
 0 &0 \end{pmatrix}  \ \ &\text{if} \ \ a\wedge b=0, \\
 \\
  \ket{1}\bra{1}=\begin{pmatrix}0 &0 \\
 0 &1 \end{pmatrix}  \ \ &\text{if} \ \ a\wedge b=1.
 \end{array}
 \right.
\]

Hence, the quantum operation defined in this manner does indeed compute the logical AND operation
\[
  \Phi(\rho_i)=\SUM{j=1}{k}A_j\rho_i A^\dagger_j=\ket{a\wedge b}\bra{a\wedge b},
\]
 where, as originally defined $\rho_i:=\ket{a,b}\bra{a,b}$ with $i=2a+b+1$ for $a,b\in\{0,1\}$.
 
This operation maps all basis states to pure states, but does it map all pure input states to pure output states?

 Consider the pure state $\ket{\psi}:=\frac{1}{\sqrt{2}}(\ket{0,0}+\ket{1,1})$, with density matrix given by
\[ \begin{align*}
 \rho:=\ket{\psi}\bra{\psi}&=\frac{1}{2}(\ket{0,0}+\ket{1,1})(\bra{0,0}+\bra{1,1})\\
 &=\frac{1}{2}(\ket{0,0}\bra{0,0}+\ket{0,0}\bra{1,1}+\ket{1,1}\bra{0,0}+\ket{1,1}\bra{1,1}) \\
 &=\frac{1}{2}\begin{pmatrix}
 1&0&0&1 \\
 0&0&0&0 \\
 0&0&0&0 \\
 1&0&0&1
 \end{pmatrix}.
 \end{align*} \]

Recall that $\Phi(\ket{0,0}\bra{0,0})=\ket{0}\bra{0}$ and $\Phi(\ket{1,1}\bra{1,1})=\ket{1}\bra{1}$. Moreover it can be seen that $\Phi(\ket{0,0}\bra{1,1})=\Phi(\ket{1,1}\bra{0,0})=\mathbf{0}_2$ (where $\mathbf{0}_2$ is the $2\times 2$ zero matrix) since $\Phi$ only acts nontrivially on  a matrix which has some nontrivial coefficient along its main diagonal.That is, for $a,b,c,d\in\{0,1\}$, $\Phi(\ket{a,b}\bra{c,d})\neq \mathbf{0}_2$ only if $a=c$ and $b=d$. Then by the linearity of the quantum operation $\Phi$,
\[ \begin{align*}
 \Phi(\rho)&=\frac{1}{2}(\Phi(\ket{0,0}\bra{0,0})+\Phi(\ket{0,0}\bra{1,1})+\Phi(\ket{1,1}\bra{0,0})+\Phi(\ket{1,1}\bra{1,1}))\\
 &=\frac{1}{2}(\ket{0}\bra{0}+\ket{1}\bra{1}) \\
 &=\frac{1}{2}\begin{pmatrix}1 &0 \\
 0 &0 \end{pmatrix} + \frac{1}{2}\begin{pmatrix}0 &0 \\
 0 &1 \end{pmatrix} \\
 &=\frac{1}{2}\begin{pmatrix}1 &0 \\
 0 &1 \end{pmatrix}
 \end{align*}\]
 
  Thus, since $\Phi(\rho)$ in this case is a mixed state, but $\rho$ was a pure state, the quantum operation $\Phi$ does not always take pure input states to pure output states.

An analysis of the amplitude damping channel

  Let $p$ be an arbitrary real-valued parameter such that $0<p<1$. We will explore some nice properties of the one-qubit operation, commonly referred to as the amplitude damping channel, defined by the two Krauss operators
\[
  A_0=\begin{pmatrix}1&0\\0&\sqrt{1-p} \end{pmatrix} \ \ \ \text{and} \ \ \ A_1=\begin{pmatrix} 0 &\sqrt{p} \\ 0 & 0 \end{pmatrix} .
\]
  It is easy to verify that $A^\dagger_0A_0+A^\dagger_1A_1 = I$, so the operation $D_\rho$ that maps each $2 \times 2$ density matrix $\rho$ to
\[
  D_p(\rho)=A_0\rho A^\dagger_0 +A_1\rho A^\dagger_1
\]
  is indeed a valid quantum operation.
 
Lets find a state that is a fixed point of $D_p$, which is a one-qubit density matrix $\rho_0$ satisfying $D_p(\rho_0)=\rho_0$.
 
  Let $\rho_0=\begin{pmatrix} a&b\\c&d\end{pmatrix}$ such that $Tr(\rho_0)=a+b=1$ so that $\rho_0$ be such a density matrix. Then
 \[ \begin{align*}
  D_p(\rho_0)&= A_0\rho_0A^\dagger_0+A_1\rho_0A^\dagger_1 \\
  &=\begin{pmatrix} 1&0\\0&\sqrt{1-p} \end{pmatrix}\begin{pmatrix} a&b\\c&d\end{pmatrix}\begin{pmatrix} 1&0\\0&\sqrt{1-p}\end{pmatrix}+\begin{pmatrix} 0 &\sqrt{p} \\ 0 & 0 \end{pmatrix}\begin{pmatrix} a&b\\c&d\end{pmatrix}\begin{pmatrix} 0 &0 \\ \sqrt{p} & 0 \end{pmatrix} \\
  &=\begin{pmatrix} a & b\sqrt{1-p} \\ c\sqrt{1-p} & d(1-p) \end{pmatrix} + \begin{pmatrix} dp &0 \\0 & 0 \end{pmatrix} \\
  &= \begin{pmatrix}a+dp & b\sqrt{1-p} \\ c\sqrt{1-p}& d(1-p) \end{pmatrix} \\
  &= \begin{pmatrix} a&b\\c&d\end{pmatrix}.
  \end{align*}\]
 
  Equating the coefficients of the matrices in the last two lines gives
\[
  \left\{\begin{array}{rl}
  a=&a+dp, \\
 b=& b\sqrt{1-p}, \\
 c=& c\sqrt{1-p}, \\
  d=&d(1-p) \\
  \end{array}
  \right.
\]
  Since $0<p<1$, $1-p\neq 0,1$, and thus $b=c=d=0$. Furthermore, since $\rho_0$ is a density matrix, $Tr(\rho_0)=a+d=a=1$. Thus,
\[
   \rho_0=\begin{pmatrix} 1&0\\0&0\end{pmatrix}=\ket{0}\bra{0}
\]
   is the fixed point satisfying $D_p(\rho_0)=\rho_0$.
 
Now, lets show that the operation $D^{(2)}_p$, corresponding to applying $D_p$ twice in succession, that is $D^{(2)}_p(\rho)=D_p(D_p(\rho))$, is equivalent to applying $D_q$ once for some suitably chosen value $q$.
 
  Before proceeding consider the following identities involving the Krauss operators $A_0$ and $A_1$:
 \[ \begin{align*}
  A_0=A_0^\dagger, \ \ \ \ \
  & A_0^2=A_0^{\dagger2}=\begin{pmatrix}1&0\\0&1-p \end{pmatrix}, \ \ \ \ \  A_1^2=A_1^{\dagger2}=\begin{pmatrix}0&0\\0&0
   \end{pmatrix} \\
 &   A_0A_1=A_1,  \ \ \ \ \ A_1A_0=\begin{pmatrix}0&\sqrt{p^2-p}\\0&0\end{pmatrix}
  \end{align*}\]
 
  Then,
 \[ \begin{align*}
  D_p(D_p(\rho))&=A_0(A_0\rho A^\dagger_0 +A_1\rho A^\dagger_1) A^\dagger_0 +A_1(A_0\rho A^\dagger_0 +A_1\rho A^\dagger_1) A^\dagger_1 \\
  &=A_0A_0\rho A^\dagger_0A^\dagger_0 +A_0A_1\rho A^\dagger_1A^\dagger_0  +A_1A_0\rho A^\dagger_0A^\dagger_1 +A_1A_1\rho A^\dagger_1A^\dagger_1 \\
  &=A_0^2\rho A_0^{\dagger2}+(A_0A_1)\rho (A_0A_1)^\dagger +(A_1A_0)\rho (A_1A_0)^\dagger \\
  &=A_0^2\rho A_0^{\dagger2}+A_1\rho A_1^\dagger +(A_1A_0)\rho (A_1A_0)^\dagger
  \end{align*}\]
 
  Now,  observing that
  \[\begin{align*}
  A_0^2\rho A_0^{\dagger2}&=\begin{pmatrix}1&0\\0&1-p \end{pmatrix}\rho\begin{pmatrix}1&0\\0&1-p \end{pmatrix} \\
   A_1\rho A_1A_1^\dagger +(A_1A_0)\rho (A_1A_0)^\dagger&=\begin{pmatrix} 0 &\sqrt{2p-p^2} \\ 0 & 0 \end{pmatrix}\rho\begin{pmatrix} 0 &\sqrt{2p-p^2} \\ 0 & 0 \end{pmatrix},
  \end{align*}\]
  and by letting $q:=2p-p^2$ allows $D_p(D_p(\rho))$ to be expressed as
\[
  D_p(D_p(\rho))=\begin{pmatrix}1&0\\0&\sqrt{1-q} \end{pmatrix}\rho\begin{pmatrix}1&0\\0&\sqrt{1-q} \end{pmatrix}+\begin{pmatrix} 0 &\sqrt{q} \\ 0 & 0 \end{pmatrix}\rho\begin{pmatrix} 0 &0 \\ \sqrt{q} & 0 \end{pmatrix}=D_q(\rho).
\]
 
  Hence $  D_p(D_p(\rho))=D_q(\rho)$, with $q=\sqrt{2p-p^2}$.
 
Generalizing the case just analyzed, we can also define the operation $D^{(k)}_p$, corresponding to applying $D_p$ $k$ times in succession. What are the Krauss operators of $D^{(k)}_p$ with their matrix entries written as closed-form expressions in terms of $p$ and $k$? 
 
  Calculating the effect of applying $D_p(\rho)$ $k-$times in succession results in the expression
\[
  D_p^{(k)}(\rho)=A_0^kA_0^{k\dagger}+\SUM{n=1}{k}(A_1A_0^{n-1})\rho\SUM{n=1}{k}(A_1A_0^{n-1})^\dagger.
\]
 
  Let
\[
  B_0^{(k)}:=A_0^kA_0^{k\dagger}=\begin{pmatrix}1&0\\0&(1-p)^{k/2} \end{pmatrix} \ \ \text{and} \ \ B_1^{(k)}=\SUM{n=1}{k}(A_1A_0^{n-1})=\begin{pmatrix}a&b\\c&d\end{pmatrix}
\]
  be the two Krauss operators of $D_p^{(k)}$ (here, $B_1^{(k)}$ has been expressed as some matrix with unknown coefficients which are to be determined) so that
\[
  D_p^{(k)}(\rho)= B_0^{(k)}\rho B_0^{(k)\dagger}+ B_1^{(k)} \rho B_1^{(k)\dagger}.
\]
  Since $D_p$ itself is a valid quantum operation, compositions of $D_{p}$ will also always be another valid quantum operation. Thus, the two Krauss operators for $D_p^{(k)}$ must satisfy the relation $I=B_0^{(k)\dagger} B_0^{(k)}+ B_1^{(k)^\dagger}  B_1^{(k)}$. Thus. $B_1^{(k)}$ can be determined more precisely by the identity $ B_1^{(k)\dagger}B_1^{(k)}=I-B_0^{(k)^\dagger}B_0^{(k)}$. Since
\[
  B_0^{(k)^\dagger}B_0^{(k)}=\begin{pmatrix}1&0\\0&(1-p)^k\end{pmatrix} \ \ \text{and} \ \ B_1^{(k)\dagger}B_1^{(k)}= \begin{pmatrix}a^2+c^2&ab+cd\\ab+cd&b^2+d^2\end{pmatrix}
\]
 
then using the previous relation implies
\[
\begin{pmatrix}a^2+c^2&ab+cd\\ab+cd&b^2+d^2\end{pmatrix}
=\begin{pmatrix}0&0\\0&1-(1-p)^{k} \end{pmatrix}.
\]
 
  From this, it is seen that
\[  \begin{align*}
  a^2+c^2=0 &\implies a=c=0 \\
  b^2+d^2&=1-(1-p)^{k},
  \end{align*}\]
  but in the case when $k=1$, $B_1^{(1)}=A_1$ so that $b=\sqrt{p}$ and $b^2+d^2=1-(1-p)=p$ implying that $d=0$. Hence, in general
\[
  B_1^{(k)}=\begin{pmatrix}0&\sqrt{1-(1-p)^k}\\0&0\end{pmatrix} \ \ \text{and} \ \ B_0^{(k)}=\begin{pmatrix}1&0\\0&(1-p)^{k/2} \end{pmatrix}
\]
  are the Krauss operators of $D_p^{(k)}$.
 

Now, lets answer the following question: Is $lim_{k\to \infty}D^(k)_p(\rho)=\rho_0$ for any initial state $\rho$, where $\rho_0$ is the fixed point of $D_p$ that was calculated in part above?
 
  For $0<p<1$, it is always the case that $lim_{k\to\infty}(1-p)^k=0$. Therefore, in the limit,
\[
 lim_{k\to\infty}B_0^{(k)}=\begin{pmatrix}1&0\\0&0 \end{pmatrix} \ \ \text{and} \ \ lim_{k\to\infty}B_1^{(k)}=\begin{pmatrix}0&1\\0&0 \end{pmatrix}.
\]
 Let $\rho$ be an arbitrary density matrix expressed in the general form
\[
\rho=\begin{pmatrix}a&b\\c&d \end{pmatrix},
\]
where the constraint $a+d=1$ is imposed so that $Tr(\rho)=1$ as required for density operators by definition.
Then it follows that
\[\begin{align*}
lim_{k\to \infty}D^(k)_p(\rho)&=\begin{pmatrix}1&0\\0&0 \end{pmatrix}\rho\begin{pmatrix}1&0\\0&0 \end{pmatrix}+\begin{pmatrix}0&1\\0&0 \end{pmatrix}\rho\begin{pmatrix}0&0\\1&0 \end{pmatrix} =\begin{pmatrix}a+d&0\\0&0 \end{pmatrix}=\begin{pmatrix}1&0\\0&0 \end{pmatrix}=\rho_0,
\end{align*}\]
which is precisely equal to the fixed point calculated in above.
 

Converting from Stinespring form to Krauss form

Suppose that you are given a description of a quantum operation that takes an $n$-qubit state $\rho$ as input and produced an $n'$-qubit state $\sigma$ as output, where the description is of the following form (where $n+m=n'+m'$):
 
  1. Append $m$ quits in the state $\ket{0^m}$ to the end of the input state.
  2. Apply an $(n+m)$-qubit unitary operation $U$.
  3. Trace out the first $m'$ qubits (resulting in an $n'$-qubit output).

Lets show how to implement this in Krauss form as
\[
 \rho\mapsto \SUM{j\in S}{}A_j\rho A^\dagger_j, \ \ \ \text{where} \ \ \ \SUM{j\in S}{}A^\dagger_jA_j=I.
\]

For notational purposes, let $L(\mathcal{H})$ denote the space of linear operators on a some Hilbert space $\mathcal{H}$. Moreover, denote $\mathcal{H}^{2^n}$ as the Hilbert space of dimension $2^n$.

Consider positive integers $n,m,n',m', d$ such that $n+m=n'+m'=d$.

 Now, let $\rho\in L(\mathcal{H}_A^{2^n})$, $\ket{0}\bra{0}\in L(\mathcal{H}_B^{2^m})$ so that $\rho\otimes\ket{0}\bra{0}\in L(\mathcal{H}_A^{2^n}\otimes\mathcal{H}_B^{2^m})$.

  Also, let $\sigma\in L(\mathcal{H}_{B'}^{2^{n'}})$,
 and consider another Hilbert space $\mathcal{H}_{A'}^{2^{m'}}$ so that
\[
 \mathcal{H}_A^{2^n}\otimes\mathcal{H}_B^{2^m}\approx\mathcal{H}_{A'}^{2^{m'}}\otimes\mathcal{H}_{B'}^{2^{n'}}\approx\mathcal{H}^{2^d}.
\]
Then in regards to the larger ambient space $\mathcal{H}^{2^d}$, the previously mentioned operators can be expressed as $\rho\otimes I_{2^m}\in\mathcal{H}^{2^d}$ and $I_{2^n}\otimes\ket{0}\bra{0}\in L(\mathcal{H}^{2^d})$, where $I_{2^m}\in L(\mathcal{H}^{2^m})$ and $I_{2^n}\in L(\mathcal{H}^{2^n})$ are the identity operators acting on their respective spaces.

In this way, observe that
\[\begin{align*}
\rho\otimes\ket{0}\bra{0}=(\rho\otimes I_{2^m})(I_{2^n}\otimes\ket{0}\bra{0})=(I_{2^n}\otimes\ket{0})\rho(I_{2^n}\otimes\bra{0})
\end{align*}\]
This shows that the quantum operation of adding the ancilla $\ket{0}\bra{0}$ to the end of $\rho$ has its own Krauss operator as $(I_{2^n}\otimes\ket{0})$.

Let $\Phi(\rho)=\sigma$ be the quantum operation described in the problem statement, and let $U\in\mathcal{H}^{2^d}$ be the aforementioned unitary operation. Then the Stinespring representation of the operation is given by
\[
\Phi(\rho)=Tr_{A'}(U(\rho\otimes \ket{0}\bra{0})U^\dagger)=\sigma.
\]
The partial trace operation $Tr_{A'}$ over the register consisting of the space $\mathcal{H}_{A'}^{2^{m'}}$ is itself a quantum operation with $m'$ Krauss operators $B_j$ given by
\[
B_j=\bra{j}\otimes I_{2^{n'}},
\]
where $0\leq j\leq m'-1$ and the states $\ket{j}$ form the orthonormal basis of $\mathcal{H}_{A'}^{2^{m'}}$. Therefore, the action of the operation $\Phi$ can be more explicitly represented as
\[\begin{align*}
\Phi(\rho)&=Tr_{A'}(U(\rho\otimes \ket{0}\bra{0})U^\dagger) \\
&=\SUM{j=0}{m'-1}B_j(U(\rho\otimes \ket{0}\bra{0})U^\dagger)B_j^\dagger \\
&=\SUM{j=0}{m'-1}(\bra{j}\otimes I_{2^{n'}})U(\rho\otimes \ket{0}\bra{0})U^\dagger(\ket{j}\otimes I_{2^{n'}}) \\
&=\SUM{j=0}{m'-1}(\bra{j}\otimes I_{2^{n'}})U(I_{2^n}\otimes\ket{0})\rho(I_{2^n}\otimes\bra{0})U^\dagger(\ket{j}\otimes I_{2^{n'}})\\
&=\SUM{j=0}{m'-1}A_j\rho A_j^\dagger,\\
\end{align*}\]
where for $0\leq j\leq m'$,
\[
A_j:=(\bra{j}\otimes I_{2^{n'}})U(I_{2^n}\otimes\ket{0})
\]
 are matrices with dimensions $2^{n'}$-by-$2^{n}$. Now consider the following sum
\[ \begin{align*}
 \SUM{j=0}{m'-1}A_j^\dagger A_j&= \SUM{j=0}{m'-1}\left((\bra{j}\otimes I_{2^{n'}})U(I_{2^n}\otimes\ket{0})\right)^\dagger(\bra{j}\otimes I_{2^{n'}})U(I_{2^n}\otimes\ket{0}) \\
 &= \SUM{j=0}{m'-1}(I_{2^n}\otimes\bra{0})U^\dagger(\ket{j}\otimes I_{2^{n'}})(\bra{j}\otimes I_{2^{n'}})U(I_{2^n}\otimes\ket{0})\\
 &=\SUM{j=0}{m'-1}(I_{2^n}\otimes\bra{0})U^\dagger(\ket{j}\bra{j}\otimes I_{2^{n'}})U(I_{2^n}\otimes\ket{0}) \\
  &=(I_{2^n}\otimes\bra{0})U^\dagger\left(\SUM{j=0}{m'-1}(\ket{j}\bra{j}\otimes I_{2^{n'}})\right)U(I_{2^n}\otimes\ket{0})\\
    &=(I_{2^n}\otimes\bra{0})U^\dagger\left(I_{2^m}\otimes I_{2^{n'}}\right)U(I_{2^n}\otimes\ket{0})\\
&=(I_{2^n}\otimes\bra{0})U^\dagger I_{2^d}U(I_{2^n}\otimes\ket{0})\\
&=(I_{2^n}\otimes\bra{0})U^\dagger U(I_{2^n}\otimes\ket{0})\\
&=(I_{2^n}\otimes\bra{0})(I_{2^n}\otimes\ket{0})\\
&=I_{2^n}\otimes\ip{0}{0}\\
&=I_{2^n}.
 \end{align*}\]

These $m'$ matrices $A_j$ are therefore the Krauss operators that define an operation equivalent to $\Phi$ which was originally given in Stinespring form. That is,
\[
\Phi(\rho)=\SUM{j=0}{m'-1}A_j\rho A_j^\dagger=\sigma.
\]


The density matrix is in the eye of the beholder

Consider the following scenario. Alice first flips a biased coin that has outcome $0$ with probability $\cos^2(\pi/8)$ and $1$ with probability $\sin^2(\pi/8)$. If the coin value is $0$ she creates the state $\ket{0}$ and if the coin value is $1$ she creates the state $\ket{1}$. Then Alice sends the state that she created to Bob, but does not send the coin value.

From Alice's perspective (who knows the coin value), the density matrix of the state she created will be either
\[
\rho_A^0=\ket{0}\bra{0}=\begin{pmatrix} 1 &0 \\ 0 & 0 \end{pmatrix} \ \ \ \text{or} \ \ \ \rho_A^1=\ket{1}\bra{1}=\begin{pmatrix} 0 &0 \\ 0 & 1 \end{pmatrix}.
\]
 What is the density matrix of the state from Bob's perspective (who does not know the coin value)?

 Assuming that Bob at least knows that the biased coin results in outcome $0$ with probability $\cos^2(\pi/8)$ and $1$ with probability $\sin^2(\pi/8)$, and also that Alice sends him either the state $\ket{0}\bra{0}$ or the state $\ket{1}\bra{1}$ depending on whether or not the coin value was $0$ or $1$, respectively, then Bob's density matrix $\rho_B$ will be given as the statistical mixture of the two cases:
\[\begin{align*}
 \rho_B=&\cos^2(\pi/8)\ket{0}\bra{0}+\sin^2(\pi/8)\ket{0}\bra{0} \\
 =&\cos^2(\pi/8)\begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}+\sin^2(\pi/8)\begin{pmatrix} 0 & 0 \\ 0 & 1 \end{pmatrix}  \\
 =&\begin{pmatrix} \cos^2(\pi/8) & 0 \\ 0 &\sin^2(\pi/8) \end{pmatrix}.
\end{align*}\]

Suppose that, upon receiving the state from Alice, Bob measures it in the computational basis. The measurement process yields a classical bit and an output state $\ket{0}$ or $\ket{1}$. Will Bob's density matrix for the state (with Bob knowing the classical measurement outcome) be the same as Alice's?

 When Bob measures his density matrix $\rho_B$ the resulting density matrix will either be
\[ \begin{align*}
 \rho_B^0=\ket{0}\bra{0} = \rho_A^0 \ \ \ &\text{if Alice flipped} \  0, \\
 \rho_B^1=\ket{1}\bra{1} = \rho_A^1 \ \ \ &\text{if Alice flipped}  \ 1.
 \end{align*}\]

 In either case, Bob's resulting density matrix, $\rho_B^0$ or $\rho_B^1$, after the measurement is the same as Alice's density matrix, $\rho_B^0$ or $\rho_B^1$, depending on the coin value Alice flipped in the original state's construction.

 Suppose that we modify the above scenario to one where Alice flips a fair coin (where outcomes $0$ and $1$ occur with probability $1/2$ each) and if the coin value is $0$ she creates the state $\ket{\psi_0}=\cos(\pi/8)\ket{0}+\sin(\pi/8)\ket{1}$ and if the coin value is $1$ she creates the state $\ket{\psi_1}=\cos(\pi/8)\ket{0}+\sin(\pi/8)\ket{1}$. Alice send the state (but not the coin value) to Bob.

From Alice's perspective (who knows the coin value), the density matrix of the state she created will be either $\rho_A^0= \ket{\psi_0}\bra{\psi_0}$ or  $\rho_A^1\ket{\psi_1}\bra{\psi_1}$ , where \[\begin{align*}
\rho_A^0=&(\cos(\pi/8)\ket{0}+\sin(\pi/8)\ket{1})(\cos(\pi/8)\bra{0}+\sin(\pi/8)\bra{1}) \\
 =&\cos^2(\pi/8)\ket{0}\bra{0}+\cos(\pi/8)\sin(\pi/8)\ket{0}\bra{1}+\cos(\pi/8)\sin(\pi/8)\ket{1}\bra{0}+\sin^2(\pi/8)\ket{1}\bra{1} \\
 =&\cos^2(\pi/8)\begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}+\cos(\pi/8)\sin(\pi/8)\begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}+ \cos(\pi/8)\sin(\pi/8)\begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix}+\sin^2(\pi/8)\begin{pmatrix} 0 & 0 \\ 0 & 1 \end{pmatrix}  \\
 =&\begin{pmatrix} \cos^2(\pi/8) & \cos(\pi/8)\sin(\pi/8) \\ \cos(\pi/8)\sin(\pi/8) &\sin^2(\pi/8)\end{pmatrix},
\end{align*}\]
 or
 \[ \begin{align*}
\rho_A^1=&(\cos(\pi/8)\ket{0}-\sin(\pi/8)\ket{1})(\cos(\pi/8)\bra{0}-\sin(\pi/8)\bra{1}) \\
 =&\cos^2(\pi/8)\ket{0}\bra{0}-\cos(\pi/8)\sin(\pi/8)\ket{0}\bra{1}-\cos(\pi/8)\sin(\pi/8)\ket{1}\bra{0}+\sin^2(\pi/8)\ket{1}\bra{1} \\
 =&\cos^2(\pi/8)\begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}-\cos(\pi/8)\sin(\pi/8)\begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}- \cos(\pi/8)\sin(\pi/8)\begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix}+\sin^2(\pi/8)\begin{pmatrix} 0 & 0 \\ 0 & 1 \end{pmatrix}  \\
 =&\begin{pmatrix} \cos^2(\pi/8) & -\cos(\pi/8)\sin(\pi/8) \\ -\cos(\pi/8)\sin(\pi/8) &\sin^2(\pi/8)\end{pmatrix}.
\end{align*}\]

 Then the density matrix of the state from Bob's perspective $\rho_B$ (who does not know the coin value) will be the statistical mixture
\[ \begin{align*}
 \rho_B=&\frac{1}{2}\rho_A^0+\frac{1}{2}\rho_A^1 \\
 =&\frac{1}{2}\begin{pmatrix} \cos^2(\pi/8) & \cos(\pi/8)\sin(\pi/8) \\ \cos(\pi/8)\sin(\pi/8) &\sin^2(\pi/8)\end{pmatrix}+\frac{1}{2}\begin{pmatrix} \cos^2(\pi/8) & -\cos(\pi/8)\sin(\pi/8) \\ -\cos(\pi/8)\sin(\pi/8) &\sin^2(\pi/8)\end{pmatrix} \\
 =&\begin{pmatrix} \cos^2(\pi/8) & 0 \\ 0 &\sin^2(\pi/8)\end{pmatrix}.
 \end{align*}\]

Suppose that, upon receiving the state from Alice, Bob measures it in the computational basis, yielding a classical bit and an output state $\ket{0}$ or $\ket{1}$. Bob knows the classical bit outcome from his measurement, but does not reveal this to Alice. Will Bob's density matrix for the output state be the same as Alice's.

 If Bob measures his state $\rho_B$, the resulting state will be either
\[ \begin{align*}
 \rho_B^0=\ket{0}\bra{0}, \ \ \ &\text{with probability} \ \cos^2(\pi/8), \\
& \text{or} \\
 \rho_B^1=\ket{1}\bra{1}, \ \ \ &\text{with probability} \ \sin^2(\pi/8).
 \end{align*}\]

 However, if Alice does not learn of the resulting state produced by Bob's  measurement, than her lack of knowledge yields the density matrix $\rho'_A$ consisting of the statistical mixture given by

 \[\begin{align*}
 \rho'_A=&\cos^2(\pi/8)\ket{0}\bra{0} +\sin^2(\pi/8)\ket{1}\bra{1} \\
 &=\begin{pmatrix} \cos^2(\pi/8) & 0 \\ 0 &\sin^2(\pi/8)\end{pmatrix}.
 \end{align*}\]
  Thus, Bob's density matrix is not the same as Alice's regardless of the measurement outcome: $\rho_B^0\neq \rho'_A$ and $\rho_B^1\neq \rho'_A$.

The trace distance between pure states

In this post, the relationship between two distance measures on density operators will be analyzed for a few specific cases. Namely, the trace distance and the Euclidean distance of two pure states will be studied.

The trace norm is defined as $||M||_{Tr}=Tr(\sqrt{M^\dagger M})$, and the trace distance between two density operators $\rho$ and $\sigma$ is given by $||\rho-\sigma||_{Tr}$.

Let's calculate an expression for the trace distance between $\ket{0}$ and $\cos(\theta)\ket{0}+\sin(\theta)\ket{1}$ as a function of $\theta$.

Let
\[
\rho=\ket{0}\bra{0}=\begin{pmatrix} 1&0\\0&0\end{pmatrix}
\]
 and
\[ \begin{align*}
 \sigma&=(\cos(\theta)\ket{0}+\sin(\theta)\ket{1})(\cos(\theta)\bra{0}+\sin(\theta)\bra{1})\\
 \sigma&=\cos^2(\theta)\ket{0}\bra{0}+\cos(\theta)\sin(\theta)\ket{0}\bra{1}+\cos(\theta)\sin(\theta)\ket{1}\bra{0}+\sin^2(\theta)\ket{1}\bra{1} \\
 \sigma&=\begin{pmatrix}\cos^2(\theta)&\cos(\theta)\sin(\theta)\\ \cos(\theta)\sin(\theta)&\sin^2(\theta)
 \end{pmatrix}.
 \end{align*}\]

In this case, since both $\rho$ and $\sigma$ are density operators, they are Hermitian. Thus, $(\rho-\sigma)^\dagger=(\rho-\sigma)$  implying that $(\rho-\sigma)^\dagger(\rho-\sigma)=(\rho-\sigma)^2$. Then
\[
\rho-\sigma=\begin{pmatrix}1-\cos^2(\theta)&-\cos(\theta)\sin(\theta)\\ -\cos(\theta)\sin(\theta)&-\sin^2(\theta)\end{pmatrix}=\begin{pmatrix}\sin^2(\theta)&-\cos(\theta)\sin(\theta)\\ -\cos(\theta)\sin(\theta)&-\sin^2(\theta)\end{pmatrix},
\]
from which it follows that
\[\begin{align*}
(\rho-\sigma)^2&=\begin{pmatrix}\sin^2(\theta)&-\cos(\theta)\sin(\theta)\\ -\cos(\theta)\sin(\theta)&-\sin^2(\theta)\end{pmatrix}\begin{pmatrix}\sin^2(\theta)&-\cos(\theta)\sin(\theta)\\ -\cos(\theta)\sin(\theta)&-\sin^2(\theta)\end{pmatrix}\\
&=\begin{pmatrix}\sin^4(\theta)+\cos^2(\theta)\sin^2(\theta)&-\sin^2(\theta)\cos(\theta)\sin(\theta)+\sin^2(\theta)cos(\theta)\sin(\theta) \\ -\sin^2(\theta)\cos(\theta)\sin(\theta) +\sin^2(\theta)cos(\theta)\sin(\theta)&\sin^4(\theta)+\cos^2(\theta)\sin^2(\theta)\end{pmatrix} \\
&=\begin{pmatrix}
\sin^2(\theta)&0 \\
0&\sin^2(\theta).
\end{pmatrix}
\end{align*}\]
 Thus,
\[
 \sqrt{(\rho-\sigma)^2}=\begin{pmatrix}
\sqrt{\sin^2(\theta)}&0 \\
0&\sqrt{\sin^2(\theta)}\end{pmatrix}
=\begin{pmatrix}
|\sin(\theta)|&0 \\
0&|\sin(\theta)|\end{pmatrix},
\]
and therefore
\[
||(\rho-\sigma)||_{Tr}=Tr\begin{pmatrix}
|\sin(\theta)|&0 \\
0&|\sin(\theta)|\end{pmatrix}=2|\sin(\theta)|.\]

As another example, lets now calculate an expression for the Euclidean distance between the two points in the Bloch sphere that correspond to the pure states $\ket{0}$ and $\cos(\theta)\ket{0}+\sin(\theta)\ket{1}$.

In the Bloch sphere representation, a general density matrix can be expressed as
\[
\frac{1}{2}(I+c_xX+c_yY+c_zZ),
\]
where $X$, $Y$, $Z$ are the Pauli matrices, and the vector $(c_x,c_y,c_z)$ gives the coordinates of the density matrix on the Bloch sphere. In such a representation, $\rho$ and $\sigma$ can be expressed as follows:
\[\begin{align*}
\rho=&\frac{1}{2}(I+Z),\\
\sigma=&\frac{1}{2}(I+2\cos(\theta)\sin(\theta)X+\cos(2\theta)Z),
\end{align*}\]
so that the
coordinate vectors for the Block sphere representations of $\rho$ and $\sigma$ are given by, respectively,
\[\begin{align*}
v_{\rho}=&(0,0,1) \\
v_{\sigma}=&(2\cos(\theta)\sin(\theta),0,\cos(2\theta)).
\end{align*}\]
 Then the Euclidean distance between $\rho$ and $\sigma$ on the Block sphere is given by
 \[\begin{align*}
 ||v_\rho-v_\sigma||_2&=\sqrt{(-2\cos(\theta)\sin(\theta))^2+(1-\cos(2\theta))^2}\\
 &=\sqrt{4\cos^2(\theta)\sin^2(\theta)+4\sin^4(\theta)} \\
 &=\sqrt{4\sin^2(\theta)(\cos^2(\theta)+\sin^2(\theta))}\\
 &=\sqrt{4\sin^2(\theta)}\\
 &=2|\sin(\theta)|.
 \end{align*}\]

 Hence, $||(\rho-\sigma)||_{Tr}=||v_\rho-v_\sigma||_2=2|\sin(\theta)|$, and the two distance measures agree.

Now we'll repeat the calculations done above for these two states: $\rho=\begin{pmatrix} 1/2 &0 \\ 0 & 1/2 \end{pmatrix}$ and the pure state $\cos(\theta)\ket{0}+\sin(\theta)\ket{1}$ as a function of $\theta$.

 As calculatedabove, the density matrix corresponding to the state $\cos(\theta)\ket{0}+\sin(\theta)\ket{1}$ is given by
\[
 \sigma:=\begin{pmatrix}\cos^2(\theta)&\cos(\theta)\sin(\theta)\\ \cos(\theta)\sin(\theta)&\sin^2(\theta)
 \end{pmatrix}
\]

 Calculating the relevant matrices in order to determine the trace distance $||(\rho-\sigma)||_{Tr}=Tr(\sqrt{(\rho-\sigma)^2})$ yields

 \[\begin{align*}
 \rho-\sigma=\begin{pmatrix}\frac{1}{2}-\cos^2(\theta)&\cos(\theta)\sin(\theta)\\ \cos(\theta)\sin(\theta)&\frac{1}{2}-\sin^2(\theta)
 \end{pmatrix},
 \end{align*}\]
 and after some further calculation
\[ \begin{align*}
 (\rho-\sigma)^2&=\begin{pmatrix}\frac{1}{2}-\cos^2(\theta)&\cos(\theta)\sin(\theta)\\ \cos(\theta)\sin(\theta)&\frac{1}{2}-\sin^2(\theta)
 \end{pmatrix}\begin{pmatrix}\frac{1}{2}-\cos^2(\theta)&\cos(\theta)\sin(\theta)\\ \cos(\theta)\sin(\theta)&\frac{1}{2}-\sin^2(\theta)
 \end{pmatrix}\\
 &=\begin{pmatrix}
 \frac{1}{4}&0\\
 0&\frac{1}{4}
 \end{pmatrix},
 \end{align*}\]
 which implies that
\[
 \sqrt{ (\rho-\sigma)^2}=\begin{pmatrix}
\sqrt{ \frac{1}{4}}&0\\
 0&\sqrt{\frac{1}{4}}
 \end{pmatrix}=\begin{pmatrix}
\frac{1}{2}&0\\
 0&\frac{1}{2}
 \end{pmatrix}
\]
 Therefore,
\[
 ||(\rho-\sigma)||_{Tr}=Tr(\sqrt{(\rho-\sigma)^2})=Tr\begin{pmatrix}
\frac{1}{2}&0\\
 0&\frac{1}{2}
 \end{pmatrix}=1
\]
 is the trace distance between the two state $\rho$ and $\sigma$.

 Now, to calculate the Euclidean distance between the two states in the Block sphere representation observe that

\[ \begin{align*}
\rho=&\frac{1}{2}I,\\
\sigma=&\frac{1}{2}(I+2\cos(\theta)\sin(\theta)X+\cos(2\theta)Z),
\end{align*}\]
so that the
coordinate vectors for the Block sphere representations of $\rho$ and $\sigma$ are given by, respectively,
\[\begin{align*}
v_{\rho}=&(0,0,0) \\
v_{\sigma}=&(2\cos(\theta)\sin(\theta),0,\cos(2\theta)).
\end{align*}\]

 The Euclidean distance is then given by
 \[ \begin{align*}
 ||v_\rho-v_\sigma||_2&=\sqrt{(-2\cos(\theta)\sin(\theta))^2+(-\cos(2\theta))^2}\\
 &=\sqrt{4\cos^2(\theta)\sin^2(\theta)+\cos^2(2\theta)} \\
 &=\sqrt{4\cos^2(\theta)\sin^2(\theta)+(\cos^2(\theta)-\sin^2(\theta))^2}\\
 &=\sqrt{2\cos^2(\theta)\sin^2(\theta)+\cos^4(\theta)+\sin^4(\theta))^2}\\
&=\sqrt{2\cos^2(\theta)\sin^2(\theta)+\cos^4(\theta)+\sin^4(\theta))^2}\\
 &=\sqrt{(\cos^2(\theta)+\sin^2(\theta))^2}\\
 &=\sqrt{1^2}\\
 &=1,
 \end{align*}\]
 and so the trace distance of $\rho$ and $\sigma$ agrees with their Euclidean distance on the Block sphere since $||(\rho-\sigma)||_{Tr}=||v_\rho-v_\sigma||_2=1$.

Inputting superpositions into unitary gates

In what follows, we will consider a few different scenarios where some unitary operator is acted on a superposition of states, and a subsequent measurement is made.

Let $U$ be any $n$-qubit unitary, $\ket{\psi_1}$ and $\ket{\psi_2}$ be orthogonal $n$-qubit states, and $a_1, a_2\in\{0,1\}^n$ such that the following property holds. For each $j\in\{1,2\}$, if $U\ket{\psi_j}$ is measured in the computational basis then the outcome is the  computational basis state  $\ket{a_j}$ with probability $1$. Let $\alpha_1, \alpha_2\in\mathbb{C}$ be such that $|\alpha_1|^2+|\alpha_2|^2=1$.

 These assumptions imply the following constraints on the states $U\ket{\psi_1}$ and
\[
 (\ket{a_1}\bra{a_1}+\ket{a_2}\bra{a_2})U\ket{\psi_j}=\ket{a_j}.
\]
 Therefore, it must be the case that
\[
 U\ket{\psi_1}=\ket{a_1} \ \ \ \text{and} \ \ \ U\ket{\psi_2}=\ket{a_2}.
\]

 Now consider the state $\ket{\psi}=\alpha_1\ket{\psi_1}+\alpha_2\ket{\psi_2}$. Then
 \[\begin{align*}
U \ket{\psi}&=\alpha_1U\ket{\psi_1}+\alpha_2U\ket{\psi_2} \\
&=\alpha_1\ket{a_1}+\alpha_2\ket{a_2}.
 \end{align*}\]
So if $U\ket{\psi}$ is measured in the computational basis, the state $\ket{a_1}$ will be observed with probability
\[\begin{align*}
|\bra{a_1}U \ket{\psi}|^2&=|\alpha_1U\bra{a_1}\ket{\psi_1}+\alpha_2\bra{a_1}U\ket{\psi_2}|^2 \\
&=|\alpha_1\bra{a_1}\ket{a_1}+\alpha_2\bra{a_1}\ket{a_2}|^2 \\
&=|\alpha_1|^2. \\
\end{align*}\]
Likewise, if $U\ket{\psi}$ is measured in the computational basis, the state $\ket{a_2}$ will be observed with probability
\[\begin{align*}
|\bra{a_2}U \ket{\psi}|^2&=|\alpha_1\bra{a_2}U\ket{\psi_1}+\alpha_2\bra{a_2}U\ket{\psi_2}|^2 \\
&=|\alpha_1\bra{a_2}\ket{a_1}+\alpha_2\bra{a_2}\ket{a_2}|^2 \\
&=|\alpha_2|^2.
\end{align*}\]

Now consider a slightly modified scenario. Let $U$ be any $n$-qubit unitary, $\ket{\psi_1}$ and $\ket{\psi_2}$ be orthogonal $n$-qubit states, and $a_1, b_1, a_2, b_2\in\{0,1\}^n$ such that the following property holds. For each $j\in\{1,2\}$, if $U\ket{\psi_j}$ is measured in the computational basis then the outcome is the  computational basis state  the outcome is
\[
\left\{ \begin{array}{ll}
a_j& \ \text{with probability} \ p_j \\
b_j& \ \text{with probability} \ q_j
\end{array},
\right.
\]
where $p_k+q_k=1$. Let $\alpha_1, \alpha_2\in\mathbb{C}$ be such that $|\alpha_1|^2+|\alpha_2|^2=1$.

Here it will be shown by counterexample that it does not necessarily follow that if $U(\alpha_1\ket{\psi_1}+\alpha_2\ket{\psi_2})$ is measured in the computational basis then the outcome is
\[
\left\{
\begin{array}{ll}
a_1& \ \text{with probability} \ p_1|\alpha_1|^2 \\
b_1& \ \text{with probability} \ q_1|\alpha_1|^2 \\
a_2& \ \text{with probability} \ p_2|\alpha_2|^2 \\
b_2& \ \text{with probability} \ q_2|\alpha_2|^2.
\end{array}
\right.
\]

A counter example will now be constructed. Let $\ket{\psi_1}=\ket{+}=\frac{1}{\sqrt{2}}(\ket{0}+\ket{1})$ and $\ket{\psi_2}=\ket{-}=\frac{1}{\sqrt{2}}(\ket{0}-\ket{1})$. Then $\ket{\psi_1}$ and $\ket{\psi_2}$ are orthogonal since
\[
\ip{\psi_1}{\psi_2}=\frac{1}{2}(\bra{0}+\bra{1})(\ket{0}-\ket{1})=\frac{1}{2}(\ip{0}{0}-\ip{1}{1})-\frac{1}{2}(1-1)=0.
\]
 Consider the unitary operator (Pauli-$z$) defined as  $Z\ket{0}=\ket{0}$ and $Z\ket{1}=-\ket{1}$. Consequently,
 \[ \begin{align*}
Z\ket{\psi_1}={\sqrt{2}}Z(\ket{0}+\ket{1})={\sqrt{2}}(Z\ket{0}+Z\ket{1})={\sqrt{2}}(\ket{0}-\ket{1})=\ket{\psi_2}, \\
Z\ket{\psi_2}={\sqrt{2}}Z(\ket{0}-\ket{1})={\sqrt{2}}(Z\ket{0}-Z\ket{1})={\sqrt{2}}(\ket{0}+\ket{1})=\ket{\psi_1}.
 \end{align*}\]
 Let $a_1=a_2=0$ and $b_1=b_2=1$, with $p_j=q_j=1/2$ so that $p_j+q_j=1$. Then measuring in the computational basis it is seen that
 \[\begin{align*}
 |\bra{a_1}Z\ket{\psi_1}|^2 &= |\ip{0}{\psi_2}|^2=\left|\frac{1}{\sqrt{2}}\bra{0}(\ket{0}-\ket{1})\right|^2=\frac{1}{2}|\ip{0}{0}-\ip{0}{1}|^2=\frac{1}{2}|1|^2=\frac{1}{2}=p_1 \\
  |\bra{a_2}Z\ket{\psi_2}|^2 &= |\ip{0}{\psi_1}|^2=\left|\frac{1}{\sqrt{2}}\bra{0}(\ket{0}+\ket{1})\right|^2=\frac{1}{2}|\ip{0}{0}+\ip{0}{1}|^2=\frac{1}{2}|1|^2=\frac{1}{2}=p_2 \\
   |\bra{b_1}Z\ket{\psi_1}|^2 &= |\ip{1}{\psi_2}|^2=\left|\frac{1}{\sqrt{2}}\bra{1}(\ket{0}-\ket{1})\right|^2=\frac{1}{2}|\ip{1}{0}-\ip{1}{1}|^2=\frac{1}{2}|-1|^2=\frac{1}{2}=q_1 \\
  |\bra{b_2}Z\ket{\psi_2}|^2 &= |\ip{1}{\psi_1}|^2=\left|\frac{1}{\sqrt{2}}\bra{1}(\ket{0}+\ket{1})\right|^2=\frac{1}{2}|\ip{1}{0}+\ip{1}{1}|^2=\frac{1}{2}|1|^2=\frac{1}{2}=q_2
 \end{align*}\]
 It has just been shown that this construction satisfies all the assumptions of the problem statement.

 Now let $\alpha_1=\alpha_2=\frac{1}{\sqrt{2}}$ be such that $|\alpha_1|^2+|\alpha_2|^2=1$, and consider the state
\[
  \ket{\psi}=\alpha_1\ket{\psi_1}+\alpha_2\ket{\psi_2}=\frac{1}{2}(\ket{0}+\ket{1})+\frac{1}{2}(\ket{0}-\ket{1})=\ket{0}.
\]
 Then $Z\ket{\psi}=Z\ket{0}$, which implies that
\[
 \bra{a_1}Z\ket{\psi}=\ip{0}{0}=1.
\]
 Therefore, the state is observed with probability $1$ and not with probability $p_1|\alpha_1|^2=\frac{1}{2}|\frac{1}{\sqrt{2}}|^2=\frac{1}{4}$ as it is falsely claimed. This discrepancy is sufficient to  say that the statement cannot hold in general.


Now, one final scenario will be analyzed. Let $W$ denote a generalized $n$-qubit controlled-$U$ gate (i.e. for all $x,y\in \{0,1\}^n$, $W\ket{x}\ket{y}=\ket{x}U^x\ket{y}$ and let $\ket{\psi_1}, \ket{\psi_2}$ be two orthogonal eigenvectors of $U$. Let $V$ be any $n$-qubit unitary. Also, let $\ket{\phi}$ ne any $n$-qubit initial state for the control qubits of $W$. Suppose that the following property holds. For each $j\in\{1,2\}$, if the first register of $(V\otimes I)W\ket{\phi}\ket{\psi_j}$ is measured in the computational basis then the outcome is $a_j$ with probability $p_j$, and $b_j$ with probability $q_j$. 

 Express $\ket{\phi}$ in the general form
\[
 \ket{\phi}=\SUM{x=0}{2^n-1}\beta_x\ket{x} \ \text{with} \ \SUM{x=0}{2^n-1}|\beta_x|^2=1,
\]
 and let
\[
 U\ket{\psi_j}=e_j\ket{\psi_j} \ \text{which implies} \  U^x\ket{\psi_j}=e_j^x\ket{\psi_j},
\]
 where $e_j$ is the eigenvalue of the state $\ket{\psi}$.

 Then
\[ \begin{align*}
 (V\otimes I)W\ket{\phi}\ket{\psi_j}&=(V\otimes I)W\SUM{x=0}{2^n-1}\beta_x\ket{x}\ket{\psi_j} \\
 &=(V\otimes I)\SUM{x=0}{2^n-1}W\beta_x\ket{x}\ket{\psi_j} \\
 &=\SUM{x=0}{2^n-1}\beta_xV\ket{x}U^x\ket{\psi_j} \\
  &=\SUM{x=0}{2^n-1}\beta_xV\ket{x}e^x_j\ket{\psi_j} \\
    &=\SUM{x=0}{2^n-1}\beta_xe^x_jV\ket{x}\ket{\psi_j} .
 \end{align*}\]
 Now let
\[
 \ket{S_j}:=\SUM{x=0}{2^n-1}\beta_xe^x_jV\ket{x},
\]
 and consider measuring subject to the constraints imposed by the assumptions described above:
\[
 (\ket{a_j}\bra{a_j}\otimes I)\ket{S_j}=\delta_j\ket{a_j} \\
 (\ket{b_j}\bra{b_j}\otimes I)\ket{S_j}=\gamma_j\ket{b_j},
\]
 where $\delta_j,\gamma_j\in \mathbb{C}$ such that $|\delta_j|^2=p_j$ and $|\gamma_j|^2=q_j$. Hence the states $\ket{S_j}$ are given as
\[
 \ket{S_j}=\delta_j\ket{a_j}+\gamma_j\ket{b_j}.
\]

 Now consider the following:
 \[ \begin{align*}
 (V\otimes I)W\ket{\phi}(\alpha_1\ket{\psi_1}_+\alpha_2\ket{\psi_j})&=(V\otimes I)W\SUM{x=0}{2^n-1}\beta_x\ket{x}(\alpha_1\ket{\psi_1}_+\alpha_2\ket{\psi_j})\\
 &=(V\otimes I)\SUM{x=0}{2^n-1}W\beta_x\ket{x}(\alpha_1\ket{\psi_1}_+\alpha_2\ket{\psi_j}) \\
 &=\SUM{x=0}{2^n-1}\beta_xV\ket{x}U^x(\alpha_1\ket{\psi_1}_+\alpha_2\ket{\psi_j}) \\
  &=\SUM{x=0}{2^n-1}\beta_xV\ket{x}(\alpha_1U^x\ket{\psi_1}_+\alpha_2U^x\ket{\psi_j}) \\
  &=\SUM{x=0}{2^n-1}\beta_xV\ket{x}(\alpha_1e_1^x\ket{\psi_1}_+\alpha_2e_2^x\ket{\psi_j}) \\
    &=\alpha_1\SUM{x=0}{2^n-1}\beta_xe^x_1V\ket{x}\ket{\psi_1} +\alpha_2\SUM{x=0}{2^n-1}\beta_xe^x_2V\ket{x}\ket{\psi_2}\\
        &=\alpha_1\ket{S_1}\ket{\psi_1} +\alpha_2\ket{S_2}\ket{\psi_2} \\
        &=\alpha_1( \delta_1\ket{a_1}+\gamma_1\ket{b_1})\ket{\psi_1} +\alpha_2( \delta_2\ket{a_2}+\gamma_2\ket{b_2})\ket{\psi_2} \\
        &=:\ket{\Omega}.
 \end{align*}\]
 Now consider the following measurement outcomes for the state $\ket{\Omega}$:
\[ \begin{align*}
 &(\ket{a_j}\bra{a_j}\otimes I)\ket{\Omega}=\\
 &\alpha_1( \delta_1\ket{a_j}\bra{a_j}\ket{a_1}+\gamma_1\ket{a_j}\bra{a_j}\ket{b_1})\ket{\psi_1} +\alpha_2( \delta_2\ket{a_j}\bra{a_j}\ket{a_2}+\gamma_2\ket{a_j}\bra{a_j}\ket{b_2})\ket{\psi_2}=\alpha_j\delta_j\ket{a_j}\ket{\psi_j}\\
 \\
  &(\ket{b_j}\bra{b_j}\otimes I)\ket{\Omega}=\\
 &\alpha_1( \delta_1\ket{b_j}\bra{b_j}\ket{a_1}+\gamma_1\ket{b_j}\bra{b_j}\ket{b_1})\ket{\psi_1} +\alpha_2( \delta_2\ket{b_j}\bra{b_j}\ket{a_2}+\gamma_2\ket{b_j}\bra{b_j}\ket{b_2})\ket{\psi_2}=\alpha_j\gamma_j\ket{b_j}\ket{\psi_j}
 \end{align*}\]
 Recalling that $|\delta_j|^2=p_j$ and $|\gamma_j|^2=q_j$ as shown above, this yields the probability distribution given as
\[
 \left\{
 \begin{array}{ll}
 a_1 \ \text{with probability} \ p_1|\alpha_1|^2 \\
 b_1 \ \text{with probability} \ q_1|\alpha_1|^2 \\
 a_2 \ \text{with probability} \ p_2|\alpha_2|^2 \\
 b_1 \ \text{with probability} \ q_2|\alpha_2|^2.
 \end{array}
 \right.
\]

Period inversion

Let $p$ and $q$ be integers greater than $1$, and let $pq$ denote their product. Recall that the quantum Fourier transform modulo $pq$ is the $pq$-dimensional unitary operation defined as
\[
F_{pq}\ket{x}=\frac{1}{\sqrt{pq}}\SUM{y=0}{pq-1}\omega_{pq}^{xy}\ket{y},
\]
for each $x\in\mathbb{Z}_{pq}$, where $\omega_{N}=e^{i2\pi/N}$.

Define two quantum states $\ket{\psi_1}$ and $\ket{\psi_2}$ as
\[
\ket{\psi_1}=\frac{1}{\sqrt{q}}\SUM{x=0}{q-1}\ket{xp} \ \ \ \text{and} \ \ \ \ket{\psi_2}=\frac{1}{\sqrt{p}}\SUM{x=0}{p-1}\ket{xq}.
\]

Then
\[\begin{align*}
F_{pq}\ket{\psi_1}&=\frac{1}{q\sqrt{p}}\SUM{x=0}{q-1}\SUM{y=0}{pq-1}\omega_{pq}^{xpy}\ket{y} \\
&=\frac{1}{q\sqrt{p}}\SUM{y=0}{pq-1}\left(\SUM{x=0}{q-1}\omega_{q}^{xy})\right)\ket{y} \\
&=\frac{1}{q\sqrt{p}}\SUM{n=0}{p-1}\left(\SUM{x=0}{q-1}1\right)\ket{nq} \\
&=\frac{1}{q\sqrt{p}}\SUM{n=0}{p-1}q\ket{nq} \\
&=\frac{1}{\sqrt{p}}\SUM{n=0}{p-1}\ket{nq} \\
&=\ket{\psi_1},
\end{align*}\]
where the third and fourth lines follows from the general fact that
\[\SUM{x=0}{N-1}\omega_{N}^{xy}=\left\{
\begin{array}{ll} N \  &\text{if} \ y=0 (\text{mod} \ N) \\
0 &\text{if} \ y\neq 0 (\text{mod} \ N)
\end{array}
\right.,
\]
and that $y=0 (\text{mod} \ q)$ implies that $y=nq$ for some integer $n$.

Let $s=\{0,1,\dots, p-1\}$, and define $\ket{\psi_3}$ (a ``shifted" version of $\ket{\psi_1}$) as
\[\ket{\psi_3}=\frac{1}{\sqrt{q}}\SUM{x=0}{q-1}\ket{s+xp}.\]

Then
\[\begin{align*}
F_{pq}\ket{\psi_3}&=\frac{1}{q\sqrt{p}}\SUM{x=0}{q-1}\SUM{y=0}{pq-1}\omega_{pq}^{(s+xp)y}\ket{y} \\
&=\frac{1}{q\sqrt{p}}\SUM{x=0}{q-1}\SUM{y=0}{pq-1}\omega_{pq}^{sy}\omega_{pq}^{xpy}\ket{y} \\
&=\frac{1}{q\sqrt{p}}\SUM{y=0}{pq-1}\omega_{pq}^{sy}\SUM{x=0}{q-1}\omega_{q}^{xy}\ket{y} \\
&=\frac{1}{q\sqrt{p}}\SUM{n=0}{p-1}\omega_{pq}^{snq}\SUM{x=0}{q-1}1\ket{nq} \\
&=\frac{1}{q\sqrt{p}}\SUM{n=0}{p-1}\omega_{p}^{sn}q\ket{nq} \\
&=\frac{1}{\sqrt{p}}\SUM{n=0}{p-1}\omega_{pq}^{sn}\ket{nq}.
\end{align*}\]

 Let $\alpha_{y}$ denote the amplitude of the state $\ket{y}$ in the super position described in $F_{pq}\ket{\psi_1}$ just derived. Since any state $\ket{y}\neq\ket{nq}$ for some $n\in\{0,1,\dots,p-1\}$ does not appear in the superposition $F_{pq}\ket{\psi_1}$, this implies that the amplitude $\alpha_y$ for such states is $0$ and will be observed with probability $0$. On the contrary, for any state $\ket{y}\neq\ket{nq}$ for some $n\in\{0,1,\dots,p-1\}$, the corresponding amplitude is given by $\alpha_{y}=\frac{1}{\sqrt{p}}\omega_{pq}^{sn}$. Therefore the probability of observing the state $\ket{y}\neq\ket{nq}$ is $|\alpha_y|^2=\frac{1}{\sqrt{p}}\omega_{pq}^{sn}\frac{1}{\sqrt{p}}\overline{\omega}_{pq}^{sn}=\frac{1}{p}$. Hence, if $F_{pq}\ket{\psi_1}$ is measured in the computation basis, only some state $\ket{nq}$ representing an integer multiple of $q$ will be observed with uniform probability of $\frac{1}{p}$.

Simon's problem modulo $m$


Throughout, the following notation and results will be used. Let $\omega_N=e^{i2\pi/N}$. The following sum can be evaluated as so:
\[
\SUM{x=0}{N-1}\omega_N^{xy}
\left\{
\begin{array}{ll}
N \ \ \text{if} \ \ y\equiv0 \ (\text{modulo} \ N) \\
0 \ \ \text{if} \ \ y\equiv0 \ (\text{modulo} \ N).
\end{array}
\right.
\]


 Let $m$ be some $n$-bit number $(2^{n-1}<m<2^m)$ and assume that we are given a black box computing $f : \mathbb{Z}_m\times\mathbb{Z}_m$ that is promised to have the property that $f(a_1, a_2)=f(b_1,b_2)$ if and only if $(a_1,a_2)-(b_1,b_2)\in S$, where $S=\{k(r,1) ; k\in \mathbb{Z}_m\}$ for some unknown $r\in\mathbb{Z}_m$. Let the goal be to compute $r$.

For each $a\in\mathbb{Z}_m$, define $S_a=S+(a,0)=\{(kr+a,k) : k\in\mathbb{Z}_m\}$. It will now be proved that $S_0, S_1, \dots, S_{m-1}$ for a partition of $\mathbb{Z}_m\times\mathbb{Z}_m$.

Here, it will be shown that $a\neq b$ implies that $S_a\cap S_b=\emptyset$, where without loss of generality $a,b\in\mathbb{Z}_m$ are such that $a,b\in\{0,1, \dots, m-1\}$. Moreover, for $x,y\in\mathbb{Z}$, write $x\equiv y$ if $x=y(\text{mod}\ m)$. Thus, assume that $a\neq b$. Consider any $s\in S_a$ so that $s=(kr+a,k)$ for some $k\in\mathbb{Z}_m$. Now suppose for the sake of contradiction that $s\in S_b$ so that $s=(k'r+b,k')$ for some $k'\in\mathbb{Z}_m$. Then $(kr+a, k)=(k'r+b, k')$, which implies that $k\equiv k' \ \text{and} \ kr+a\equiv k'r+b$. Furthermore, since $k\equiv k'$ implies $k-k'\equiv0$, then $kr+a\equiv k'r+a$ implies $a-b\equiv (k-k')r\equiv 0$. This then implies that $a\equiv b$ or that $a=b$ contradicting the original assumption that $a=b$. Thus, for every $s\in S_a$, $s\notin S_b$. By the same reasoning it can similarly be shown that for every $t\in S_b$, $t\notin S_a$. Therefore, if $a\neq b$, then it must be the case that $S_a\cap S_b=\emptyset$

Now it will be shown that $S_0\cup S_1\cup\dots\cup S_{m-1}=\mathbb{Z}_m\times\mathbb{Z}_m$. Since $S_a\subseteq\mathbb{Z}_m\times\mathbb{Z}_m$ for all $a\in\{0,1,\dots, m-1\}$, then if $s\in S_a$, for some $a$, it follows that $s\in\mathbb{Z}_m\times\mathbb{Z}_m$.  Instead, consider an arbitrary $s=(j,k)\in\mathbb{Z}_m\times\mathbb{Z}_m$. Let $c\in\{0,1,\dots, m-1\}$ be such that $c\equiv j-kr$. Observe that  $kr+c\equiv kr+j-kr\equiv j$, and therefore $s=(j,k)=(kr+c,k)\in S_c$. Hence, for any $s\in\mathbb{Z}_m\times\mathbb{Z}_m$ there exists some $S_c$ such that $s\in S_c$. Hence, $S_0\cup S_1\cup\dots\cup S_{m-1}=\mathbb{Z}_m\times\mathbb{Z}_m$.

The results of (i) and (ii) together imply that the sets $S_0, S_1, \dots, S_{m-1}$ form a partition of $\mathbb{Z}_m\times\mathbb{Z_m}$.


Here, we prove that $f(x_1,x_2)=f(y_1,y_2)$ if and only if $(x_1,x_2)$ and $(y_1,y_2)$ are in the same element of the above partition (in other words, $(x_1,x_2),(y_1,y_2)\in S_a$ for some $s$).

First, suppose that $(x_1,x_2),(y_1,y_2)\in S_a$ for some $a$. Then there exists $k,k'\in\mathbb{Z}_m$ such that $(x_1,x_2)=(kr+a, k)$ and $(y_1,y_2)=(k'r+a,k')$. This implies that
\[
\begin{align*}
(x_1,x_2)-(y_1,y_2)&=(kr+a, k)-(k'r+a,k')\\
&=(kr+a-k'r-a,k-k')\\
&=((k-k')r,k-k')\\
&=(k-k')(r,1)\in S.
\end{align*}
\]
Then provided with the promise on $f$, it follows that $f(x_1,x_2)=f(y_1,y_2)$.

Suppose instead that $f(x_1,x_2)=f(y_1,y_2)$. The promise on $f$ then implies that $(x_1,x_2)-(y_1,y_2)\in S$, which means that $(x_1,x_2)-(y_1,y_2)=k(r,1)$ for some $k\in \mathbb{Z}_m$. Since
\[
(x_1,x_2)-(y_1,y_2)=(x_1-y_1,x_2-y_2)=k(r,1)=(kr,k),
\]
this implies that $x_2-y_2=k$. Moreover, since $kr=(x_2-y_2)r=x_1-y_1$ this implies that $x_2r-x_1=y_2r-y_1$. Let $c=x_2r-x_1=y_2r-y_1$. Now note that for any $(z_1,z_2\in S_a$ for some $a$, $(z_1,z_2)=(nr+a,n)$ for some $n$. This is equivalent to the condition $z_1=z_2r+a$. Therefore, since $x_2r-x_1=c$ and $y_2r-y_1=c$, it follows that $x_1=x_2r+c$ and $y_1=y_2r+c$. Hence, $(x_1,x_2),(y_1,y_2)\in S_c$ showing that if $f(x_1,x_2)=f(y_1,y_2)$, $(x_1,x_2)$ and $(y_1,y_2)$ belong to the same set $S_c$.

It has now been shown that $f(x_1,x_2)=f(y_1,y_2)$ if and only if $(x_1,x_2)$ and $(y_1,y_2)$ are in the same element of the above partition.


Consider the state
\[
\ket{\psi}=\frac{1}{m}\displaystyle\sum_{x_1=0}^{m-1}\SUM{x_2=0}{m-1}\ket{x_1}\ket{x_2}\ket{f(x_1,x_2)}.
\]
Since $f(x_1,x_2)=f(y_1,y_2)$ if $(x_1,x_2),(y_1,y_2)\in S_a$ for some $a$. The state $\ket{\psi}$ is an entangled state which can be equivalently expressed as
\[\begin{align*}
\ket{\psi}&=\frac{1}{m}\SUM{a=0}{m-1}\SUM{k=0}{m-1}\ket{kr+a}\ket{k}\ket{f(kr+a,k)}\\
&=\frac{1}{m}\SUM{a=0}{m-1}\SUM{k=0}{m-1}\ket{kr+a}\ket{k}\ket{f_a}.
\end{align*}\]
Here the state of the third register has been labelled with $f_a$, which denotes the value of $f(kr+a,k)$, since this value does not depend on $k$ due to the promise that $f(kr+a,k)=f(k'r+a,k')$ for any $k$ and $k'$.

Then applying the Fourier transform $F_m^\dagger$ to the first two registers yields the state:
\[\begin{align*}
\ket{\overline{\psi}}:=(F_m^\dagger\otimes F_m^\dagger\otimes I)\ket{\psi}&=\frac{1}{m}\SUM{a=0}{m-1}\SUM{k=0}{m-1}F_m^\dagger\ket{kr+a}F_m^\dagger\ket{k}\ket{f_a}\\
&=\frac{1}{m^2}\SUM{a=0}{m-1}\SUM{k=0}{m-1}\SUM{s_1=0}{m-1}\SUM{s_2=0}{m-1}\omega_m^{-(kr+a)s_1}\omega_m^{-ks_2}\ket{s_1}\ket{s_2}\ket{f_a} \\
&=\frac{1}{m^2}\SUM{a=0}{m-1}\SUM{k=0}{m-1}\SUM{s_1=0}{m-1}\SUM{s_2=0}{m-1}\omega_m^{-(kr+a,k)\cdot(s_1,s_2)}\ket{s_1}\ket{s_2}\ket{f_a} \\
&=\frac{1}{m^2}\SUM{a=0}{m-1}\SUM{k=0}{m-1}\SUM{s_1=0}{m-1}\SUM{s_2=0}{m-1}\omega_m^{-k(r,1)\cdot(s_1,s_2)-as_1}\ket{s_1}\ket{s_2}\ket{f_a}
\end{align*}\]

 Now define the sets $R^\perp, R\subseteq \mathbb{Z}_m\times\mathbb{Z}_m$ as
\[\begin{align*}
R^\perp&:=\{(s_1,s_2)\in\mathbb{Z}_m\times\mathbb{Z}_m : (r,1)\cdot(s_1,s_2)=0\} \\
R&:=\{ (s_1,s_2)\in\mathbb{Z}_m\times\mathbb{Z}_m : (r,1)\cdot(s_1,s_2)\neq0\}.
\end{align*}\]
These two sets $R$ and $R^\perp$ partition $\mathbb{Z}_m\times\mathbb{Z}_m$ since for every element $(s_1,s_2)\in\mathbb{Z}_m\times\mathbb{Z}_m$ either $(r,1)\cdot(s_1,s_2)=0$ or $(r,1)\cdot(s_1,s_2)\neq0$. This allows the sum over $s_1$ and $s_2$ in the state for $\ket{\overline{\psi}}$ to be split into two sums. That is,
\[\begin{align*}
\ket{\overline{\psi}}&=\frac{1}{m^2}\SUM{k=0}{m-1}\SUM{a=0}{m-1}\SUM{(s_1,s_2)\in R^\perp}{}\omega_m^{-k(r,1)\cdot(s_1,s_2)-as_1}\ket{s_1}\ket{s_2}\ket{f_a}+\frac{1}{m^2}\SUM{k=0}{m-1}\SUM{a=0}{m-1}\SUM{(s_1,s_2)\in R}{}\omega_m^{-k(r,1)\cdot(s_1,s_2)-as_1}\ket{s_1}\ket{s_2}\ket{f_a}.
\end{align*}\]

Consider just the first summation ranging over$(s_1,s_2)\in R^\perp$ contained in $\ket{\overline{\psi}}$. In this case, $(r,1)\cdot(s_1,s_2)=0$, so the summation becomes
\[\begin{align*}
&\frac{1}{m^2}\SUM{k=0}{m-1}\SUM{a=0}{m-1}\SUM{(s_1,s_2)\in R^\perp}{}\omega_m^{-k(r,1)\cdot(s_1,s_2)-as_1}\ket{s_1}\ket{s_2}\ket{f_a} \\
=&\frac{1}{m^2}\SUM{k=0}{m-1}\SUM{a=0}{m-1}\SUM{(s_1,s_2)\in R^\perp}{}\omega_m^{-as_1}\ket{s_1}\ket{s_2}\ket{f_a} \\
=&\frac{1}{m}\SUM{a=0}{m-1}\SUM{(s_1,s_2)\in R^\perp}{}\omega_m^{-as_1}\ket{s_1}\ket{s_2}\ket{f_a}
\end{align*}\]
Now consider the second summation contained in $\ket{\overline{\psi}}$, which ranges over $(s_1,s_2)\in R$. This implies that $(r,1)\cdot(s_1,s_2)\neq0$, so that
\[\begin{align*}
&\frac{1}{m^2}\SUM{k=0}{m-1}\SUM{a=0}{m-1}\SUM{(s_1,s_2)\in R}{}\omega_m^{-k(r,1)\cdot(s_1,s_2)-\overline{a}s_1}\ket{s_1}\ket{s_2}\ket{f_a}=0,
\end{align*}\]
which implies that no states of the form $\ket{s_1}\ket{s_2}$ where $(s_1,s_2)\in R$ appear in the superposition in $\ket{\overline{\psi}}$. Hence the final state is of the form
\[
 \ket{\overline{\psi}}=\frac{1}{m^2}\SUM{a=0}{m-1}\SUM{(s_1,s_2)\in R^\perp}{}\omega_m^{-as_1}\ket{s_1}\ket{s_2}\ket{f_a},
\]
 and measuring $\ket{\overline{\psi}}$ will yield some state $\ket{s_1}\ket{s_2}$ such that $(s_1,s_2)\in R^\perp$ with probability
\[
\left|\frac{1}{m}\SUM{a=0}{m-1}\omega_m^{-as_1}\right|^2=\frac{1}{m^2}\SUM{a=0}{m-1}|\omega_m^{-as_1}|^2=\frac{1}{m^2}\SUM{a=0}{m-1}|1=\frac{m}{m^2}=\frac{1}{m}.
\]