Pages

Tuesday, 15 April 2014

Homological Product Codes


Abstract

 Using the concept of a chain complex from homology theory a means of constructing CSS codes from the chain complex is described. Moreover, the homological product of two chain complexes can be used to define a product of the respective CSS codes referred to as a homological product code introduced in \cite{Bravyi}. Code parameters of the constructed CSS codes are determined and randomization techniques are used to construct random CSS codes from random chain complexes. The main result is that with high probability a random homological product code on $n$ physical qubits has code parameters given by $[[n,\Omega(n),\Omega(n),O(\sqrt{n})]]$. Hence, homological product codes are one of the first examples of good quantum codes having relatively low stabilizer weight $O(\sqrt{n})$.




Quantum Error Correction

Computation, often considered entirely in the abstract, is a physical process, and although ideal faultless computations are an interesting idea to entertain, the need for active error correction seems inevitable. As physical systems undergoing physical processes, computers are subject to internal and external noise.  This is even more undeniable when comparing quantum computation to classical computation. Despite offering a seemingly more powerful paradigm of computing, controlling quantum computation is a much more challenging task.

The objective of quantum error correction is to encode desired information in a subspace of a larger space through means of redundancy. Such an encoding can serve the purpose of protecting quantum states and transforming quantum states in a hopefully robust manner. Various error correcting codes achieve this is different ways. The primary means used to describe a code is through its code parameters. These parameters are the number of actual physical qubits $n$ used in the system, the number of effectively encoded logical qubits $k$ (where $2^k$ is the dimension of the codespace), and the code distance $d$ which specifies certain limitations on what the code can correct. For a quantum error correcting code, denote its parameters as $[[n,k,d]]$. Other parameters may also be defined in terms of these or independent of them.

"Good'' Codes

Given the parameters $[[n,k,d]]$ of a quantum error correcting code, some notion of what it means for the code to be good or bad ought to be determined. A formal criterion for the properties of a ``good" quantum code is presented here. An important issue to address is how the code parameters scale as the number of physical qubits $n$ increase. A code can be characterized by considering how the various parameters behave in the asymptotic limit as $n$ grows large.  In this regard, consider the ratio $k/n$ called the encoding rate and the ratio $d/n$ called the relative distance. A $[[n,k,d]]$ quantum code is defined to be good if both of the following relations hold between the parameters in the limit as $n$ tends to infinity:
\[
k/n=\Omega(1) \Longleftrightarrow k=\Omega(n)
\]
and
\[
d/n=\Omega(1) \Longleftrightarrow k=\Omega(n).
\]
Thus, a code is good if both the encoding rate and relative distance are at least constant. This is equivalent to the condition that both $k$ and $d$ must grow at least linearly in $n$. Good codes defined in this way capture some desirable properties of a code. For instance, suppose it is required that additional logical qubits be involved in some error correcting scheme, or that the distance $d$ of the code needs to be increased to protect from more severe errors. Then for a good quantum code, both $k$ and $d$ can be increased at least linearly by only adding more physical qubits to the code. Good codes ensure that in order to increase parameters like $k$ and $d$, a large overhead in the number of physical qubits is not necessary in order to achieve the desired increase in parameter values.

Stabilizer Weight: $\mathbf{w}$

In addition to the standard code parameters $[[n,k,d]]$, another parameter known as the stabilizer weight $w$ becomes relevant for practical purposes. The stabilizer weight of a code is a concept that is defined for CSS codes and more generally for stabilizer codes. Intuitively, the stabilizer weight measures the degree of interaction between stabilizer operators and the physical qubits of the code. Each stabilizer operator acts nontrivially on some number of physical qubits, and each physical qubit is generally acted on by some number of stabilizer operators. A stabilizer code is said to have stabilizer weight $w$ (where $w$ is the smallest such number), if no stabilizer operator acts on more than $w$ physical qubits, and no more than $w$ different stabilizer operators act on any particular physical qubit.

In the CSS formalism, code spaces $C^X$ and $C^Z$ are spanned by columns of binary parity check matrices $A^X$ and $ A^Z$, which specify stabilizer operators consisting of only Pauli X or only Pauli Z operators, respectively. In this setting, the code $CSS(C^X,C^Z)$ is said to have stabilizer weight $w$ if every row and column of the parity check matrices $A^X$ and $A^Z$ has weight at most $w$. Here, the weight of a binary vector is given by the number of $1$s that comprise the vector.


LDPC Codes

Consider a code having stabilizer weight $w$. Then if it is the case that $w=O(1)$, the code is called an LDPC code (Low Density Parity Check code). Thus, in an LDPC code each stabilizer operator acts on at most a constant number of physical qubits. Moreover, each physical qubit is only acted on by a constant number of stabilizer operators.

For practical purposes, LDPC codes are ideal when compared to non LDPC codes having stabilizer weight which may scale with $n$. The reason why this is the case is ultimately due to fault tolerant considerations. Error correction generally requires measurements of some form. In the quantum setting, this is quite literally a sensitive issue. In practice operations and gates have a nonzero error probability, and the more gates or operations that are involved, the more this error is expected to accumulate. A common strategy for measuring quantum systems for error correction is to couple the system with an ancilla, and then measure the corresponding ancilla in order to read out syndrome information. This coupling is often achieved through, say, a two-qubit $CNOT$ gate which has the potential of propagating errors throughout the system. In an LDPC code, where the number of operators or qubits involved in direct interaction is small, there will naturally be less coupling required to actively perform error correction. Furthermore, in regards to fault tolerant computation, it has been shown that if the code distance of an LDPC code satisfies $d= \Omega(log \ n)$, then there exists a constant error threshold for stochastic error models \cite{Got}, \cite{Kovalev}.

In the classical framework, LDPC codes have had tremendous success. In fact, there exists classical LDPC codes which are also good in the technical sense. This serves as motivation for generalizing the notion of LDPC codes in the quantum setting. In the table below various quantum LDPC codes are given with their parameters $k,d$, and $w$, and their dependence on $n$. Note however, that none of these codes are good since the code distances of each grow strictly slower than $\Omega(n)$. It can also be seen that some of the codes have an encoding rate $k/n$ which approaches zero asymptotically.

Various examples of LDPC codes. References: [7]\cite{Kitaev}, [11] \cite{Zemor}, [13]\cite{Delfosse}, [6]\cite{Freedman}, [10]\cite{Tillich}, [5]\cite{FreedmanHas}.




Main Results

The principle objective of this paper is to introduce and outline new quantum error correcting codes known as homological product codes. By borrowing the mathematical framework and terminology of homology theory and algebraic topology, in particular the idea of chain complexes, CSS codes can be constructed in a straight forward manner. Moreover, the notion of a homological product of two chain complexes can be used to also define a homological product of CSS codes, in which a new code is constructed using two old codes. How these constructions are carried out, and what the parameters of these resulting codes are, will be discussed in more detail in what follows.

By defining random chain complexes, random CSS codes can be considered. Using these randomized constructions, it will be argued that with high probability the random CSS codes derived from random chain complexes are good codes. Likewise, it will also be argued that the product CSS code resulting from taking a product of two random chain complexes also gives a good code. Regardless, these CSS codes are not $LDPC$, but have stabilizer weight $w=O(\sqrt{n})$. Thus, in a certain sense it can be said that these codes are ``almost" LDPC. Besides a distinctively new way to construct new codes from old codes, the main contribution offered by homological product codes is that they are the first family of good codes to be introduced which have stabilizer weight smaller than $\Omega(n)$.


Homology Theory

In this section, some basic concepts and terminology in homology theory will be introduced. The main mathematical entity from homology theory, known as a chain complex or just complex for short, will be necessary to define the construction of CSS codes. In that regard, the notions to be introduced will be limited and highly specialized to suit these purposes. However, it should be acknowledged that many of these notions can be generalized to a broader setting.

The central object of study in homology theory is a chain complex $(C_i, \partial_i)$, which consists of:
  • A sequence of vector spaces $C_i$.
  • Linear transformations called boundary operators, $\partial_i:C_i\to C_{i+1}$ such that $\partial_i\partial_{i+1}=0$,
which can be schematically visualized as
\[
\dots \overset{\partial_{i+2}}\longrightarrow C_{i+1}\overset{\partial_{i+1}}\longrightarrow C_i \overset{\partial_{i}}\longrightarrow C_{i-1} \overset{\partial_{i-1}}\longrightarrow \dots
\]
 Here, $\partial_i\partial_{i+1}=0$ says that the composition of two consecutive maps is the zero mapping sending all elements to the zero element in the appropriate space. It is worth remarking that the restriction of  $(C_i,\partial_i)$ to vector spaces and linear transformations is a specialization of the definition of a chain complex. More generally, one could consider groups or modules for the $C_i$ and the appropriate structure preserving homomorphisms in their respective categories as the transformations $\partial_i$.

 To understand the consequences of the seemingly peculiar condition $\partial_i\partial_{i+1}=0$
 imposed on the boundary operators, define the following subspaces of $C_i$
\[
 \begin{align*}
im\partial_i &:=\{\partial_i\vec{v}\in C_{i-1} \mid \vec{v}\in C_i\}, \\
ker\partial_i &:=\{\vec{w}\in C_i \mid \partial_i\vec{v}=\vec{0}\in C_{i-1}\}. 
 \end{align*}
\]
Thus, for any $\vec{v}\in C_{i+1}$ we have $\vec{w}:=\partial_{i+1}\vec{v}\in im\partial_{i+1}\subseteq C_i$. Then the condition $\partial_i\partial_{i+1}=0$ implies that $\partial_i\partial_{i+1}\vec{v}=\partial_i\vec{w}=0$ so $\vec{w}\in ker\partial_i$. Hence, for any chain complex it is always the case that $im\partial_{i+1}\subseteq ker\partial_i$. Now since $im\partial_i$ is a vector subspace of $ker\partial_i$, the quotient space $\ker\partial_i/im\partial_{i+1}$ can be defined as the set of equivalence classes $[\vec{w}]$, for $\vec{w}\in ker\partial_i$, where for two classes $[\vec{w}]=[\vec{w}']$ if and only if $\vec{w}-\vec{w}'\in im\partial_{i+1}$. The space $\ker\partial_i/im\partial_{i+1}$,called the $i^{th}$ \emph{homology space}, can be given a vector space structure by defining the vector addition and scalar multiplication as $[\vec{v}]+[\vec{w}]=[\vec{v}+\vec{w}]$ and $\alpha[\vec{w}]=[\alpha\vec{w}]$ for scalar elements $\alpha$ in the underlying field of $C_i$. Understanding these homology spaces in some chain complex is one of the main focuses of homology theory.

In a similar manner, to every complex $(C_i,\partial_i)$ there is an associated cocomplex $(C_i,\partial^{T}_i)$, defined in terms of the transpose maps $\partial^{T}_i: C_{i-1}\to C_i$ (or more generally the adjoints of $\partial_i$)., which satisfy $\partial^{T}_{i+1}\partial^{T}_{i}=0$. This can essentially be understood in terms of the standard complex by simply reversing arrows in the chain diagram:
\[
 \dots \overset{\partial^{T}_{i+2}}\longleftarrow C_{i+1}\overset{\partial^{T}_{i+1}}\longleftarrow C_i \overset{\partial^{T}_{i}}\longleftarrow C_{i-1} \overset{\partial^{T}_{i-1}}\longleftarrow \dots
\]
 In this way, the condition  $\partial^{T}_{i+1}\partial^{T}_{i}=0$ implies that $im\partial^{T}_i \subseteq ker\partial^{T}_{i+1}$ so that the $i^{th}$ cohomology group is given by the quotient $ker\partial^T_{i+1}/im\partial^{T}_i$.


 
CSS codes from Complexes
 
 General Construction: $\mathbf{CSS(C_i,\partial_i)}$
 
  In what follows, when referring to some chain complex $(C_i,\partial_i)$, all vector spaces $C_i$ will be over the binary field $\mathbb{F}_2$ where addition and multiplication are carried out modulo $2$. Furthermore, for the purposes of defining a CSS code from the chain complex, it will suffice to restrict the discussion to short chain complexes of the form
\[
  C_{2}\overset{\partial_{2}}\longrightarrow C_1 \overset{\partial_{1}}\longrightarrow C_{0},
\]  consisting of just three spaces $C_0,C_1,C_2$ and boundary maps $\partial_1,\partial_2$ in the chain satisfying $\partial_1\partial_2=0$. Temporarily refraining from any
  motivation, consider making the following definitions
\[
  C^Z:=im\partial_2 \ \ \text{and} \ \ C^X:=im\partial^{T}_1,
\]
  where $\partial_2: C_2\to C_1$ and $\partial^{T}_1 : C_0\to C_1$. Note that $C^Z,C^X\subseteq C_1$.
 
  Suppose $C_1=\mathbb{F}^{n}_2$ for some $n$, and define the standard inner product between $\vec{v},\vec{w}\in C_1$ as
\[
  <\vec{v},\vec{w}>:=\displaystyle\sum^{n}_{i=1}v_i w_i \ (mod \ 2),
\]
  where $\vec{v}=(v_1,\dots,v_n)$ and $\vec{w}=(w_1,\dots,w_n)$ are the corresponding vector components in some basis. With respect to this inner product and for any subspace $S\subseteq C_1$ define the orthogonal complement of $S$ as the space
\[
  S^\perp:=\{\vec{v}\in C_1 \mid <\vec{v},\vec{w}>=0 \ \text{for all } \ \vec{w}\in S\} .
\]
The space $S^\perp$ is indeed a vector subspace of $C_1$.

It will now be shown that with the definitions given above, $C^Z\subseteq (C^X)^\perp$. For any vectors $\vec{v}\in C^Z:=im\partial_2$ and $\vec{w}\in C^X:=im\partial^{T}_1$, there exists $\vec{x}\in C_2$ and $\vec{y}\in C_0$ such that $\partial_2\vec{x}=\vec{v}$ and $\partial^{T}_1\vec{y}=\vec{w}$. Then the inner product between $\vec{v}$ and $\vec{w}$ satisfies
\[
  <\vec{v},\vec{w}>=<\partial_2\vec{x},\partial^{T}_1\vec{y}>=<\partial_1\partial_2\vec{x},\vec{y}>=<\vec{0},\vec{y}>=0
\]
  since $\partial_1\partial_2=0$. Thus for any $\vec{v}\in C^Z$ it is also the case that $\vec{v}\in(C^X)^\perp$.
 
  The boundary operator condition $\partial_1\partial_2=0$ implies that $C^Z\subseteq(C^X)^{\perp}$, or equivalently that $C^X\subseteq (C^Z)^\perp$. This condition, together with the underlying assumption that the the spaces under consideration are over the field $\mathbb{F}^{n}_2$ (so that $C^X,C^Z\subseteq C_1=\mathbb{F}^{n}_2$) is enough to completely specify a valid CSS code on $n=dim(C_1)$ physical qubits with code spaces given by $C^X$ and $C^Z$. Let $CSS(C^X,C^Z)$ or $CSS(C_i,\partial_i)$ denote the CSS code that results from the complex $(C_i,\partial_i)$. By choosing a basis for the code spaces $C^X$ and $C^Z$, the basis vectors can be used to specify stabilizer generators acting on the $n$ physical qubits consisting of either Pauli $X$ operators or Pauli $Z$ operators, respectively. In this case, since the code spaces are defined in terms of the images of $\partial_2$ and $\partial^{T}_1$, a basis can be specified by taking linearly independent columns of $\partial_2$ as a basis of $C^Z:=im\partial_2$, and linearly independent columns of $\partial^{T}_1$ to form a basis for $C^X=im\partial^{T}_1$. In this way, the essential condition for CSS codes $C^Z\subseteq(C^X)^{\perp}$ implies that all $X$-type stabilizers commute with all $Z$-type stabilizers---a necessary condition in the stabilizer formalism of quantum error correcting codes. More details regarding this construction will be explained in what follows.
 
The Single-Sector Theory: $\mathbf{CSS(C,\partial)}$

 In the general construction of the code $CSS(C_i,\partial_i)$ from a chain complex, three different spaces $C_0,C_1,$ and $C_2$ were at play. The rest of this paper will be restricted to a more specialized setting known as the single-sector theory. In the \emph{single-sector theory}, only a single vector space $C=\mathbb{F}^{n}_2$ and boundary operator $\partial: C\to C$ is considered with $C=C_0,C_1,C_2$. This gives a chain complex $(C,\partial)$ of the form
\[
   C\overset{\partial}\longrightarrow C \overset{\partial}\longrightarrow C ,
\]
  which in the present context can be viewed as
\[
   C\overset{\partial}\longrightarrow C \overset{\partial^{T}}\longleftarrow C.
\]
  Then in this case the code spaces of the corresponding CSS code are defined to be
\[
  C^Z:=im\partial \ \ \text{and} \ \ C^X:=im\partial^{T}.
\]
  Moreover, it can be shown that
\[
  (C^Z)^\perp=ker\partial^T \ \ \text{and} \ \ (C^X)^\perp=ker\partial.
\]
  Let $A^Z$ and $A^X$ be the parity check matrices spanning $C^Z$ and $C^X$, and make the identification
\[
  A^Z=\partial \ \ \text{and} \ \ A^X=\partial^T.
\]
Thus, the columns of $\partial$ span the code space $C^Z$ and the columns of $\partial^T$ (or equivalently the rows of $\partial$) span the code space $C^X$. Note that the columns and rows of $\partial$ may not be linearly independent. If this is the case then not every row of the parity check matrices $A^Z$ and $A^X$ will yield an independent stabilizer. In some conventions party check matrices that generate some CSS code are required to have full rank (where all columns are linearly independent). This restriction will not be imposed here in this paper in defining the parity check matrices in terms of $\partial$. Then in general, since it is possible to have distinct boundary operators $\partial\neq\partial'$, with $im\partial=im\partial'$, the correspondence in this translation of chain complexes $(C,\partial)$ to codes $CSS(C,\partial)$ is many-to-one. That is, different chain complexes may define equivalent CSS codes.

To make the construction of $CSS(C,\partial)$ more explicit consider, for example, the boundary operator $\partial : C\to C$, where $C=\mathbb{F}^{5}_2$, is expressed in some basis as
\[
\partial=\begin{pmatrix}
1&1&1&0&0 \\
0&0&1&1&1 \\
1&1&0&1&1 \\
1&1&1&0&0 \\
0&0&1&1&1 \\
\end{pmatrix} .
\]
Regard each column of $\partial$ as a vector $\vec{v}=(v_1,\dots, v_5)$. Then a $Z$-type stabilizer acting on the $dim(C)=5$ qubit space can be obtained using a column $\vec{v}$ as
\[
S^Z=Z^{v_1}_1Z^{v_2}_2Z^{v_3}_3Z^{v_4}_4Z^{v_5}_5.
\]
Hence, $S^Z$ acts nontrivially on qubits that correspond to the nontrivial support of $\vec{v}$. From the columns of $A^Z=\partial$, three $Z$-type stabilizers are obtained in this way:
\[
   \begin{align*}
S^{Z}_1&=Z_1Z_3Z_4  \\
S^{Z}_2&=Z_2Z_3Z_5  \\
S^{Z}_3&=S^{Z}_1S^{Z}_2,  \\
\end{align*}
\]
Similarly, from the columns of $A^X=\partial^{T}$ three  $X$-type stabilizers are obtained:
\[
   \begin{align*}
S^{X}_1&=X_1X_2X_3 \\
S^{X}_2&=X_3X_4X_5 \\
S^{X}_3&=S^{X}_1S^{X}_2. \\
\end{align*}
\]
For this particular $\partial$, it is seen that there are only two linearly independent columns and rows: $rank(\partial)=rank(\partial^T)=2$. This corresponds to the fact that there are only two independent stabilizer generators of each type. Given some stabilizer group $S$, recall that there are generally multiple ways to specify a generating set for $S$. This freedom implies that a conversion that begins with an arbitrary CSS code and translates to a corresponding chain complex $(C,\partial)$ will also be many-to-one in general.

Justification for why a complex $(C,\partial)$ defined in this setting naturally translates to a corresponding CSS code can be given by observing that the underlying condition $\partial^2=0$ is equivalent to the commutativity constraints demanded by CSS codes and stabilizer codes. Write $\partial=(\partial_{ij})_{ij}$ to express the matrix coefficients of $\partial$ in a particular basis, and note that $A^Z=\partial=(\partial_{ij})_{ij}$ and $A^X=\partial^T=(\partial_{ji})_{ij}$. Then
\[
\partial^2=0  \Longleftrightarrow   A^Z(A^X)^{T}=\displaystyle\sum_{k}^{}\partial_{ik}\partial_{kj}=0  (mod \ 2)   \Longleftrightarrow  S^{X}_iS^{Z}_j=S^{Z}_jS^{X}_i.
\]

Here, the first statement is the relevant constraint in the context of chain complexes, while the second statement can be interpreted as a condition that implies $C^Z\subseteq(C^X)^T$ in the setting of CSS codes. The last statement gives the commutativity constraints imposed on the different types of stabilizer generators that result from the construction described above.


Equivalent Terminology

      Outlined below is analogous terminology of various entities in homology theory and the corresponding concepts in the quantum error correction setting. For a chain complex, $(C,\partial)$, the main spaces of interest are $ker\partial$, $im\partial$, and the quotient $ker\partial/im\partial$, which correspond to $Z$-type operators of the code $CSS(C,\partial)$; and the same spaces for $\partial^T$ that correspond to $X$-type operators of the code. In the table below, the middle column gives the nomenclature used for elements in these spaces.




    
     The relations that hold between the various spaces in the homology setting are analogous to the relationships between the different groups that arise in the stabilizer formalism. Let $S_Z$ be the stabilizer group defined by the $Z$-type part of $CSS(C,\partial)$, and denote $N(S_Z)$ as the normalizer group of $S_Z$ in the Pauli group for $n=dim(C)$ qubits. In this way,  elements of $ker\partial$, called cycles, correspond to elements of $N(S_Z)$. Elements of $im\partial\subseteq ker\partial$ are trivial cycles that correspond to operators that are elements of the stabilizer $S_Z\subseteq N(S_Z)$. Furthermore, $ker\partial\backslash im\partial$ contains all cycles in $ker\partial$ that are not in $im\partial$, which corresponds to the nontrivial logical operators in $N(S_Z)$ that are not in $S_Z$. The homology space $ker\partial/im\partial$ is analogous to the quotient group $N(S_Z)/S_Z$, where in the stabilizer setting different elements of the quotient correspond to an equivalence class of logical operators. Operators in the same class can be distinct operators when expressed in terms of Pauli operators (up to a product of stabilizer elements), but in effect have the same \emph{logical} action on the encoded qubits.
    
      An similar translation between concepts and their interpretations can be made for the $X$-type part of $CSS(C,\partial)$ by simply replacing the boundary operator with the transpose operator $\partial^T$ as indicated in the table. The acquainted reader may recognize some of these concepts coming from homology theory from their experience with the toric code \cite{Kitaev},where historically  speaking, some of the intimate relations between homology theory and error correction were first manifested.


Code Parameters of $\mathbf{CSS(C,\partial)}$

Having developed the framework to construct a code $CSS(C,\partial)$ from a chain complex $(C,\partial)$, let us determine the code parameters $[[n,k,d,w]]$ of $CSS(C,\partial)$. The number of physical qubits $n$ that define the global space of the code is given by
\[
n=dim(C)=dim(\mathbb{F}^{n}_2).
\]   
This can equivalently be determined through $\partial$ by simply counting the number of rows/columns that comprise $\partial$.

For a stabilizer code with $r$ generators the number of encoded qubits is given by $k=n-r$, and since the number of $X$-type and $Z$-type generators are both given by the number $rank(\partial)$ of linearly independent rows/columns of $\partial$ we have
\[
k=n-2rank(\partial).
\]

The stabilizer weight $w$ and distance $d$ of $CSS(C,\partial)$ are not so readily determined from $\partial$. By definition, the stabilizer weight is the smallest $w$ such that every column and row of $\partial$ has weight at most $w$. In regards to the LDPC criterion, $CSS(C,\partial)$ will be an LDPC code if $\partial$ is a ``sparse'' matrix. More specifically, $CSS(C,\partial)$ is LDPC if each row and column of $\partial$ has $O(1)$ nonzero entries.

To determine the distance $d=min\{d^X,d^Z\}$, it is necessary to know the respective distances $d^X$ and $d^Y$, which are defined in this setting to be
\[
\begin{align*}
d^Z:=&min\{weight(\vec{v}) \mid \vec{v} \in ker\partial \backslash  im\partial \} \\
d^X:=&min\{weight(\vec{v}) \mid \vec{v} \in ker\partial^T \backslash  im\partial^T \}.
\end{align*}
\]
Thus, $d^Z$ and $d^X$ represents the minimum weight of nontrivial cycles in $ker\partial$ and $ker\partial^T$, respectively.  It will be argued later that when considering a uniformly random distribution of possible boundary operators $\partial$ the corresponding code $CSS(C,\partial)$ will has linear distance ($d=\Omega(n)$) with high probability.
    
    
Homological Dimension $\mathbf{H(\partial)}$

 Here we introduce the homological dimension of a complex $(C,\partial)$ and its relation to the number of encoded qubits $k$ of the code $CSS(C,\partial)$, which will become more relevant in later discussions. Define the homological dimension $H(\partial)$ of a complex $(C,\partial)$ to be the dimension of the homology space of the complex:
\[
 H(\partial):=dim(ker\partial/im\partial).
\]
 The following relations hold:
\[
 \begin{align*}
 H(\partial)&=dim(ker\partial/im\partial) \\
&=dim(ker\partial)-dim(im\partial) \\
&=(n-dim(im\partial))-dim(im\partial) \\
&=n-2dim(im\partial) \\
&=n-2rank(\partial) \\
  &=k
 \end{align*}
\]

 Hence, the homological dimension of the complex is equal to the number of encoded qubits of the code: $H(\partial)=k$.  The second identity is a general relationship that holds for quotient spaces which essentially follows from the fact that given a basis of a vector subspace $V\subseteq W$, a basis of $W$ can be obtained from a basis of  $V$ and $W/V$.  The third identity follows from the rank-null theorem of linear algebra which states that $dim(ker\partial)+dim(im\partial)=n$. The last two identities follow merely from the definition $rank(\partial):=dim(im\partial)$ and the previous relationship derived for $k$.

   
Homological Product
   
    Given two chain complexes $(C_1,\partial_1)$ and $(C_2,\partial_2)$, a natural construction to consider is some notion of a product defined on the complexes that yields another chain complex $(C,\partial)$. The homological product to be introduced here achieves precisely this. The product complex $(C_1,\partial_1)\times (C_2,\partial_2)=:(C,\partial)$ will express the space $C$ and boundary operator $\partial$ in terms of the spaces $C_1,C_2$ and boundary operators $\partial_1,\partial_2$ of the factors. Moreover, the K\"{u}nneth formula will provide a means of expressing $ker\partial$ in terms of $ker\partial_1$ and $ker\partial_2$, and the homological dimension $H(\partial)$ in terms of $H(\partial_1)$ and $H(\partial_2)$. This theory will then be applied to define a product of two codes $CSS(C_1,\partial_1)$ and $CSS(\partial_2)$ and to calculate its parameters.
  
   
$\mathbf{(C_1,\partial_1)\times (C_2,\partial_2)=(C,\partial)}$
       
    The homological product, denoted by $\times$, of two complexes $(C_1,\partial_1)$ and $(C_2,\partial_2)$ gives a product complex
\[
    (C,\partial):=(C_1,\partial_1)\times (C_2,\partial_2),
\]
    where $C=C_1\otimes C_2$ is given by the tensor product of $C_1$ and $C_2$, and the boundary operator by
\[
    \partial=\partial_1\otimes I +I\otimes \partial_2.
\]
    To verify that $\partial$ is indeed a valid boundary operator observe the following:
\[
    \begin{align*}
    \partial^2&=(\partial_1)^2\otimes I +2\partial_1\otimes\partial_2 + I\otimes (\partial_2)^2 \\
&=0\otimes I +0\cdot\partial_1\otimes\partial_2 + I\otimes 0 \\
&=0
    \end{align*}
\]
    Here, the assumptions $\partial^{2}_1=0$ and $\partial^{2}_2=0$ were used together with $2\equiv 0 \ (mod \ 2)$ since all spaces under consideration are taken to be over the field $\mathbb{F}_2$.
   
   
Kunneth Formula
       
    The Kunneth formula is a standard result in homology theory that, for a product complex $(C,\partial)=(C_1,\partial_1)\times (C_2,\partial_2)$, relates $ker\partial$ and the homological dimension $H(\partial)$ in terms of the corresponding entities of $(C_1,\partial_1)$ and  $(C_2,\partial_2)$. The theorem statement reads:
   
    For any boundary operators $\partial_1,\partial_2$ and $\partial=\partial_1\otimes I +I\otimes \partial_2$,
\[
    ker\partial=ker\partial_1\otimes ker\partial_2+im\partial
\]
    and
\[
  H(\partial)=H(\partial_1)H(\partial_2).
\]

Although a proof of  the Kunneth formula will not be given here, the results will be used to derive certain properties of the product complex and resulting product code. In particular, the identity $H(\partial)=H(\partial_1)H(\partial_2)$ will be useful. The first statement says the any element $\vec{v}\in ker\partial$ can be expressed in the form $\vec{v}=\vec{v_1}\otimes\vec{v_2} +\vec{w}$ where $\vec{v_i}\in ker\partial_i$ and $\vec{w}\in im\partial$, and that any vector of such form is always in $ker\partial$.
   
   

Homological Product Codes
       
    Previously, it was described how a complex $(C,\partial)$ can be used to specify a code $CSS(C,\partial)$. By considering a product complex $(C,\partial)=(C_1,\partial_1)\times (C_2,\partial_2)$ that results from taking the homological product of two chain complexes, and then translating the resulting product complex $(C,\partial)$ into the corresponding code $CSS(C,\partial)$ a homological product code can be defined. In the following sections, the parameters of the product code will be calculated in terms of the parameters of the input codes.
     
   
$\mathbf{CSS(C_1,\partial_1)\times CSS(C_2,\partial_2)=CSS(C,\partial)}$
   
 Consider two codes $CSS(C_1, \partial_1)$ and $CSS(C_2,\partial_2)$ defined by two complexes $(C,\partial_1)$ and $(C_2,\partial_2)$. Define the homological product of $CSS(C_1, \partial_1)$ and $CSS(C_2,\partial_2)$ to be the code
\[
CSS(C,\partial):=CSS(C_1,\partial_1)\times CSS(C_2,\partial_2),
\]
where $(C,\partial)=(C_1,\partial_1)\times (C_2,\partial_2)$ so that $C=C_1\otimes C_2$ and $\partial=\partial_1\otimes I +I\otimes \partial_2$. Even though the resulting product code does share some properties in common with codes constructed through code concatenation, the homological product offers a way to create a new code from two older codes in a way that is distinct from code concatenation. How the parameters of the product code are related to the code parameters of the input codes is the focus of the following sections.
   
     
Parameters of the Product Code $\mathbf{CSS(C,\partial)}$
   
Let $[[n_i,k_i,d_i,w_i]]$ be the code parameters of two codes $CSS(C_i,\partial_i)$ for $i=1,2$, and let $[[n,k,d,w]]$ be the parameters of the resulting product code $CSS(C,\partial)=CSS(C_1,\partial_1)\times CSS(C_2,\partial_2)$. The objective here will be to determine the parameters $[[n,k,d,w]]$ in terms of the parameters of the input codes.

Since $C=C_1\otimes C_2$, the number of physical qubits $n$ in the product code is given by

\[
n=dim(C)=dim(C_1\otimes C_2)=dim(C_1)dim(C_2)=n_1n_2.
\]
The number of encoded qubits $k$ of the product code is calculated through the K\"{u}nneth formula:
\[
k=H(\partial)=H(\partial_1)H(\partial_2)=k_1k_2.
\]
    Hence, the homological product of two codes always increases the number of physical and logical qubits provided that both input codes have parameters $n_i,k_i > 1$.
   
    Recall from the definition that  if $CSS(C_i,\partial_i)$ have stabilizer weights $w_i$, the weight of rows and columns of $\partial_i$ is no greater than $w_i$. Then since taking the tensor product of any operator with the identity does not increase row or column weight, the stabilizer weights of $\partial_1\otimes I$ is no greater than $w_1$ and the stabilizer weight of $I\otimes \partial_2$ is no greater than $w_2$. Therefore the stabilizer weight of the product boundary operator $\partial=\partial_1\otimes I +I\otimes \partial_2$ must satisfy $w\leq w_1+w_2$, implying that the stabilizer weight of the product code is no greater than the sum of the stabilizer weights of the input codes.
   
    Calculating the distance $d$ of the product code $CSS(C,\partial)$ is highly non trivial, and only bounds are known. Let $d^{Z}_i, d^{X}_i$ be the  $Z$ and $X$-type distances of CSS$(C_i,\partial_i)$ for $i=1,2$ so that the distances are given by $d_i=min\{d^{Z}_i, d^{X}_i\}$. Without complete proof we state the following relationship:
\[
    max\{d^{\alpha}_1,d^{\alpha}_2\}\leq d^{\alpha} \leq d^{\alpha}_1d^{\alpha}_2  (\text{for} \alpha=X,Z).
\]
    Then the distance of the product code will be given by $d=min\{d^X,d^Z\}$. To justify the upper bound, consider non trivial cycles $\vec{v}_i\in ker\partial_i\backslash im\partial_i$. Then $\vec{v}_1\otimes\vec{v}_2\in C$ will be a nontrivial cycle in $ker\partial\backslash im\partial$. So if $\vec{v}_i$ has weight greater than $d^{\alpha}_i$ for type $\alpha$, then $\vec{v}_1\otimes\vec{v}_2$ will have weight given by $d^{\alpha}_1d^{\alpha}_2$. Thus, $d^{\alpha}\leq d^{\alpha}_1d^{\alpha}_2$. Proving the lower bound is a little more involved, but can be done by invoking the K\"{u}nneth formula. This bound can be interpreted as stating that the distance $d^{\alpha}$ of the product code in no worse than the best of the distances $d^{\alpha}_1$ and $d^{\alpha}_2$ of the input codes.
   
    The parameters $n$ and $k$ regarding the number of physical and encoded qubits of the product code $CSS(C,\partial)$ are readily determined from the parameters of the input code. On the contrary, the stabilizer weight $w$ and the code distance $d$ of the product code are more subtle matters. Determining exact values for $w$ and $d$ is specific to the more detailed structure of the input boundary operators $\partial_1$ and $\partial_2$ and how they combine to form the product boundary operator $\partial$. However, it will be shown in the following section that with high probability, most  homological product codes have a linear distance: $d=\Omega(n)$.
   
  
Distance Bounds on $\mathbf{CSS(C,\partial)}$
      
  In order to better determine a stronger lower bound on the distance $d$ of a homological product code, it is necessary to make use of probabilistic arguments regarding random chain complexes $(C,\partial)$ with fixed $n=dim(C)$ and homological dimension $k=H(\partial)$.  The main objective of this section is to sketch a proof that the homological product of two random complexes whose corresponding CSS codes have  linear distance will yield a product code that also has linear distance $d=\Omega(\partial)$ with high probability. The statistical nature of this argument, shows that such a result holds in a ``typical case''. First, what is meant by a random complex will be made more precise. Then, a few logical reductions involved in the analysis of the distance bounds will be discussed. The interested reader can find further details regarding the proof in \cite{Bravyi} .


Random Complexes

A way of generating a random ensemble of complexes satisfying certain properties is discussed in what follows. Fix integers $H$ and $L$, and consider an arbitrary complex $(C,\partial)$ having homological dimension $H(\partial)=H$ and $rank(\partial)=L$. Define a \emph{canonical boundary operator} as the block matrix
\[\hat{\partial}=\begin{pmatrix}0&0&0 \\
0&0&I \\
0&0&0 \\
\end{pmatrix},
\]
  where the rows and columns are grouped into blocks of sizes $H$, $L$, $L$. Then it can be shown \cite{Bravyi} that there exists an invertible matrix $U$ such that $\partial=U\hat{\partial}U^{-1}$. In this way, starting with a canonical boundary operator $\hat{\partial}$ a random boundary operator can be generated by choosing a random invertible matrix $U$ having the appropriate dimension of $M:=H+2L$. If such a $U$ is chosen uniformly at random, then the resulting boundary operator $\partial=U\hat{\partial}U^{-1}$ will also be distributed uniformly. Then by starting with random complexes $(C,\partial)$, random codes $CSS(C,\partial)$.

Consider the "encoding rate'' $H/M$, and the limiting scenario where $H,M\to\infty$ while $H/M$ remains constant. In this limit, a random complex $(C,\partial)$ with $M=dim(C)$ and homological dimension $H(\partial)=H$ will translate to a code $CSS(C,\partial)$ that has linear distance $d=\Omega(M)$ with high probability. This can be argued by observing that the number of low weight cycles in $ker\partial$ is small, which implies that with high probability $ker\partial$ does not contain any low weight cycles. Although this result shows that a random code $CSS(C,\partial)$ tends to have linear distance, it is not immediately obvious that the homological product of two random codes will also have linear distance. More work must be done in order to determine this.


Random Product Complexes

  Here, the notion of random complexes discussed in the previous section is extended to random product complexes. Consider two canonical boundary operators $\hat{\partial}_1$ and $\hat{\partial}_2$, and two invertible matrices $U_1$ and $U_2$ chosen independently at random. Then two random boundary operators can be constructed as
\[
  \partial_1=U_1\hat{\partial}_1U^{-1}_1 \ \ \text{and} \ \ \partial_2=U_2\hat{\partial}_1U^{-1}_2,
\]
 and the boundary operator $\partial=\partial_1\otimes I + I\otimes \partial_2$ resulting from the homological product will be randomly distributed. Equivalently, a \emph{canonical boundary operator} can be defined as
\[
 \hat{\partial}=\hat{\partial}_1\otimes I + I\otimes \hat{\partial}_2,
\]
 Then random boundary operator can be constructed through the transformation
\[
  \partial=(U_1\otimes U_2)\hat{\partial}(U^{-1}_1\otimes U^{-1}_2),
\]
  where $U_1$ and $U_2$ are random invertible matrices. Note that from these relations it readily follows that
\[ker\partial=(U_1\otimes U_2)ker\hat{\partial} \ \ \text{and} \ \ im\partial=(U_1\otimes U_2)im\hat{\partial}.\]
 
 

The Event $\mathbf{E_c}$

In what follows, we consider two random complexes $(C_i,\partial_i)$ (as discussed in the previous section) having the same size $M=dim(C_i)$ and homological dimension $H=H(\partial_i)$ for $i=1,2$. Let $r=H/M$ be the encoding rate. The random complex $(C,\partial)=(C_1,\partial_1)\times(C_2,\partial_2)$ then has size $n=M^2$. The desired result pertaining to the code distance of the random product code $CSS(C,\partial)$ can then be stated as follows: for sufficiently small constants $c,r>0$, the random product code has distance $d>cM^2$ with high probability. Hence, for this to be the case it must be shown that every nontrivial cycle $\vec{v}\in ker\partial\backslash im\partial$ satisfies $weight(\vec{v})>cM^2$ with high probability.

In this regard, for some constant $c$, define the event
\[E_c=\{\exists \vec{v}\in ker\partial\backslash im\partial \mid weight(\vec{v})<cM^2\}.
\]
Then if it can be shown that the probability $Pr[E_c]<1/2$ for sufficiently small $c$  and large enough $M$ , the desired result will follow. The objective then comes down to bounding the probability $Pr[E_c]$ from above by an amount that is exponentially small in $M$. Note that this result can be equivalently obtained by considering the complementary event
\[\overline{E}_c=\{\forall \vec{v}\in ker\partial\backslash im\partial \mid weight(\vec{v})>cM^2\},\]
 and showing that the probability $Pr[\overline{E}_c]>1/2$ instead and tends to $1$ in the limiting case. However, in the analysis that follows it will be preferable to work with $E_c$ as opposed to $\overline{E}_c$.

 As described above, consider the random product complex that results from taking the canonical bipartite boundary operator $\hat{\partial}$ and transforming it to $\partial=(U_1\otimes U_2)\hat{\partial}(U^{-1}_1\otimes U^{-1}_2)$ for random invertible matrices $U_1$ and $U_2$. Recall that $ker\partial=(U_1\otimes U_2)ker\partial$. Then the probability of interest can be bounded by
\[Pr[E_c]\leq \displaystyle\sum^{}_{\vec{v}\in ker\hat{\partial}\backslash im\hat{\partial}} Pr[weight( \ (U_1\otimes U_2)\vec{v} \ )<cM^2].\]

 Since $\vec{v}\in C=C_1\otimes C_2$, interpret $\vec{v}$ as a $M\times M$ matrix $v$ with rows corresponding to the space $C_1$ and columns corresponding to $C_2$. Throughout, we will freely interpret $\vec{v}\in C$ as either a vector or a matrix depending on context. In the latter situation when the vector $\vec{v}$ is to be interpreted as a matrix it will be written as $v$ without the $  \vec{}  $ designation. Suppose the matrix $v$ has rank $R$. Then $(U_1\otimes U_2)v$ will be distributed uniformly over all rank $R$ matrices. Now let $\eta(R)$ be the number of rank $R$ "matrices" of size $M\times M$ in $ker\hat{\partial}\backslash im\hat{\partial}$, and $Pr[R]$ be the probability that such a matrix has weight less than $cM^2$. Then
\[Pr[E_c]\leq \displaystyle\sum^{}_{R\geq 1} \eta(R)Pr[R].\]
 Although the quantities $\eta(R)$ and $Pr[R]$ are relatively straightforward to compute, expressing the bound on the probability $Pr[E_c]$ in this way results in a sum that is exponentialy large in $M$.

 This failure in attempting to bound the probability $Pr[E_c]$ is partly due to the possibility that one of the input codes may be bad. If this is the case, then the resulting product code will have exponentially many low weight cycles. In a certain sense, bounding the probability as above exponentially amplifies bad choices of the input codes. In order to overcome this issue, it will be necessary to define a more refined condition that is only satisfied when both input codes are good.


Uniform Low Weight Condition 
  For some $M\times M$ matrix $A$, say that $A$ has \emph{uniform low weight} with constant $c$ if and only if every row and column of $A$ has weight at most $cM$. To denote that $A$ has uniform low weight, we will write $A$ has $ULW(c)$ for short.
 
  Motivated by the uniform low weight condition, for a constant $c>0$ define the event
\[
E^{ULW}_c=\{\exists \vec{v}\in ker\partial\backslash \vec{0} \mid v  \ \  \text{has} \ \   ULW(c)\},
\]
  and now let $\eta(R)$ be the number of rank $R$ matrices in $ker\partial\backslash \vec{0}$ with $Pr[R]$ denoting the probability that a random rank $R$ matrix of size $M$ has $ULW(c)$. Then  
\[
 Pr[E^{ULW}_c]\leq \displaystyle\sum^{}_{R\geq 1} \eta(R)Pr[R].
\]
Although this time the sum bounding the probability is exponentially small in $M$ as desired, the issue now is that the original low weight event $E_c$ does not imply this uniform low weight event $E^{ULW}$. Hence, attempting to bound the probability in this way is also insufficient for our purposes.

To further refine the analysis, let $t$ be some constant such that $c<t<1$. Suppose that $\vec{v}\in ker\partial$ is a nontrivial cycle with $weight(\vec{v})<cM^2$. Observe that, as a matrix, $v$ has at least $(1-t)M$ rows and columns with weight $\frac{c}{t}M$. Let $M'=(1-t)M$, and consider the \emph{reduced matrix} $v_{red}$ of size $M'$ that consists of precisely these rows and columns. Then $v_{red}$ has $ULW(c')$ where $c'=\frac{c}{t(1-t)}$.

 
The relevance of this reduced submatrix satisfying the uniform low weight condition is made apparent through the following lemma proved in \cite{Bravyi} using results from \cite{Terhal}. If the two random input codes have distances greater than$M-M'+1$, and $\vec{v}\in ker\partial$ is cycle such that $v$ has a vanishing reduced matrix $v_{red}$, then $\vec{v}\in im\partial$ is a trivial cycle. This statement can be equivalently interpreted as saying: if $\vec{v}\in ker\partial$ is a nontrivial cycle, then $v$ must contain a nonvanishing reduced matrix $v_{red}$. Note that a similar result holds for cocycles.

By defining a reduced low weight event
\[E^{RULW}_c=\{\exists \vec{v}\in ker\partial\backslash \vec{0} \mid v_{red} \ \    \text{has}   \ \ ULW(c)\},\]
 and applying the aforementioned lemma, it is seen that the originally defined event $E_c$ implies the event $E^{RULW}_c$. Therefore $Pr[E_c]\leq Pr[E^{RULW}_c]$, and any upper bound on $Pr[E^{RULW}_c]$ yields an upper bound on $Pr[E_c]$. Now let $\eta(R)$ be the number of reduced submatrices $v_{red}$ that can be extended to a valid matrix $v$ that corresponds to a cycle $\vec{v}\in ker\partial$, and let $Pr[R]$ be the probability that such a rank $R$ matrix of size $M'$ has $ULW(c)$. Then,
\[
Pr[E^{RULW}_c]\leq \displaystyle\sum^{}_{R\geq 1} \eta(R)Pr[R].
\]

 In \cite{Bravyi}, bounds on the quantity $\eta(R)$ are calculated to be
\[
\eta(R)\leq O(1)\cdot 2^{(M+H)R-R^2} \ \   \text{if} \ \    R\leq H
\]
 and
\[\eta(R)\leq O(1)\cdot 2^{(M+H/2)R-R^2/2}  \ \  \text{if} \ \  R\geq H,\]
 where $H=H(\partial_1)=H(\partial_2)$ is the homological dimension of the input codes used to construct the product code. Furthermore, it is calculated that the probability $Pr(R)$ is bounded by
\[Pr(R)\leq O(1)\cdot 2^{R^2-2(1-\epsilon)MR},\]
 for any $\epsilon>0$ provided that $c$ is chosen accordingly.

 These two results imply that $Pr[E^{RULW}_c]$, and hence $Pr[E_c]$, are bounded above by an exponentially small quantity in $M$. Therefore, with high probability the product code will not contain nontrivial cycles having weight less than $cM^2$. Since the same argument holds for the cocycles of the code, this implies that the product code of size $n=M^2=dim(C)$ has linear distance $d=\Omega(n)$ with high probability as desired.


Conclusion

We have seen here how chain complexes from homology theory can naturally be used to define CSS codes, and how the homological product of two complexes can be used to define a product code with nice parameters. Code parameters for the CSS codes resulting from both a single chain complex and a product of two complexes were derived. By considering random complexes, it was argued that a single random complex yields a good code with linear distance with high probability. Through a more involved analysis it was also shown that under some typical conditions, the code that results through the homological product will also have linear distance with high probability. In summary, if $[[n_i,k_i,d_i,w_i]]$ are the parameters of two CSS codes constructed from two chain complexes, then the parameters of the resulting product code are given by $[[n_1n_2,k_1k_2,d, w]]$ where in general $d\leq d_1d_2$ and $w\leq w_1+w_2$. By taking a pair of random CSS codes with $n_1=n_2$ physical qubits, $k_1=k_2$ logical qubits such that $k_i=cn_i$ for some small constant $c$, the resulting product code on $n=n_1n_2$ physical qubits has (with high probability) parameters of the form $[[n,\Omega(n),\Omega(n), O(\sqrt{n})]]$. Thus, homological product codes are an example of good quantum codes, which are almost LDPC.
 

Monday, 14 April 2014

Linear Optical Quantum Computation: The KLM Proposal


Abstract

A brief overview of quantum optics is given, and some early proposals for quantum computation using optics are discussed. The main focus is invested on the Knill, Laflamme, and Milburn proposal \cite{KLM} which achieves scalable universal quantum computation using only linear optics and the ability to prepare and detect single photon states.

(** All figures displayed in this post are borrowed from arXiv:quant-ph/0512104 **)



Introduction

Computation, although often thought of entirely in the abstract, is a physical process. Thus, a physical system able to undergo controlled transformations is a prerequisite for any successful computation. In quantum computation, the physical nature of the necessary system becomes even more relevant in order to harness sufficient quantum phenomenon. Despite many distinct paradigms for physically realizing quantum computation, the setting of quantum optics offers unique advantages. In quantum optics, photons are deployed as the main carriers of information, and various physical devices such as phase shifters and beam splitters can be used to enact transformations on the photon states to perform computation. The ability to operate at high temperatures, and long decoherence times are examples of some of the advantages gained when working with quantum optics. However, the same properties of light that offer such benefits also bring some disadvantages. In particular, it is generally difficult to make photons interact with one another. Such interactions seem necessary in order to implement multiple qubit entangling gates---a main requirement for achieving universal quantum computation.

In this post, a brief overview of quantum optics is given and various proposals for quantum computing in this optical setting are discussed. The main proposal described here is that of Knill, Laflamme and Milburn (KLM) \cite{KLM}, who offer a scheme for achieving scalable universal quantum computation using only linear optics (LOQC) that overcomes many detrimental features that existed in previous proposals.

Bosonic Modes

In quantum optics, photons are the main physical entities at play.  When referring to photons in what follows, we will mean noninteracting spin-less bosons. In this setting, the electromagnetic field is quantized in which the energy associated to the physical system  is given by the Hamiltonian
\[
\hat{H}=\displaystyle\sum_{k}^{}\hbar\omega(\ani_k\cre_k +\frac{1}{2}),
\]
where $\ani_k$ and $\cre_k$ are the annihilation and creation operators associated to mode $k$. A bosonic mode, $k$, can be considered as a quantum system whose state space is spanned by the number states $\ket{n}_k$, for $n=1,2,3,\dots$, where $n$ represents the number of photons in the mode $k$. The number states of a mode $k$ form a complete orthonormal set so that $\ip{n}{m}=\delta_{nm}$. The state $\ket{0}$ will denote the state in which every mode $k$ is in the state $\ket{0}_k$. In this way the observables of a particular mode $k$ are given by the annihilation and creation operators satisfying
\[
\cre_k\ket{n}_k=\sqrt{n+1}\ket{n+1}_k
\]
and
\[\begin{align*}
\ani_k\ket{n}_k&=\sqrt{n}\ket{n-1}_k   \text{for}   k\geq 1, \\
\ani_k\ket{0}_k&=0.
\end{align*}
\]
Thus, any number state $\ket{n}_k$ can be expressed in terms $\ket{0}_k$ state
\[
\ket{n}_k=\frac{(\cre_k)^n}{\sqrt{n!}}\ket{0}_k.
\]
Moreover, the creation and annihilation operators satisfy the canonical commutation relations
\[
[\ani_k,\cre_{k'}]=\delta_{kk'} \ \ \ \text{and} \ \ \ [\ani_k,\ani_{k'}]=[\cre_k,\cre_{k'}]=0.
\]



Single Photon Creation and Detection

Necessary requirements for the paradigm of quantum optics are the abilities to both create single photons and to detect them reliably. In addition to the linear elements to be introduced later, controlled photon creation and detection are essential resources necessary for efficient linear optical quantum computation. Even though perfect reliable photon creation and detection are difficult to achieve in practice, this will be taken somewhat for granted in describing later protocols for LOQC where it is implicitly assumed that ideal photon creation and detection is readily available.

 In regards to single photon creation, methods can be essentially classified as either deterministic or probabilistic. A probabilistic source, for instance, may involve multiple photon detections where a second photon can be used to signal the creation of the first ``'single" photon. Such schemes are commonly used for quantum information processing. However,  deterministic means which are able to produce single photons on demand in a controlled way are beneficial. One way of achieving this is through spontaneous parametric down conversion (SPDC), in which photons interact with a nonlinear crystal to occasionally create a pair of correlated photons \cite{Kok}.

Although SPDC is commonly used for quantum optical purposes, other means involving single photon emission through the controlled stimulus of various physical systems is also possible \cite{Migdall}. It is also worth mentioning that single photon states can be created without the use of nonlinearities using weak squeezing where the Hamiltonian $\hat{s}_{jk}=\ani_j\ani_k+\cre_j\cre_k$ is applied to the vacuum state in order to produce specific number states in the modes. Experimental realizations of such a technique have been demonstrating in \cite{Hong}.

The matter of photon detection comes in two varieties: being able to distinguish merely between zero and some nonzero number of photons in a mode, or the more refined ability of actually being able to count the precise number of photons in a mode. This latter means of photon detection is required for various applications of LOQC. Such measurements involve some particle detector which destructively determines the presence of photons. One way to approximately count the number of photons in some mode $k$ is to use optical elements to effectively split the mode into $N$ other modes $k_1,k_2,\dots, k_N$, and then use $N$ particle detectors to count the number of photons in each of the modes $k_l$. Now suppose that mode $k$ has $n$ photons, then the probability that any of the modes $k_l$ has more than one photon is given by $1-\frac{N!}{(N-n)!N^n}\leq\frac{n(n+1)}{2N}$. Thus, provided that the number of photons $n$ in the mode is not too large, or if the number of split modes $N$ is sufficiently large, the probability of any of the modes $k_l$ having more than one photon is small. Therefore, with high probability the number of actual photons in the main mode $k$ would be given by the number of photon detectors (out of $N$ many) that detect a photon. Further details pertaining to photon detection and its associated errors can be found in \cite{Kok}.


Linear Optical Elements

The main dynamical elements that induce state transformations in linear optics are phase shifters and beam splitters. An important property of such passive linear optical elements is that they preserve the boson number of the states. That is, if $U$ represents the unitary operator associated with such a state transformation, then $U^\dagger\ket{0}=\ket{0}$. The effect of $U$ on the creation operators can then be described as follows:
\begin{align*}
U\cre_k\ket{0}=U\cre_kU^\dagger\ket{0}=\displaystyle\sum_{j}u_{jk}\cre_{j}\ket{0},
\end{align*}
where $U=(u_{jk})_{jk}$ gives the matrix coefficients of $U$. By letting $\hat{b}^{\dagger}_k=U\cre_k$ represent the outmode mode, an optical element corresponding to the transformation $U$, is said to be \emph{linear} if each output mode is a linear combination on the input modes $\cre_j$:
\[
\hat{b}^{\dagger}_{k}=\displaystyle\sum_{j}u_{jk}\cre_{j}\
\]

The phase shift optical element $P_\phi$, which acts on a single mode $k$, has the effect of introducing a phase factor $e^{in\phi}$ on the state $\ket{n}_k$.   This transformation can be described by the Hamiltonian $\hat{N}_k=\cre_k\ani_k$ so that the unitary representing the action of the phase shift is given by $U(P_\phi)=e^{i\phi\cre_k\ani_k}$. In an optical network, the phase shift element acting in a particular mode will be depicted schematically as shown in the figure below.

A phase shift element $P_\phi$.


The other main optical element predominantly used in linear optics is a beam splitter $B^{(jk)}_{\theta, \phi}$ that is defined to act on two different modes $j$ and $k$ of the system. The action of a beam splitter can be described by the Hamiltonian $\hat{B}_{jk}=e^{i\phi}\cre_j\ani_k+e^{-i\phi}\ani_j\cre_k$, with corresponding unitary matrix given by
\[
U(B_{\theta, \phi})=\begin{pmatrix}
\cos\theta &-e^{i\phi}\sin\theta \\
e^{-i\phi}\sin\theta & \cos\theta
\end{pmatrix},
\]
where mode indices have been suppressed. The schematic symbol for the beam splitter $B^{(jk)}_{\theta, \phi}$ is shown in the figure below.

A beam splitter element $B_{\theta,\phi}$.




Quantum Computation with Linear Optics

Given the paradigm of quantum linear optics discussed in the previous sections, the objective now is to show how quantum computation can be achieved in this setting. This requires a suitable way to encoding qubits using bosonic modes, and a means of implementing quantum gate operations  on the qubits using optical elements consisting of various phase shifters and beam splitters. Again, also implicit in these constructions will be a means of preparing and measuring qubits by means of photon creation and detection. In what follows, after discussing some schemes for quantum computation which fail to do so, a general scheme for quantum computation that does not rely on nonlinear optical elements but yet has efficient scalability properties will be discussed.

Early Proposals

Traditional quantum optics models (e.g. \cite{Chuang1}, \cite{Chuang2}) may use $n$ photons to represent $n$ qubits, and allow the use of nonlinear optical elements to achieve $2$ qubit quantum gates.  One example of a nonlinear element is a Kerr medium, which has the property that its refractive index contains nonlinear terms so that a beam traversing the Kerr medium undergoes a phase shift proportional to the beam's intensity. Moreover, photons in two paths incident to the Kerr medium can induce controlled-phase operations allowing for universal quantum computation when considered together with arbitrary single qubit operations. An early example concerning the implementation of a quantum optical Fredkin gate has been achieved through these means as demonstrated in \cite{Milburn}.

The issue with using nonlinear optical elements through a natural Kerr medium, is the high degree of nonlinearity that must be present in order to implement gates of interest making such methods extremely difficult to implement in practice.
  
   A proposal using only linear optics that can achieve universal computation was given by Adami and Cerf in \cite{Adami}. This proposal uses just a single photon that has access to $2^n$ distinct spatial paths or modes to encode $n$ qubits. Simple gate implementations in this scheme are achieved through particular arrangements of only linear optical elements consisting of phase shifters and beam splitters. The major drawback here is that $2^n-1$ beam splitters are needed to set up the $2^n$ paths necessary to encode $n$ qubits. Since this number is exponential in $n$, general quantum computations done this way will require an exponential amount or resources (beam splitters). Hence, this Adami-Cerf proposal does not possess the desired scaling properties for efficient quantum computation to be accomplished in general. Regardless, in \cite{Adami}, Adami and Cerf use their scheme to provide experimental realizations of the quantum teleportation protocol---a nontrivial quantum computation.


The KLM Proposal

Despite initial beliefs that scalable universal quantum computation may only be achieved when nonlinear optical elements are exploited, in \cite{KLM} Knill, Laflamme, and Milburn offered a scheme that yields efficient quantum computation using only linear optics together with single photon sources and detectors. The key insight made here is that a particular two qubit gate can be executed non-deterministically with some probability of success through the exclusive use of linear optical elements. Furthermore, this gate can be implemented through gate teleportation, which effectively reduces the problem of applying the two qubit gate to the problem of constructing a special state instead. The probability of success associated to such a procedure can be increased arbitrarily through the preparation of richer states and by using quantum error correction to eliminate certain errors. With these tools the KLM scheme offers a scalable means of quantum computation. An outline of the KLM proposal is provided in the following sections.

Qubit Encodings
Consider an optical system with two distinct modes $a$ and $b$ representing distinct spatial paths, and suppose a single photon is present in either of the two modes. The possibilities present for the state of the system can be used to encode a single qubit by letting the two computational basis states be given by
\begin{align*}
\ket{0}_q&:=\ket{0}_a\ket{1}_b=\ket{01}_{ab} \\
\ket{1}_q&:=\ket{1}_a\ket{0}_b=\ket{10}_{ab}.
\end{align*}
Here, the presence of the subscript $q$ for the qubit basis states does not refer to a mode, but is included to denote that the state $\ket{\cdot}_q$ represents a qubit. Such an encoding of qubits is often referred to in the literature as ``dual rail logic". Another possible way to encode an optical qubit is through the polarization properties of photons. If $\ket{H}$ and $\ket{V}$ represent the state of a single photon being horizontally and vertically polarized, respectively, then the computational basis states of the qubit can be alternatively identified as
\[ \ket{0}_q:=\ket{H} \ \ \ \text{and} \ \ \ \ket{1}_q:=\ket{V}.  \]
Although both of these qubit representations offer their own utilities in certain contexts, the dual rail logic encoding with be the used in what follows. However, it should be noted that one representation can be transformed to the other using linear optical elements as discussed in \cite{Myers}.

Single Qubit Gates
A natural consequence of linear optics using only phase shifters and beam splitters is that single qubit operations are easily implemented. Recall that an arbitrary single qubit unitary $U$ can be expressed as a product of rotations around the $Y$ and $Z$ axis of the Bloch sphere. More explicitly
\[U=e^{i\alpha}R_z(\beta)R_y(\gamma)R_z(\delta),\]
where $R_z(\theta)=e^{i\frac{\theta}{2}\pz}$ and $R_y(\phi)=e^{i\frac{\phi}{2}\py}$ are the rotations about the corresponding axis. Here, $\pz$, $\py$ are the standard Pauli operators. The relevance of this decomposition will become apparent after describing the quantum gates corresponding to phase shifters and beam splitters.

By applying a phase shift to, say, the top mode of a dual rail encoded qubit a relative phase is introduced in the state. A schematic representation of such a gate is shown in the following figure.

A phase shift gate acting on the top mode of a dual rail qubit.


 To see this, consider an arbitrary single qubit state $\ket{\psi}_q=\alpha\ket{0}_q+\beta\ket{1}_q$. Then the action of $U(P_\phi)$ on this state is given by
\begin{align*}
U(P_\phi)\ket{\psi}_q&=\alpha U(P_\phi)\ket{0}_q+\beta U(P_\phi)\ket{1}_q \\
&=\alpha U(P_\phi)\ket{0}_a\ket{1}_b+\beta U(P_\phi)\ket{1}_a\ket{0}_b \\
&=\alpha \ket{0}_a\ket{1}_b+\beta e^{i\phi}\ket{1}_a\ket{0}_b \\
&=\alpha\ket{0}_q+\beta e^{i\phi}\ket{1}_q
\end{align*}
Further manipulations of this transformed state gives
\begin{align*}
U(P_\phi)\ket{\psi}_q&=\alpha\ket{0}_q+\beta e^{i\phi}\ket{1}_q \\
&=e^{i\phi/2}(e^{-i\phi/2}\alpha\ket{0}_q+e^{i\phi/2}\beta\ket{1}_q) \\
&=e^{i\phi/2}e^{-i\phi Z_q/2}(\alpha\ket{0}_q+\beta\ket{1}_q) \\
&=e^{i\phi/2}R_z(\phi)(\alpha\ket{0}_q+\beta\ket{1}_q),
\end{align*}
where now $Z_q$ denotes the effective Pauli-$Z$ operation on the encoded qubit. This shows that (up to a physically irrelevant global phase factor $e^{i\phi/2}$) the action of the phase shift $P_\phi$ amounts to a rotation about the $Z$ axis of the Block sphere.

Now consider the beam splitter $B_{\theta,0}$ acting on the two modes of a single dual rail qubit as depicted in the figure.

 A beam splitter acting on the two modes of a single dual rail qubit.

In this case, the matrix representation of $B_{\theta,0}$ is given by
\[
U(B_{\theta, 0})=\begin{pmatrix}
\cos\theta &\sin\theta \\
\sin\theta & \cos\theta
\end{pmatrix},
\]
and its action on an arbitrary single qubit $\ket{\psi}_q=\alpha\ket{0}_q+\beta\ket{1}_q$ is given by
\begin{align*}
U(B_{\theta,0})\ket{\psi}_q&=\alpha U(B_{\theta,0})\ket{0}_q+\beta U(B_{\theta,0})\ket{1}_q \\
&=\alpha U(B_{\theta,0})\ket{01}_{ab}+\beta U(B_{\theta,0})\ket{10}_ab \\
&=\alpha(\cos(\theta)\ket{01}_{ab}-\sin(\theta)\ket{10}_{ab})+\beta(\cos(\theta)\ket{10}_{ab}+\sin(\theta)\ket{01}_{ab}) \\
&=\cos(\theta)(\alpha\ket{01}_{ab}+\beta\ket{10}_{ab})-\sin(\theta)(\alpha\ket{10}_{ab}-\beta\ket{01}_{ab}) \\
&=e^{i\theta Y_q}(\alpha\ket{01}_{ab}+\beta\ket{10}_{ab} \\
&=R_y(-2\theta)\ket{\psi}_q,
\end{align*}
where $Y_q$ represents the Pauli-$Y$ operation on the qubit. Hence, the beam splitter $B_{\theta,0}$ has the effect of performing a rotation of $-2\theta$ about the $Y$ acid of the Block sphere.

With these results in mind, together with the decomposition of an arbitrary single qubit unitary in terms of Bloch sphere rotations about the $Z$ and $Y$ axis described above, it is readily seen that any single qubit unitary can be executed by simply using a phase shift $P_\delta$ and a beam splitter $B_{-\gamma/2,0}$ followed by another phase shift $P_\beta$.

A Two Qubit Gate: $\mathbf{\con{Z}}$
In the previous section it was shown how arbitrary single qubit transformations can be accomplished using only linear optical elements consisting of phase shifts and beam splitters. It is known that in order to achieve universal quantum computation, two-qubit gates are also necessary in addition to the single qubit gates. Specifically, the two-qubit gate must be an entangling gate to ensure universality. A standard conventional choice for such a two-qubit entangling gate is the controlled-NOT gate $\con{X}$. However, for our purposes the controlled-SIGN gate $\con{Z}$ will be implemented instead. Abstractly, the action of the $\con{Z}$ gate is given by $\ket{u}\ket{v}\mapsto(-1)^{u\cdot v}\ket{u}\ket{v}$, where $u,v\in\{0,1\}$. In this way, $\con{Z}$ is related to $\con{X}$ through the relation $\con{Z}=(I\otimes H)(\con{X})(I\otimes H)$, where $H$ is the Hadamard transform.

In order to perform the $\con{Z}$ gate, observe that a basic transformation of the form
\[ \alpha\ket{0}+\beta\ket{1}+\gamma\ket{2} \mapsto \alpha\ket{0}+\beta\ket{1}-\gamma\ket{2} \]
is needed. This operation will be called the nonliner sign shift gate, denoted by $NS_{-1}$, for reasons that are suggested by the name. Since $NS_{-1}$ is a nonlinear gate its action can not be given using only linear optical elements. Instead, the $NS_{-1}$ gate will be implemented nondeterministically (probabilistically) using just linear optics and ancilla modes. Once this is achieved for the $NS_{-1}$ gate,  the probabilistically implemented $NS_{-1}$ gates will be used to construct a probabilistic implementation of $\con{Z}$.

Consider three spatial bosonic modes where mode $1$ contains the main state of interest $\ket{\psi}_1$, and modes $2$ and $3$ serve as ancilla registers initialized in the states $\ket{1}_2$ and $\ket{ 0}_3$. Let the initial state of mode $1$ be in the state $\ket{\psi}_1=\alpha\ket{0}+\beta\ket{1}+\gamma\ket{2}$. Then it can be shown that the following optical network shown in the figure below produces the state $NS_{-1}\ket{\psi}_1$ in the first mode provided that the to ancillary modes are measured to be precisely $\ket{1}_2\ket{0}_3$.
 An optical circuit for the nonlinear sign shift gate $NS_{-1}$.


For the particular action of $NS_{-1}$ the angle parameters of the optical elements are given by
\begin{align*}
&\theta_1=22.5^{\circ},   \phi_1=0^{\circ}\\
&\theta_2=65.5302^{\circ},   \phi_2=0^{\circ} \\
&\theta_3=-22.5^{\circ},   \phi_1=0^{\circ} \\
&\phi_4=180^{\circ}.
\end{align*}
The corresponding unitary matrix $U(NS_{-1})$ acting on the three modes is given by
\[
U(NS_{-1})=\begin{pmatrix}
1-\sqrt{2} & \frac{1}{\sqrt{\sqrt{2}}} & \sqrt{\frac{3}{\sqrt{2}}-2} \\
\frac{1}{\sqrt{\sqrt{2}}} &\frac{1}{2} & \frac{1}{2}-\frac{1}{\sqrt{2}} \\
\sqrt{\frac{3}{\sqrt{2}}-2} & \frac{1}{2}-\frac{1}{\sqrt{2}} & \sqrt{2}-\frac{1}{2}
\end{pmatrix}.
\]
 If the two ancillary modes after measurement are in a state different from $\ket{1}_2\ket{0}_3$, the state in the first mode is not the desired state $NS_{-1}\ket{\psi}$. Since there are $4$ possible output states for the two ancilla modes, and only one of these states (namely $\ket{1}_2\ket{0}_3$) yields the desired outcome in the first mode, the probability of success in this nondeterministic implementation of $NS_{-1}$ is $1/4$.

 Consider two dual rail qubits $\ket{Q_1}=\alpha\ket{0}_q+\beta\ket{1}_q$ and $\ket{Q_2}=\gamma\ket{0}_q+\delta\ket{1}_q$ in arbitrary states a nondeterministic $\con{Z}$ gate can be performed on these two qubits by making using of two $NS_{-1}$ gates as shown in the figure below. Since there are two $NS_{-1}$ gates, and each requires the use of two ancilla modes and has a individual success probability of $1/4$, there are four ancilla modes in total and the success probability of implementing the $\con{Z}$ gate is $1/16$.

 An optical network implementing the controlled-SIGN gate $\con{Z}$.




Gate Teleportation of $\mathbf{\con{Z}}$ 

 It has now been shown how to implement, albeit probabilistically, a two-qubit gate $\con{Z}$  which is universal for quantum computation when considered together with single qubit gates. Although this implementation was accomplished using linear optical elements alone, the inherent probabilistic nature of the gate does not allow for scalable quantum computation due to low success rates. The next key insight utilized in the KLM scheme helps overcome this obstacle by increasing the success probability through the means of gate teleportation. Such a teleportation procedure was originally discussed in \cite{Got}.

 In essence, by harnessing quantum teleportation the problem of needing to apply the probabilistic gate $\con{Z}$ is reduced to the problem of having to prepare a certain state "off-line". This has the advantage of first being able to ensure the proper state is created before carrying on with the gate implementation so that the states in the rest of the system do not get corrupted. In this way, if an error does occur during the state preparation phase it will be known. Then the state preparation can simply be repeated until the desired state is created successfully.

 To understand this more thoroughly, consider two qubits $\ket{\psi_1}_q$ and $\ket{\psi_2}$ on which a $\con{Z}$ is to be applied. This can be accomplished, although it may seem superfluous, by first teleporting the two qubits and then applying $\con{Z}$ as suggested in the figure below.
 Teleportation of two qubits follows by a $\con{Z}$ gate. The region contained in side the dashed/dotted box is the state preparation area.

Since $(\con{Z})^2=I$ consider adding the trivial action of two $\con{Z}$ gates prior to the correction procedure for the teleportation as shown in the following figure.
 Teleportation with the trivial addition of two $\con{Z}$ gates before the measurement phase.


 Focusing on the two middle registers of the network, and applying certain relations for when a Pauli operator is conjugated by a $\con{Z}$, it is seen that these sequences of gates containing $\con{Z}$ operations can equivalently expressed without any $\con{Z}$ gates as
 \[
 (\con{Z})(Z^{m'_1}X^{m_1}\otimes Z^{m'_2}X^{m_2})(\con{Z})=Z^{m'_1}X^{m_1}Z^{m_2}\otimes Z^{m_1}Z^{m'_2}X^{m_2} ,\]
  where $m_1, m'_1$ and  $m_2,m'_2$ are the two classical bits given by the measurements outcomes of the upper and lower teleportations, respectively. This is more clearly depicted in the the following figure.
 Teleportation with a $\con{Z}$ gates in the state preparation area and  only Pauli gates after measurement.

 Thus, it is seen here that there is only a single $\con{Z}$ gate that appears anywhere in the optical circuit. Moreover, this $\con{Z}$ gate has been delegated to the state preparation phase of the teleportation procedure, and the only gates that are to be applied after measurement are simply single qubit Pauli operators. Therefore, the only probabilistic aspect that comes into play when implementing the $\con{Z}$ gate can be thought of has happening ``off-line" so that any failure that may occur in the $\con{Z}$ implementation does not jeopardize the rest of the computation.

 To complete the implementation of $\con{Z}$ using only linear optical elements, it remains to describe an optical network that executes the teleportation task. As described in \cite{Myers}, a general teleportation protocol can be accomplished using the network displayed below.
An optical network describing the standard teleportation protocol.


Now, all of the elements introduced thus far can be combined to give a complete linear optical network for the implementation of $\con{Z}$ as shown:
 The complete optical network for the nondeterministic implementation of $\con{Z}$ using teleportation.



 Increasing the Success probability of $\mathbf{\con{Z}}$

 The successes probability associated to the probabilistic implementation of $\con{Z}$ described in the previous sections cannot be made arbitrarily close to $1$ in a scalable manner. In order to achieve better scaling rates, the state constructed during the state preparation phase of the teleportation protocol can be modified. This modification involves creating  more complex entangled states which involve more bosonic modes as a resource. By creating richer states to be used, the success probability of $\con{Z}$ can be made arbitrarily close to $1$ in a scalable manner. However, the catch here is that the appropriate states required can be difficult and resource intensive to create. The next improvement in the KLM proposal is to avoid having to prepare increasingly complex states, by using active error correction to correct errors that may occur during this stage. By analyzing the particular nature of the errors in this context, appropriate error correcting codes can be deployed that increase the success probability of $\con{Z}$. For the sake of brevity, further details pertaining to these aspects of the KLM proposal will be omitted here. The interested reader is invited to enlighten themself through references (\cite{KLM},\cite{Kok},\cite{Myers},)



Conclusion

 In this present work, a brief general overview of quantum optics was given and various proposals realizing quantum computation through quantum optics were discussed. The main scheme focused on here was the $KLM$ proposal which achieves scalable linear optical quantum computation, using only linear optical elements (such as phase shifters and beam splitters) together with the ability to prepare ancilla modes and make photon detections. Key features of this scheme that allow for scalability are the use of quantum teleportation and error correction to achieve arbitrarily large success probabilities in an efficient manner.

 Experimental demonstrations realizing various aspects of the KLM scheme have been carried out in the field. In \cite{Okamoto},  a controlled-NOT, or $\con{X}$ gate, was successfully implemented. In \cite{Pittman}, a demonstration of quantum error correction using linear optics was described. Some more exotic experiments include the observation of anyonic features in a toric code simulation using LOQC  in \cite{Pachos}, and another more recent demonstration of topological error correction in \cite{Yao}. The latter experiment makes use of the paradigm of \emph{one-way quantum computation} \cite{Walther} that involves the preparation of an initial \emph{cluster state} \cite{Nielsen}, which is then subject exclusively to various measurements conditions on the outcomes of previous measurements. This approach to quantum computation has its own merits, and when implemented with LOQC each paradigm further benefits the other. That is, cluster state/one-way quantum computation offers yet another means for improving LOQC.

Saturday, 29 March 2014

Tranversal gates in permutation-invariant codes and CSS codes

Let $S_n$ be the group of permutations of the $n$-element set $[n]:=\{1 , 2, \dots, n\}$, and let $\pi\in S_n$. Define the unitary $U_\pi: (\mathbb{C}^2)^{\otimes n}\mapsto(\mathbb{C}^2)^{\otimes n}$ by
\[
U_\pi(\ket{\varphi_1}\otimes\cdots\otimes\ket{\varphi_n})=\ket{\varphi_{\pi^{-1}(1)}}\otimes\cdots\otimes\ket{\varphi_{\pi^{-1}(n)}}
\]
for all product states $\ket{\varphi_1}\otimes\cdots\otimes\ket{\varphi_n}$ (and extended linearly to all of $(\mathbb{C})^{\otimes{n}}$). Denote by $(i j)$ the transposition of $i,j\in[n]$ for $i\neq j$, and define the subspace
\[
Q_n=\{\ket{\Psi}\in(\mathbb{C})^2)^{\otimes{n}} \mid U_{(i j)}\ket{\Psi}=\ket{\Psi} \  \text{for all} \ i\neq j \}.
\]

 Let $\vec{b}=b_1b_2\dots b_n \in\{0,1\}^n$ with $b_i\in\{0,1\}$, and let $\ket{\vec{b}}\in(\mathbb{C}^2)^{\otimes n}$ be a $n$-qubit computational basis state. More explicitly, $\ket{\vec{b}}=\bigotimes_{i=1}^{n}\ket{b_i}$, describes the $n$-qubit state where each qubit is either in the state $\ket{0}$ or $\ket{1}$. Now define the map
 \[
  \omega: \{0,1\}^n\to \{0,1,\dots,n\} \ \ \text{given as} \ \  \omega(\vec{b})=\SUM{i=1}{n}b_i,
  \]
  which simply counts the number of $1$s appearing in the bit string $\vec{b}$, and call $\omega(\vec{b})$ the \emph{weight} of $\vec{b}$.
 
It was shown in a previous post that a basis of $Q_n$ is given by the $n+1$ states of the form
 
  \[
  \ket{\omega_k}=\frac{1}{\sqrt{\binom{n}{k}}}\SUM{ \ \ \vec{b} \  : \  \omega(\vec{b})=k}{}\ket{\bar{b}},
    \]
   
  where each basis state $\ket{\omega_k}$, for $k\in\{0,1,\dots, n\}$, consists of an equally weighted superposition of the $\binom{n}{k}$ computational basis states $\ket{\vec{b}}$ with weight $\omega(\vec{b})=k$.


 Consider the two dimensional subspace of $Q_n$ which is spanned by the following two states:
 \[\begin{align*}
 \ket{\overline{0}}&:=\ket{\omega_0}=\TENSOR{i=1}{n}\ket{0}=\ket{\vec{0}}, \\
 \ket{\overline{1}}&:=\frac{1}{\sqrt{2^n-1}}\SUM{k=1}{n}\sqrt{\binom{n}{k}}\ket{\omega_k} \\
 &=\frac{1}{\sqrt{2^n-1}}\SUM{\vec{b}\neq\vec{0}}{}\ket{\vec{b}},
 \end{align*}  \]
 and note that $\ip{\overline{0}}{\overline{1}}=0$. Moreover, since each of $\ket{\overline{0}}$ and $\ket{\overline{1}}$ are given by superpositions of basis states of $Q_n$ each state is left invariant under the action of $U_{(ij)}$ for any transposition $(ij)\in S_n$. Therefore, the two dimensional subspace spanned by $\ket{\overline{0}}$ and $\ket{\overline{1}}$ is a valid subcode of $Q_n$. 

  Now, observe the action of the transversal Hadamard gate applied to each of the $n$ qubits of the state $\ket{\overline{0}}$

\[ \begin{align*}
 H^{\otimes n}\ket{\overline{0}}&= H^{\otimes n}\ket{\vec{0}} \\
  &=\frac{1}{\sqrt{2^n}}\SUM{\vec{b}\in\{0,1\}^n}{}\ket{\vec{b}} \\
  &=\frac{1}{\sqrt{2^n}}\ket{\vec{0}}+\frac{1}{\sqrt{2^n}}\SUM{\vec{b}\neq \vec{0}}{}\ket{\vec{b}} \\
  =&\frac{1}{\sqrt{2^n}}\ket{\vec{0}}+\frac{\sqrt{2^n-1}}{\sqrt{2^n}}\left(\frac{1}{\sqrt{2^n-1}}\SUM{\vec{b}\neq \vec{0}}{}\ket{\vec{b}}\right) \\
  &=\sqrt{2^{-n}}\ket{\overline{0}}+\sqrt{1-2^{-n}}\ket{\overline{1}}. \\
 \end{align*}\]

 Instead, observe the action of $H^{\otimes n}$ on the state $\ket{\overline{1}}$:
\[  \begin{align*}
 H^{\otimes n}\ket{\overline{1}}&= \frac{1}{\sqrt{2^n-1}}\SUM{\vec{b}\neq\vec{0}}{}H^{\otimes n}\ket{\vec{b}} \\
 &= \frac{1}{\sqrt{2^n-1}\sqrt{2^n}}\SUM{\vec{b}\neq\vec{0}}{}\SUM{ \ \vec{c}\in\{0,1\}^n}{}(-1)^{\vec{b}\cdot\vec{c}}\ket{\vec{c}} \\
 &= \frac{1}{\sqrt{2^n-1}\sqrt{2^n}}\SUM{\vec{b}\neq\vec{0}}{}(-1)^{\vec{b}\cdot\vec{0}}\ket{\vec{0}} + \frac{1}{\sqrt{2^n-1}\sqrt{2^n}}\SUM{\vec{b}\neq\vec{0}}{}\SUM{ \ \vec{c}\neq\vec{0}}{}(-1)^{\vec{b}\cdot\vec{c}}\ket{\vec{c}}  \\
  &= \frac{2^n-1}{\sqrt{2^n-1}\sqrt{2^n}}\ket{\vec{0}} + \frac{1}{\sqrt{2^n-1}\sqrt{2^n}}\SUM{\vec{b}\neq\vec{0}}{}\SUM{ \ \vec{c}\neq\vec{0}}{}(-1)^{\vec{b}\cdot\vec{c}}\ket{\vec{c}}  \\
  &= \sqrt{1-2^{-n}}\ket{\vec{0}} + \frac{1}{\sqrt{2^n-1}\sqrt{2^n}}\SUM{\vec{b}\neq\vec{0}}{}\SUM{ \ \vec{c}\neq\vec{0}}{}(-1)^{\vec{b}\cdot\vec{c}}\ket{\vec{c}}  \\
 &= \sqrt{1-2^{-n}}\ket{\vec{0}} + \frac{1}{\sqrt{2^n-1}\sqrt{2^n}}\SUM{\vec{c}\neq\vec{0}}{}\left(\SUM{ \ \vec{b}\neq\vec{0}}{}(-1)^{\vec{b}\cdot\vec{c}}\right)\ket{\vec{c}}  \\
  &= \sqrt{1-2^{-n}}\ket{\vec{0}} + \frac{1}{\sqrt{2^n}}\frac{1}{\sqrt{2^n-1}}\SUM{\vec{c}\neq\vec{0}}{}\left(-1\right)\ket{\vec{c}}  \\
    &= \sqrt{1-2^{-n}}\ket{\overline{0}} - \sqrt{2^{-n}}\ket{\overline{1}}.  \\
 \end{align*}\]

 Thus, in summary the action of $H^{\otimes n}$ on $\ket{\overline{0}}$ and $\ket{\overline{1}}$ is given by:
 \[\begin{align*}
 \ket{\overline{0}}&\overset{H^{\otimes n}}\mapsto \sqrt{2^{-n}}\ket{\overline{0}}+\sqrt{1-2^{-n}}\ket{\overline{1}} \\
  \ket{\overline{1}}&\overset{H^{\otimes n}}\mapsto \sqrt{1-2^{-n}}\ket{\overline{0}}-\sqrt{2^{-n}}\ket{\overline{1}},
 \end{align*}\]

 which is given by the logical operation $\overline{U}$ given by the matrix expressed in the logical basis as
 \[
 \overline{U}=\begin{pmatrix}
 \sqrt{2^{-n}} & \sqrt{1-2^{-n}}\\
  \sqrt{1-2^{-n}}&-\sqrt{2^{-n}} \\
 \end{pmatrix}.
 \]

Transitioning now into the setting of CSS codes, let $H_1, H_2$ be parity check matrices such that $Q=CSS(H_1,H_2)$ is a $[[n,k,d]]$-CSS code. The matrix of stabilizers for this code is given by the binary symplectic matrix
  \[
 S= \left(\begin{array}{c | c}
  0&H_1 \\
  H_2 & 0
  \end{array}\right)
  \]
 
  The action of conjugation by a transversal Hadamard applied to the symplectic matrix $S$ transforms it to 
    \[
 S'=H^{\otimes n}SH^{\otimes n \dagger} \left(\begin{array}{c | c}
  0&H_2 \\
  H_1 & 0
  \end{array}\right)
  \]
 
  Suppose that $H^{\otimes n}$ preserves the code space so that $H^{\otimes n}Q=Q$. This action must send codewords of $Q$ to codewords of $Q$ and only permute the set of stabilizers of the code.  Since $H^{\otimes n}$ effectively turns $X$-type generators into $Z$-type generators, then it must be the case that any row $r_i$ of $H_1$ lies in the span of the rows of $H_2$, and likewise that any row $r'_i$ of $H_2$ lies in the span of the rows of $H_1$. That is, $Row(H_1)\subseteq Row(H_2)$ and $Row(H_2)\subseteq Row(H_1)$, which is equivalent to the condition that
  \[
  Row(H_1)=Row(H_2),
  \] 
  where $row(H_i)$ represents the space spanned by rows of $H_i$.
 
  Now, suppose instead that $Row(H1)=Row(H_2)$. This implies that a row of $H_1$ (or $H_2$) can be expressed in terms of the rows of $H_2$ (or $H_1$). Then after the action of a transversal Hadamard $H^{\otimes n }$, each $X$-type (or $Z$-type) stabilizer will be transformed into a $Z$-type ($X$-type) stabilizer that can be generated by the original set of $Z$-type ($X$-type) stabilizers. Hence, the action of $H^{\otimes n}$ preserves the codespace: $H^{\otimes n}Q=Q$.
 
Therefore, the condition that $Row(H_1)=Row(H_2)$ is both a necessary and sufficient condition for $H^{\otimes n}Q=Q$. If $H_1=H_2$, then trivially $Row(H_1)=Row(H_2)$ so that $H^{\otimes n}Q=Q$.


Suppose that $H=H_1=H_2$ and $n$ is odd. Let $Row(H)$ be the space spanned by the rows of $H$, and assume that $|v|=\sum_{j=1}^{n}=0 \ (mod \ 2)$ for every $v\in Row(H)$. Note that this implies that $\vec{1}\in Row(H)^\perp\backslash Row(H)$, where $\vec{1}=(1,\dots,1)$ ($n$ times).

Therefore, the operators given by $X^{\otimes n}$ and $Z^{\otimes n}$ are in the normalizer of the stabilizer for $Q$, because each stabilizer will commute with $X^{\otimes n}$ and $Z^{\otimes n}$ as the size of the intersection of the supports of any stabilizer with either $X^{\otimes n}$ and $Z^{\otimes n}$ is even. Moreover, $X^{\otimes n}$ and $Z^{\otimes n}$ are not in the stabilizer since every stabilizer acts nontrivially only on an even number of physical qubits whereas $X^{\otimes n}$ and $Z^{\otimes n}$ acts on all $n$ qubits (where $n$ is odd). Hence, the operators $X^{\otimes n}$ and $Z^{\otimes n}$ yield a logical operation on $Q$ on some encoded qubit. Without loss of generality associate these logical operations to act on the first encoded qubit:
\[\begin{align*}
\overline{X}_1&=X^{\otimes n} \\
\overline{Z}_2&=Z^{\otimes n}
\end{align*}\]

Consider then the logical basis states for the $1$st encoded qubit: $\ket{\overline{0}}_1$ and $\ket{\overline{1}}_1$. In this basis, these states can be expressed as:
\[\begin{align*}
\ket{\overline{0}}_1\bra{\overline{0}}_1&=\frac{I+\overline{Z}}{2} \\
\ket{\overline{1}}_1\bra{\overline{1}}_1&=\frac{I-\overline{Z}}{2}.
\end{align*}\]

Then since $HXH^\dagger=Z$ and $HZH\dagger=X$, this implies that $H^{\otimes n}\overline{X}H^{\otimes n \dagger}=\overline{Z}$ and $H^{\otimes n}\overline{Z}H^{\otimes n \dagger}=\overline{X}$. Therefore,
\[\begin{align*}
H^{\otimes n}\left(\frac{I+\overline{Z}}{2}\right)H^{\otimes n \dagger}&=\frac{I+\overline{X}}{2}=\ket{\overline{+}}_1\bra{\overline{+}}_1 \\
H^{\otimes n}\left(\frac{I-\overline{Z}}{2}\right)H^{\otimes n \dagger}&=\frac{I-\overline{X}}{2}=\ket{\overline{-}}_1\bra{\overline{-}}_1,
\end{align*}\]
  where $\ket{\overline{+}}=\frac{1}{\sqrt{2}}(\ket{\overline{0}}_1+\ket{\overline{1}}_1)$ and $\ket{\overline{-}}=\frac{1}{\sqrt{2}}(\ket{\overline{0}}_1-\ket{\overline{1}}_1)$. Hence, the action of $H^{\otimes n}$ on the logical computational basis of the first encoded qubit is given by a logical Hadamard on that encoded qubit:
\[  \begin{align*}
  \ket{\overline{0}}&\overset{\overline{H}}\mapsto\ket{\overline{+}},\\
  \ket{\overline{1}}&\overset{\overline{H}}\mapsto\ket{\overline{-}}.
  \end{align*}\]
 
  In regards to the action of $H^{\otimes n}$ on the rest of the $k-1$ encoded qubits, recall that $H^{\otimes n}$ preserves the code space. Therefore, $H^{\otimes n}$ merely permutes the individual stabilizers. More generally, $H^{\otimes n}$ is in the Clifford group $C_n$. Thus, it must be the case that $H^{\otimes n}=\overline{H}\otimes \overline{C}$, where $\overline{C}$ is some logical Clifford operation since $H^{\otimes n}$ is a Clifford operation.