Following John Preskill’s 1998 Quantum Information and Computation notes.
Introduction
- Landauer’s principle: It takes energy to erase information (since erasure always compresses phase space, such processes are irreversible)
- You can store a bit of information as one molecule in a box. If it’s on the left, it’s on, else it’s off. If you slowly compress the volume in half, you’re gaurenteed to be in the LHS. The change in entropy is k ln 2, which has some associated work that needs to be performed
- Logic gates used to perform computation are typically irreversible
- For instance, NAND is irreversible, since one bit of information is lost for each gate
- NOT is an example of a reversible gate
- Charles Bennett observed that any computation can in principle be done reversibly
- You can construct a Toffoli gate:
- Input is (a,b,c)
- Output is $(a,b, c \otimes (ab))$
- So a and b get mirrored, and the third bit gets flipped if the first two bits are both 1 (otherwise, it mirrors the input)
- a Toffoli gate is universal (provided you throw away the ancilla bits when interpreting
- You can in principle do any computation up to the end, print out a copy of the answer (logically reversible process), then step back the computation back to the beginning
- You can construct a Toffoli gate:
Maxwell’s Demon
- The aforementioned ideas allows a resolution of the Maxwell’s demon paradox
- The original formulation is as follows: You have a partitioned box (split into A and B parts) and some demon which observes the molecules in the box
- If a fast particle is moving from A to B and it will cross the partition, then the demon allows it through
- If a fast particle is moving from B to A across the partition, the demon blocks it
- Over time, you will get faster particles on the right and slower ones on the left with minimal work. This has the effect of having heat flow from a cold place to a hot place at no cost, in violation of the 2nd law of thermodynamics
- The resolution to this is that the demon needs to have some memory/ keep information on the molecules in the box. If the demon has a finite memory capacity, eventually, information will need to be erased, which results in work being done
- The original formulation is as follows: You have a partitioned box (split into A and B parts) and some demon which observes the molecules in the box
Quantum Influence
- The quantum nature of reality changes the definition of information
- Quantum mechanics is a truly random process, which has no place in deterministic classical dynamics
- The uncertainty principle implies that the act of acquiring information from a system invariably disturbs the system
- The no-cloning theorem of quantum mechanics states that quantum information cannot be copied with perfect fidelity
- In classical computation, you can clearly copy a state with impunity
- The main fundamental difference of quantum mechanics is Bell’s theorem. This states that quantum mechanics is not a local hidden variable theory. All of the information in a quantum system is encoded in nonlocal correlations that have no classical analog
Quantum Complexity
- Classically, the indivisible unit of information is the bit
- The quantum analog is the qubit, which is a vector in a 2D complex vector space with an inner product
- ie. $|\phi > = a|0> + b | 1>$
- Performing measurements on $|\phi>$ gives a probabilistic output
- Generalizing to N qubits, you need $2^{N}$ basis vectors to completely specify the state
- This can be compactly written as $\Sigma_{x=0}^{2^{N}-1} a_{x} | x>$ where $a_{x}$ are complex numbers
- Thus, any quantum computation consists of applying unitary transformations onto this N qubit representation. You can then measure the state by projecting onto one of the basis vectors
- A quantum computation is probabilistic by nature, so multiple runs are not guaranteed to yield the same result
- All of these operations can be simulated on a classical computer (re: vector representations, matrix multiplications, inner products)
- The trouble arises if you want to simulate a large number of qubits. Even 100 qubits requires manipulating way more than $10^{30}$ complex numbers as a vector
- You can’t divide and conquer the complexity of a quantum system due to the nonlocal correlations imparting the vast majority of the information content
- Standard computers are Turing complete: If given an infinite amount of time and infinite memory, any computation can be done
- Problems can be classified as “hard” or “easy” depending on how much time and memory they consume
- Describe how “hard” a problem is should be universal: it should not depend on the hardware you’re running on
- The standard distinction is between polynomial time algorithms and exponential time algorithms
- Simulating a quantum computer on a classical computer is not a polynomial time algorithm
- In light of this physical reality, a classical Turing machine is not an appropriate model for quantum computers
Errors
- Information is stored in nonlocal correlations of the system. A large quantum system can couple to its’ environment, which “spreads out” the correlation to the environment. You can’t measure the environment perfectly, so you inevitably lose information
- Can think of this as decoherence: the enviornment is continually “measuring” the state, which causes it to drift over time
- A separate problem is that we don’t have perfect quantum gates. Trying to implement perfect unitary transformation ain’t happening. There will always be some error of order $\epsilon$ from the ideal
- You need some sort of error correcting codes to deal with this reality
- Classically, there are a ton of error correcting codes (think Hamming codes)
- The simplest is just repetition: make N copies of your data, then do majority polling to determine the correct output
- Quantum mechanically, more things can go wrong
- You can have the standard bit errors like in classical computing
- You can have phase errors, where, for instance, $|0> \rightarrow - |0>$
- This phase shift is continuous, unlike the jump discontinuities of classical bit flips
- You can’t clone a quantum system with perfect fidelity
- You can’t measure the system without disturbing it
- Classically, there are a ton of error correcting codes (think Hamming codes)
- Unsurprisingly, Peter Shor developed the first quantum error-correcting code, which can be thought of as an extension of the N-bit repetition code.
- For simplicity, suppose that we want to encode one qubit as 3 qubits
- So the state $a|0> + b |1> = a|000> + b|111>$
- We want to be able to detect errors without destroying this superposition
- If I measure the first qubit and get 0, then all the states collapse to 0 and I lose information about the coefficients a and b
- What if you instead measure pairs of qubits?
- Let the 3 qubit state be represented as $|x,y,z>$. Define $x \oplus y$ denote XOR-ing the bits
- Define the two-bit observable $(y \oplus z, x \oplus z)$ For our 3-qubit system, this observable will dictate which index an error occurred at
- 0 for no error, and then 1,2,3 (in binary) to denote that qubit 1,2,3 has been flipped (going left to right)
- What if there is a small deviation in state?
- Say that the following perturbations occur: $|000> \rightarrow |000> + \epsilon |100>$ and $|111> \rightarrow |111> + \epsilon |011>$
- When you project onto the eigenstate of $(y \oplus z, x \oplus z)$, most of the time (with probability $1-|\epsilon|^{2}$), the state will get reprojected back to the original state. The $|\epsilon|^{2}$ outcome just sets the observable to (0,1), indicating a bit flip in position 1
- This scheme fails with a probability of $|\epsilon|^{4}$ if multiple bit-flips happen
- The above scheme allows us to:
- Make a measurement of the system without damaging information ( you gain information about the error location, but not the exact configuration of your system)
- Small continuous errors either get corrected immediately, or produce a large discrete error which can be corrected
- Avoid the no cloning theorem ($a|000>+b|111>$ is not the same as $(a|0>+b|1>)^{3}$)
- The only source of error left are phase errors
- Do a similar trick used to address the other problems:
- Define a set of 9-qubit states $|0> = \frac{1}{2^{\frac{3}{2}}}(|000> + |111>)(|000> + |111>)(|000> + |111>)$ and $|1> = \frac{1}{2^{\frac{3}{2}}}(|000> - |111>)(|000> - |111>)(|000> - |111>)$
- Define a “cluster” as state within each parentheses
- If a bit flip occurs within a cluster, you can correct it with the aforementioned scheme
- Define a “cluster” as state within each parentheses
- Observe that the phases between each cluster is aligned (ie. they are all + or all -). If a phase flip occurs, then the phases between each cluster won’t be aligned (ie. one is + and one is 1)
- You can generate a similar index observable, but now it’s a 6-bit observable. If this observable is non-zero, than you can detect which cluster has a different sign compared to the others and correct for it
- Define a set of 9-qubit states $|0> = \frac{1}{2^{\frac{3}{2}}}(|000> + |111>)(|000> + |111>)(|000> + |111>)$ and $|1> = \frac{1}{2^{\frac{3}{2}}}(|000> - |111>)(|000> - |111>)(|000> - |111>)$
- Do a similar trick used to address the other problems:
- The most general single-qubit unitary transformation can be expanded to order $\epsilon$ in terms of the Pauli matrices:
- $U = 1+ i\epsilon_{x}\begin{pmatrix} 0 & 1 \\ 0 & 1 \end{pmatrix}+ i \epsilon_{y}\begin{pmatrix} 0 & -i \\ 0 & i \end{pmatrix}+ i\epsilon_{z} \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}$
- Each term can be though of as a bit flip, a phase flip, and a combination of the two respectively
- For simplicity, suppose that we want to encode one qubit as 3 qubits
- The key takeaways of the quantum repetition code are:
- That the errors became digitized: Either you are in a state of no error, or you are in a discrete set of error states that you know how to recover from
- You can measure the error without measuring the data (re: sampling all of the qubits)
- The errors are local (re: uncorrelated), and the encoded information is nonlocal, so as long as you keep your measurements to single qubits, then you can’t get information out of the system
States and Ensembles
Quantum Mechanics Axioms
- A state is a ray in Hilbert space
- A Hilbert space is a vector space over the complex numbers
- You can define an inner product which maps an ordered pair of vectors to C that is strictly positive, linear, and skew symmetric (ie. swapping order of vectors is complex conjugation)
- An additional constrain is that there is some notion of a norm: $||\phi|| = <\phi| \phi>^{\frac{1}{2}}$
- A ray is an equivalence class of vectors which differ by a multiplication by a nonzero complex scalar. We choose the representative of this class to have a unit norm
- An observable is a property of a physical system which can be measured
- Must be a linear, self-adjoint operator(ie. $<\psi| A \psi> = < A^{\dag} \psi | \psi >$ )
- As a result of the self-adjoint property, we can write $\Sigma_{n} a_{n} P_{n}$ where the projection operator $P_{n}$ satisfies $P_{n}P_{m} = \delta_{n,m} P_{n}$ and $P^{\dag} = P$
- Must be a linear, self-adjoint operator(ie. $<\psi| A \psi> = < A^{\dag} \psi | \psi >$ )
- We can make a measurement of a state by using the projection operator
- The dynamics of the system is unitary (re: probability preserving) and is governed by $\frac{d}{dt}|\phi> = -i H |\phi>$
- We can describe the time evolution of a state as $|\phi(t)> = U(t) | \phi(0)>$
- If H is time independent, then $U = exp(-i t H)$
Qubits
- A qubit is a state in a 2D Hilbert space that can take the form $a|0> + b|1>$
Symmetries
- Any symmetry of a quantum system must leave the probabilities untouched
- This implies that a symmetry is a automorphism of the Hilbert space which preserves the absolute values of inner products for all members of the space
- Each symmetry maps onto either a unitary or anti-unitary operator
- Anti-unitarity doesn’t matter for continuous symmetries
- Compositions of symmetries should also be a symmetry (up to an overall phase factor). This follows from group theory
- Symmetries should commute with the dynamics of a system
- Let R be the symmetry in question. The following must hold: $U(R) exp(-itH) = exp(-itH) U(R)$
- For a continuous symmetry, we can let R get arbitrarily close to the identity: $R = I + \epsilon T$. This implies that, to first order, $U = 1-i\epsilon Q$ where Q is unitary. This in turn implies that Q commutes with H
- We call Q the generator
- Since any finite transformation can be written as a product of infinitesimal ones: $R = (1+\frac{\theta}{N} T)^{N} \rightarrow U = exp(i\theta Q)$, knowing how the infinitesimal symmetry transformations are represented lets you do finite transformations
Rotations
- A finite rotation is given by $R(\hat{n}, \theta) = exp(-i\theta\hat{n}\cdot J)$
- $\hat{n}$ is the axis of rotation, $\theta$ is the rotation angle, and $\vec{J}$ is the angular momentum
- The associated commutation relationship for angular momentum is $[J_{k}, J_{l}] = i \epsilon_{klm} J_{m}$
- The simplest non-trivial irrreducible representation of angular momentum is 2D, as given by the Pauli matrices: $J_{k} = \frac{1}{2}\sigma_{k}$
- The Pauli matrices satisfy the anti-commutation relationship: $\sigma_{k}\sigma_{l}+\sigma_{l}\sigma_{k} = 2\delta_{lk} I$
- We can use the Pauli matrices to write any finite rotation as $U(\hat{n},\theta) = exp(-i\frac{\theta}{2} \hat{n}\cdot \vec{\sigma})$
- There is a $2\pi$ ambiguity, which gives rise to spinor representations
- For rotation, you have the action $U(\hat{n} ,\theta = 2\pi) = -1$
- There is a $2\pi$ ambiguity, which gives rise to spinor representations
- The components of angular momentum transform under rotations like a vector
- $U(R)J_{k} U(R)^{\dag} = R_{kl} J_{l}$
- This implies that if state $|m>$ is an eigenstate of $J_{3}$, then $U(R)|m>$ is an eigenstate of $RJ_{3}$ with the same eigenvalue
- The above implies that we can construct eigenstates of angular momentum along the axis $\hat{n} = < \sin \theta \cos \phi, \sin \theta \sin \phi, \cos \theta >$ by applying a rotation through $\theta$ to the z axis
- This means that all direction measurements can be performed by first rotating the $\hat{n}$ axis to the z axis, and then measuring along z
Density Operator
- In real quantum systems, we don’t have a closed system: there is always some interaction with the environment
- This can cause some axioms of quantum mechanics to appear to be violated
- States are not rays
- Measurements are not orthogonal projections
- Evolution is not unitary
- This can cause some axioms of quantum mechanics to appear to be violated
- Say that you have a two qubit system A and B, but you can only observe A. How can you characterize the observations made on A alone?
- We can define the density matrix of the system as $\rho = \Sigma_{i} p_{i} |i><i|$
- This of this as a representation of the ensemble of possible quantum states, each with their own probability
- We can then define expectation values of observable Q as $tr(Q \rho)$
- We can define the density matrix of the system as $\rho = \Sigma_{i} p_{i} |i><i|$
- This idea extends too any bipartite system: You can calculate the expectation value of one system by partial tracing over the subsystem
- In general, we have that $\rho_{A} = tr_{B}(|\phi><\phi|)$
- A general density matrix (in diagonal form) can also be written as $\rho_{A} = \Sigma_{a} p_{a} |a>< a|$
- The density matrix has the following properties:
- self-adjoint ($\rho_{A} = \rho_{A}^{\dag}$)
- $\rho_{A}$ is positive definite
- $tr(\rho_{A}) = 1$
- For so called “pure states”, we have that $\rho^{2} = \rho$
- You can think of a pure state as when the subsystem is also a ray (re: no mixing of components)
Bloch Sphere
- We can write any 2x2 self-adjoint matrix in the basis of the Pauli matrices and the identity
- $\rho(\vec{P}) = \frac{1}{2}(I+\vec{P}\cdot \vec{\sigma})$ where $\vec{P}$ is some 3D vector
- The $\frac{1}{2}$ arises because
- $tr(\rho)=1$ and the Pauli matrices are traceless, hence we need to scale down the identity so that $\rho$ is a density operator
- The $\frac{1}{2}$ arises because
- $\rho(\vec{P}) = \frac{1}{2}(I+\vec{P}\cdot \vec{\sigma})$ where $\vec{P}$ is some 3D vector
- The eigenvalues for $\rho$ must be non-negative (since they are interpreted as probabilities)
- Looking at the determinant of $\rho$ we get $det(\rho) = \frac{1}{4}(1-\vec{P}^{2})$
- The above constrains $\vec{P}^{2} \leq 1$, which corresponds the the unit ball
- So Bloch “Sphere” is a bit of a misnomer
- The actual sphere corresponds to density matrices with a vanishing determinant. Since the trace is 1, the eigenvalues are either 0 or 1. Hence, the boundary of the ball are pure states
- Accordingly, we can write a pure state as: $\rho(\hat{n}) = \frac{1}{2}(I+\hat{n}\cdot \vec{\sigma})$
- Re can write this in spherical coordinates as well: $\rho(\theta,\phi) = \frac{1}{2} I + \frac{1}{2} \begin{pmatrix} \cos \theta & \sin \theta exp(-i\phi) \\ \sin \theta exp(i\phi) & -\cos(\theta) \end{pmatrix}$
- This is easily derived from the vector $\Phi(\theta,\phi)> = \begin{pmatrix} exp(-i\phi/2) \cos(\theta/2) \\ exp(i\phi/2) \sin(\theta/2)\end{pmatrix}$
- We remember that $\hat{n} = (\sin\theta\cos\phi, \sin\theta\sin\phi, \cos\theta)$
- You can construct the density matrix of a system by measuring $\hat{p}\cdot\vec{\sigma}$ across the 3 linearly independent axes
Schmidt Decomposition
- The standard orthonormal basis for a bipartite system is $|\phi_{AB}> = \Sigma_{\alpha, \mu} \phi_{\alpha\mu} |a_{A}> \otimes |\mu_{B}>$
- An alternative representation is the so called Schmidt decomposition: $|\phi_{AB}> = \Sigma_{i} \sqrt{p_{i}} |i_{A}> \otimes | i’_{B} >$
- This is just the SVD of the original operator
- $i_{B}’> = \Sigma_{\mu} \phi_{i\mu} |\mu_{B}>$
- the transformation from unprimed to primed can be encoded in a unitary transformationo $U_{B}$
- The $p_{i}$ are the singular values
- This decomposition holds for any matrix
- Calculating the density matrices for A and B using the Schmidt decomposition, we find that A and B share the same non-zero singular values
- If A and B are different dimensions, the difference is made up with zeros
Entanglement
- Define the Schmidt number as the number of non-zero singular values in a bipartite pure state
- If the schmidt number is greater than one, then the state is entangled. Otherwise, it’s seperable
- So if the state can be written as a direct product of the subspaces (re: $|\phi_{AB}> = |\phi_{A}>\otimes |\phi_{B}>$) then it’s seperable
- A seperable state is not necessarily uncorrelated: If you have $\uparrow_{A}>\uparrow_{B}>$, then the system is seperable but correlated
- An entangled state is fundamentally different in that there are non-local quantum correlations (re: there must be interactions between the subsystems)
Ensemble Interpretation Ambiguity
- The density operator is self-adjoint, nonnegative and has $tr(\rho) = 1$
- From the above, you can construct a density matrix as a convex linear combination between two other density matrices which still satisfies all the properties:
- $\rho(\lambda) = \lambda \rho_{1} + (1-\lambda) \rho_{2}$ where $0 \leq \lambda \leq 1$
- The above implies that the density operators are a convex subset of the real vector space of dxd Hermetian operators
- Pure states can’t be defined as a complex sum of other states
- Define the pure state $\rho = |\phi><\phi|$ and define $\phi_{\perp}>$ as some orthogonal vector to $\rho$
- Suppose that $\rho$ can be written as a convex linear combination
- $<\phi_{\perp}|\rho|\phi_{\perp}> = 0 = \lambda <\phi_{\perp}|\rho_{1}|\phi_{\perp} + (1-\lambda) <\phi_{\perp}|\rho_{2}|\phi_{\perp}>$
- For the above to hold, both terms must vanish. If $\lambda$ is 0 or 1, then $\rho_{1}= \rho_{2} = \rho$
- Otherwise, $\rho_{1}$ and $\rho_{2}$ are orthogonal to $|\phi_{\perp}>$. Since $\phi_{\perp}>$ can be any orthogonal vector, you arrive at the same conclusion
- These pure states are extremal points of the set
- You can see this structure on the Bloch sphere (extremal points are on the boundary)
- d=2 is a special case where the extremal points are all pure states. This does not hold for d>2
- The convexity of the set of density matrices has a physical interpretation:
- Calculating the expectation value of some observable M, we have $<M> = tr(\rho(\lambda) M)$
- Suppose that we have two states $\rho_{1}$ and $\rho_{2}$, where the former has a probability of $\lambda$ and the latter has probability of $1-\lambda$
- Taking the expectation value of M on this state also gives $tr(M \rho(\lambda))$
- So this preperation of $\rho$ gives the same observable, regardless of $\rho_{1}$ and $\rho_{2}$ used in preparation
- Since pure states can’t be a convex linear combination of other density matrices, there is an unambiguous way of preparing a pure state
- This ambiguity in mixed states stands in stark constrast to classical systems, where there is a unique preparation method to generate a probability distribution
Faster Than Light Communication?! (No…)
- Suppose that qubit A has the density matrix $\rho_{A} = \frac{1}{2}(|\uparrow_{z}><\uparrow_{z}|+|\downarrow_{z}><\downarrow_{z}|$
- This density matrix could arise from an entangled bipartite pure state $|\phi_{AB}>$ with the Schmidt decomposition $\rho_{A} = \frac{1}{2}(|\uparrow_{zA}><\uparrow_{zB}|+|\downarrow_{zA}><\downarrow_{zB}|)$
- We can realize the ensemble interpretation of A via measuring qubit B
- Since $\rho_{A}$ has degenerate eigenvalues, this Schmidt basis is not unique, so any direction works (just apply a unitary tranformation V to A and $V^{*}$ to B to rotate your basis)
- Possible faster than light information propagation:
- Prepare many copies of this entangled state. Alice takes all the A qubits to one location, while Bob takes all the other B qubits to another
- Bob measures along the z axis for all of the prepared states. Alice, concurrently, measures all of her qubits to see which axis was read.
- For now, ignore the fact that the measurements need to be simultaneous (re: Bob calls Alice and tells her to measure something)
- The main problem with this is that density matrix $\rho_{A}$ is the same between the two setups, so it’s fundamentally impossible to distinguish between the two states
- Follows from the convexity of the density matrices set
Quantum Erasers
- The density matrix $\rho_{A} = \frac{1}{2} I$ is an incoherent mixture between $|\uparrow>$ and $\downarrow>$
- A coherent superposition would be $|\phi> = \frac{1}{\sqrt{2}}( |\uparrow > \pm \downarrow>)$
- The distinction is that the relative phase of a coherent superposition is observable
- Entanglement causes decoherence
- Imagine an entangled state like that in previous section
- Bob makes a measurement along the x axis and sends his measurement result to Alice
- This forces Alice’s spin to be a pure state along the x axis, which in turn can be interpreted as a coherent superposition of z axis spins
- So even though Alice initially had an incoherent state, Bob’s measurement caused Alice’s state to become coherent
- Another thought experiment: Bob uses a Stern-Gerlach experiment to measure his z axis spin. This precludes Alice from having a coherent superposition along the z axis
- Bob refocuses the two beams of the Stern-Gerlach which then passes through a Stern-Gerlach along the x axis. Alice’s coherence along the z axis is now restored! (re: Alice is in a known x axis configuration, but the z axis position is lost)
- This situation is called a quantum eraser. Coherence of a state can be restored if an orthogonal measuremnt occurs
- Information is physical: measuring an system fundamentally changes the physical description of the system
HJW Theorem
- How do you extend the quantum eraser to multiple qubits with a more general density matrix?
- Consider the general density matrix: $\rho_{A} = \Sigma_{i} p_{i} |\phi_{i}><\phi_{i}|$ where $\Sigma p_{i} = 1$
- Don’t assume that $|\phi_{i}>$ are orthogonal, but do assume they are normalized
- Imagine some bipartite system such that performing a partial trace over the subsystem B yields $\rho_{A}$
- $|\phi_{1AB}> = \Sigma_{i} \sqrt{p_{i}} |\phi_{iA}>\otimes |\alpha_{iB}>$
- $<\alpha_{i}|\alpha_{j}> = \delta_{ij}$
- $tr_{B}(|\phi_{1AB}><\phi_{1AB}|) = \rho_{A}$
- The construction of this bipartite state is called purification. It can be though of as representing a mixed state as a pure state in a higher dimensional Hilbert space
- $|\phi_{1AB}> = \Sigma_{i} \sqrt{p_{i}} |\phi_{iA}>\otimes |\alpha_{iB}>$
- Performing a measurement on B is projecting onto a $|\alpha_{iB}>$ basis, which in turn forces system A to be in the pure state $|\phi_{i}><\phi_{i}|$
- Is there a different ensemble interpretation of $\rho_{A}$ that we can construct by making a different measurement of B?
- Let $\rho_{A} = \Sigma_{\mu} q_{\mu} |\phi_{\mu}>< \phi_{\mu}|$ so a different ensemble of pure states
- There is a similar purification of A:
- $|\phi_{2AB}> = \Sigma_{\mu} \sqrt{q_{\mu}} |\phi_{\mu A}>\otimes |\beta_{\mu B}>$
- How are $|\phi_{1}>$ and $\phi_{2}>$ related?
- Partial tracing over B yields the same density matrix in both cases. They both have Schmidt decompositioons:
- $\phi_{1}> = \Sigma_{k} \sqrt{\lambda_{k}} |k_{A}> \otimes | k_{1B}>$
- $\phi_{2}> = \Sigma_{k} \sqrt{\lambda_{k}} |k_{A}> \otimes | k_{2B}>$
- Alternatively: $|\phi_{1AB}> = (I_{A} \otimes U_{B}) | \phi_{2AB}>$
- Since $k_{1}>$ and $k_{2}>$ are both orthonormal bases for B, there is a unitary transformation $U_{B}$ between the two
- Hence, $|\phi_{1}>$ and $|\phi_{2}>$ are the same purification. You just need to change which direction in B you measure along
- Partial tracing over B yields the same density matrix in both cases. They both have Schmidt decompositioons:
- Suppose that we have many ensembles that realize $\rho_{A}$, where the max number of ensembles is d
- We can then choose a Hilbert space $H_{B}$ of dimension d and a pure state $|\phi_{AB}> \in H_{A} \otimes H_{B}$ such that any of the ensembles can be realized by measuring a suitable observable of B
- This is the HJW theorem
- The general density matrix mixes the pure states incoherently (re: can’t detect the relative phases of these states)
- So you can erase information by making a measurement in B, and restore the coherence by making a different measurement
Fidelity
- Suppose that you have two density operators $\rho$ and $\sigma$. The fidelity is defined as $(tr(\sqrt{\rho^{\frac{1}{2}} \sigma \rho^{\frac{1}{2}}}))^{2}$
- Think of this as the distance metric between two mixed states
- This is well defined since $\rho$ and $\sigma$ are positive definite matrices; hence, you can take the square root via the spectral theorem
- The fidelity is bound between 0 and 1. 1 occurs if $\rho$ and $\sigma$ are identical
- If $\rho = |\phi>< \phi|$ (ie. a pure state), the fidelity becomes $F(\rho,\sigma) = <\phi|\sigma|\phi>$
- If both density matrices are pure states, the fidelity reduces to the Born interpretation
- An alterrnative definition of the fidelity is $F(\rho,\sigma) = ||\sigma^{\frac{1}{2}} \rho^{\frac{1}{2}}||_{1}$
- $||A||_{1} = tr \sqrt{A^{\dag}A}$
- For Hermetian matrices, take the sum of the absolute values of the eigenvalues
- The symmetry between the arguments of the fidelity is more manifest in this notation, since for any Hermetian matrices A and B, we have that $||AB|| $ $= ||BA||$
- This follows from the fact that BAAB and ABBA have the same eigenvalues (hence the same traces)
- $||A||_{1} = tr \sqrt{A^{\dag}A}$
Uhlmann’s Theorem
- How does the fidelity of two density operators relate to the overlap of their purifications?
- Define $|\Phi_{AB}>$ as the purifcation of the density operator $\rho_{A}$ where $\rho_{A} = tr_{B}(|\Phi><\Phi|)$
- Suppose that $\rho = \Sigma_{i} p_{i} |i><i|$ where $|i_{A}>$ is a orthonormal basis for system A
- The associated purifcation is then $\Phi_{\rho}> = \Sigma_{i} \sqrt{p_{i}} |i_{A}> \otimes |i_{B}>$
- By the HJW theorem, the general purification is $\Phi_{\rho}(V)> = I \otimes V |\Phi_{\rho}>$ where V is unitary
- This can also be written as $(\rho^{\frac{1}{2}} \otimes V) |\tilde{\Phi}>$ where $\tilde{\Phi}> = \Sigma_{i} |i_{A}> \otimes | i_{B}>$, which is a normalized maximally entangled state
- Suppose that we have two density operators $\rho$ and $\sigma$ acting on A. What is the inner product of their purifications?
- $<\Phi_{\sigma}(W)|\Phi_{\rho}(V)> = <\tilde{\Phi}| \sigma^{\frac{1}{2}} \rho^{\frac{1}{2}} \otimes W^{\dag} V |\tilde{\Phi}>$
- Using the fact that $U\otimes I |\tilde{\Phi}> = I \otimes U^{T} |\tilde{\Phi}>$, we can write the inner product of the purifications as $<\tilde{\Phi}|\sigma^{\frac{1}{2}}\rho^{\frac{1}{2}}U \otimes I | \tilde{\Phi}> = tr( \sigma^{\frac{1}{2}}\rho^{\frac{1}{2}}U )$ where $U = (W^{\dag}V)^{T}$
- We use the polar decomposition: $A = U’ \sqrt{A^{\dag}A}$ This is the “obvious” fact that any square complex matrix can be factorized as $A=UP$, where U is a unitary matrix and P is a positive semi-definite Hermetian matrix
- so $ tr( \sigma^{\frac{1}{2}}\rho^{\frac{1}{2}}U ) = tr(UU’\sqrt{\rho^{\frac{1}{2}}\sigma \rho^{\frac{1}{2}}})$
- This is very close to the square root of the fidelity. If we select $U’ = U^{-1}$, then they are the same
- This selection can be thought of as maximizing the inner product of the purifications (this can be seen by Schmit decomposing $ \sqrt{\rho^{\frac{1}{2}}\sigma \rho^{\frac{1}{2}}}$ into its’ non-negative eigenvalues $\lambda_{a}$ and eigenvectors $|a>$, and realizing that $<a|UU’|a>$ is maximized when $UU’ = I$
- The final result of Uhlmann’s theorem is that $F(\rho,\sigma) = (tr( \sqrt{\rho^{\frac{1}{2}}\sigma \rho^{\frac{1}{2}}}))^{2} = max_{V,W} | <\phi_{\sigma}(W)|\phi_{\rho}(V)>|^{2}$