Density Matrices and Mixed States: The Formalism for Real Quantum Systems
Pure-state quantum mechanics ($|\psi\rangle$ vectors) is enough for textbook quantum computing but not for real hardware. Real qubits are noisy, partially-known, or part of larger entangled systems whose other parts you've ignored. The density matrix is the formalism that handles all three cases. This tutorial defines density matrices, derives their properties, covers the partial trace and the purification theorem, and shows why density matrices are the natural language of quantum-information theory.
Prerequisites: Tutorial 1: What Is a Qubit, Tutorial 18: Noise and Decoherence
Tutorials 1-3 introduced quantum states as vectors in a Hilbert space. This is the pure-state description: a state of complete knowledge, with definite amplitudes for every basis vector. It is enough for textbook quantum computing and clean theoretical analysis.
Real quantum systems are not pure states. A qubit in a noisy quantum processor has small interactions with its environment that randomize parts of its state. A qubit you’ve measured but not looked at the outcome of is a probabilistic mixture of several pure states. A qubit that is entangled with another qubit you’ve discarded — say, an environmental degree of freedom you can’t track — has its quantum information partly hidden in correlations that are no longer accessible.
All three cases — noise, classical uncertainty, and ignored entanglements — are described by the same mathematical object: the density matrix. It is the natural language of quantum-information theory, of error correction, of channel descriptions, and of essentially everything that distinguishes “real quantum hardware” from “ideal closed quantum systems.”
This tutorial defines the density matrix, derives its properties, covers the partial trace (the operation that handles “ignored subsystems”), introduces the purification theorem (every mixed state is the marginal of some pure state in a larger Hilbert space), and motivates why this formalism is essential for working quantum-information theory.
The three sources of mixedness
Mixed states arise from three distinct physical situations:
1. Classical uncertainty over pure states
Suppose a qubit was prepared in with probability or with probability , but you don’t know which. You can describe this classical mixture as a probability distribution over pure states. The corresponding density matrix is
This is classical uncertainty translated into the quantum formalism. Importantly, it is not the same as the pure superposition , which has off-diagonal coherence and gives different measurement statistics in superposition bases.
2. Noise from environmental coupling
A qubit interacting weakly with an uncontrolled environment (other qubits, photons, phonons, two-level-system defects) ends up entangled with the environment. If you can’t measure the environment, you describe the qubit alone by a density matrix obtained by tracing out the environment — the partial trace operation covered below. This is how decoherence (tutorial 18) shows up in the density-matrix language.
3. Subsystems of larger pure states
Even with no noise at all, a qubit that is part of a larger entangled system has a density-matrix description if you only look at it alone. The classic example: half of a Bell state. The full state is pure, but the first qubit alone is described by
This is the maximally mixed state — completely random, no information. The information is not lost; it lives in the entanglement with the second qubit. Looking only at the first qubit, you see a maximally mixed density matrix.
The unifying principle: whenever you have less than complete knowledge of a quantum system, the density matrix is what describes what you do know. Pure states are the special case where you have complete knowledge.
Definition and properties
A density matrix on a Hilbert space is a Hermitian operator satisfying:
- Positive semidefinite: for all .
- Unit trace: .
- (Hermitian, but this follows from positive semidefinite plus unit trace in standard usage.)
Equivalently, is a convex combination of pure-state projectors:
This decomposition is not unique — different sets of pure states with different probabilities can give the same density matrix. The density matrix encodes only the operational consequences of the mixture, not the specific decomposition.
Pure vs mixed states
A density matrix is pure if it can be written as for some pure state . Equivalently:
- (idempotent).
- (purity equals 1).
A density matrix is mixed otherwise. The purity, , ranges from (pure) down to for the maximally mixed state on a -dimensional system. Purity is a useful single-number summary of how mixed a state is.
For a single qubit: means a Bloch-sphere-surface state; means the center of the Bloch sphere (maximally mixed). Tutorial 48 covers the Bloch sphere geometry.
Computing expectation values
For an observable measured on a state :
This unifies pure-state and mixed-state expectations. For a pure state , , and — the familiar pure-state formula. For mixed states, the trace formula handles the convex mixture automatically.
For diagonal observables in the eigenbasis (e.g., measuring ), the diagonal elements of are exactly the measurement-outcome probabilities. This is why the diagonal entries of a density matrix are sometimes called the “populations” of each basis state.
Time evolution: unitary and non-unitary
Pure-state time evolution is unitary: . Translated into density matrices:
This handles closed-system evolution. For open systems (with environmental noise), evolution is described by completely positive trace-preserving (CPTP) maps, also called quantum channels:
where are Kraus operators satisfying . This operator-sum representation captures every physically realizable quantum operation, including noise.
Tutorial 18 covered specific noise channels (depolarizing, dephasing, amplitude damping). All of them are CPTP maps, and the density-matrix formalism is what makes them representable.
The partial trace
Given a bipartite state on a composite system , the partial trace over produces a density matrix on alone:
The partial trace describes “what an observer of subsystem alone would see, ignoring entirely.” It is the operation that produces mixed states from pure entangled states.
Concrete example: the Bell state has density matrix
Tracing out the second qubit:
the maximally mixed state. The pure entangled state, viewed locally, looks completely random. This is the density-matrix expression of the fact that entanglement information is not localizable — you cannot read it out from one half of the entangled pair alone.
The purification theorem
Going the other direction: every mixed state on a system can be obtained as the partial trace of some pure state on a larger system . The pure state is called a purification of .
The construction: diagonalize . Then
satisfies . The auxiliary system has dimension at least .
Purifications are not unique (any unitary on produces another purification), but they exist for every mixed state. This is one of the key technical results of quantum-information theory: any noise process or classical mixture can be modeled as part of a larger pure quantum system. There is no fundamental difference between “noise from the environment” and “entanglement with degrees of freedom you can’t see” — the density matrix sees them as the same.
Schmidt decomposition
For a pure bipartite state , there exist orthonormal bases such that
The coefficients are the Schmidt coefficients, and the number of non-zero is the Schmidt rank.
The reduced density matrices are diagonal in these bases:
The two marginals have the same eigenvalues. This is the structural reason why entanglement entropy is the same on both halves of a Bell state — they share the same Schmidt spectrum.
The Schmidt decomposition is the operational tool for analyzing entanglement. Schmidt rank 1 means a separable state; Schmidt rank > 1 means entangled. The entropy of the Schmidt spectrum measures how entangled.
A small density-matrix example
Concrete code computing a noisy state’s density matrix:
import numpy as np
import pennylane as qml
dev = qml.device("default.mixed", wires=2)
@qml.qnode(dev)
def noisy_bell_state(p_depol):
"""Prepare a Bell state, then apply depolarizing noise to each qubit."""
qml.Hadamard(wires=0)
qml.CNOT(wires=[0, 1])
qml.DepolarizingChannel(p_depol, wires=0)
qml.DepolarizingChannel(p_depol, wires=1)
return qml.density_matrix(wires=[0, 1])
# No noise: pure Bell state.
rho_pure = noisy_bell_state(0.0)
print("Pure Bell state purity:", np.trace(rho_pure @ rho_pure).real)
# Expected: 1.0
# Moderate noise.
rho_noisy = noisy_bell_state(0.1)
print("Noisy state purity:", np.trace(rho_noisy @ rho_noisy).real)
# Expected: < 1.0
# Reduce to first-qubit marginal via partial trace.
def partial_trace(rho, dims, traced_axis):
"""Partial trace of rho over the specified axis."""
rho_reshaped = rho.reshape([dims[0], dims[1], dims[0], dims[1]])
if traced_axis == 0:
return np.einsum("ijkj->ik", rho_reshaped) / 1.0
else:
return np.einsum("ijik->jk", rho_reshaped) / 1.0
rho_a = partial_trace(rho_noisy, dims=[2, 2], traced_axis=1)
print("First qubit marginal:")
print(rho_a)
print("Marginal purity:", np.trace(rho_a @ rho_a).real)
# Expected: 0.5 (maximally mixed)
Sample output:
Pure Bell state purity: 1.0
Noisy state purity: 0.7290
First qubit marginal:
[[0.5+0.j 0.0+0.j]
[0.0+0.j 0.5+0.j]]
Marginal purity: 0.5
The pure Bell state has full purity. Adding noise reduces the joint purity. Tracing out one qubit always gives a maximally mixed marginal for any Bell state — even the noisy version, because the noise we applied is symmetric.
Common misconceptions
“A density matrix is just a fancy way to write probabilities.” Density matrices encode classical probabilities (diagonal) and quantum coherence (off-diagonal). Off-diagonal entries have no classical analog. Density matrices are strictly more expressive than classical probability distributions.
“Mixed states are less quantum than pure states.” Wrong. Mixed states can be entangled with auxiliary systems via purification, and an entangled mixed state is at least as quantum as any pure state. Quantum information theorems (no-cloning, no-broadcasting, etc.) apply to mixed states as much as to pure states.
“You can decompose any mixed state into a unique pure-state mixture.” No. The same density matrix can be written as different pure-state mixtures, all giving the same density matrix and the same operational predictions. This non-uniqueness is structural — only the density matrix itself is operationally meaningful.
“The partial trace destroys information.” It does not destroy information; it makes information about the traced-out subsystem inaccessible to the remaining one. The information still exists in the joint state — purifications make this explicit.
“Density matrices are only needed for noisy hardware.” They are needed any time you don’t have complete knowledge of a quantum state — including when you’re studying a part of a perfectly-pure entangled system. Quantum information theory of pure states and noise-free hardware still uses density matrices for the marginals.
Decision rule
Use density matrices when:
- The system is noisy. Real hardware always has some noise; modeling as a density matrix captures this.
- You’re studying a subsystem of a larger entangled system. Even pure global states have mixed-state marginals.
- Classical probability distributions over quantum states are involved. E.g., a state preparation that randomly produces or .
- You’re doing quantum-information-theoretic analysis. Channels, capacities, fidelities, entropies — all natively in density-matrix language.
Stick with pure-state notation when:
- You’re explaining or learning textbook algorithms. Shor, Grover, QFT, etc. are cleaner in pure-state notation.
- You’re analyzing closed-system unitary dynamics. No noise, no partial subsystems.
- You want minimal mathematical overhead. Density matrices add notation; if the system is genuinely pure, vectors are enough.
For practical quantum-computing work, density matrices are the language of real hardware analysis: error correction, channel characterization, randomized benchmarking, fault-tolerance proofs.
Exercises
1. Distinguish a superposition from a mixture
The pure state and the mixture have the same probability of measurement outcomes in the computational basis. What measurement distinguishes them?
Show answer
Measure in the Hadamard basis (). The pure state gives outcome with probability 1. The mixture gives outcomes or with equal probability. Off-diagonal coherence is the distinguishing signature — the pure state has nonzero off-diagonal density-matrix entries, the mixture has only diagonal entries. Superpositions and mixtures look identical in the wrong basis but differ in any other basis. This is also how decoherence is detected experimentally — a state with significant off-diagonal coherence in one basis is in a superposition; one without is a mixture.
2. Purify a maximally mixed qubit
Construct a 2-qubit pure state that is a purification of the maximally mixed single-qubit state . How does this relate to the Bell state?
Show answer
Purification: . The purification of a maximally mixed qubit is exactly the Bell state. This is why “maximally mixed locally” and “maximally entangled globally” are two views of the same state: the Bell state’s first-qubit marginal is the maximally mixed state, and the Bell state is the purification of that marginal. Maximum entanglement is the structural complement of maximum local randomness.
3. The partial-trace recipe
Show that for any bipartite pure state with Schmidt decomposition, the marginal has eigenvalues equal to the Schmidt coefficients squared, .
Show answer
. Tracing over : . The marginal is diagonal in the Schmidt basis with eigenvalues . This is why the Schmidt coefficients are sometimes called “the spectrum of the entanglement” — they are literally the eigenvalues of the marginal density matrices.
4. Why density matrices need to be positive
Suppose someone claims to have measured a “density matrix” with one negative eigenvalue. Why is this impossible?
Show answer
The eigenvalues of a density matrix are the probabilities of finding the state in the corresponding eigenbasis (in the spectral decomposition , are eigenvalues). Probabilities cannot be negative — that would imply the existence of negative measurement-outcome counts in repeated experiments. Negative eigenvalues correspond to operational impossibility. A “density matrix” with negative eigenvalues is not a valid quantum state. In quantum-state tomography (the procedure of estimating from measurement statistics), apparent negative eigenvalues are a signature of measurement noise or experimental error; they must be regularized to a nearby positive matrix before the result counts as a valid state estimate.
Where this goes next
Tutorial 48 covers the Bloch sphere — the natural geometric representation of single-qubit density matrices, and one of the cleanest visual tools in quantum computing. Future foundations tutorials may cover specific channel families (depolarizing, dephasing, amplitude damping) in their density-matrix forms, the von Neumann entropy as the unique entropy on density matrices, and the Choi-Jamiolkowski isomorphism that lets channels be analyzed as states.