QAOA for Combinatorial Optimization
QAOA encodes a combinatorial problem as a cost Hamiltonian, prepares a variational state by alternating cost and mixer evolutions, and uses a classical optimizer to find approximate solutions. This tutorial derives the MaxCut case from scratch, runs it in Qiskit, and honestly compares QAOA to classical baselines like Goemans-Williamson.
Prerequisites: Tutorial 13: Variational Quantum Eigensolver
QAOA — the Quantum Approximate Optimization Algorithm — is VQE’s cousin, reshaped for combinatorial problems. It was proposed by Farhi, Goldstone, and Gutmann in 2014 as a near-term quantum heuristic for problems like MaxCut, graph coloring, Max-SAT, and portfolio optimization. The promise: use shallow circuits to produce approximately optimal solutions better than classical heuristics.
Ten years later, the reality is nuanced. QAOA with small depth () often matches but doesn’t beat strong classical baselines. At deeper , the circuit gets hard to optimize. This tutorial walks through the math, ships a working implementation, and gives you the honest comparison: what QAOA can and can’t do in 2026.
The problem framing
Combinatorial optimization: find a bit string that maximizes (or minimizes) a cost function . MaxCut is the canonical instance: given a graph , partition into two sets to maximize the number of edges crossing the cut.
MaxCut cost function:
Edge contributes 1 if (cut) and 0 if they match.
Encode the cost as a Hamiltonian
Map (Pauli Z eigenvalues are ). The cost Hamiltonian is:
Eigenvalues of on computational-basis states equal the MaxCut values of the corresponding bit strings. The ground state of (highest energy actually, since we’re maximizing) is the optimal bit string encoded as a basis state.
The QAOA ansatz
Alternate two unitaries times:
Cost unitary. . For MaxCut: each edge contributes a factor, implementable as two CNOTs around an gate.
Mixer unitary. , where . Just on every qubit.
Full QAOA- circuit:
Initial state is the uniform superposition (easy to prepare with Hadamards). Classical optimizer finds that maximize . Then sample the final state to read off good bit strings.
Why this works (intuition)
At with appropriately slow-scheduled and , QAOA becomes the quantum adiabatic algorithm — evolving slowly from the ground state of (uniform superposition) to the ground state of (optimal bit string). QAOA- is a truncation: steps of a discretized adiabatic evolution.
At small , the circuit is shallow enough to run on NISQ hardware, and the cost landscape in is often surprisingly well-structured — you can find good solutions by parameter sweep or gradient descent.
MaxCut on a 4-node graph
Graph: cycle , edges . Optimal cut: 4 (alternating color assignment). MaxCut value for a random cut = 2; for a uniformly random bit string, expected value 2 as well.
import numpy as np
from qiskit import QuantumCircuit
from qiskit.circuit import Parameter
from qiskit.quantum_info import SparsePauliOp
from qiskit.primitives import StatevectorEstimator, StatevectorSampler
from scipy.optimize import minimize
edges = [(0, 1), (1, 2), (2, 3), (3, 0)]
n = 4
def maxcut_hamiltonian(edges, n) -> SparsePauliOp:
"""H_C = Σ_{(i,j)} (I - Z_i Z_j)/2"""
terms = []
for i, j in edges:
paulis = ["I"] * n
paulis[i] = "Z"; paulis[j] = "Z"
terms.append(("".join(reversed(paulis)), -0.5))
terms.append(("I" * n, 0.5))
return SparsePauliOp.from_list(terms).simplify()
H_C = maxcut_hamiltonian(edges, n)
def qaoa_circuit(p: int) -> tuple[QuantumCircuit, list[Parameter]]:
gammas = [Parameter(f"γ{i}") for i in range(p)]
betas = [Parameter(f"β{i}") for i in range(p)]
qc = QuantumCircuit(n)
qc.h(range(n)) # |+⟩^n
for layer in range(p):
# Cost unitary: e^{-i γ H_C}
for i, j in edges:
qc.cx(i, j); qc.rz(2 * gammas[layer], j); qc.cx(i, j)
# Mixer: e^{-i β H_M}
for q in range(n):
qc.rx(2 * betas[layer], q)
return qc, gammas + betas
p = 2
ansatz, params = qaoa_circuit(p)
estimator = StatevectorEstimator()
def cost(theta: np.ndarray) -> float:
bound = ansatz.assign_parameters(dict(zip(params, theta)))
ev = estimator.run([(bound, H_C, [])]).result()[0].data.evs
return -float(ev) # minimize -⟨H_C⟩ to maximize
theta0 = np.array([0.5, 0.5, 0.3, 0.3]) # [γ1, γ2, β1, β2]
result = minimize(cost, theta0, method="COBYLA", options={"maxiter": 200, "rhobeg": 0.2})
print(f"QAOA-{p} expected cut: {-result.fun:.4f}")
print(f"Optimal cut for C₄: 4")
print(f"Random cut (baseline): 2")
# QAOA-2 expected cut: 3.9998 ← basically optimal
At on a 4-cycle, QAOA finds the optimal cut essentially exactly. The optimizer converges in ~60 iterations.
Sample the final state for a concrete bit string:
from qiskit.primitives import StatevectorSampler
bound = ansatz.assign_parameters(dict(zip(params, result.x)))
bound_meas = bound.copy(); bound_meas.measure_all()
sampler = StatevectorSampler()
counts = sampler.run([(bound_meas, [])], shots=2048).result()[0].data.meas.get_counts()
for bits, cnt in sorted(counts.items(), key=lambda kv: -kv[1])[:4]:
z = [1 if b == "0" else -1 for b in reversed(bits)]
cut_value = sum((1 - z[i] * z[j]) / 2 for i, j in edges)
print(f"|{bits}⟩: {cnt:>4d} shots, cut = {cut_value}")
# |0101⟩: 1023 shots, cut = 4
# |1010⟩: 1021 shots, cut = 4
# |0110⟩: 2 shots, cut = 2
# ...
The two optimal cuts ( and , the two alternating-color assignments) dominate the distribution.
→ Try a small MaxCut in the playground — paste the cost-unitary QASM and see the diagram.
Benchmarking against Goemans-Williamson
The canonical classical baseline for MaxCut is the Goemans-Williamson (GW) SDP-based approximation algorithm, which achieves a worst-case ratio of — the cut it returns has value at least in expectation. This is provably optimal under the Unique Games Conjecture; no polynomial-time classical algorithm can do better in the worst case.
How does QAOA compare?
- QAOA-1 (p=1): worst-case approximation ratio (Farhi 2014). Worse than GW.
- QAOA-2 (p=2): average-case ratios around on random 3-regular graphs. Still below GW.
- QAOA- at large : ratios approach 1 on many benchmark families, but you need depth that exceeds NISQ capabilities.
- Warm-started QAOA (Egger et al. 2020): initialize parameters from the GW solution; empirically beats GW on specific instances. Still requires running GW first.
Scaling up — Max--SAT, portfolio optimization
The QAOA framework is fully general. Any problem expressible as
(where each is a Pauli string) fits. Examples:
- Max--SAT. Each clause contributes a tensor-product-of--operator term; you build the cost Hamiltonian from the clauses.
- Portfolio optimization. Markowitz mean-variance . The covariance term is quadratic in binary variables, hence expressible as a weighted sum of and terms.
- Graph coloring. Encode each vertex’s color in a one-hot binary vector; penalize constraint violations.
- Traveling Salesman. Requires qubits for cities; large overhead. Not competitive below fault tolerance.
When to use QAOA
Honest guidance for indie developers and small teams:
- Prototyping hybrid quantum-classical workflows. QAOA teaches the full hybrid stack; the lessons transfer.
- Education. Any optimization tutorial is cleaner when the problem has both a classical solver and a quantum one you can compare.
- Research. Algorithmic improvements to QAOA (warm starts, adaptive schedules, symmetry exploitation) are active areas.
- Production: not yet. Classical heuristics (simulated annealing, Gurobi, specialized graph algorithms) beat QAOA at problem sizes big enough to matter. Revisit in 2028-2030 with fault-tolerant hardware.
Parameter landscapes
A practical curiosity: the QAOA landscape is more structured than you’d expect. For MaxCut on random graphs, Farhi et al. showed the optimal concentrate around typical values that generalize across instances. Transferring parameters from one small instance to a larger one often works — called parameter transferability or instance concentration.
This means you can:
- Find good parameters on a small classical simulator (up to qubits).
- Transfer to hardware for larger instances (up to qubits).
- Skip the expensive optimization on the noisy hardware.
This is the single biggest practical optimization when running QAOA on real hardware.
Exercises
1. Cost Hamiltonian by hand
Write the MaxCut cost Hamiltonian for the triangle graph (three nodes, three edges). Expand and identify the ground state by inspection.
Show answer
. Max cut value = 2 (any bipartite split of 3 nodes cuts 2 edges). Every bit string except and achieves 2. Degenerate ground state of is the 6-dimensional subspace of non-constant bit strings.
2. QAOA-1 approximation
Implement QAOA at for a 5-node random 3-regular graph. Measure the average approximation ratio vs GW over 10 random instances.
Show approach
import networkx as nx
import cvxpy as cp # for GW SDP
def goemans_williamson(G):
n = G.number_of_nodes()
X = cp.Variable((n, n), symmetric=True)
constraints = [X >> 0] + [X[i, i] == 1 for i in range(n)]
obj = 0.5 * sum(1 - X[i, j] for i, j in G.edges())
cp.Problem(cp.Maximize(obj), constraints).solve()
# Randomized rounding + average
V = np.linalg.cholesky(X.value + 1e-8 * np.eye(n))
# ... random hyperplaneCompare QAOA-1 average cut to GW average cut over 10 instances.
3. Mixer alternatives
The default mixer explores the full bit-string space. For problems with symmetry (e.g., fixed-weight bit strings), a more structured mixer is better. Think: what’s the mixer for problems where you need exactly 3 ones among 8 bits? (Hint: look up “XY mixer” or “ring mixer.”)
Show sketch
Use — the XY (swap-preserving) mixer. This preserves total Hamming weight, so the state stays in the “exactly 3 ones” subspace throughout the evolution. Useful for graph coloring and portfolio selection problems.
4. Compare depth
Run QAOA-1, QAOA-2, QAOA-4 on the 4-cycle. Tabulate: (a) number of parameters, (b) circuit depth, (c) best expected cut, (d) wall-clock optimization time. Is more depth always better?
Expected
QAOA-1: 2 params, depth ~8, expected cut ~3.0. QAOA-2: 4 params, depth ~16, expected cut 4.0. QAOA-4: 8 params, depth ~32, expected cut 4.0 but optimizer stalls more often due to local minima. More depth doesn’t always help; the cost landscape gets more complex.
What you should take away
- QAOA framework. Encode cost as Hamiltonian ; alternate and ; optimize classically.
- Works for many problems. MaxCut, SAT, portfolio optimization, coloring — anything expressible as a sum of Pauli strings.
- Classical baseline is Goemans-Williamson (0.878 approximation for MaxCut). QAOA-1 is worse; QAOA- beats it only at impractical depth.
- No quantum advantage demonstrated for real-world optimization as of 2026. Parameter transferability and warm starts are the closest wins.
- Still worth learning for the hybrid-algorithm framework, which generalizes to VQE, QML, and near-term quantum software patterns.
Next: Quantum ML foundations — using parameterized circuits for classification, the encoding problem, and the parameter-shift rule for gradient-based training.