Bernstein-Vazirani and Simon: Learning Hidden Structure in One (or O(n)) Queries
Bernstein-Vazirani learns a hidden bit string in a single query. Simon's algorithm learns a hidden shift with O(n) queries where classical algorithms need exponentially many — and was the direct inspiration for Shor's factoring algorithm. This tutorial derives both from scratch with complete Qiskit implementations.
Prerequisites: Tutorial 8: Deutsch-Jozsa
Deutsch-Jozsa told us quantum computers can beat deterministic classical ones in the black-box model. This tutorial covers two algorithms that push the idea further and make it recognizable as learning: Bernstein-Vazirani retrieves a hidden bit string in one query, and Simon’s algorithm cracks a problem where even randomized classical algorithms need exponentially many queries.
Simon’s algorithm is the unsung hero of quantum computing history. When Peter Shor saw it in 1994, he reportedly realized “the same idea would break RSA if I could just replace the Boolean group with the integers mod .” Shor’s factoring algorithm — the reason governments care about quantum — is a direct descendant.
Bernstein-Vazirani: find the hidden string in one query
Problem. You’re given a Boolean function as a black box, with the promise that there exists a hidden string such that
where the dot product is taken bit-by-bit and summed mod 2. Your job: find .
Classical bound. queries — you have to learn each bit of . Query where is the -th standard basis vector and read off . You can’t do better because each query returns only one bit.
Quantum bound. One query. Same circuit as Deutsch-Jozsa, same structure, different interpretation of the output.
Circuit
Literally identical to Deutsch-Jozsa. Prepare , apply Hadamards on everything, apply the oracle, apply Hadamards on the inputs, measure. The output is .
Why? Trace the state as in Deutsch-Jozsa. After the oracle with phase kickback:
Now apply to the inputs. The Hadamard of a basis state with a sign pattern is:
The inner sum is when and otherwise. So you measure deterministically. One oracle query, classical bits recovered.
OpenQASM for BV with
OPENQASM 2.0;
include "qelib1.inc";
// Bernstein-Vazirani for hidden string s = 1011 (LSB on q[0])
qreg q[5]; // 4 inputs + 1 ancilla
creg c[4];
// Prep
x q[4];
h q[0]; h q[1]; h q[2]; h q[3]; h q[4];
// Oracle f(x) = s · x mod 2, for s = 1011
// Bits of s that are 1 contribute CNOTs to the ancilla:
cx q[0], q[4]; // s_0 = 1
cx q[1], q[4]; // s_1 = 1
// s_2 = 0, skip
cx q[3], q[4]; // s_3 = 1
// Final H on inputs
h q[0]; h q[1]; h q[2]; h q[3];
measure q[0] -> c[0];
measure q[1] -> c[1];
measure q[2] -> c[2];
measure q[3] -> c[3];
Every shot returns 1011 (in Qiskit’s reverse bit order that’s 1101 read right-to-left). → Try it in the playground — paste the QASM above.
Qiskit implementation
from qiskit import QuantumCircuit
from qiskit_aer import AerSimulator
def bv_circuit(s: str) -> QuantumCircuit:
n = len(s)
qc = QuantumCircuit(n + 1, n)
qc.x(n) # ancilla = |1⟩
qc.h(range(n + 1)) # Hadamards everywhere
# Oracle: one CNOT per '1' bit in s (indexed so s[0] is the MSB)
for i, bit in enumerate(reversed(s)):
if bit == "1":
qc.cx(i, n)
qc.h(range(n)) # Hadamards back on inputs
qc.measure(range(n), range(n))
return qc
s = "1011"
qc = bv_circuit(s)
counts = AerSimulator().run(qc, shots=1).result().get_counts()
print(f"Hidden s = {s}, recovered = {next(iter(counts))}")
# Hidden s = 1011, recovered = 1011
Why the classical vs quantum separation looks so stark
If you hand a classical algorithm an oracle for on bits, it will make exactly 1000 queries to learn . The quantum algorithm makes one. That’s a 1000-to-1 query speedup — exponential in the speedup factor, though not in the scaling (both are polynomial in in this problem).
The deeper win: the speedup holds against randomized classical algorithms too. Unlike Deutsch-Jozsa, there’s no way for a classical algorithm to get around the information-theoretic lower bound of queries — each classical query returns only one bit and you need bits of answer. Quantum parallelism + phase kickback + Hadamard readout returns bits in one go.
Simon’s algorithm: the real exponential speedup
Problem. You’re given a function as a black box, with the promise that there exists a hidden nonzero string such that
In words: is 2-to-1, and every pair of colliding inputs differs by XOR of the hidden . Your job: find .
Classical bound. queries by the birthday paradox — you have to keep querying until you happen to see the same output twice, and then . On that’s already 2 million queries.
Quantum bound. queries on average. That’s exponential separation against even randomized classical algorithms. No asterisks.
Circuit
Simon uses two -qubit registers. Call them the input register and the output register:
- Prepare .
- Apply to the input register:
- Apply the oracle which maps :
- Measure the output register (and discard the result). This step is subtle and important.
Because is 2-to-1, every output comes from exactly two inputs . Measuring the output collapses the input register to an equal superposition over that pair:
You don’t know — that’s hidden randomness — but the structure "" is exactly what exposes .
- Apply to the input register. After a page of algebra:
Every in the superposition has the property . The random phase is irrelevant — when you measure, you get some with .
- Measure the input register. You get a random orthogonal to over .
Classical post-processing
One run of the quantum circuit gives you one random linear constraint . Run the circuit times (or slightly more, to be robust) and you’ll have random linear constraints. Solve them as a linear system over and you get (up to the trivial solution , which the problem promise rules out).
Linear systems over can be solved by Gaussian elimination in classical operations. Total quantum queries: . Total classical post-processing: .
OpenQASM for Simon with
, smallest meaningful instance. Pick a specific : , — collisions at differ-by- pairs.
OPENQASM 2.0;
include "qelib1.inc";
// Simon, s = 11. 2 input qubits, 2 output qubits.
qreg q[4];
creg c[2];
// Prep & Hadamards on input register
h q[0]; h q[1];
// Oracle: f(x0, x1) = (x0 XOR x1, 0). Collisions at x and x XOR 11.
cx q[0], q[2];
cx q[1], q[2];
// Measure output register (qubits 2,3) first, informally
// (skipping the classical measurement step here; the collapse still happens)
// Hadamards back on input
h q[0]; h q[1];
// Measure input register
measure q[0] -> c[0];
measure q[1] -> c[1];
Every shot returns either 00 or 11 (because the only with are and ). → Try it in the playground.
Qiskit implementation with classical post-processing
import numpy as np
from qiskit import QuantumCircuit
from qiskit_aer import AerSimulator
def simon_circuit(n: int, oracle: QuantumCircuit) -> QuantumCircuit:
qc = QuantumCircuit(2 * n, n)
qc.h(range(n)) # uniform superposition on inputs
qc.compose(oracle, inplace=True) # apply oracle
qc.h(range(n)) # Hadamards back
qc.measure(range(n), range(n))
return qc
def simon_oracle(n: int, s: str) -> QuantumCircuit:
"""Oracle for a Simon function with hidden shift s."""
oc = QuantumCircuit(2 * n)
# Copy x → output register
for i in range(n):
oc.cx(i, n + i)
# Find first '1' bit of s (MSB in our convention)
first_one = None
for i, bit in enumerate(reversed(s)):
if bit == "1":
first_one = i
break
# Controlled on that bit, XOR s into output → creates the 2-to-1 collision pattern
for i, bit in enumerate(reversed(s)):
if bit == "1":
oc.cx(first_one, n + i)
return oc
def solve_f2_system(ys: list[str]) -> str | None:
"""Find a nonzero s such that y · s = 0 mod 2 for every y. Gaussian elimination over F_2."""
n = len(ys[0])
A = np.array([[int(b) for b in y] for y in ys], dtype=np.int8)
# Row-reduce
rows = A.tolist()
row, col = 0, 0
while row < len(rows) and col < n:
pivot = next((i for i in range(row, len(rows)) if rows[i][col]), None)
if pivot is None:
col += 1
continue
rows[row], rows[pivot] = rows[pivot], rows[row]
for i in range(len(rows)):
if i != row and rows[i][col]:
rows[i] = [(a ^ b) for a, b in zip(rows[i], rows[row])]
row += 1; col += 1
# Find a null-space vector
pivots = set()
for r in rows:
for j in range(n):
if r[j]:
pivots.add(j); break
free = [j for j in range(n) if j not in pivots]
if not free:
return None
s = [0] * n
s[free[0]] = 1
for r in rows:
pivot_col = next((j for j in range(n) if r[j]), None)
if pivot_col is None:
continue
s[pivot_col] = sum(r[j] * s[j] for j in range(n)) % 2
return "".join(str(b) for b in s)
# --- Run ---
n = 3
s = "110"
oracle = simon_oracle(n, s)
qc = simon_circuit(n, oracle)
sim = AerSimulator()
ys = []
shots = 0
while len(ys) < n - 1 and shots < 200:
counts = sim.run(qc, shots=1).result().get_counts()
y = next(iter(counts))
shots += 1
if y != "0" * n and y not in ys:
# Sanity check: y · s should be 0 mod 2
if sum(int(a) * int(b) for a, b in zip(y, s)) % 2 == 0:
ys.append(y)
recovered = solve_f2_system(ys)
print(f"Hidden s = {s}, shots used = {shots}, recovered = {recovered}")
# Hidden s = 110, shots used = ~3, recovered = 110
Why Simon matters more than Bernstein-Vazirani
BV’s speedup is dramatic-looking (1 vs ) but polynomial: classical queries is already efficient. Simon’s speedup is exponential ( vs ) in a setting where no classical algorithm — randomized, deterministic, anything — can do better than . This is the first genuine exponential quantum-vs-classical separation in query complexity that holds against randomized algorithms.
Shor’s algorithm is, structurally, Simon’s algorithm with replaced by and the hidden XOR-shift replaced by a hidden multiplicative period. The analogy is precise: Simon’s hidden subgroup is ; Shor’s hidden subgroup is . Same idea, bigger group, bigger trouble for RSA.
Exercises
1. Verify BV algebraically
For on , hand-trace the state through the BV circuit and confirm the final measurement yields with probability 1.
Show answer
After phase kickback, input register is with . The sign of is . Hadamard transform gives . Measurement gives 101 with probability 1.
2. Simon with a specific oracle
Write out the Simon oracle for , , as a circuit. Run it in Qiskit and collect 10 random ‘s. Verify every satisfies .
Show hint
Use simon_oracle(3, "110") from the code above. Run 10 shots, check each outcome: should satisfy . Expected ‘s: .
3. Classical birthday attack
Implement a naive classical algorithm for Simon that just queries at random inputs until it sees a collision. Measure the number of queries needed for over 100 runs. Compare to .
Hint
import random
def classical_simon(f, n, trials=100):
lengths = []
for _ in range(trials):
seen = {}
count = 0
while True:
x = random.randint(0, 2**n - 1)
count += 1
if x in seen: continue
y = f(x)
if y in seen.values():
lengths.append(count); break
seen[x] = y
return sum(lengths) / len(lengths)You’ll see averages around , consistent with the birthday bound.
4. Think: what Simon leaves out
Simon’s algorithm gives you given a specific promise about . What happens if the promise is violated — e.g., is actually 4-to-1? Would the algorithm fail silently or detectably?
Show answer
If is 4-to-1, measuring the output register collapses the input register to an equal superposition over 4 inputs (not 2). The subsequent Hadamard and measurement returns a orthogonal to a hidden subgroup of size 4, not a single shift . The algorithm “succeeds” in returning such ‘s, but the linear system you’d solve to extract is inconsistent — Gaussian elimination fails or returns multiple solutions. So you’d notice, but only in post-processing. This is why real uses of Simon check the promise holds before trusting results.
What you should take away
- Bernstein-Vazirani retrieves an -bit secret in one oracle query (vs classically). Same circuit as Deutsch-Jozsa, different interpretation of the output bit string.
- Simon’s algorithm finds a hidden XOR-shift in queries (vs classically). The first exponential quantum speedup holding against randomized classical algorithms.
- Simon → Shor. The same algorithm with replaced by and XOR-period replaced by multiplicative-period is Shor’s factoring algorithm.
- Classical post-processing matters. Simon returns random linear constraints; you recover the answer by Gaussian elimination over .
Next: Grover’s search. The other great query-complexity algorithm — vs — and the amplitude amplification technique that generalizes it far beyond search.