gates and circuits intermediate · 24 min read · By LIPAI WANG · April 22, 2026

OpenQASM 3 and Your First Real Hardware Run

Qiskit circuits are a convenience. OpenQASM 3 is the portable assembly language underneath — and what you actually send to hardware. This tutorial walks through the OpenQASM 3 syntax that matters, IBM Quantum's free tier, transpilation, and how to interpret noisy results honestly on your first real-hardware run.

Prerequisites: Tutorial 6: Multi-Qubit Gates

Every Qiskit QuantumCircuit is ultimately serialized to OpenQASM 3 before it hits real hardware. QASM is the portable assembly language for quantum — readable by humans, writeable by machines, vendor-neutral across IBM, IonQ, Quantinuum, and anything else that takes the standard. Knowing QASM 3 is the difference between “I wrote a quantum program” and “I understand what the hardware is actually doing.”

This tutorial teaches you the subset of QASM 3 you’ll actually use, then walks you through running your first circuit on IBM Quantum’s free tier, inspecting calibration data, and interpreting the results without deceiving yourself.

Why QASM 3 (not QASM 2 or anything else)

Historical context in one minute: OpenQASM 2 (2017) was the original standard. It got you far, but lacked real-time classical control flow, parameterized gates, and custom calibrations. QASM 3 (stabilized 2021, hardware-adopted 2023–2024) fixes all three. Every new feature on IBM’s Runtime primitives (dynamic circuits, mid-circuit measurement with branching, classical registers as first-class values) lives in QASM 3.

IonQ, Rigetti, Quantinuum, and most simulators accept QASM 3 as input. If you want your circuits to be portable across vendors, OpenQASM 3 is the interchange format.

A complete QASM 3 program

Here is the Bell-state circuit in OpenQASM 3:

OPENQASM 3.0;
include "stdgates.inc";

qubit[2] q;
bit[2]   c;

h q[0];
cx q[0], q[1];
c = measure q;

Every piece is worth knowing:

OPENQASM 3.0; — version declaration; always line 1.
include "stdgates.inc"; — pulls in standard gate definitions (H, CX, X, Y, Z, Rx, Ry, Rz, …). Without this you’d have to define h yourself, which is rarely what you want.
qubit[2] q; — declare a quantum register named q with 2 qubits.
bit[2] c; — classical register of 2 bits, the measurement destination.
h q[0]; — Hadamard on qubit 0.
cx q[0], q[1]; — CNOT, control = qubit 0, target = qubit 1.
c = measure q; — measure the full register; results go into classical bits.

Things you didn’t see in QASM 2:

Real-time control flow. if (c == 1) { x q[0]; } — branch on a classical measurement result mid-circuit.
Parameterized gates. gate rx(θ) q { ... } — define custom gates with parameters.
Classical integer arithmetic. int[32] n; n = 5; — you can do classical computation in the middle of a program.
Timing control. delay[100ns] q[0]; — insert a deterministic delay for decoherence studies.

Exporting and importing in Qiskit

Round-trip a Qiskit circuit through QASM 3:

from qiskit import QuantumCircuit, qasm3

qc = QuantumCircuit(2, 2)
qc.h(0)
qc.cx(0, 1)
qc.measure([0, 1], [0, 1])

# Qiskit → QASM 3 string
qasm_src = qasm3.dumps(qc)
print(qasm_src)
# OPENQASM 3.0;
# include "stdgates.inc";
# bit[2] c;
# qubit[2] q;
# h q[0];
# cx q[0], q[1];
# c[0] = measure q[0];
# c[1] = measure q[1];

# QASM 3 string → Qiskit
qc_restored = qasm3.loads(qasm_src)
assert qc_restored.data == qc.data

For vendor interoperability, dump to QASM 3, pipe into a non-Qiskit tool, and load back. This is how benchmarks like MQT Bench and QASMBench stay language-neutral.

Getting an IBM Quantum account

Free tier: 10 minutes of quantum compute per month, which is an eternity for learning (each circuit shot takes microseconds). Sign up flow:

Go to quantum.ibm.com and create a free account.
On your dashboard, copy your API token. Store it somewhere safe — it’s the credential you’ll paste below.
Note which open plan backends are available. At the time of writing these include ibm_brisbane, ibm_kyiv, and ibm_sherbrooke (127 qubits each, Eagle r3 processors).

Save credentials once

from qiskit_ibm_runtime import QiskitRuntimeService

QiskitRuntimeService.save_account(
    channel="ibm_quantum_platform",
    token="YOUR_TOKEN_HERE",                # paste once, then delete from source
    instance="ibm-q/open/main",             # free open-plan instance
    overwrite=True,
)

After this, the token lives in ~/.qiskit/qiskit-ibm.json and every subsequent QiskitRuntimeService() call loads it automatically — no more pasting.

Pick a backend and inspect it

from qiskit_ibm_runtime import QiskitRuntimeService

service = QiskitRuntimeService()

# List operational backends
for b in service.backends(operational=True, simulator=False):
    print(f"{b.name:24s}  qubits={b.num_qubits}  queue={b.status().pending_jobs}")
# ibm_brisbane             qubits=127  queue=12
# ibm_sherbrooke           qubits=127  queue=41
# ibm_kyiv                 qubits=127  queue=3     ← pick this one

backend = service.backend("ibm_kyiv")

# Calibration snapshot
props = backend.properties()
q0 = props.qubit_property(0)
print(f"Qubit 0: T1 = {q0['T1'][0]*1e6:.1f} µs, T2 = {q0['T2'][0]*1e6:.1f} µs, "
      f"readout error = {q0['readout_error'][0]:.2%}")
# Qubit 0: T1 = 185.3 µs, T2 = 132.5 µs, readout error = 0.73%

Every real-hardware run should start with a calibration check. $T_1$ (energy relaxation time) and $T_2$ (dephasing time) tell you how long the qubit holds quantum information. Readout error tells you how reliably a measurement turns the physical state back into a classical bit. These numbers drift hour-to-hour — don’t assume Tuesday’s calibrations apply Wednesday.

Transpile for the hardware

Before you can run on ibm_kyiv, Qiskit has to rewrite your circuit in terms of the hardware’s native gate set (ecr, id, rz, sx, x on Eagle r3) and respect its qubit connectivity. This is transpilation.

from qiskit import QuantumCircuit, transpile

qc = QuantumCircuit(2, 2)
qc.h(0)
qc.cx(0, 1)
qc.measure([0, 1], [0, 1])

# Transpile for the specific backend, at optimization level 3 (the most aggressive)
tqc = transpile(qc, backend=backend, optimization_level=3)
print(f"Original gates: {dict(qc.count_ops())}")
print(f"Transpiled:     {dict(tqc.count_ops())}")
# Original gates: {'h': 1, 'cx': 1, 'measure': 2}
# Transpiled:     {'rz': 2, 'sx': 2, 'ecr': 1, 'measure': 2}

ecr is the Echoed Cross-Resonance gate, IBM’s native two-qubit gate. Every CNOT in your circuit becomes one ecr plus a few single-qubit rotations. On a deeper circuit with limited connectivity, SWAP chains would appear — the transpiler inserts them as needed to route qubits together.

Inspect which physical qubits your logical qubits got mapped to:

print(tqc.layout.initial_layout)
# Layout: mapping from logical qubits in the circuit to physical qubits on chip
# e.g., logical 0 -> physical 14, logical 1 -> physical 13

The transpiler picks qubits with the best calibration numbers by default. That’s usually what you want.

Run the job

from qiskit_ibm_runtime import SamplerV2 as Sampler

sampler = Sampler(mode=backend)
job = sampler.run([tqc], shots=4096)
print("Job ID:", job.job_id())
# Come back in a few minutes when the queue clears
result = job.result()

pub_result = result[0]
counts = pub_result.data.c.get_counts()
print(counts)
# {'00': 1827, '11': 1803, '01': 239, '10': 227}

Two things to notice immediately:

00 and 11 are dominant. That’s the Bell-state signature: perfect correlation between the two qubits.
01 and 10 are not zero. On a perfect quantum computer they would be exactly zero. On real hardware they appear because of (a) gate errors during H and CX, (b) decoherence during the ~1 µs circuit execution, and (c) readout misclassification.

The fraction $(01\text{+}10)/\text{total}$ on this run is about 11% — higher than you’d like, but not unusual for an early free-tier run on a shared machine. For comparison, on Quantinuum H2 you’d see under 1%.

Reading out error rates honestly

How do you distinguish “the algorithm is working but the hardware is noisy” from “the algorithm is wrong”? Three disciplines, in order of effort:

Run on a simulator first. Always. AerSimulator() gives you the ideal answer; if your circuit doesn’t produce Bell-state correlations in simulation, no hardware run will save you.

from qiskit_aer import AerSimulator
sim = AerSimulator()
ideal = sim.run(tqc, shots=4096).result().get_counts()
print(ideal)
# {'00': 2053, '11': 2043}   ← exactly 0 on '01' and '10'

Measurement-error mitigation. A post-processing step that inverts the confusion matrix between physical and reported bit strings. Qiskit Runtime supports this via sampler.options.resilience.measure_mitigation = True. Turn it on for any quantitative claim.
Dynamical decoupling. Insert identity-preserving pulses during idle time to fight slow drifts. sampler.options.dynamical_decoupling.enable = True. Costs a bit of extra time; usually worth it.

from qiskit_ibm_runtime import SamplerV2 as Sampler

sampler = Sampler(mode=backend)
sampler.options.resilience.measure_mitigation = True
sampler.options.dynamical_decoupling.enable = True
sampler.options.dynamical_decoupling.sequence_type = "XpXm"

job = sampler.run([tqc], shots=4096)

After mitigation, the 01 and 10 fractions typically drop to 3–5% on Eagle r3 — noticeably better, still nonzero.

The mental checklist for every real-hardware run

Does the circuit work on a simulator? If not, stop.
Is the calibration recent (last 24 hours)? Is readout error on the involved qubits under 2%? If not, try another backend or wait.
Did you transpile with optimization_level=3 for the specific backend? Default is 1 and much worse.
Is your two-qubit gate count under ~100 for a meaningful result on today’s machines? If not, consider simulation or VQE-style ansatz truncation.
Do you have at least 4096 shots to beat binomial noise on probability estimates to ~1.5%?
Did you turn on measurement-error mitigation? It’s nearly free.
Did you sanity-check the raw counts before plotting? Outlier shots can mask real effects.

Every real-hardware workflow follows this checklist. Violate any step and you’ll spend hours debugging an effect that isn’t there.

Full runnable example

End-to-end, from nothing to hardware-measured Bell state:

from qiskit import QuantumCircuit, transpile
from qiskit_ibm_runtime import QiskitRuntimeService, SamplerV2 as Sampler
from qiskit_aer import AerSimulator

# 1. Define the circuit
qc = QuantumCircuit(2, 2)
qc.h(0)
qc.cx(0, 1)
qc.measure([0, 1], [0, 1])

# 2. Simulator sanity check
ideal = AerSimulator().run(qc, shots=4096).result().get_counts()
print("Ideal:", ideal)

# 3. Pick a backend and transpile
service = QiskitRuntimeService()
backend = service.least_busy(operational=True, simulator=False, min_num_qubits=2)
tqc = transpile(qc, backend=backend, optimization_level=3)
print(f"Using backend: {backend.name}, transpiled ops: {dict(tqc.count_ops())}")

# 4. Run with mitigation
sampler = Sampler(mode=backend)
sampler.options.resilience.measure_mitigation = True
sampler.options.dynamical_decoupling.enable = True
job = sampler.run([tqc], shots=4096)
print(f"Job submitted: {job.job_id()}")

# 5. Wait, fetch, compare
result = job.result()
counts = result[0].data.c.get_counts()
print("Real:", counts)

# 6. Report a single honest metric
total = sum(counts.values())
error_fraction = (counts.get("01", 0) + counts.get("10", 0)) / total
print(f"Off-diagonal fraction (should be 0): {error_fraction:.2%}")

Exercises

1. Read the QASM

Parse this QASM 3 circuit by hand and say what it does:

OPENQASM 3.0;
include "stdgates.inc";

qubit[3] q;
bit[3]   c;

h q[0];
cx q[0], q[1];
cx q[1], q[2];
c = measure q;

Show answer

Creates a 3-qubit GHZ state: H on qubit 0 produces $|+\rangle \otimes |00\rangle$ ; cascading CNOTs propagate to $\tfrac{1}{\sqrt{2}}(|000\rangle + |111\rangle)$ ; measurement collapses to either 000 or 111 with 50/50 probability.

2. Round-trip through QASM

Write a 5-qubit QFT circuit in Qiskit, export it to QASM 3, print the output, then load it back and verify the round-trip produces an equivalent circuit.

Show answer

from qiskit import QuantumCircuit, qasm3
from qiskit.circuit.library import QFT
from qiskit.quantum_info import Operator
import numpy as np

qc = QuantumCircuit(5)
qc.append(QFT(5), range(5))
src = qasm3.dumps(qc.decompose())
qc2 = qasm3.loads(src)
print(np.allclose(Operator(qc.decompose()).data, Operator(qc2).data))
# True

3. Estimate error contribution

You run a 100-CNOT circuit on a machine with 0.8% per-CNOT error, 0.05% per single-qubit-gate error, and 1.5% readout error per measured qubit. Estimate the probability of a “clean” circuit (no gate error), and the probability of a fully clean measurement for a 5-qubit readout.

Show answer

Gate-level: $0.992^{100} \approx 0.449$ (CNOTs) $\times 0.9995^{400}$ (assume 4 single-qubit gates per CNOT, 400 total) $\approx 0.449 \times 0.819 \approx 0.367$ . So ~37% of shots are free of gate errors. Readout: $0.985^5 \approx 0.927$ — about 93% of the time all 5 qubits are correctly read out. Joint probability of a perfect shot: $\sim 0.34$ . In 4096 shots, you’d expect only ~1400 shots reflecting the ideal circuit. This is why shot budgets matter.

4. Write QASM 3 for a Bell measurement in the X basis

Write QASM 3 that prepares $|\Phi^+\rangle$ and measures both qubits in the X-basis (applying H before measurement).

Show answer

OPENQASM 3.0;
include "stdgates.inc";

qubit[2] q;
bit[2]   c;

h q[0];
cx q[0], q[1];
h q[0];
h q[1];
c = measure q;

Expected result on an ideal simulator: 00 or 11 with 50/50 probability (Bell state is correlated in the X basis too).

What you should take away

OpenQASM 3 is the vendor-neutral assembly language. Read it. Write it when portability matters.
Transpilation is non-negotiable. Always transpile against the specific backend at optimization_level=3.
Calibration drifts. Check $T_1$ , $T_2$ , readout error before every real-hardware run.
Simulator first, hardware second, every time. This will save you dozens of hours over your first year.
Measurement-error mitigation + dynamical decoupling are nearly-free wins — enable both.
The 7-point pre-run checklist is the difference between useful experiments and noise harvesting.

That closes the Gates & Circuits track. You now have enough infrastructure to implement real algorithms. Next up in the Algorithms track: Deutsch-Jozsa (the original quantum speedup), Bernstein-Vazirani, Grover’s search, and the Quantum Fourier Transform — each derived from scratch with runnable code on both simulators and real hardware.

Why QASM 3 (not QASM 2 or anything else)

A complete QASM 3 program

Exporting and importing in Qiskit

Getting an IBM Quantum account

Save credentials once

Pick a backend and inspect it

Transpile for the hardware

Run the job

Reading out error rates honestly

The mental checklist for every real-hardware run

Full runnable example

Exercises

1. Read the QASM

2. Round-trip through QASM

3. Estimate error contribution

4. Write QASM 3 for a Bell measurement in the X basis

What you should take away

Quantum, for people who already code.