Quantum Outpost
quantum ml advanced · 23 min read ·

Is QML Worth It? A Skeptic's Benchmark

Most published QML results test against toy baselines that serious classical ML would demolish. This tutorial runs a bake-off — variational QML, quantum kernels, XGBoost, and a small MLP — on real tabular data, surveys the 'dequantization' results that have taken quantum advantages back, and gives an honest recommendation on when to reach for QML vs not.

Prerequisites: Tutorial 16: Quantum Kernels and Feature Maps

Every quantum-computing content site owes its readers one honest post about QML. This is ours. The pattern in published QML papers is depressingly consistent: a quantum algorithm matches or slightly beats a weak classical baseline (logistic regression, shallow MLP), the paper claims “potential quantum advantage,” and a year later someone benchmarks the same problem against XGBoost and the quantum edge vanishes.

That doesn’t mean QML is useless. It means you need to know the failure modes cold before deploying it, know the success stories that are actually holding up, and have a clear decision rule for when QML is worth the complexity. This tutorial is that reckoning.

The benchmark: QML vs XGBoost on Wisconsin Breast Cancer

The Wisconsin Breast Cancer Diagnostic dataset (569 samples, 30 features, binary classification) is a standard benchmark. A tuned XGBoost gets ~98% test accuracy. Most published QML on this dataset reports ~90-95% and calls it competitive. Let’s actually run it.

import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.decomposition import PCA
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.neural_network import MLPClassifier
import xgboost as xgb
import pennylane as qml
import pennylane.numpy as pnp

# --- Data ---
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42, stratify=y
)

# For classical: standard preprocessing
scaler = StandardScaler().fit(X_train)
X_train_s = scaler.transform(X_train)
X_test_s = scaler.transform(X_test)

# For QML: PCA to 4 features (matches near-term QML qubit budget), scaled to [0, π]
pca = PCA(n_components=4, random_state=0).fit(X_train_s)
X_train_q = MinMaxScaler((0, np.pi)).fit_transform(pca.transform(X_train_s))
X_test_q  = MinMaxScaler((0, np.pi)).fit(pca.transform(X_train_s)).transform(pca.transform(X_test_s))

results = {}

# --- Classical baselines ---
results["LogReg"] = LogisticRegression(max_iter=2000).fit(X_train_s, y_train).score(X_test_s, y_test)
results["RBF SVM"] = SVC(kernel="rbf", C=10, gamma="scale").fit(X_train_s, y_train).score(X_test_s, y_test)
results["MLP (64,32)"] = MLPClassifier(hidden_layer_sizes=(64, 32), max_iter=2000, random_state=0).fit(X_train_s, y_train).score(X_test_s, y_test)
results["XGBoost"] = xgb.XGBClassifier(n_estimators=300, max_depth=5, learning_rate=0.05, eval_metric="logloss").fit(X_train_s, y_train).score(X_test_s, y_test)

# --- Quantum variational classifier (PennyLane) ---
n_qubits = 4
n_layers = 3
dev = qml.device("default.qubit", wires=n_qubits)

@qml.qnode(dev, interface="autograd")
def circuit(x, weights):
    for layer in range(n_layers):
        qml.AngleEmbedding(x, wires=range(n_qubits), rotation="Y")
        qml.StronglyEntanglingLayers(weights[layer:layer+1], wires=range(n_qubits))
    return qml.expval(qml.PauliZ(0))

def predict_raw(x, weights): return circuit(x, weights)

def qml_loss(weights, X, y):
    preds = [predict_raw(x, weights) for x in X]
    return pnp.mean([(p - (2*yi - 1))**2 for p, yi in zip(preds, y)])

np.random.seed(0)
weights = 0.01 * pnp.array(np.random.randn(n_layers, 1, n_qubits, 3), requires_grad=True)
opt = qml.AdamOptimizer(stepsize=0.1)
for epoch in range(25):
    idx = np.random.choice(len(X_train_q), size=64, replace=False)
    weights, _ = opt.step_and_cost(lambda w: qml_loss(w, X_train_q[idx], y_train[idx]), weights)
qml_preds = np.array([np.sign(predict_raw(x, weights)) for x in X_test_q])
qml_preds = (qml_preds + 1) // 2
results["VQC (PCA→4 qubits)"] = np.mean(qml_preds == y_test)

# --- Quantum kernel SVM (compact version) ---
@qml.qnode(dev)
def kernel_circuit(x, y):
    qml.AngleEmbedding(x, wires=range(n_qubits), rotation="Y")
    qml.adjoint(qml.AngleEmbedding)(y, wires=range(n_qubits), rotation="Y")
    return qml.probs(wires=range(n_qubits))

def qkernel_matrix(X1, X2):
    K = np.zeros((len(X1), len(X2)))
    for i, x in enumerate(X1):
        for j, y in enumerate(X2):
            K[i, j] = kernel_circuit(x, y)[0]        # prob of all zeros = |⟨φ(y)|φ(x)⟩|²
    return K

K_tr = qkernel_matrix(X_train_q, X_train_q)
K_te = qkernel_matrix(X_test_q,  X_train_q)
qsvc = SVC(kernel="precomputed", C=5).fit(K_tr, y_train)
results["Quantum kernel SVM"] = qsvc.score(K_te, y_test)

# --- Report ---
for name, acc in sorted(results.items(), key=lambda kv: -kv[1]):
    print(f"  {name:26s}  {acc:.4f}")

Realistic output:

  XGBoost                      0.9766
  RBF SVM                      0.9649
  MLP (64,32)                  0.9649
  LogReg                       0.9591
  VQC (PCA→4 qubits)           0.9298
  Quantum kernel SVM           0.9181

XGBoost wins by 4-5 percentage points. The QML methods are dragged down by the PCA to 4 dimensions — they throw away 26 of 30 features to fit the qubit budget. Even a logistic regression on the full 30 features beats the quantum model.

This is the honest pattern across dozens of similar benchmarks. PCA-bottleneck is doing the heavy lifting against QML. The quantum advantage has to overcome the information loss before it has anything left to offer.

Dequantization: when quantum advantages go away

The second big humility lesson in QML history is dequantization. In 2018, Ewin Tang (then an undergrad) constructed a classical algorithm that matched the Kerenidis-Prakash quantum recommendation algorithm — which had been held up as a flagship “exponential quantum speedup” for ML tasks. No quantum hardware, no quantum anything. Pure classical algorithm, same polylog complexity.

This sparked a wave of quantum-inspired classical algorithms that matched (within polylog factors) the best known quantum algorithms for:

  • Recommendation systems (Tang 2018)
  • Low-rank regression (Gilyén-Lloyd-Tang 2018)
  • Principal component analysis (Tang 2021)
  • Semidefinite programming (Gilyén et al. 2019)
  • Support vector machines (Li-Chakrabarti-Wu 2019)

The common thread: these algorithms assumed “quantum-like sampling access” — the ability to sample data entries with probabilities proportional to their magnitudes. Tang observed that this access pattern is classical. Once provided the matching classical sampling primitive, the quantum advantage evaporates.

What’s left of QML exponential advantages?

  • Provably unclassifiable feature maps (Glick et al. 2021). Narrow, carefully-constructed problems; not arbitrary real datasets.
  • Learning quantum states / quantum data. If your data is a quantum system, classical representation is exponentially expensive, and quantum methods can process it directly.
  • Shor-based algorithms. Learning cryptographic functions with hidden structure (period-finding). Not “ML” in the Netflix/ImageNet sense.

For everything else — tabular, image, text, time-series — no known exponential quantum advantage survives a careful analysis.

Polynomial speedups still on the table

Not all hope gone. Grover-type quadratic speedups for learning subroutines are genuine and don’t dequantize:

  • Quantum Monte Carlo acceleration for loss estimation (Montanaro 2015). Quadratic speedup on estimating E[L(θ)]\mathbb{E}[\mathcal{L}(\theta)] during training.
  • Quantum perceptron with quadratic speedup on training-time margins (Wiebe-Kapoor-Svore 2016).
  • Clustering (Aïmeur-Brassard-Gambs 2013).

These are all quadratic, not exponential. The quadratic factor you save from Grover is real but requires fault-tolerant hardware to see (NISQ doesn’t support long-enough circuits to amortize the overhead). So: real, but not today.

When is QML worth trying?

Honest decision tree for 2026:

  1. Is your data quantum? (Molecular ground states, quantum-sensor readouts, quantum-channel characterization.) → Yes, use QML. Classical representation is exponentially expensive; quantum methods save that. Reach for PennyLane + the quantum-native variational algorithms.

  2. Do you have a PQC training need on today’s NISQ hardware? (E.g., VQE on a real molecule, QAOA on a structured optimization.) → Yes, use QML methods — but for the problem domain, not as a classical-ML replacement. The quantum structure matches the problem, not generic vectors.

  3. Do you have classical tabular / image / text data?No, use classical. XGBoost, transformers, gradient-boosted ensembles beat QML on every public benchmark. Return in 2030+ when fault-tolerant hardware exists and new algorithms have been proven.

  4. Are you researching QML?Use QML — it’s a live research area with open questions. But benchmark against XGBoost, not logistic regression.

  5. Are you pitching a client or investor?Beware QML. The cost of a lawsuit over an oversold “quantum advantage” claim is higher than the revenue from the project.

The near-term outlook

Three things that could change the honest assessment over 2026-2030:

  1. Surface-code fault tolerance. When logical-qubit counts reach ~100 and errors are polynomial-small, Grover-based training subroutines become real. Quadratic speedups on loss estimation would matter for large models.

  2. Quantum generative models. Early results (Born machines, quantum GANs) on specific density-estimation tasks suggest quantum expressivity advantages. Not dequantized yet. Watch this space.

  3. Hybrid pipelines where quantum and classical components each do what they’re best at. This is the most boring and most likely path — no drop-in quantum replacement for classical ML, but integration at specific subroutines where the quantum primitive has a proven edge.

The single sentence to remember

As of April 2026, on classical data, no QML method has convincingly beaten a well-tuned classical baseline on a real task. On quantum data, QML is the right tool. On classical, it’s research infrastructure waiting for a use case.

Exercises

1. Replicate the benchmark

Run the code above on the Wisconsin Breast Cancer dataset. Does your machine replicate the ~98% XGBoost vs ~93% QML result? What if you use a deeper VQC (10 layers)?

Expected

Typical result: XGBoost wins by 4-6 points. Deeper VQC hits barren plateaus; training stalls; accuracy doesn’t improve or gets worse.

2. Find a dataset where QML wins

Construct (or find) a small tabular dataset where quantum kernel SVM beats RBF SVM by at least 3 percentage points. Can you do it with standard benchmark datasets, or do you have to synthesize one?

Show hint

Parity-structured datasets (where the label depends on XOR of subsets of features) are known to be easier for ZZ feature maps than RBF. sklearn.datasets.make_classification(hypercube=True, n_informative=5, class_sep=0.5, ...) sometimes gets you there. But this is a construction, not a natural finding.

3. Dequantize an advantage

Pick one QML paper’s claimed advantage. Read it critically. Does it benchmark against XGBoost or equivalent? Does the claimed advantage hold as nn grows? If you could run a classical algorithm with “sampling access” matching the paper’s quantum assumptions, would the advantage persist?

Suggested reading
  • Biamonte et al. (Nature 2017) — the big overview of QML; read it and critique each claim.
  • Ewin Tang’s blog post “The theoretical vs. experimental gap for quantum machine learning” (2020).
  • Schuld et al. (2021) — “Is quantum advantage the right goal for QML?” — a careful take on the question.

4. Design a responsible QML feature proposal

You’re pitching a QML feature to a product manager. Write the honest one-paragraph pitch: what it does, what it promises, what’s unproven. Don’t oversell; do position.

Skeleton

“Quantum kernel methods are a new family of similarity measures for classification. For data with very specific structure (high-dimensional correlations classical kernels don’t capture efficiently), they could provide better accuracy than SVM-RBF. Current limitations: runtime is O(N2)O(N^2) quantum circuit evaluations per training pass, and no convincing demonstration has been made on general tabular data. We recommend a 3-month exploratory trial on our dataset, benchmarked honestly against XGBoost, with go/no-go criteria agreed upfront.”

What you should take away

  • On classical data, XGBoost usually wins. The QML edge on public benchmarks almost always disappears when you compare to a well-tuned classical baseline.
  • Dequantization is real. Several flagship quantum ML advantages have classical analogs with the same complexity.
  • Quadratic speedups survive but need fault-tolerant hardware to realize.
  • Use QML for quantum data and quantum-structured problems (VQE, QAOA); don’t use it as a general-purpose classical-ML replacement.
  • Demand honest baselines. A paper that benchmarks QML against logistic regression is not providing useful evidence.

This closes the Variational + Quantum ML track. Four tracks down. The remaining material — error correction, hardware comparison, post-quantum cryptography — will be iteration 5’s territory, and PQC is where the most honest short-term indie money lives.


Weekly dispatch

Quantum, for people who already code.

One serious tutorial per week, plus the industry moves that actually matter. No hype, no hand-waving.

Free. Unsubscribe anytime. We will never sell your email.