Statistics Utilities Reference¶

pytest-quantum ships statistical primitives that underpin its assertions and help you choose the right shot count for reliable tests. All functions are pure numpy/scipy — no quantum SDK is required.

Import from the top-level package:

from pytest_quantum import (
    min_shots,
    recommended_shots,
    fidelity,
    tvd,
    tvd_from_counts,
    chi_square_test,
)

Shot-count calculators¶

`min_shots`¶

min_shots(
    epsilon,
    alpha=0.05,
    power=0.80,
) -> int

Returns the minimum number of shots to reliably detect a Total Variation Distance of epsilon between two distributions.

The formula

Based on two-sample statistical power analysis:

$$ N = \left\lceil \frac{(z_{1-\alpha/2} + z_{\text{power}})^2}{2\varepsilon^2} \right\rceil $$

where $z_p$ is the $p$-th quantile of the standard normal distribution.

With default settings ($\alpha = 0.05$, $\text{power} = 0.80$): $z_{0.975} \approx 1.96$, $z_{0.80} \approx 0.84$.

Parameters: epsilon — Minimum detectable TVD. 0.01 means the test can reliably catch a 1% deviation from the expected distribution.; alpha — Significance level (default 0.05 → 95% confidence).; power — Statistical power — the probability of detecting a real error (default 0.80 → 80% power).
Returns: Minimum recommended shot count as an integer.
Raises: ValueError — Any argument is outside its valid range (0, 1).

Worked examples

from pytest_quantum import min_shots

min_shots(0.10)                        # 74   — catch 10% TVD, 95% CI, 80% power
min_shots(0.05)                        # 293  — catch 5% TVD
min_shots(0.01)                        # 7299 — catch 1% TVD
min_shots(0.01, alpha=0.01, power=0.90)  # 11282 — stricter: 99% CI, 90% power

Using in a test

import pytest
from pytest_quantum import assert_measurement_distribution, min_shots

@pytest.mark.quantum
def test_bell_5pct_sensitivity(aer_simulator):
    from qiskit import QuantumCircuit, transpile

    shots = min_shots(epsilon=0.05)   # 293 shots

    qc = QuantumCircuit(2)
    qc.h(0); qc.cx(0, 1); qc.measure_all()
    counts = aer_simulator.run(
        transpile(qc, aer_simulator), shots=shots
    ).result().get_counts()

    assert_measurement_distribution(counts, {"00": 0.5, "11": 0.5})

Choosing epsilon

Use case	Recommended epsilon
Smoke test — just check the circuit runs	`0.10` (74 shots)
Normal regression test	`0.05` (293 shots)
Precise distribution validation	`0.02` (1 825 shots)
High-precision scientific result	`0.01` (7 299 shots)

Mark high-shot tests with @pytest.mark.quantum_slow and run them with --quantum-slow to keep the default suite fast.

`recommended_shots`¶

recommended_shots(
    expected_probs,
    min_expected_per_bucket=5,
) -> int

Returns the shot count needed so every non-zero bucket in expected_probs gets at least min_expected_per_bucket expected counts.

The chi-square goodness-of-fit test (used by assert_measurement_distribution) requires expected count ≥ 5 per cell to produce valid p-values. Violating this gives unreliable results and triggers a UserWarning.

recommended_shots targets the rarest outcome: if the rarest outcome has probability $p_{\min}$, you need at least $\lceil k / p_{\min} \rceil$ shots (where $k$ is min_expected_per_bucket).

Parameters: expected_probs — Dict mapping outcome labels to probabilities. Must sum to 1. Zero-probability outcomes are ignored.; min_expected_per_bucket — Minimum expected count per non-zero bucket (default 5).
Returns: Recommended shot count as an integer.
Raises: ValueError — expected_probs is empty or all probabilities are zero.

Examples

from pytest_quantum import recommended_shots

# Uniform Bell state — rarest outcome has probability 0.5
recommended_shots({"00": 0.5, "11": 0.5})           # 10

# Mostly-uniform, but one rare outcome at 0.1%
recommended_shots({"00": 0.499, "01": 0.001, "11": 0.5})  # 5000

# 3-qubit uniform (min_prob = 1/8)
recommended_shots({f"{i:03b}": 1/8 for i in range(8)})    # 40

Using in a test

from pytest_quantum import assert_measurement_distribution, recommended_shots

def test_ghz_distribution(aer_simulator):
    from qiskit import QuantumCircuit, transpile

    expected = {"000": 0.5, "111": 0.5}
    shots = recommended_shots(expected)   # 10 — very cheap for uniform

    qc = QuantumCircuit(3)
    qc.h(0); qc.cx(0, 1); qc.cx(1, 2); qc.measure_all()
    counts = aer_simulator.run(
        transpile(qc, aer_simulator), shots=max(shots, 500)
    ).result().get_counts()

    assert_measurement_distribution(counts, expected)

Note

recommended_shots guarantees chi-square validity but may return fewer shots than min_shots for detection power. For production tests combine both: use max(recommended_shots(probs), min_shots(epsilon=0.05)) to satisfy both constraints.

Statistical primitives¶

`fidelity`¶

fidelity(
    psi,
    phi,
) -> float

Computes the pure-state fidelity $F = |\langle\psi|\phi\rangle|^2$.

Both arrays are flattened and normalised before computation, so minor normalisation errors from simulators do not affect the result.

Returns: Float in [0.0, 1.0]. 1.0 means identical states (up to global phase). 0.0 means orthogonal states.

Raises: ValueError if arrays have different sizes or are zero-norm.

import numpy as np
from pytest_quantum import fidelity

zero = np.array([1, 0], dtype=complex)
one  = np.array([0, 1], dtype=complex)
plus = np.array([1, 1], dtype=complex) / np.sqrt(2)

fidelity(zero, zero)    # 1.0 — identical
fidelity(zero, one)     # 0.0 — orthogonal
fidelity(zero, plus)    # 0.5 — |<0|+>|² = 0.5
fidelity(plus, plus)    # 1.0 — identical

Global phase invariance:

psi   = np.array([1, 0], dtype=complex)
psi_j = 1j * np.array([1, 0], dtype=complex)   # global phase i·|0>

fidelity(psi, psi_j)    # 1.0 — global phase is invisible

`tvd`¶

tvd(
    p,
    q,
) -> float

Computes the Total Variation Distance between two probability distributions:

$$ \text{TVD}(p, q) = \frac{1}{2} \sum_x |p(x) - q(x)| $$

Parameters: p, q — 1-D numpy arrays of probabilities (each sums to 1).

Returns: Float in [0.0, 1.0]. 0.0 means identical; 1.0 means disjoint support.

import numpy as np
from pytest_quantum import tvd

# Identical distributions
tvd(np.array([0.5, 0.5]), np.array([0.5, 0.5]))    # 0.0

# Small deviation
tvd(np.array([0.5, 0.5]), np.array([0.6, 0.4]))    # 0.1

# Orthogonal distributions
tvd(np.array([1.0, 0.0]), np.array([0.0, 1.0]))    # 1.0

Interpreting TVD values:

TVD	Interpretation
`0.0`	Identical distributions
`< 0.05`	Very close — acceptable for most tests
`0.05 – 0.15`	Noticeable deviation — may indicate noise or error
`> 0.15`	Significant — likely a bug or misconfiguration
`1.0`	Completely disjoint — certain error

`tvd_from_counts`¶

tvd_from_counts(
    counts_a,
    counts_b,
) -> float

Computes TVD between two shot-count dictionaries. Each dict is normalised to a probability distribution before TVD is calculated. Outcomes present in one dict but absent in the other are treated as having count 0.

Parameters:: counts_a — First counts dict, e.g. {"00": 489, "11": 511}.; counts_b — Second counts dict, e.g. {"00": 501, "11": 499}.

Returns: Float in [0.0, 1.0].

Raises: ValueError if either dict is empty.

from pytest_quantum import tvd_from_counts

# Nearly identical Bell distributions
tvd_from_counts(
    {"00": 489, "11": 511},
    {"00": 501, "11": 499},
)
# → 0.012

# One backend sees "01" where the other sees nothing
tvd_from_counts(
    {"00": 500, "11": 500},
    {"00": 450, "01": 50, "11": 500},
)
# → 0.05

Using tvd_from_counts directly (instead of assert_counts_close):

from pytest_quantum import tvd_from_counts

def test_backend_drift(aer_simulator):
    """Fail if backend results drift more than 3% TVD day-over-day."""
    from qiskit import QuantumCircuit, transpile

    qc = QuantumCircuit(2)
    qc.h(0); qc.cx(0, 1); qc.measure_all()
    qc_t = transpile(qc, aer_simulator)

    run1 = aer_simulator.run(qc_t, shots=2000).result().get_counts()
    run2 = aer_simulator.run(qc_t, shots=2000).result().get_counts()

    distance = tvd_from_counts(run1, run2)
    assert distance < 0.03, f"Backend drift too large: TVD = {distance:.4f}"

`chi_square_test`¶

chi_square_test(
    observed,
    expected_probs,
    total_shots=None,
) -> tuple[float, float]

Chi-square goodness-of-fit test for quantum measurement distributions. Tests whether observed counts are consistent with expected_probs.

This is the statistical engine behind assert_measurement_distribution. Use it directly when you need the raw p-value or chi-square statistic.

Parameters: observed — Either a count dict {"00": 489, "11": 511} or a 1-D numpy array of observed counts.; expected_probs — Either a probability dict {"00": 0.5, "11": 0.5} (must sum to 1) or a 1-D numpy array of expected probabilities.; total_shots — Required when both inputs are numpy arrays. Ignored when dict inputs are used (total is inferred from observed).

Returns: (statistic, pvalue) — the chi-square statistic and the p-value.

Reject the null hypothesis (i.e., declare the distributions inconsistent) when pvalue < significance.

Raises: ValueError — Inconsistent inputs (mismatched keys, missing total_shots for array inputs, observed counts summing to zero).

Example — dict inputs

from pytest_quantum import chi_square_test

# 1000 shots on a Bell circuit — should give 50/50
stat, p = chi_square_test(
    observed={"00": 495, "11": 505},
    expected_probs={"00": 0.5, "11": 0.5},
)
print(f"χ² = {stat:.4f},  p = {p:.4f}")
# χ² = 0.1000,  p = 0.7518   → consistent

# Biased circuit — clearly wrong distribution
stat, p = chi_square_test(
    observed={"00": 800, "11": 200},
    expected_probs={"00": 0.5, "11": 0.5},
)
print(f"χ² = {stat:.4f},  p = {p:.6f}")
# χ² = 360.0000,  p = 0.000000   → reject null hypothesis

Example — numpy array inputs

import numpy as np
from pytest_quantum import chi_square_test

observed_counts  = np.array([245, 255, 248, 252])    # 4-outcome uniform
expected_uniform = np.array([0.25, 0.25, 0.25, 0.25])

stat, p = chi_square_test(
    observed=observed_counts,
    expected_probs=expected_uniform,
    total_shots=1000,
)
assert p > 0.05   # consistent with uniform distribution

Interpreting p-values

p-value	Interpretation
`> 0.05`	Consistent with expected distribution — pass
`0.01 – 0.05`	Marginal — consider more shots
`< 0.01`	Significant deviation — likely a bug
`< 0.001`	Strong evidence of error

Degrees of freedom

The chi-square test has k - 1 degrees of freedom, where k is the number of non-zero expected outcome buckets. Adding outcomes with zero expected probability that appear in counts does not add degrees of freedom.

The test requires expected count ≥ 5 per cell. Use recommended_shots to compute the shot count that satisfies this for your distribution.

Statistics Utilities Reference¶

Shot-count calculators¶

min_shots¶

recommended_shots¶

Statistical primitives¶

fidelity¶

tvd¶

tvd_from_counts¶

chi_square_test¶

`min_shots`¶

`recommended_shots`¶

`fidelity`¶

`tvd`¶

`tvd_from_counts`¶

`chi_square_test`¶