# Statistics Utilities Reference pytest-quantum ships statistical primitives that underpin its assertions and help you choose the right shot count for reliable tests. All functions are pure numpy/scipy — no quantum SDK is required. Import from the top-level package: ```python from pytest_quantum import ( min_shots, recommended_shots, fidelity, tvd, tvd_from_counts, chi_square_test, ) ``` --- ## Shot-count calculators ### `min_shots` ```python min_shots( epsilon, alpha=0.05, power=0.80, ) -> int ``` Returns the minimum number of shots to reliably detect a Total Variation Distance of `epsilon` between two distributions. **The formula** Based on two-sample statistical power analysis: $$ N = \left\lceil \frac{(z_{1-\alpha/2} + z_{\text{power}})^2}{2\varepsilon^2} \right\rceil $$ where $z_p$ is the $p$-th quantile of the standard normal distribution. With default settings ($\alpha = 0.05$, $\text{power} = 0.80$): $z_{0.975} \approx 1.96$, $z_{0.80} \approx 0.84$. **Parameters** : `epsilon` — Minimum detectable TVD. `0.01` means the test can reliably catch a 1% deviation from the expected distribution. : `alpha` — Significance level (default `0.05` → 95% confidence). : `power` — Statistical power — the probability of detecting a real error (default `0.80` → 80% power). **Returns** : Minimum recommended shot count as an integer. **Raises** : `ValueError` — Any argument is outside its valid range `(0, 1)`. **Worked examples** ```python from pytest_quantum import min_shots min_shots(0.10) # 74 — catch 10% TVD, 95% CI, 80% power min_shots(0.05) # 293 — catch 5% TVD min_shots(0.01) # 7299 — catch 1% TVD min_shots(0.01, alpha=0.01, power=0.90) # 11282 — stricter: 99% CI, 90% power ``` **Using in a test** ```python import pytest from pytest_quantum import assert_measurement_distribution, min_shots @pytest.mark.quantum def test_bell_5pct_sensitivity(aer_simulator): from qiskit import QuantumCircuit, transpile shots = min_shots(epsilon=0.05) # 293 shots qc = QuantumCircuit(2) qc.h(0); qc.cx(0, 1); qc.measure_all() counts = aer_simulator.run( transpile(qc, aer_simulator), shots=shots ).result().get_counts() assert_measurement_distribution(counts, {"00": 0.5, "11": 0.5}) ``` **Choosing epsilon** | Use case | Recommended epsilon | |---|---| | Smoke test — just check the circuit runs | `0.10` (74 shots) | | Normal regression test | `0.05` (293 shots) | | Precise distribution validation | `0.02` (1 825 shots) | | High-precision scientific result | `0.01` (7 299 shots) | Mark high-shot tests with `@pytest.mark.quantum_slow` and run them with `--quantum-slow` to keep the default suite fast. --- ### `recommended_shots` ```python recommended_shots( expected_probs, min_expected_per_bucket=5, ) -> int ``` Returns the shot count needed so every non-zero bucket in `expected_probs` gets at least `min_expected_per_bucket` expected counts. The chi-square goodness-of-fit test (used by `assert_measurement_distribution`) requires expected count ≥ 5 per cell to produce valid p-values. Violating this gives unreliable results and triggers a `UserWarning`. `recommended_shots` targets the **rarest outcome**: if the rarest outcome has probability $p_{\min}$, you need at least $\lceil k / p_{\min} \rceil$ shots (where $k$ is `min_expected_per_bucket`). **Parameters** : `expected_probs` — Dict mapping outcome labels to probabilities. Must sum to 1. Zero-probability outcomes are ignored. : `min_expected_per_bucket` — Minimum expected count per non-zero bucket (default `5`). **Returns** : Recommended shot count as an integer. **Raises** : `ValueError` — `expected_probs` is empty or all probabilities are zero. **Examples** ```python from pytest_quantum import recommended_shots # Uniform Bell state — rarest outcome has probability 0.5 recommended_shots({"00": 0.5, "11": 0.5}) # 10 # Mostly-uniform, but one rare outcome at 0.1% recommended_shots({"00": 0.499, "01": 0.001, "11": 0.5}) # 5000 # 3-qubit uniform (min_prob = 1/8) recommended_shots({f"{i:03b}": 1/8 for i in range(8)}) # 40 ``` **Using in a test** ```python from pytest_quantum import assert_measurement_distribution, recommended_shots def test_ghz_distribution(aer_simulator): from qiskit import QuantumCircuit, transpile expected = {"000": 0.5, "111": 0.5} shots = recommended_shots(expected) # 10 — very cheap for uniform qc = QuantumCircuit(3) qc.h(0); qc.cx(0, 1); qc.cx(1, 2); qc.measure_all() counts = aer_simulator.run( transpile(qc, aer_simulator), shots=max(shots, 500) ).result().get_counts() assert_measurement_distribution(counts, expected) ``` :::{note} `recommended_shots` guarantees chi-square validity but may return fewer shots than `min_shots` for detection power. For production tests combine both: use `max(recommended_shots(probs), min_shots(epsilon=0.05))` to satisfy both constraints. ::: --- ## Statistical primitives ### `fidelity` ```python fidelity( psi, phi, ) -> float ``` Computes the pure-state fidelity $F = |\langle\psi|\phi\rangle|^2$. Both arrays are flattened and normalised before computation, so minor normalisation errors from simulators do not affect the result. **Returns:** Float in `[0.0, 1.0]`. `1.0` means identical states (up to global phase). `0.0` means orthogonal states. **Raises:** `ValueError` if arrays have different sizes or are zero-norm. ```python import numpy as np from pytest_quantum import fidelity zero = np.array([1, 0], dtype=complex) one = np.array([0, 1], dtype=complex) plus = np.array([1, 1], dtype=complex) / np.sqrt(2) fidelity(zero, zero) # 1.0 — identical fidelity(zero, one) # 0.0 — orthogonal fidelity(zero, plus) # 0.5 — |<0|+>|² = 0.5 fidelity(plus, plus) # 1.0 — identical ``` **Global phase invariance:** ```python psi = np.array([1, 0], dtype=complex) psi_j = 1j * np.array([1, 0], dtype=complex) # global phase i·|0> fidelity(psi, psi_j) # 1.0 — global phase is invisible ``` --- ### `tvd` ```python tvd( p, q, ) -> float ``` Computes the Total Variation Distance between two probability distributions: $$ \text{TVD}(p, q) = \frac{1}{2} \sum_x |p(x) - q(x)| $$ **Parameters:** `p`, `q` — 1-D numpy arrays of probabilities (each sums to 1). **Returns:** Float in `[0.0, 1.0]`. `0.0` means identical; `1.0` means disjoint support. ```python import numpy as np from pytest_quantum import tvd # Identical distributions tvd(np.array([0.5, 0.5]), np.array([0.5, 0.5])) # 0.0 # Small deviation tvd(np.array([0.5, 0.5]), np.array([0.6, 0.4])) # 0.1 # Orthogonal distributions tvd(np.array([1.0, 0.0]), np.array([0.0, 1.0])) # 1.0 ``` **Interpreting TVD values:** | TVD | Interpretation | |---|---| | `0.0` | Identical distributions | | `< 0.05` | Very close — acceptable for most tests | | `0.05 – 0.15` | Noticeable deviation — may indicate noise or error | | `> 0.15` | Significant — likely a bug or misconfiguration | | `1.0` | Completely disjoint — certain error | --- ### `tvd_from_counts` ```python tvd_from_counts( counts_a, counts_b, ) -> float ``` Computes TVD between two shot-count dictionaries. Each dict is normalised to a probability distribution before TVD is calculated. Outcomes present in one dict but absent in the other are treated as having count 0. **Parameters:** : `counts_a` — First counts dict, e.g. `{"00": 489, "11": 511}`. : `counts_b` — Second counts dict, e.g. `{"00": 501, "11": 499}`. **Returns:** Float in `[0.0, 1.0]`. **Raises:** `ValueError` if either dict is empty. ```python from pytest_quantum import tvd_from_counts # Nearly identical Bell distributions tvd_from_counts( {"00": 489, "11": 511}, {"00": 501, "11": 499}, ) # → 0.012 # One backend sees "01" where the other sees nothing tvd_from_counts( {"00": 500, "11": 500}, {"00": 450, "01": 50, "11": 500}, ) # → 0.05 ``` **Using `tvd_from_counts` directly (instead of `assert_counts_close`):** ```python from pytest_quantum import tvd_from_counts def test_backend_drift(aer_simulator): """Fail if backend results drift more than 3% TVD day-over-day.""" from qiskit import QuantumCircuit, transpile qc = QuantumCircuit(2) qc.h(0); qc.cx(0, 1); qc.measure_all() qc_t = transpile(qc, aer_simulator) run1 = aer_simulator.run(qc_t, shots=2000).result().get_counts() run2 = aer_simulator.run(qc_t, shots=2000).result().get_counts() distance = tvd_from_counts(run1, run2) assert distance < 0.03, f"Backend drift too large: TVD = {distance:.4f}" ``` --- ### `chi_square_test` ```python chi_square_test( observed, expected_probs, total_shots=None, ) -> tuple[float, float] ``` Chi-square goodness-of-fit test for quantum measurement distributions. Tests whether `observed` counts are consistent with `expected_probs`. This is the statistical engine behind `assert_measurement_distribution`. Use it directly when you need the raw p-value or chi-square statistic. **Parameters** : `observed` — Either a count dict `{"00": 489, "11": 511}` or a 1-D numpy array of observed counts. : `expected_probs` — Either a probability dict `{"00": 0.5, "11": 0.5}` (must sum to 1) or a 1-D numpy array of expected probabilities. : `total_shots` — Required when both inputs are numpy arrays. Ignored when dict inputs are used (total is inferred from `observed`). **Returns:** `(statistic, pvalue)` — the chi-square statistic and the p-value. Reject the null hypothesis (i.e., declare the distributions inconsistent) when `pvalue < significance`. **Raises:** `ValueError` — Inconsistent inputs (mismatched keys, missing `total_shots` for array inputs, observed counts summing to zero). **Example — dict inputs** ```python from pytest_quantum import chi_square_test # 1000 shots on a Bell circuit — should give 50/50 stat, p = chi_square_test( observed={"00": 495, "11": 505}, expected_probs={"00": 0.5, "11": 0.5}, ) print(f"χ² = {stat:.4f}, p = {p:.4f}") # χ² = 0.1000, p = 0.7518 → consistent # Biased circuit — clearly wrong distribution stat, p = chi_square_test( observed={"00": 800, "11": 200}, expected_probs={"00": 0.5, "11": 0.5}, ) print(f"χ² = {stat:.4f}, p = {p:.6f}") # χ² = 360.0000, p = 0.000000 → reject null hypothesis ``` **Example — numpy array inputs** ```python import numpy as np from pytest_quantum import chi_square_test observed_counts = np.array([245, 255, 248, 252]) # 4-outcome uniform expected_uniform = np.array([0.25, 0.25, 0.25, 0.25]) stat, p = chi_square_test( observed=observed_counts, expected_probs=expected_uniform, total_shots=1000, ) assert p > 0.05 # consistent with uniform distribution ``` **Interpreting p-values** | p-value | Interpretation | |---|---| | `> 0.05` | Consistent with expected distribution — pass | | `0.01 – 0.05` | Marginal — consider more shots | | `< 0.01` | Significant deviation — likely a bug | | `< 0.001` | Strong evidence of error | **Degrees of freedom** The chi-square test has `k - 1` degrees of freedom, where `k` is the number of non-zero expected outcome buckets. Adding outcomes with zero expected probability that appear in counts does not add degrees of freedom. The test requires expected count ≥ 5 per cell. Use `recommended_shots` to compute the shot count that satisfies this for your distribution.