osxQ Docs | Apple Silicon Quantum Simulator

osxQ Documentation

Designed for single‑machine research on Apple Silicon, osxQ combines a Python‑first workflow with MLX acceleration and unified memory to run mid‑scale state‑vector circuits — no CUDA required. Validation spans parity with PennyLane, Qulacs, and cuQuantum plus a QASMBench‑style OpenQASM 2.0 suite and a 200+ test corpus with circuit drawings (ASCII/Matplotlib/Quantikz). These docs cover install, the device‑style API, measurement modes (exact vs shot‑based), the reproducibility harness, scaling knobs, and curriculum‑ready examples.

Highlights

Python‑first • MLX‑accelerated Unified memory (Apple Silicon) OpenQASM 2.0 suite Parity vs PennyLane/Qulacs/cuQuantum Curriculum‑ready tests/tutorials

Overview

Shots: 0

Last: —

P(00): 0.0%

P(11): 0.0%

00

11

For Apple Silicon systems, there has been no comprehensive, CUDA‑free, Apple‑native simulator mapping community benchmarks and OpenQASM suites directly to the platform. osxQ closes this gap. It is a Python‑first, MLX‑accelerated, Apple‑Silicon‑native state‑vector/MPS simulator targeting Metal unified memory (128–512 GB) to run mid‑scale circuits on a single machine — without shader programming or host‑device copy overhead.

osxQ reproduces community benchmarks (Yao.jl, PennyLane, Qulacs), provides parity with NVIDIA cuQuantum (cuStateVec), and runs a QASMBench‑style OpenQASM 2.0 suite. The library includes circuit drawing (ASCII, Matplotlib, Quantikz), exact and shot‑based measurement, and 200+ executable tests with analytical checks — mirroring pen‑and‑paper derivations with working code.

Backends & Architecture

SV — Dense state‑vector simulator with device‑style API, sampling, and OpenQASM execution.
MPS — Matrix Product State engine with two‑site SVD, truncation (dmax, eps), non‑adjacent gates via swap‑network, and per‑run bond CSVs.
MPSD — MPS with TEBD‑optimized diagonal ZZ‑MPO (and optional pair‑sweeps). Produces distinct _mpsd outputs for clean comparison to MPS.

Why compare MPS vs MPSD? It isolates TEBD optimization effects, validates accuracy, reveals bond‑growth/truncation behavior, and guides defaults (dmax, eps, sweeps). Ablations become figure‑ready.

Paper Parity

To mirror the schedules used in “Comparative Benchmarking of Utility‑Scale Quantum Emulators” (arXiv:2504.14027), enable the preset which applies a quasi‑logarithmic qubit grid to the supported keys. All plots standardize the x‑axis label to Qubits.

# Apply paper‑2504 schedule (4,8,16,24,32,64,128,256; 512/1024 if cap allows)
./bench_with_logging.sh --paper-2504 --with-mps --with-mpsd --mps-dmax 128 --mps-eps 1e-10

# One‑off example
./bench.sh --paper-2504 --circuit qft --simulate-limit 256

Supported keys include: ghz, wstate, qft, qft_entangled/qftentangled, graph_state/graphstate, phase_estimation/qpeexact, phase_estimation_inexact/qpeinexact, ae, quantum_walk/quantum_walk_vchain/qwalk, random/random_circuit, realamp, su2rand, qnn.

Backends & Flags

# Backend selection
MLXQ_BACKEND=sv|mps

# MPS tuning
MLXQ_MPS_DMAX=128          # bond cap (default 64)
MLXQ_MPS_EPS=1e-10         # truncation epsilon
MLXQ_MPS_EARLY_STOP_BMAX=0 # stop when D ≥ this (0=off)
MLXQ_MPS_STOP_ON_TRUNC=0   # stop when any truncation occurs
MLXQ_MPS_PAIR_SWEEPS=0     # even/odd pair sweeps in TEBD

# MPSD mode (MPS + ZZ‑MPO; distinct _mpsd outputs)
MLXQ_MPSD=1                # enables _mpsd suffix and captions
MLXQ_MPS_USE_MPO_ZZ=1      # diagonal ZZ‑MPO for TEBD

Use ./bench_with_logging.sh --with-mps --with-mpsd to run SV → MPS → MPSD and copy artifacts for papers. For one‑offs, set env vars and call ./bench.sh --circuit <key>.

Utilities & Reporting

# Overlay mlxQ vs external CSV (e.g., cuStateVec or TN)
python3 tools/overlay_compare.py \
  --ours bench/qft_mps_data.csv \
  --ext mimiq_qft.csv --ext-time-col seconds --ext-scale-ms 1000 \
  --label-ours "osxQ (M‑series)" --label-ext "External" \
  --out bench/qft_overlay.png

# Aggregate MPS summaries into a single report (JSON + Markdown)
python3 tools/mps_report.py --bench bench

# Track D‑growth vs TEBD steps for a single circuit (ad‑hoc)
PYTHONPATH=src python3 tools/mps_dgrowth.py \
  --circuit tfim --n 16 --steps 40 --dt 0.05 --dmax 128 --eps 1e-10 \
  --out bench --plot

Screenshots & Circuit Gallery

Product

Example

rdcircuit001. Random/Yao‑style circuit used in visualization tests. Rounded corners for easy scanning.

Algorithms & Benchmarks

Algorithm subjects

QFT (Quantum Fourier Transform)
Phase Estimation (PE)
QAOA (Ising)
VQE (Ising, UCCSD toy)
Time Evolution / Trotterization
Hamiltonian Simulation (toy Pauli models)
Variational Circuit (ansatz sweep)
GHZ, Grover, Random/Yao, QCBM, Steady State

Benchmark keys

hamiltonian_simulation, time_evolution, trotter, steady_state
random_circuit, qcbm, phase_estimation, qft
qaoa, vqe, variational_circuit, grover, ghz
cuquantum_blueqat (vendor parity)

MQTBench additions

ghz, wstate, qft, qft_entangled/qftentangled
qpeexact/qpeinexact, graph_state/graphstate, ae
qwalk/quantum_walk_vchain, random/random_circuit
realamp, su2rand, qnn

Many‑body variants

TFIM (1st/2nd order), Heisenberg/XXZ (random field), long‑range Ising (1/r^α), ladder Heisenberg (2×L)

Why compare MPS and MPSD?

Performance: MPSD replaces dense 4×4 applies on ZZ with a diagonal MPO, reducing per‑bond cost in TEBD sweeps, especially at higher D.
Parity: Both produce the same unitary; MPSD changes the kernel, not the model. JSON metadata includes mps.mode (mps or mpsd).
Diagnostics: Bonds plots “flatten” once Dmax is reached. Increase --mps-dmax and/or tighten --mps-eps to observe D‑growth.

Outputs

Per‑bench CSV/JSON (scaling + hardware‑suffixed JSON)
Per‑n bonds CSV (<key>_mps[d]_n<N>_bonds.csv) and per‑bench summary (<key>_mps[d]_summary.csv)
Scaling and bond‑growth plots (_scaling.png, _bonds.png) copied to paper/prx-quantum/images/ and assets/benchmarks/

Vendor/framework groups

Yao.jl (Random/Yao, QFT, time evolution)
PennyLane (VQE, QAOA, circuit parity & drawing)
Qulacs (reference workloads)
NVIDIA cuQuantum (cuStateVec samples; Blueqat‑style brickwork)

Installation

Requirements: macOS 13.3+, Apple Silicon (M‑series), Python 3.11+, MLX.

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install mlx matplotlib

Quick Start

# Run validation tests (pretty output)
export PYTHONPATH=src
export MLXQ_PRINT_ASCII=1   # optional
python3 src/tests/run_core_tests.py

# Run algorithm benchmarks (saves plots)
export PYTHONPATH=src
python3 src/benchmark/bench.py

# Textual TUI (optional)
python3 -m pip install textual
python3 src/benchmark/bench_tui.py

Sample Run

CLI Output

A single‑circuit scaling run of hamiltonian_simulation (cap 30; qubits 1→28). Rendered as a terminal snippet for clarity.

=== Running: ./bench.sh --circuit hamiltonian_simulation --simulate-limit 30 --qubits 1,2,5,7,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28
=== Single-circuit run: hamiltonian_simulation (qubits: 1,2,5,7,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28, cap: 30) ===
=== Running hamiltonian_simulation (qubits: 1,2,5,7,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28, cap: 30) ===
hamiltonian_simulation Scaling Benchmark
Framework: mlx–quantum | Device: apple–silicon–mlx
Testing qubit counts: 1, 2, 5, 7, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 
21, 22, 23, 24, 25, 26, 27, 28
Hamiltonian: H = ∑ ZᵢZᵢ₊₁ + 0.5 ∑ Xᵢ (J=1.0, h=0.5)
hamiltonian_simulation |  1q | gates    20 | wall   12.13 ms
hamiltonian_simulation |  2q | gates    60 | wall   13.33 ms
hamiltonian_simulation |  5q | gates   180 | wall   21.64 ms
hamiltonian_simulation |  7q | gates   260 | wall   27.66 ms
hamiltonian_simulation | 10q | gates   380 | wall   46.08 ms
hamiltonian_simulation | 11q | gates   420 | wall   44.06 ms
hamiltonian_simulation | 12q | gates   460 | wall   50.35 ms
hamiltonian_simulation | 13q | gates   500 | wall   58.00 ms
hamiltonian_simulation | 14q | gates   540 | wall   57.36 ms
hamiltonian_simulation | 15q | gates   580 | wall   63.27 ms
hamiltonian_simulation | 16q | gates   620 | wall   86.08 ms
hamiltonian_simulation | 17q | gates   660 | wall  137.84 ms
hamiltonian_simulation | 18q | gates   700 | wall  248.51 ms
hamiltonian_simulation | 19q | gates   740 | wall  514.43 ms
hamiltonian_simulation | 20q | gates   780 | wall 1038.66 ms
hamiltonian_simulation | 21q | gates   820 | wall 2152.24 ms
hamiltonian_simulation | 22q | gates   860 | wall 4399.28 ms
hamiltonian_simulation | 23q | gates   900 | wall 9287.86 ms

OpenQASM Suite

Place circuits under datasets/qasm/local/. Control caps via environment variables.

Environment Variables

QASM_MAX_QUBITS · QASM_TIMEOUT_MS · QASM_SIMULATE_LIMIT

Bench Controls

MLXQ_MAX_QUBITS · MLXQ_CAP_QFT · MLXQ_CAP_VQE · MLXQ_CAP_PHASE_ESTIMATION

Algorithms

QFT, PE, VQE, QAOA, GHZ, Grover, Random/Yao, QCBM, Trotter, Time Evolution

API Sketch

Python examples live under src/mlxq and src/tests. A minimal device‑style workflow looks like:

from mlxq.device import Device
from mlxq.gates import H, CNOT, RX, MeasureZ

dev = Device(n_qubits=3)
dev.apply(H(0))
dev.apply(CNOT(0, 1))
dev.apply(RX(2, 0.2))

exp = dev.expectation([MeasureZ(0), MeasureZ(1)])
print(exp)

Get in touch

Questions about osxQ, integrations, or research use? Reach out.

QNeura.ai

osxQ — Apple Silicon Quantum Simulator

shlomo@qneura.ai