Tensor-format-aware visualization for ML model weights, built on arbvis. Renders .safetensors / .gguf / PyTorch .bin / .pth / .pt checkpoints at each tensor's natural element shape — 1 px = 1 element — and stacks transformer blocks vertically so corresponding sub-tensors (q_proj, gate_proj, etc.) line up across every layer. Block-to-block changes — quantization steps, finetune deltas, dead heads — appear as horizontal bands.
For non-tensor files (binaries, JSON, anything else), use arbvis directly. modelweightvis is a thin crate that adds tensor awareness on top of arbvis: it registers FormatPlugin / LayoutPlugin / DiffSourceBuilder impls and CLI dispatch hooks against arbvis's registry, then hands the actual rendering, Hub I/O, tile pyramid, and Space deploy off to arbvis. The modelweightvis binary inherits arbvis's full CLI surface — --out, --space, --3d, --stream, --show-xet-xorbs, etc. — so you don't need to use both. See Relationship to arbvis below for the architectural picture.
modelweightvis hf://meta-llama/Llama-3.2-1B --out ./out
# then open out/index.html in a browserhf:// inputs are fetched directly — no manual download. The output is a Leaflet.js tile pyramid you can zoom across; at maximum zoom, one pixel is one tensor element.
When every input is safetensors with transformer-style tensor names, modelweightvis renders each tensor at its 2D element shape and stacks transformer blocks vertically. Corresponding sub-tensors (e.g. q_proj across every layer) are pixel-aligned, so block-to-block changes line up as horizontal bands.
Override with --layout arch|hilbert:
arch(default) — force architectural, and abort if it can't be built. A model file that matched a format plugin but couldn't be parsed (e.g. an IQ-quant GGUF candle can't decode) has no tensor metadata, so the run errors out instead of silently degrading to a byte view — modelweightvis is for tensor files; use arbvis directly for arbitrary binaries. Non-tensor siblings (config.json,tokenizer.json, README) are ignored, so a full repo still renders. Applies to the 2D render, the--3darch volume, and--diff;--moerenders its own scene layouts.hilbert— force the byte-only Hilbert layout (1 px = 1 byte) for regression checks against arbvis output.
- safetensors (
.safetensors) — single file or sharded index. Header is range-fetched forhf://inputs. - GGUF (
.gguf) — quantized weights are dequantized for diffing. - PyTorch pickle (
.bin/.pth/.pt) — parsed without invoking__reduce__/find_class, so loading untrusted headers is safe. Remote pickle isn't supported (the zip end-of-central-directory lives at the file tail and can't be prefix-fetched).
Diffs match tensors by canonical name across formats, so a GGUF checkpoint diffs cleanly against the corresponding safetensors release.
Tensor elements (not just raw bytes) are colored according to dtype:
- Float values are mapped through a perceptually-uniform brightness scale honoring sign and magnitude.
- Integer-quant elements (GGUF
Q4_K,Q8_0, etc.) are dequantized first, then colored on the float scale. - Padding regions and unused shard slots render as a recognizable non-pure-black so they're distinguishable from real zero-valued elements.
This applies in both the normal arch render and the --show-xet-xorbs xet-coloring path inherited from arbvis (hue from xorb ID, intensity from element value instead of raw byte).
modelweightvis --diff hf://meta-llama/Llama-3.2-1B hf://meta-llama/Llama-3.2-1B-Instruct --out ./outPer-tensor element-wise diff between two checkpoints (local files, directories, or hf:// URLs). Each pixel encodes a signed delta: black for identical, green for values that grew, red for values that shrank, white for non-finite results.
rms(default) — per-tensor RMS-normalized signed delta. Stable across tensors of wildly different scale.abs-log— absolute delta on a log brightness scale. Honest about raw magnitudes.exact— ternary: identical bytes → black; any change → full saturation.
When both arguments are hf:// model URLs, modelweightvis auto-detects whether the second is declared as a finetune of the first via the HF model card (base_model + base_model_relation). In finetune mode, tensors present only on the base side render as grey crosshatch (informational); anything new on the finetune side or with a mismatched shape aborts the run. Pass --finetune to force the relation on, --no-finetune to force it off.
Non-tensor files in a --diff between repos or directories (READMEs, tokenizer configs, etc.) fall back to arbvis's plain-byte / JSON-aware diff path.
modelweightvis --moe hf://Qwen/Qwen1.5-MoE-A2.7B --out ./out
# then open out/index.html — toggle the Summary / CKA tabs (top-right)--moe loads the model once and renders two complementary lenses as separate, tab-switchable scenes in the viewer (the tab switcher is the Leaflet base-layer control top-right). It renders the 2D tabbed viewer (--out or --space); it's incompatible with --3d, whose volume bundle can't host the tabs.
One heatmap panel per FFN weight (gate_proj, up_proj, down_proj) plus the router gate, side by side. Each panel is a layers × experts grid with one colored cell per expert, so dead experts, per-layer magnitude trends, and outliers pop out at a glance. The scalar shown per cell is chosen by --summary-stat:
rms(default) — √(mean(x²)), comparable across tensors of different scale.frobenius— √(sum(x²)), honest about total magnitude.mean-abs— mean(|x|), stable and dominated by typical entries.sparsity— fraction of near-zero entries, surfacing dead / near-dead experts.
Supports HF per-expert safetensors (Qwen-MoE, OLMoE, DeepSeek routed experts), the classic Mixtral block_sparse_moe layout, and the newer fused transformers export (batched mlp.experts.gate_up_proj / down_proj). GGUF fused-expert tensors are not yet supported.
One n_experts × n_experts linear-CKA similarity heatmap per (layer, weight): the diagonal is 1.0 (every expert is self-identical) and bright off-diagonal blocks reveal redundant expert clusters. Uses Gaussian random projection on the input axis (--cka-sample, default 128) to keep a 60-expert × 24-layer model tractable. The CKA scene needs per-expert tensors — on a fused-layout checkpoint it's skipped (with a warning) and only the Summary scene renders.
modelweightvis --moe /path/to/Qwen1.5-MoE --probe --out ./outAdds one behavioral panel to each scene from a routing-faithful forward pass over a probe input — the one signal that reflects which experts the router actually fires on real tokens, not just static weights:
- Summary gets a per-
(layer, expert)routing-frequency panel. - CKA gets a per-layer
n_experts × n_expertsco-activation grid: cell(i, j)is the fraction of probe tokens whose router top-k fired both expertiandj(diagonal(i, i)= experti's own frequency). CKA asks "which experts have similar weights"; co-activation asks "which experts actually fire together." Each layer's panel is per-panel normalized.
Override the bundled probe text with --probe-text "…", --probe-file <path>, or --probe-url <https|hf://…>. The probe input must be a local directory (resolve hf:// repos with hf download … first). Supported architectures: Qwen2MoeForCausalLM, MixtralForCausalLM. Probe failures are non-fatal — the static scenes still render.
modelweightvis inherits arbvis's full CLI surface. The output destinations, Hub I/O, and viewer-side flags work the same on tensor-aware renders:
--out DIR— write the viewer bundle here: a zoomable Leaflet tile pyramid (2D), or the Three.js volume bundle under--3d. Accepts a local dir or anhf://URL.--3d(with--grid N) — render bytes along a 3D Hilbert curve into a cube and emit a Three.js volume + point-cloud viewer instead of the 2D pyramid.--space OWNER/REPO— deploy a Docker Space serving the viewer (works for both 2D and--3d).--stream— keephf://inputs remote and push tiles to the Hub as they're produced.--show-xet-xorbs— color regions by xorb ID for xet-backed inputs (hue per xorb).--tile-format avif|png,--regen-html DIR,--title TEXT,-l/--file-list FILE.
See the arbvis README for the full reference on all of these.
arbvis is the byte-only foundation: Hilbert layout, byte coloring, JSON-aware diff, Hub I/O, tile pyramid, Space deploy, xet xorb path, streaming. It has no knowledge of tensors or model formats — .safetensors and .gguf get the same byte-Hilbert treatment as any other binary.
modelweightvis extends arbvis through its plugin / hook surface — no fork, no patch:
FormatPluginimpls (SafetensorsFormatPlugin,GgufFormatPlugin,PickleFormatPlugin) parse each format's header and stuff aModelInfo(tensors + dtype color ranges) into the source'sextensionsmap.LayoutPluginimpls (ArchLayoutPlugin,MoeSummaryLayoutPlugin,MoeCkaLayoutPlugin) register the architectural canvas and the MoE summary / CKA panel layouts; arbvis's plugin-iterationselect_layoutpicks them by priority.LeafLoader+LeafRendererpair (ArchRegionsLoader,ArchRegionsRenderer) drive per-tensor tile rendering at element granularity.DiffSourceBuilder(TensorDiffBuilder) handles tensor-aware file-pair--diffat priority above arbvis's JSON / plain-byte fallbacks.SourceProviderimpls (MoeSceneProvider,RepoDiffProvider,TensorDiffProvider) turn an invocation into render sources —--moe, a repo-levelhf://--diff, and a directory--diff— each registered above arbvis's byte-diff / normal-bytes built-ins. Finetune detection (HF model card) is resolved inside the diff providers.PrepareSourcesExtension(SourceMetaSidecarHook) fetchesconfig.json/ index sidecars after the sources are built, enriching each with transformer hyperparameters the arch layout reads back.
The modelweightvis binary itself is tiny — it builds an arbvis::Registry::with_defaults(), calls modelweightvis::register_all(&mut registry, &args) (which also wires the parsed CLI flags into the providers and the registry's layout mode), and hands off to arbvis::run. Same renderer, same Hub I/O, same tile pyramid; the tensor awareness comes entirely through the registered plugins.
Which to use:
- modelweightvis — for
.safetensors/.gguf/.binmodel checkpoints, architectural transformer layout,--moe(tabbed summary + CKA scenes) /--probe,--diff-metric,--finetune/--no-finetune,--layout, dtype-aware coloring. Inherits arbvis's full CLI surface. - arbvis — for non-model binaries (any file format), JSON/JSONL diffs, plain-byte diffs, the xet xorb path on arbitrary content. Smaller dependency footprint (no
candle-core/regex/zip/half).
Requires Rust (stable).
cargo build --release
./target/release/modelweightvis <model.safetensors> --out ./outputOr install into your PATH:
cargo install --path .Cargo.toml pins arbvis to a specific commit on znation/arbvis via arbvis = { git = "…", rev = "…" }. Cargo fetches and builds it on first build; no sibling checkout needed. Bump the rev to pick up new arbvis changes.
If you're editing arbvis and want modelweightvis to build against your local checkout instead of the pinned git rev, create a gitignored .cargo/config.toml next to this repo's Cargo.toml:
# .cargo/config.toml — not committed; per-developer override.
[patch."https://github.com/znation/arbvis"]
arbvis = { path = "../arbvis" }Layout:
~/your-code/
├── arbvis/ # https://github.com/znation/arbvis
└── modelweightvis/ # this repo (with .cargo/config.toml above)
cargo build will now resolve arbvis from ../arbvis instead of GitHub. Remove the file (or the [patch] block) to switch back to the pinned rev.
Built on arbvis for everything not tensor-aware. Tensor parsing leans on candle-core (GGUF dequantization, pickle reading), half (FP16 / BF16), zip (PyTorch .bin zip-entry resolution), and regex (transformer-style tensor-name classification).