Shared low-rank subspaces for efficient LoRA adapter management.

One shared basis. Per-task coefficients. Up to 122× compression at scale.

$ pip install vlora-dev

GitHub vlora.ai

Based on arXiv:2602.06043 | Apache 2.0 | PyTorch ≥ 2.0

Benchmarks

Tested with 8 Lots-of-LoRAs adapters (Mistral-7B, rank 16, 96 layers each).

Variance Explained

B matrices share structure much more strongly.

k	Var (A)	Var (B)
1	0.19	0.43
2	0.37	0.73
4	0.69	0.95
6	1.00	1.00

Reconstruction Error

Relative L2 norm — near-perfect at k=6.

k	Mean Err	Max Err
1	0.826	0.938
4	0.387	0.846
6	0.000002	0.000003

Compression at Scale

Shared basis is a one-time cost.

N	Full	vLoRA	Ratio
8	288 MB	288 MB	1.0×
100	3,600 MB	289 MB	12.5×
1,000	36,000 MB	293 MB	122.8×

The 3-Step Algorithm

Build a shared subspace, project new adapters, and absorb them — all in a few lines of code.

Initialize

SharedSubspace.from_adapters()

SVD on stacked weight matrices to extract the shared basis across all adapters.

Project

subspace.project()

New adapter is reduced to a small loadings vector — per-task coefficients against the shared basis.

Absorb

subspace.absorb()

Incorporate a new adapter and recompute the basis to include its structure.

Key Insight

LoRA adapters across tasks share a common low-rank subspace. Instead of storing N separate adapters, maintain one shared basis and per-task coefficient vectors — achieving up to 100× parameter reduction. The shared basis is a one-time cost; each new adapter adds only k loadings per layer.

Beyond Compression

A complete toolkit for LoRA adapter lifecycle — from training to merging to serving.

Adapter Merging

Task arithmetic, TIES, and DARE merging — combine adapters into one with state-of-the-art techniques.

vlora merge adapters/* --method ties

Instant Adapter Switching

VLoRAModel wraps any PyTorch model — switch adapters with a single call, no reloading.

model.set_task("sentiment")

Train in the Subspace

100×+ parameter reduction — optimize k loadings per layer instead of full LoRA matrices.

SubspaceTrainer(subspace, "task")

HuggingFace Trainer

Drop-in VLoRACallback for HF Trainer — train subspace loadings with your existing pipeline.

pip install vlora-dev[hf]

Serving Ready

Export to vLLM, TGI, or Ollama-compatible formats with proper adapter configs.

vlora export --alpha 32

9 CLI Commands

compress, export, merge, analyze, validate, diff, benchmark, info, add — everything from the terminal.

vlora validate subspace/

Quickstart

Get started in minutes with the vLoRA Python library.

quickstart.py

from vlora import SharedSubspace, load_adapter

# Step 1: Build shared subspace from existing adapters
adapters = [load_adapter(f"adapters/task_{i}") for i in range(5)]
subspace = SharedSubspace.from_adapters(adapters, num_components=16)

# Step 2: Project a new adapter (only stores small loadings vector)
new_adapter = load_adapter("adapters/new_task")
projection = subspace.project(new_adapter, task_id="new_task")
subspace.add_task(projection)

# Step 3: Absorb — recompute basis to include new adapter
subspace.absorb(load_adapter("adapters/another_task"), new_task_id="another")

# Reconstruct any task back to full LoRA weights
weights = subspace.reconstruct("new_task")

# Save / load
subspace.save("shared_subspace/")
subspace = SharedSubspace.load("shared_subspace/")

Run the benchmark yourself:

$ pip install vlora-dev[hub] && python examples/real_adapters.py