Barswap

GPU multiplexer for dense compute systems

Unlock all your GPU hardware on commodity motherboards. Run inference and training across every GPU die you own, regardless of system address space limits.

GPU Dies

72 GB

Total VRAM

<1 ms

Swap Latency

2.5x

Speedup vs CPU

The Problem & The Solution

Budget motherboards can't address all your GPU hardware. Barswap fixes that.

The Problem

Multi-GPU cards pack multiple dies per board, but commodity motherboards don't have enough address space to map them all. The OS sees the hardware, but can't use it. Most of your GPU dies sit completely dark.

You bought 72 GB of VRAM. You can use a fraction of it. The hardware works fine — the system just can't reach it.

The Solution

Barswap is a Linux kernel module that dynamically manages GPU access, letting the system use every die across all your cards. Model weights and compute state persist between switches — no data lost, no reload needed.

Drop it in, load the module, and your full GPU cluster comes online.

Zero-copy VRAM Hot-swap capable State persistent Sub-millisecond Kernel module

Status & Roadmap

From kernel module to full GPU cluster.

Complete

Core GPU Multiplexer

Kernel module working. All GPU dies accessible and validated. Full CUDA compatibility.

Complete

GPU-Accelerated Inference

Running language models on multiplexed GPUs with significant speedup over CPU-only inference. Output validated bitwise-identical.

In Progress

Inference Engine Integration

Integrating with our inference engine for seamless multi-model serving. Expanding the GPU cluster with additional cards.

Planned

Multi-GPU Concurrent Inference

Serving multiple models simultaneously across all GPU dies with automatic scheduling.

Planned

Training Support

Fine-tuning model adapters across multiplexed GPU dies.

Future

Public Release

Packaged release with installer and configuration tools for supported hardware setups.

The Stack

Barswap is one layer in a fully integrated, zero-dependency AI infrastructure.

ML Framework

Compute primitives, quantization, training.

→

Barswap

GPU multiplexing. Hardware layer.

→

Inference Engine

Model serving. Text, audio, vision.

→

TeamIDE

Desktop IDE with built-in local AI.

Barswap is the hardware layer in a vertically integrated stack. From compute primitives to model serving to the IDE — every layer is built in-house with zero external dependencies.

Stay Updated

This is a solo project. Your support helps cover hardware costs and keeps development going.

Support the Project Back to Labs