NAME
k10s — GPU-aware Kubernetes TUI. See which GPUs are working, which are idle, and why. Vim keybindings. Single binary.
SYNOPSIS
INFO
DESCRIPTION
GPU-aware Kubernetes TUI. See which GPUs are working, which are idle, and why. Vim keybindings. Single binary.
README
k10s: GPU-Aware Kubernetes Toolkit
k10s is two things:
- kitty - a Daemonset that lives on your kubernetes cluster that collects node-level GPU + Network diagnostics
- k10s - (kittens) a TUI that shows the ML training jobs in your cluster and surfaces ranks that are misbehaving.
The outcomes will be:
- Your idle / misbehaving GPUs are LOUD so you know if you are burning $$
- You know exactly WHY your training job is messed up (stragger ranks, oom issues, network chokes etc)
- You don't have to leave your terminal
These are problems that I have and thats the main reason to build this. If you also have these problems join our discord and consider becoming a contributor and shape this tool:
Other separate motivations for building k10s on the dev log: why build k10s?
Project Structure
src/crates/
├── kitty/ # daemonset agent
├── tui/ # tui duh
└── e2e/ # End-to-end tests
Quick Start
cargo build # Build all crates
cargo run -p kitty # Run the agent
cargo run -p tui # Run the TUI
cargo run -p e2e # Run e2e tests
cargo build --release # Optimized release build
License
Apache 2.0. See LICENSE for details.
What happened to the Go version?
There was a vibe-coded go-version of the TUI. Still available for use here: https://github.com/shvbsle/k10s/tree/archive/go-v0.4.0
It became unmaintainable so I've archived that branch and decided to hand-write this TUI again from scratch in Rust. I speak more about it here: Blog: I'm going back to writing code by hand