GitHub
Open Source CLI Tool

Infracontext

Infrastructure context for humans and agents.

A CLI tool and data format for documenting your infrastructure in structured YAML files. It gives LLM agents the context they need to perform methodical server triage via SSH, using the USE method (Utilization, Saturation, Errors).

claude-code
$ /ic-triage vm:web-01 "high CPU"
Loading context for vm:web-01...
SSH connectivity validated
Running CPU diagnostics...
Checking configured services...
Finding: PHP-FPM pool consuming 94% CPU

The Problem

LLM agents already know how to troubleshoot Linux. They just don't know your infrastructure.

Without context

  • Agent checks random logs hoping to find errors
  • Runs generic commands with no direction
  • Misses application-specific paths and services
  • Repeats the same discovery every session
  • No memory of past investigations

With Infracontext

  • Agent knows which services to check first
  • Goes directly to relevant logs and endpoints
  • Understands dependencies between nodes
  • Builds on learnings from past incidents
  • Follows a methodical USE method approach

How It Works

Three components, loosely coupled

architecture
Claude Code                          .infracontext/
+---------------------------+        +---------------------------+
|                           |        | projects/prod/            |
|  /ic-triage vm:web-01     |        |   nodes/vm/web-01.yaml    |
|  "high CPU"               |        |   nodes/vm/db-01.yaml     |
|                           |  reads |   relationships.yaml      |
|  1. ic node context ------+--------+-> ssh_alias, services,    |
|  2. SSH to node           |        |   learnings, dependencies |
|  3. USE method checks     |        |                           |
|  4. Query monitoring      | writes |                           |
|  5. Record learnings -----+--------+-> new learning entry      |
|                           |        |                           |
+---------------------------+        +---------------------------+
            |
            | SSH
            v
    +---------------+       +-------------------+
    | Target server |       | Prometheus / Loki  |
    | (diagnostics) |       | CheckMK / Monit    |
    +---------------+       +-------------------+
  1. The ic CLI — manages nodes, relationships, sources, and graph analysis. Works standalone without Claude.
  2. YAML data in .infracontext/ — one file per node, committed to git, human-editable.
  3. Claude Code skill — a markdown prompt that teaches Claude how to use ic and perform triage.

Features

Everything you need to give LLM agents real infrastructure context

Source Sync

Import nodes from Proxmox VE clusters or SSH config files. Sync preserves your manual additions (ssh_alias, triage config, learnings).

Graph Analysis

Dependency graph via NetworkX. Find single points of failure, impact analysis, cycle detection, orphan nodes.

Monitoring Queries

Query Prometheus, Loki, CheckMK, and Monit directly from the CLI. Configured per-node in the YAML.

Access Tiers

Five tiers from local_only (no SSH) to remediate (can make changes). Set per-node or globally.

SOS Report Import

Import Linux sosreport archives into node learnings. Query stored SOS data for health findings and error search.

Kubernetes Import

Import clusters and nodes from kubectl with capacity, roles, readiness, and member_of relationships.

Living Documentation

Agents record findings as learnings in node YAML. Knowledge accumulates and improves future triage.

Local Overrides

Machine-specific settings (SSH aliases, source paths) in a gitignored .infracontext.local.yaml.

Node YAML

Minimum viable node for triage — just an ID and SSH alias

.infracontext/projects/prod/nodes/vm/web-01.yaml
version: "2.0"
id: "vm:web-01"
slug: web-01
type: vm
name: "Production Web Server"
ssh_alias: "acme-web01"

triage:
  services: [nginx, php8.2-fpm]
  context: |
    Peak traffic 5-7pm weekdays.
    If CPU high, check PHP-FPM pool first.

observability:
  - type: prometheus
    instance: web-prod:9100
  - type: loki
    selector: '{service_name="web-prod"}'

learnings:
  - date: "2024-01-20"
    context: "memory leak investigation"
    finding: "Laravel Telescope enabled in prod causes memory growth"
    source: agent

Keep triage hints minimal. Claude discovers standard logs and commands on its own — only document what's non-obvious about your setup.

The ic CLI

A single command-line tool for documenting infrastructure, managing relationships, and orchestrating AI-driven diagnostics. Built with Python and Typer.

ic describe Manage nodes, projects, relationships, and sync sources
ic graph Analyze dependencies, find SPOFs, detect cycles
ic query Query Prometheus, Loki, CheckMK, Monit, SOS data
ic import Import from Proxmox, SSH config, SOS reports, Kubernetes
ic doctor Validate YAML schemas and find orphan nodes
ic init Initialize .infracontext/ in your project
terminal
# Project management
ic describe project list | create | switch | delete
# Node management
ic describe node list | show | create | edit | delete
ic describe node find <query>
ic describe node context <id>
ic describe node learning <id>
# Relationships and sources
ic describe relationship list | create | wizard
ic describe source add | sync | configure
ic import ssh | sos <path> | kubectl
# Graph analysis
ic graph analyze <id> --upstream | --downstream
ic graph impact <id>
ic graph spof | cycles | orphans
# Monitoring
ic query status | prometheus | loki | checkmk | monit | sos <id>
# Maintenance
ic doctor

Access Tiers

Control what the agent is allowed to do on each node

0
local_only
No SSH. Only local context and monitoring queries.
1
collector
Run a pre-deployed collector script only.
2
unprivileged
Read-only SSH, no sudo.
3
privileged
SSH with sudo for diagnostics.
4
remediate
Can make changes (restart services, edit config).

Quick Start

Get up and running in four steps

1 Install
terminal
git clone https://github.com/sysinit-at/infracontext.git
cd infracontext
uv sync

# Shell alias (add to .zshrc/.bashrc)
alias ic='uv run --directory /path/to/infracontext ic'
2 Initialize
terminal
cd my-project
ic init
ic describe project create prod
ic describe project switch prod
3 Add nodes
terminal
# Auto-discover via SSH (in Claude Code)
/ic-collect web-prod

# Manual
ic describe node create --type vm --name "web-server"
ic describe node edit vm:web-server

# Bulk import from SSH config, Proxmox, SOS reports, or Kubernetes
ic import ssh --path ~/.ssh/config
ic import sos /path/to/sosreport
ic import kubectl
ic describe source add mypve --type proxmox
ic describe source sync mypve
4 Triage with Claude Code
terminal
# Install skills and agents
ln -s /path/to/infracontext/commands/ic-triage.md ~/.claude/commands/ic-triage.md
ln -s /path/to/infracontext/commands/ic-trace.md ~/.claude/commands/ic-trace.md
ln -s /path/to/infracontext/commands/ic-collect.md ~/.claude/commands/ic-collect.md
ln -s /path/to/infracontext/agents ~/.claude/agents/infracontext

# In Claude Code
/ic-triage vm:web-server "high CPU"
/ic-trace vm:proxy-01 "requests to /api returning 502"

Get Started

Infracontext is open source and free to use. Clone the repository to get started.