Skip to content

Best Claude Code Skills for DevOps: Docker, Kubernetes, Terraform and CI

Claude Code Skills for IT and infra work — Docker, Kubernetes, Terraform, CI/CD, shell hardening, and incident runbooks — with example SKILL.md description lines.

MGMCSA Guru Team June 9, 2026 8 min read
Claude Code Skills for DevOps tasks listed with their SKILL.md triggers

Infrastructure work is where a coding agent either saves you an afternoon or generates a Dockerfile that looks right and quietly does the wrong thing. The difference is whether Claude knows your conventions. A Skill is how you tell it — once — and have it apply every time you touch a manifest, a module, or a pipeline.

This roundup is aimed at the IT and infra side of the house: sysadmins, platform engineers, anyone running Docker, Kubernetes, or Terraform on Windows and WSL. For each area you’ll get what the skill should encode and an example description line, since the description is what decides whether the skill actually fires. If you’re new to the format, start with Claude Code Skills explained; for the broader, non-infra list see the best Claude Code Skills for 2026.

A note before the list: a skill guides Claude, it doesn’t enforce anything. It can tell the agent to plan before applying, but real guardrails come from Claude Code’s permission settings. Keep that distinction in mind, especially around anything that touches production.

Installable DevOps skill plugins to check first

Before you write a custom skill, check whether a maintained plugin already covers the platform. Install only the rows that match your real stack.

DevOps shortlist

Installable DevOps skill plugins

9 platform picks

#1 IaC

terraform

Terraform ecosystem work and infrastructure-as-code automation.

Install

/plugin install terraform@claude-plugins-official

Docs / source

Best for: Use it when Terraform is already part of your workflow, then pair it with your own tagging, module, and state rules.

#2 Cloud platform

aws-core

General AWS application and infrastructure work.

Install

/plugin install aws-core@claude-plugins-official

Docs / source

Best for: Useful if your repos, services, or runbooks are AWS-heavy and you want platform-aware help instead of generic cloud advice.

#3 Serverless

aws-serverless

Lambda, API Gateway, and serverless workflows.

Install

/plugin install aws-serverless@claude-plugins-official

Docs / source

Best for: Best for teams whose deploy and troubleshooting work is centered on AWS serverless services.

#4 Cloud platform

azure

Azure resource inspection, deployment validation, diagnosis, and cost/security guidance.

Install

/plugin install azure@claude-plugins-official

Best for: Worth installing when Azure is your operational home and you want the agent to reason from Azure-specific context.

#5 Edge platform

cloudflare

Workers, Durable Objects, Wrangler, and Cloudflare platform work.

Install

/plugin install cloudflare@claude-plugins-official

Docs / source

Best for: Best for teams shipping apps or edge logic through Cloudflare rather than a traditional cloud runtime.

#6 Database

mongodb

MongoDB schema, query, and database-management workflows.

Install

/plugin install mongodb@claude-plugins-official

Docs / source

Best for: Useful when MongoDB is a real part of the stack and Claude needs database-specific context.

#7 Postgres app stack

prisma / neon

Postgres application stacks that use Prisma or Neon.

Install

/plugin install prisma@claude-plugins-official

Best for: Pick the one that matches your data layer; these are practical only when your schema, migrations, and hosting are already there.

Install neon instead when Neon is the platform you actually use.

#8 CI/CD

buildkite / teamcity-cli

CI/CD teams already on Buildkite or TeamCity.

Install

/plugin install buildkite@claude-plugins-official

Best for: Best when CI history, failed jobs, and pipeline structure are daily operating context.

Install teamcity-cli instead if your pipelines live in TeamCity.

#9 Incident ops

datadog / pagerduty

Incident and observability workflows.

Install

/plugin install datadog@claude-plugins-official

Best for: Use these when logs, traces, dashboards, incidents, or on-call history are part of the coding session.

Install pagerduty instead when incident response lives there.

If the platform plugin covers 80% of what you need, install it and add a small project skill for your local rules. If no plugin matches, build the custom skills below.

Docker skills

What it should encode: your base images and why, multi-stage build structure, how you pin versions, non-root user requirements, label and tag conventions, and what should never end up in an image (secrets, build caches, dev dependencies). Without this, Claude defaults to whatever’s common online, which is rarely what your shop uses.

---
name: dockerfiles
description: Use when writing or editing Dockerfiles or docker-compose files. Enforces multi-stage builds, pinned base image tags, a non-root runtime user, and no secrets in layers.
---

Use it on any repo that ships a container. The payoff is consistency across services — every image built the same way, which makes them easier to scan, patch, and reason about.

Kubernetes skills

What it should encode: your manifest conventions — resource requests and limits on every container, the labels your tooling depends on, namespace rules, probe configuration, and the safety habits you want enforced (never kubectl delete a namespace without confirmation, always --dry-run first). Manifests are verbose and easy to get subtly wrong, so a skill that knows your shape pays off fast.

---
name: k8s-manifests
description: Use when writing or editing Kubernetes YAML manifests. Requires resource limits, liveness and readiness probes, standard team labels, and dry-run before any apply.
---

Terraform skills

What it should encode: module structure, naming and tagging standards, where state lives, required variables and outputs, and the non-negotiable rule of running terraform plan and reading it before any apply. Infrastructure-as-code mistakes are expensive because they’re real resources, so the planning discipline matters more here than almost anywhere.

---
name: terraform
description: Use when writing or changing Terraform configuration. Enforces module and tagging conventions, required tags on every resource, and always running plan and reviewing it before apply.
---

Use it on any repo that manages cloud resources. Pair it with a review habit — Claude proposes the change, you read the plan, then it applies.

CI/CD and GitHub Actions skills

What it should encode: your pipeline structure, how jobs are split, caching strategy, where secrets come from (and that they never get echoed), required status checks, and your conventions for workflow file naming. CI config drifts easily across repos; a skill keeps new pipelines looking like the ones that already work.

---
name: github-actions
description: Use when writing or editing GitHub Actions workflows. Enforces pinned action versions, least-privilege permissions blocks, cached dependencies, and secrets via the secrets context only.
---

Use it whenever you add or change a pipeline. Pinning action versions and least-privilege permissions blocks are the two things most hand-written workflows skip, and a skill makes them automatic.

Shell and PowerShell hardening skills

What it should encode: safe scripting defaults. For bash, set -euo pipefail, quoted variables, and no piping untrusted input into a shell. For PowerShell, strict mode, proper error handling, and avoiding plaintext credentials. This matters doubly on Windows and WSL, where you’re often switching between both shells in one workflow.

---
name: shell-hardening
description: Use when writing bash or PowerShell scripts. Enforces strict error handling, quoted variables, no plaintext secrets, and safe handling of external input.
---
# What the skill steers Claude toward, top of every bash script
set -euo pipefail
IFS=$'\n\t'

Use it for any automation you’ll run unattended. A script that fails loudly on the first error beats one that limps forward after a failed step and corrupts something downstream.

Incident runbook skills

What it should encode: your response process — how to triage, where the dashboards are, the order of escalation, the commands you run to gather state, and what you must capture for the postmortem. A runbook skill turns “what do we do again?” at 3 a.m. into a procedure Claude can walk through with you.

---
name: incident-runbook
description: Use during incident response or when diagnosing a production outage. Walks through triage steps, state-gathering commands, escalation order, and postmortem notes to capture.
---

Use it for the systems you actually get paged on. Keep the steps in the skill in sync with reality — a stale runbook is worse than none.

Building your DevOps skill set

  • One skill per concern — Docker, Terraform, CI — with a distinct trigger
  • Write description lines that name the file type or tool
  • Commit infra skills to .claude/skills/ so the team shares them
  • State safety steps (plan before apply, dry-run first) in the body
  • Pair skills with Claude Code permission settings for real enforcement

Where to start

If you only build one, make it the area where a mistake costs the most — usually Terraform or Kubernetes. Encode the planning discipline, commit it to the repo, and let it apply on every change. Add Docker and CI skills next, since they touch every service.

Keep each skill narrow and its trigger distinct, and remember the division of labor: skills carry your conventions, settings carry your guardrails. For the foundations, see Claude Code Skills explained, and for the wider Windows-and-WSL setup that surrounds all of this, the Claude Code setup for sysadmins.

Frequently asked questions

What DevOps tasks suit a Claude Code Skill?

Anything with strong conventions and high cost for getting it wrong — Dockerfiles, Kubernetes manifests, Terraform modules, CI pipelines, shell hardening, and incident runbooks. A skill encodes your standards so the agent stops generating generic config that doesn't match your environment.

Can a Skill stop Claude from running a risky command?

A skill encodes guidance and procedure, not enforcement. It can tell Claude to always run terraform plan before apply or never delete a namespace without confirmation. For hard enforcement of permissions, use Claude Code's own settings and approval controls — a skill complements them, it doesn't replace them.

Should DevOps skills be project or user scoped?

Infrastructure conventions are usually team-wide, so put them in .claude/skills/ in the repo and commit them. Keep personal helpers — your own shell preferences — in ~/.claude/skills/ so they follow you without leaking into shared repos.

What makes a good description line for a DevOps skill?

Name the file type or tool and the trigger. 'Use when writing or editing Dockerfiles' fires reliably; 'helps with containers' doesn't. Claude matches your request against this text, so be concrete about the artifact and the situation.

Do I need separate skills per tool?

Usually yes. One skill per concern — Docker, Terraform, CI — keeps each trigger distinct and the instructions focused. A single mega-skill for all of DevOps tends to fire at the wrong time or load instructions you don't need.

Sources & further reading

Official vendor documentation referenced while writing this guide.

MG

MCSA Guru Team

IT & Systems Administration

We are working IT pros and system administrators who spend our days in Windows Server, Microsoft 365, and the wider Microsoft stack. MCSA Guru is where we write down the fixes and walkthroughs we wish we had found the first time.

MCSA Guru provides independent, educational IT guidance. Microsoft, Windows, Windows Server, Microsoft 365, Exchange, and Microsoft Teams are trademarks of Microsoft Corporation; Docker is a trademark of Docker, Inc. MCSA Guru is not affiliated with or endorsed by Microsoft or Docker. Always test changes in a safe environment before applying them in production.

Related guides

Fixing something right now?

Jump straight into the guide library or search for the exact error or task you are dealing with.