Custom Kernels for All from Codex and Claude

⚠ Summaries are AI-generated. Please read the original article for full context.

AI Summary

tl;dr: We built an agent skill that teaches coding agents how to write production CUDA kernels. Then we pointed Claude and Codex at two real targets: a diffusers pipeline and a transformers model. The agents produced working kernels for both, with correct PyTorch bindings and benchmarks, end to end.

Read Full Article on HuggingFace ↗