HuggingFace Wed, 18 Feb 2026 16:15:45 GMT by Unknown intermediate

IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST

⚠ Summaries are AI-generated. Please read the original article for full context.

AI Summary

Ayhan Sebin Saurabh Jha Rohan Arora Daby Sow Mert Cemri Melissa Pan Ion Stoica ITBench HF Space ITBench HF Dataset MAST HF Dataset ITBench Github MAST Github IBM Research and UC Berkeley collaborated to study how agentic LLM systems break in real-world IT automation, for tasks involving incident tri

Read Full Article on HuggingFace ↗