Framework for building LLM agent benchmark environments in a Python-centric way.
Crab is a comprehensive framework designed by Camel AI for building and benchmarking environments tailored for large language model (LLM) agents. The platform supports the creation of cross-platform environments, allowing for deployment across in-memory systems, Docker-hosted environments, virtual machines, or distributed physical machines. Crab provides an easy-to-use Python-centric interface for defining agent environments and actions, making it flexible for various use cases. Additionally, it includes a novel benchmarking suite that provides fine-grained evaluation metrics.