Speedway

DBRX

DBRX
Developer(s)Mosaic ML and Databricks team
Initial releaseMarch 27, 2024
Repositoryhttps://github.com/databricks/dbrx
LicenseDatabricks Open License
Websitehttps://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm

DBRX is an open-sourced large language model (LLM) developed by Mosaic ML team at Databricks, released on March 27, 2024.[1][2][3] It is a mixture-of-experts transformer model, with 132 billion parameters in total. 36 billion parameters (4 out of 16 experts) are active for each token.[4] The released model comes in either a base foundation model version or an instruction-tuned variant.[5]

At the time of its release, DBRX outperformed other prominent open-source models such as Meta's LLaMA 2, Mistral AI's Mixtral, and xAI's Grok, in several benchmarks ranging from language understanding, programming ability and mathematics.[4][6][7]

It was trained for 2.5 months[7] on 3,072 Nvidia H100s connected by 3.2 terabytes per second bandwidth (InfiniBand), for a training cost of $10m USD.[1]

References