Vectara
Hallucination

HHEM | Flash Update: Google Gemma

See how Google Gemma hallucinates compared to other foundation models in the Hughes Hallucination Evaluation Model (HHEM)

3 minutes readHHEM | Flash Update: Google Gemma

Following the recent release of Gemini 1.5, Google has just unveiled its latest contribution to the landscape of large language models (LLMs): Gemma, an open-source model available in 2B, 7B, and instruction fine-tuned variants. Using the open-source Hughes Hallucination Evaluation Model (HHEM), we quantify the tendency of Gemma to hallucinate when summarizing a set of facts, a key benchmark in applications that employ the Retrieval Augmented Generation (RAG) architecture.

Our updated leaderboard positions Gemma, with a hallucination rate of 7.5%, on par with Cohere’s Chat model, right below Llama2 13B. This is significantly better than Mistral 7B’s 9.4% rate, yet falls short of Llama2 7B’s 5.6%. Gemma’s answer rate of 100.0% also stands out, highlighting its suitability as a summarizer.

Moreover, the release of Gemma under the liberal “Gemma Terms of Use” is a strategic move by Google to facilitate easier integration into commercial and enterprise systems. This decision mirrors Microsoft’s earlier move with Phi 2, underscoring a significant shift towards open licensing in the LLM domain.

The table below, which reproduces the leaderboard as of February 21, 2024, shows Gemma’s performance in relation to other foundation models:

HHEM-Hughes-Hallucination-Evaluation-Model-Gemma

The implications of Gemma’s release echo the sentiment of our previous analysis: the landscape of proprietary LLM vendors is under increasing pressure to innovate and adjust pricing strategies. With Gemma, the message is clear — the race for efficiency, performance, and accessibility in LLMs is heating up, with end users standing to gain the most.

Get the HHEM on HuggingFaceGet the HHEM on HuggingFaceTo code repository
Get the HHEM on HuggingFace
Before you go...

Connect with
our Community!