Nobody knows which AI model will give the best result to your prompt before you prompt them all

Empirical data from Mammouth

Reprompts on LLMs

(GPT, Claude, Llama, Mistral & Gemini)

Reprompts on Image Models

(Midjourney, Dall-e3 & Stable Diffusion)

Number of LLMs solicited per prompts % of total prompts
> = 4 7%
>= 3 12%
>= 2 34%
= 1 66%
Number of AI Model solicited per prompts % of total prompts
>= 3 19%
>= 2 41%
= 1 59%

For 66% of daily LLM queries, users solicit one model

For 34% of daily LLM queries, users solicit two or more LLMs

Multi-model is even more popular with Image generation tools than LLMs

As AI models are getting more performant, the definition of the best result is becoming more subjective and less objective

—> As LLM performance will improve, the differentiation will progressively move from objective to subjective. It will make multi-llm prompting even more relevant. That’s why we released the LLM popularity Index at Mammouth.