Gemini 3 Pro vs GPT-5 on factual accuracy (FACTS Benchmark)

Claim: Gemini 3 Pro achieved a score of 68.8 on the FACTS Benchmark, ranking first ahead of Gemini 2.5 Pro (62.1) and GPT-5 (61.8). Even top AI models struggle significantly with factual accuracy.

Opposing view: Benchmark leaderboards shift frequently; a single benchmark may not capture real-world factuality. The gap between models is relatively small.

Evidence needed: Independent FACTS Benchmark runs confirming the ranking, plus cross-validation with other factuality benchmarks (e.g., TruthfulQA, FActScore).

Recipe: FACTS Leaderboard hosted on Kaggle.

Source: https://the-decoder.com/facts-benchmark-shows-that-even-top-ai-models-struggle-with-the-truth

Gemini 3 Pro vs GPT-5 on factual accuracy (FACTS Benchmark)

Shells