Benchmark Results - Search News

MUO on MSN

5 Windows benchmark tools that fit on a USB stick and need zero installation

Most are Windows-only, but a couple will follow you to Linux, too.

What AI benchmarks miss about real-world performance

Enterprise AI teams have spent years solving for compute, securing GPU allocations, negotiating cloud capacity, and ...

1mon

Microsoft’s multi-agent AI system tops Anthropic’s Mythos on cybersecurity benchmark

Microsoft's new vulnerability-scanning system, codenamed MDASH, scored 88.45% on the CyberGym benchmark, surpassing single-model systems from Anthropic and OpenAI by using more than 100 specialized AI ...

12d

RTX Spark Beats Apple M5 by 54% in Early Benchmark, Falls Just Short of M5 Pro

NVIDIA RTX Spark beats Apple M5 by 54% in early benchmarks and delivers performance surprisingly close to the M5 Pro.

29d

Microsoft MDASH Beats A Key Mythos Benchmark. Here’s Why That Matters

Microsoft MDASH outperforms Mythos Preview on the CyberGym benchmark, demonstrating improved vulnerability discovery capabilities.

TechCrunch

Benchmark is raising $170M for its latest partners-only fund

While fund sizes of many venture capital firms have ballooned into billions of dollars over the last decade, Benchmark Partners, one of Silicon Valley’s most successful investors, has stuck to raising ...

Tech Times

MiniMax M3 Open-Weight Coding Model: Frontier Claims, Unverified Benchmarks

MiniMax M3 launched June 1, 2026 with a 1-million-token context window and company-reported SWE-Bench Pro scores that edge ...

9don MSN

Benchmark raises its first-ever growth fund as part of $2B capital raise

The legendary abandons its more than 20 year tradition of keeping its funds to about $425 million.

Android Authority

Qualcomm claims to have the fastest smartphone chip ever and here's the evidence

Qualcomm’s next-gen mobile processor is here, and it looks like an absolute unit. The Snapdragon 8 Elite sports a return to a custom CPU design, a brand-new GPU architecture, and even snappier AI ...

MIT Technology Review

How to build a better AI benchmark

To fix the way we test and measure models, AI is learning tricks from social science. It’s not easy being one of Silicon Valley’s favorite benchmarks. SWE-Bench (pronounced “swee bench”) launched in ...

NextBigFuture

XAI Grok 4 Has Leading Benchmarks

XAI Grok 4 Benchmarks are showing it is the leading model. Humanity Last Exam at 35 and 45 for reasoning is a big improvement from about 21 for other top models. If these leaked Grok 4 benchmarks are ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results