English
全部
图片
视频
地图
资讯
购物
更多
航班
旅游
酒店
搜索
笔记本
Top stories
Sports
U.S.
Local
World
Science
Technology
Entertainment
Business
More
Politics
时间不限
过去 1 小时
过去 24 小时
过去 7 天
过去 30 天
按时间排序
按相关度排序
6 天
高中生用「我的世界」评测SOTA模型!Claude暂时领先,DeepSeek紧随其后
AI频频刷新基准测试纪录,却算不清「strawberry」里到底有几个字母r,在人类看来很简单的问题却频频出错。这种反差促使创意测评兴起,例如由一名高中生开发的MC-Bench,用Minecraft方块「竞技场」模式评价AI能力。这种新的测评范式,或 ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果
今日热点
CBP officers charged
Mass protests across US
Texas AG launches probe
Retires after 13 seasons
DOJ lawyer put on leave
Former Steelers player dies
To pause US shipments
Iran currency hits record low
Senate adopts budget plan
Predicts 2025 recession
Ex-cardinal McCarrick dies
NC ballots must be verified
Temporarily released by ICE
Texas measles-related death
Karen Read appeals charges
Releases new AI models
US envoy visits Lebanon
Ordered to disburse funds
Klarna halts US IPO plans
Ja Morant fined $75K
National parks to stay open
Allows training grant cuts
Delays Switch 2 preorders
Cardinals re-sign Beachum
MI couple returns to US
Powell speaks on economy
Four space tourists return
DOJ seeks 7-year sentence
DOGE arrives at Peace Corps
Medicare proposal rejected
Israel expands operations
Trump admin sets terms
Law firms back lawsuit
Shooting death arrest
反馈