A Nature-published study by an international research team has found that current AI benchmarks fail to accurately measure large language models’ core capabilities. Existing tests often mix skills ...
Chinese AI labs are releasing open-weight large language models that rival or surpass leading proprietary systems on key coding benchmarks. Models like Z.ai’s GLM-5.1 and Moonshot AI’s Kimi K2.6 are ...
OpenAI on Monday released a large dataset for evaluating how well large language models answer questions related to health care. Experts lauded the open-source data and detailed evaluation rubrics, ...
A Cairo-based artificial intelligence startup has released Horus 1.0-4B, a fully open-source large language model built in Egypt that outperforms several ...
Chinese artificial intelligence developer DeepSeek today released a new series of open-source large language models. V4, as ...
NEW YORK – Bloomberg today released a research paper detailing the development of BloombergGPT TM, a new large-scale generative artificial intelligence (AI) model. This large language model (LLM) has ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results