Model Inference API - Search News

DeepSeek open sources DSpark, a new framework to speed up LLM inference by up to 85%

DSpark can make decoding faster, but acceptance quality still determines how much speed the system actually realizes.

Meituan open sources LongCat-2.0, the 1.6T, near-frontier agentic coding model that's been leading OpenRouter — trained entirely on Chinese chips

By registering the LongCat-2.0 repository under the open-source MIT License, Meituan positions the architecture with maximum ...

Tech Bytes: OpenAI and Broadcom unveil Jalapeño inference chip to power next wave of LLMs

The chip has been designed specifically for large language model inference — the stage where trained AI models generate ...

Tech Times

OpenAI’s First Custom AI Chip Targets 50% Cheaper Inference: Jalapeño Unveiled

OpenAI’s first custom AI chip Jalapeño was unveiled today in partnership with Broadcom, claiming roughly 50% lower inference ...

Tech Times

AI Inference and World Model Startups Pull $1.8B in Two Days as Foundation Models Commoditize

AI inference infrastructure investment pulled $1.8 billion in 48 hours as Baseten’s $1.5B round at a $13B valuation and ...

XMax Announces Up to Approximately US$25 million in AI API-Related Service Contracts and Expansion into GPU-as-a-Service

XMax Inc. (Nasdaq: XMAX) ("XMax" or the "Company") today announced a significant commercial milestone in its artificial ...

Fortune India

OpenAI, Broadcom unveil first custom AI inference chip; target deployment by end-2026 after nine-month development cycle

“Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the ...

China's GLM-5.2 AI model is winning over US firms, Sridhar Vembu explains why

Chinese AI model GLM-5.2 is rapidly gaining attention beyond its home market, with major US technology companies beginning to ...

The Most Expensive Part Of AI Might Not Be The Model

This matters because AI usage is growing fast. Goldman Sachs estimated that global AI infrastructure spending could reach ...

TechFinancials on MSN

OpenAI Debuts First Custom AI Chip, Built By Broadcom

OpenAI and Broadcom today unveiled Jalapeño, OpenAI’s first Intelligence Processor: an accelerator architected around ...

Upbound Launches Modelplane: The Open Source Control Plane for AI Inference

AI inference is undergoing the same transformation that cloud infrastructure experienced a decade ago. Open-weight models have expanded who runs AI — neoclouds, regulated enterprises, and AI-native ...

The Silicon Review

Best Infrastructure Setups for Running AI Models in Production

Learn how to build reliable infrastructure for AI models in production, including hosting, monitoring, containers, scaling, ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results