Cache API - Search News

OpenAI Halves Inference Costs With Software Alone: GPUs Drop to Hundreds

OpenAI inference cost reduction cut ChatGPT guest traffic from tens of thousands of Nvidia GPUs to just a couple hundred, ...

OpenAI engineers cut ChatGPT guest traffic to a few hundred Nvidia GPUs, with no new hardware deployed.

OpenAI inference cost reduction cut ChatGPT guest traffic from tens of thousands of Nvidia GPUs to just a couple hundred, using software optimization alone. Engineers achieved more than 50% savings ...

winbuzzer.com

DeepSeek V4 To Add Peak-Hour Pricing to Its API

DeepSeek will set deepseek-v4-flash compatibility for the deepseek-chat and deepseek-reasoner application programming interface, or API, aliases before July 24 at 15:59 UTC. Around that checkpoint, ...

How I set OpenAI API usage limits to stop agent overspending and other AI billing nightmares

OpenAI API costs can spiral when agents run wild. Here's how to set spend limits, enable hard caps, and avoid surprise AI ...

Meituan open sources LongCat-2.0, the 1.6T, near-frontier agentic coding model that's been leading OpenRouter — trained entirely on Chinese chips

By registering the LongCat-2.0 repository under the open-source MIT License, Meituan positions the architecture with maximum ...

Crypto Briefing

OpenAI cuts inference costs in half with new optimization technique

OpenAI has found a way to reduce its inference costs by roughly 50%, a development that could reshape the economics of running large language models at scale. Inference is the process of actually ...

CCN on MSN

GPT-5.6 is here: OpenAI names its new AI models Sol, Terra and Luna

OpenAI launches GPT-5.6 Sol, Terra, and Luna, with broader ChatGPT and API access coming in the coming weeks. The preview is ...

Tech Times

Coinbase Cuts AI Spend 50% on Chinese Models: The Legal Risk Its CEO Didn’t Lead With

Enterprise AI cost reduction drove Coinbase to default engineers to Chinese open-weight models GLM 5.2 and Kimi K2.7 Code, ...

Cryptopolitan

Coinbase’s CEO pitches Chinese open weight AI models as solution to rising bills

Coinbase’s CEO has proposed experimenting with cheaper open-weight AI models to keep AI spending in check as token ...

Allegrow Launches Email Verification API to Improve Data Validation and Deliverability

NEW YORK, USA, June 28th, 2026, FinanceWireAllegrow today announced the launch of its Email Verification API, a ...

Windows Report

OpenAI Announces GPT-5.6 Sol, Terra, and Luna in Limited Preview

OpenAI announced GPT-5.6 Sol, Terra, and Luna in limited preview, with stronger reasoning, new pricing, and broader access coming soon.

NERDBOT

Cached at the Edge: What Global Distribution Means for QR Code Generation Speed

You request a QR code. The server generates it. You wait. That round‑trip latency matters when you are embedding codes in a ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results