Back to List
TechnologyAIChipsInnovation

OpenAI Partners with Cerebras for 'Near-Instant' Code Generation with GPT-5.3-Codex-Spark, Diversifying Beyond Nvidia

OpenAI has launched GPT-5.3-Codex-Spark, a new coding model designed for near-instantaneous response times, marking its first major inference partnership outside of its traditional Nvidia-dominated infrastructure. This model runs on hardware from Cerebras Systems, a chipmaker specializing in low-latency AI workloads. The move comes as OpenAI navigates a complex relationship with Nvidia, faces criticism over ChatGPT ads, secures a Pentagon contract, and experiences internal organizational changes. While an OpenAI spokesperson stated that GPUs remain foundational, Cerebras complements these by excelling in workflows requiring extremely low latency, enhancing real-time coding experiences. Codex-Spark is OpenAI's first model built for real-time coding collaboration, claiming over 1000 tokens per second on ultra-low latency hardware, though specific latency metrics were not provided.

VentureBeat

OpenAI on Thursday launched GPT-5.3-Codex-Spark, a stripped-down coding model engineered for near-instantaneous response times. This deployment signifies the company's first significant inference partnership outside its traditional Nvidia-dominated infrastructure. The model operates on hardware provided by Cerebras Systems, a Sunnyvale-based chipmaker renowned for its wafer-scale processors that specialize in low-latency AI workloads.

This partnership emerges at a critical juncture for OpenAI. The company is currently navigating a strained relationship with its long-standing chip supplier, Nvidia. Concurrently, it faces increasing criticism regarding its decision to introduce advertisements into ChatGPT, has recently announced a Pentagon contract, and is experiencing internal organizational upheaval, including the disbandment of a safety-focused team and the resignation of at least one researcher in protest.

An OpenAI spokesperson clarified the strategic importance of this new collaboration to VentureBeat, stating, "GPUs remain foundational across our training and inference pipelines and deliver the most cost effective tokens for broad usage." The spokesperson added, "Cerebras complements that foundation by excelling at workflows that demand extremely low latency, tightening the end-to-end loop so use cases such as real-time coding in Codex feel more responsive as you iterate." This careful articulation, emphasizing the foundational role of GPUs while positioning Cerebras as a complement, highlights OpenAI's delicate balancing act as it diversifies its chip suppliers without alienating Nvidia, which remains the dominant force in AI accelerators.

OpenAI acknowledges that these speed gains come with certain capability tradeoffs, which the company believes developers will accept. Codex-Spark is presented as OpenAI's inaugural model specifically designed for real-time coding collaboration. The company asserts that the model can deliver more than 1000 tokens per second when served on ultra-low latency hardware. However, OpenAI declined to provide specific latency metrics, such as time-to-first-token figures, only stating that "Codex-Spark is optimized to feel near-instant."

Related News

Technology

Seerr: Open-Source Media Request and Discovery Manager for Jellyfin, Plex, and Emby Now Trending on GitHub

Seerr, an open-source media request and discovery manager, has gained attention on GitHub Trending. This tool is designed to integrate with popular media servers such as Jellyfin, Plex, and Emby, providing users with enhanced capabilities for managing and discovering media content. The project is developed by the seerr-team and was published on February 18, 2026.

Technology

Nautilus_Trader: High-Performance Algorithmic Trading Platform and Event-Driven Backtester Trends on GitHub

Nautilus_Trader, developed by nautechsystems, is gaining traction on GitHub Trending as a high-performance algorithmic trading platform. It also features an event-driven backtester, providing a robust solution for developing and testing trading strategies. The project, published on February 18, 2026, is accessible via its GitHub repository.

Technology

gogcli: Command-Line Interface for Google Suite - Manage Gmail, GCal, GDrive, and GContacts from Your Terminal

gogcli is a new command-line interface (CLI) tool designed to bring the power of Google Suite directly to your terminal. Developed by steipete, this utility allows users to manage various Google services, including Gmail, Google Calendar (GCal), Google Drive (GDrive), and Google Contacts (GContacts), all from a unified command-line environment. The project, trending on GitHub, aims to provide a streamlined way to interact with essential Google services without leaving the terminal.