AI News on June 13, 2026

Meituan Showcases AI Innovations at ACL 2026: Advancing Large Model Evaluation and Inference Optimization
Research Breakthrough

Meituan Showcases AI Innovations at ACL 2026: Advancing Large Model Evaluation and Inference Optimization

Meituan's technical team has announced the acceptance of six research papers at ACL 2026, a premier international conference for computational linguistics and natural language processing. These papers represent significant advancements in the field of AI, covering a diverse range of technical directions including large-scale model evaluation, complex process reasoning, and competition-level mathematical thinking optimization. Additionally, the research explores reinforcement learning optimization and generative recommendation systems. This selection underscores Meituan's strategic focus on building a new paradigm for generative AI, emphasizing both the rigorous assessment of model capabilities and the enhancement of inference efficiency for complex tasks.

美团技术团队
LongCat-Video-Avatar 1.5 Open-Sourced: Advancing Digital Human Video Generation to Commercial-Grade Applications
Open Source

LongCat-Video-Avatar 1.5 Open-Sourced: Advancing Digital Human Video Generation to Commercial-Grade Applications

Meituan's technical team has officially open-sourced LongCat-Video-Avatar 1.5, a significant upgrade designed to bridge the gap between experimental research and commercial-grade digital human applications. This latest version introduces comprehensive improvements in lip-sync accuracy, physical plausibility, and long-video stability. Furthermore, the model now supports multi-person interactions and features optimized inference efficiency. By moving beyond high-fidelity research (SOTA) to a practical, production-ready tool, LongCat-Video-Avatar 1.5 is capable of generating natural, high-quality content even in complex commercial environments. This release marks a transition for digital human technology from controlled experimental settings to diverse, real-world scenarios, offering a robust solution for personalized and scalable video content creation.

美团技术团队
Meituan LongCat Team Releases General 365 Benchmark Revealing Reasoning Gaps in Leading AI Models
Industry News

Meituan LongCat Team Releases General 365 Benchmark Revealing Reasoning Gaps in Leading AI Models

The Meituan LongCat team has officially introduced General 365, a new evaluation benchmark designed to test the reasoning capabilities of large language models. In a recent assessment of 26 mainstream models, the benchmark revealed a significant performance gap across the industry. Gemini 3 Pro, currently identified as the strongest model in the test, achieved an accuracy rate of 62.8%. However, the results indicate a broader struggle within the field, as the vast majority of the 26 models tested failed to reach the 60% accuracy threshold, which is considered the passing mark. This release by Meituan's technical team establishes a new standard for measuring AI reasoning, highlighting that even top-tier models have substantial room for improvement in complex cognitive tasks.

美团技术团队
Managing AI Coding Through Agent Evaluation: A 310,000-Line Code Refactoring Case Study
Industry News

Managing AI Coding Through Agent Evaluation: A 310,000-Line Code Refactoring Case Study

As AI-generated code begins to account for over 90% of system development, the primary challenge shifts from increasing coding speed to managing and constraining AI output. Meituan's technical team has shared a comprehensive practice involving the refactoring of 310,000 lines of code using an 'Agent evaluation' mindset. By implementing a structured framework—including technical debt sorting, rule construction, standardized operating procedures (SOP), and a Pre-PR (Pull Request) mechanism—the team successfully transitioned code refactoring from a high-cost, specialized project into a sustainable, daily iterative process. This approach addresses the risk of AI-driven development amplifying system chaos and emphasizes the necessity of unified standards in the era of AI-native programming.

美团技术团队
LARYBench Released: A New Benchmark Defining the ImageNet for Embodied Action Representation and Generalization
Research Breakthrough

LARYBench Released: A New Benchmark Defining the ImageNet for Embodied Action Representation and Generalization

The Meituan Technical Team has officially introduced LARYBench (Latent Action Representation Yielding Benchmark), a systematic evaluation framework designed to guide the learning of general latent action representations from large-scale visual data. Positioned as the 'ImageNet' for the embodied AI field, LARYBench provides a standardized way to measure how well models can understand and execute actions. The benchmark's initial experimental results reveal a significant shift in AI development: general-purpose vision models consistently outperform specialized embodied AI expert models in both action generalization and control precision. Furthermore, the research confirms that sophisticated embodied action representations can naturally emerge from training on extensive human video datasets, offering a scalable path for future robotic intelligence and autonomous systems.

美团技术团队
Meituan LongCat-AudioDiT: Redefining Zero-Shot Voice Cloning by Eliminating Intermediate Mel-Spectrogram Representations in TTS
Research Breakthrough

Meituan LongCat-AudioDiT: Redefining Zero-Shot Voice Cloning by Eliminating Intermediate Mel-Spectrogram Representations in TTS

Meituan's LongCat team has unveiled LongCat-AudioDiT, a novel model that advances the state of zero-shot Text-to-Speech (TTS) voice cloning. The core innovation lies in its departure from traditional intermediate representations, such as Mel-spectrograms, which often introduce cascade errors during the synthesis process. Instead, LongCat-AudioDiT utilizes a diffusion-based architecture that operates directly within the waveform latent space. By learning the fundamental patterns of sound without intermediate steps, the model aims to achieve higher fidelity and more accurate voice replication. This technical breakthrough addresses long-standing bottlenecks in audio generation, positioning LongCat-AudioDiT as a significant development in the field of AI-driven voice synthesis and zero-shot cloning technology.

美团技术团队
Meituan Technical Team Open-Sources LongCat-Flash-Prover to Advance Rigorous AI Mathematical Theorem Proving
Open Source

Meituan Technical Team Open-Sources LongCat-Flash-Prover to Advance Rigorous AI Mathematical Theorem Proving

Meituan's technical team has announced the open-source release of LongCat-Flash-Prover, a specialized AI model designed for mathematical formalization and theorem proving. Unlike traditional AI models that focus primarily on providing correct numerical answers, LongCat-Flash-Prover addresses the critical need for logical rigor in complex reasoning. Mathematical theorem proving requires an uncompromising logical chain where even minor linguistic ambiguities can invalidate a proof. By transitioning from "guessing answers" to "rigorous proving," this model aims to solve the challenges of complex reasoning in AI. This release marks a significant step in moving AI capabilities beyond simple calculation toward structured, formal mathematical validation, providing the community with a tool dedicated to the strict requirements of formal logic.

美团技术团队
Meituan Open-Sources LongCat-Next: A Native Multimodal Model for Physical World AI Perception
Open Source

Meituan Open-Sources LongCat-Next: A Native Multimodal Model for Physical World AI Perception

Meituan's technical team has officially announced the open-source release of LongCat-Next, a native multimodal model designed to bridge the gap between artificial intelligence and the physical world. By treating vision and speech as "native languages" rather than secondary inputs, LongCat-Next represents a significant step toward embodied intelligence. The release includes the core model and its specialized discrete tokenizer, aimed at providing developers with the tools necessary to build AI systems that can perceive, understand, and interact with real-world environments. This move underscores Meituan's commitment to advancing AI capabilities in physical spaces, offering a foundation for future innovations in how machines interpret and act upon visual and auditory data.

美团技术团队
Meituan BI Evolution: Building a Next-Generation Architecture with Metrics Platforms and Enhanced Calculation Engines
Industry News

Meituan BI Evolution: Building a Next-Generation Architecture with Metrics Platforms and Enhanced Calculation Engines

Meituan's data platform team has pioneered a new generation of Business Intelligence (BI) architecture, placing a centralized metrics platform at its core. This strategic shift addresses critical limitations found in traditional BI systems, which often suffer from inconsistent data definitions—commonly known as "data caliber confusion"—and sluggish query performance when handling personalized datasets. By developing and implementing two primary technical capabilities, automatic semantics and enhanced calculation, Meituan has successfully streamlined its data processing workflows. This evolution marks a significant transition from dataset-driven analytics to a more robust, metrics-centric model, ensuring higher data reliability and faster insights for the organization's diverse business operations. The practice underscores Meituan's commitment to solving complex data engineering challenges through architectural innovation.

美团技术团队
OpenMed: The Rise of Local-First Open Source Medical AI on GitHub
Open Source

OpenMed: The Rise of Local-First Open Source Medical AI on GitHub

OpenMed, a new initiative by developer maziyarpanahi, has emerged as a significant open-source project in the medical AI space. Positioned as a "local-first" solution, OpenMed prioritizes data privacy and decentralized processing, addressing critical concerns in healthcare technology. Recently gaining traction on GitHub Trending, the project represents a shift toward transparent, accessible, and secure AI tools for medical applications. By focusing on local execution, OpenMed aims to provide healthcare professionals with powerful AI capabilities without the inherent privacy risks of cloud-based data transmission. This analysis explores the core philosophy of the project and its potential role in the evolving landscape of open-source healthcare technology.

GitHub Trending
PM-Skills: A Comprehensive Marketplace of Over 100 AI Agent Skills and Plugins for Product Management
Open Source

PM-Skills: A Comprehensive Marketplace of Over 100 AI Agent Skills and Plugins for Product Management

The 'pm-skills' repository, recently trending on GitHub and authored by phuryn, offers a robust marketplace featuring over 100 intelligent agent skills, commands, and plugins specifically designed for product managers. This resource serves as a centralized hub for AI-driven tools that span the entire product development lifecycle, including discovery, strategy, execution, launch, and growth. By providing a diverse array of specialized AI capabilities, the project aims to empower product professionals to automate routine tasks and apply intelligent analysis to complex strategic decisions. As AI continues to reshape the landscape of software development and management, repositories like pm-skills provide the necessary infrastructure for PMs to transition into AI-enhanced workflows, ensuring efficiency and data-driven precision from the initial ideation phase to post-launch scaling.

GitHub Trending
Comprehensive Collection of System Prompts and Models for Leading AI Tools Surfaces on GitHub
Industry News

Comprehensive Collection of System Prompts and Models for Leading AI Tools Surfaces on GitHub

A significant new repository titled 'system-prompts-and-models-of-ai-tools' has emerged on GitHub, curated by user x1xhlol. This project serves as a centralized documentation hub for the system prompts and underlying model configurations of a vast array of prominent AI applications. The collection includes high-profile tools such as Cursor, Devin AI, Perplexity, and NotionAI, alongside specialized development environments like Augment Code, Windsurf, and Replit. By aggregating the operational logic and instructional frameworks for both proprietary and open-source AI systems—including v0, Claude Code, and VSCode Agent—the repository provides a rare look into the prompt engineering strategies that drive modern AI-assisted coding, search, and productivity platforms. This release highlights a growing trend toward transparency and community-driven analysis within the AI development ecosystem.

GitHub Trending
NVIDIA Introduces SkillSpector: A Dedicated Security Scanner for AI Agent Skills and Vulnerability Detection
Open Source

NVIDIA Introduces SkillSpector: A Dedicated Security Scanner for AI Agent Skills and Vulnerability Detection

NVIDIA has unveiled SkillSpector, a specialized security tool designed to scan and secure AI agent skills. As autonomous AI agents increasingly rely on modular 'skills' to perform complex tasks, the potential for security breaches grows. SkillSpector addresses this by identifying vulnerabilities, malicious patterns, and inherent security risks within these agentic capabilities. By providing a dedicated scanner, NVIDIA aims to bolster the safety and reliability of AI-driven workflows. This release highlights a critical shift toward proactive security in the AI ecosystem, ensuring that the tools agents use do not become vectors for attacks. The tool is positioned as an essential resource for developers looking to audit the integrity of their AI agents before deployment in sensitive or production environments.

GitHub Trending
Meta's New AI Unit Faces Internal Turmoil as Engineers Describe Working Conditions as Soul-Crushing
Industry News

Meta's New AI Unit Faces Internal Turmoil as Engineers Describe Working Conditions as Soul-Crushing

A recent report from TechCrunch AI reveals significant internal distress within Meta's newly formed AI division. The unit, which was established only months ago and currently employs approximately 6,500 people, is reportedly on the brink of a revolt. Engineering staff within the organization have characterized the work environment in extreme terms, describing it as a "soul-crushing gulag." This development suggests a deep-seated cultural or operational crisis within one of the tech industry's most critical AI initiatives. As Meta continues to scale its artificial intelligence capabilities, the reported dissatisfaction among its massive engineering workforce highlights potential challenges in management and employee retention during rapid organizational expansion.

TechCrunch AI
Autonomous AI Agent Discovers 21 Zero-Day Vulnerabilities in FFmpeg Media Library Following Google and Anthropic Audits
Research Breakthrough

Autonomous AI Agent Discovers 21 Zero-Day Vulnerabilities in FFmpeg Media Library Following Google and Anthropic Audits

A production autonomous security agent developed by depthfirst has identified 21 previously unknown zero-day vulnerabilities within FFmpeg, a critical media processing library used globally. This discovery follows recent security analyses by Google’s Big Sleep team and Anthropic’s Mythos model. The depthfirst agent not only identified these flaws—some of which have existed in the codebase for up to 20 years—but also produced concrete, reproducible Proof of Concept (PoC) inputs and demonstrated a Remote Code Execution (RCE) exploit primitive. Operating at a significantly lower cost than traditional methods ($1,000 vs. $10,000), this breakthrough highlights the increasing capability of AI-driven security systems to audit complex, hardened C codebases that underpin modern digital infrastructure.

Hacker News
NVIDIA Blackwell Ultra NVL72 Sets Performance Record in Industry-First Agentic AI Benchmark AgentPerf
Industry News

NVIDIA Blackwell Ultra NVL72 Sets Performance Record in Industry-First Agentic AI Benchmark AgentPerf

NVIDIA has announced that its Blackwell Ultra NVL72 platform has secured a leading position in the inaugural AgentPerf benchmark, the industry's first standardized test for agentic AI infrastructure. Developed by Artificial Analysis, AgentPerf provides a comprehensive framework for developers, enterprises, and infrastructure providers to compare system performance across agentic AI workloads. In the first round of published results, the NVIDIA Blackwell Ultra NVL72 demonstrated exceptional efficiency, running 20x more agents per megawatt compared to previous NVIDIA systems. This benchmark marks a significant milestone in AI infrastructure evaluation, offering a clear metric for power efficiency and throughput as the industry shifts toward autonomous agentic applications.

NVIDIA Newsroom
Google Files Lawsuit Against Chinese Cybercrime Group Outsider Enterprise for AI-Driven Scam Campaign
Industry News

Google Files Lawsuit Against Chinese Cybercrime Group Outsider Enterprise for AI-Driven Scam Campaign

Google has initiated legal action against a Chinese cybercrime organization known as "Outsider Enterprise." The group is accused of leveraging artificial intelligence to orchestrate a massive scam campaign that targeted hundreds of thousands of individuals. According to the tech giant, the operation was highly efficient, managing to dispatch approximately 2.5 million fraudulent text messages within a brief two-week window. This lawsuit highlights the growing concern over the use of AI in cybercriminal activities and Google's proactive stance in combating large-scale digital fraud. The case underscores the scale at which modern cybercrime operations can function when utilizing automated technologies to reach a vast audience in a short period, marking a significant legal confrontation in the realm of AI-enhanced security threats.

TechCrunch AI
Google Research Explores AI Integration to Enhance User Understanding of Various Skin Conditions
Industry News

Google Research Explores AI Integration to Enhance User Understanding of Various Skin Conditions

Google Research has announced an investigation into the role of artificial intelligence in assisting users with the understanding of skin conditions. Categorized under Health & Bioscience, this research initiative focuses on bridging the gap between complex dermatological information and user-centric health literacy. By exploring how AI can interpret and present data regarding skin health, the project aims to empower individuals with clearer insights into their conditions. While the research is ongoing, the focus remains on the potential for AI to serve as a supportive educational tool within the bioscience sector, highlighting a significant step toward integrating advanced computational models into personal health management and dermatological awareness.

Google Research Blog
Mistral Rumored to Raise €3 Billion at €20 Billion Valuation as AI Competition Intensifies
Funding

Mistral Rumored to Raise €3 Billion at €20 Billion Valuation as AI Competition Intensifies

French artificial intelligence startup Mistral is reportedly in discussions to raise €3 billion in a new funding round. This significant capital injection is expected to value the company at approximately €20 billion (roughly $23.15 billion). If finalized, this valuation would represent a near doubling of the company's previous Series C valuation, which stood at €11.7 billion. The rumored deal highlights the massive investor appetite for high-growth AI firms and positions Mistral as a primary European competitor in the global large language model market. The move underscores the escalating costs and capital requirements necessary to compete at the highest levels of generative AI development.

TechCrunch AI
High-Performance Local Coding Agent on macOS: Leveraging Gemma 4 and Multi-Token Prediction
Industry News

High-Performance Local Coding Agent on macOS: Leveraging Gemma 4 and Multi-Token Prediction

This technical analysis details the successful implementation of a high-speed local coding agent on macOS, specifically utilizing the Gemma 4 26B-A4B model. By integrating llama.cpp with Metal acceleration and the new Multi-Token Prediction (MTP) update, the setup achieves usable real-time performance on an Apple M1 Max. The configuration addresses common developer pain points such as internet reliability and the need for multimodal capabilities, allowing the agent to process screenshots of its own output. With a generation speed of approximately 58.2 tokens per second in baseline tests and significant gains from speculative decoding via an MTP draft model, this setup provides a robust, OpenAI-compatible local alternative for intensive coding tasks and tool-based agent workflows.

Hacker News
The Evolution of Siri: From 'Utterly Disastrous' to a Competitive AI Assistant
Industry News

The Evolution of Siri: From 'Utterly Disastrous' to a Competitive AI Assistant

For over fifteen years, Apple's Siri has occupied a precarious position in the tech world, fluctuating between being marginally useful and functionally unreliable. Users have long expressed frustration over its inability to perform even the most basic tasks, such as setting timers. However, a significant turning point has arrived. According to a recent report by David Pierce for The Verge, Apple has released a new version of Siri that marks a radical departure from its troubled past. This update suggests a major overhaul in Siri's capabilities, potentially transforming it into the high-performing AI assistant users have expected for over a decade. The analysis explores the historical context of Siri's failures and the implications of this 'wild' new version that aims to finally make Siri 'good.'

The Verge
SpaceX, Anthropic, and OpenAI’s Hot IPO Summer: The Rise of the MANGOS Era
Industry News

SpaceX, Anthropic, and OpenAI’s Hot IPO Summer: The Rise of the MANGOS Era

The financial landscape is witnessing a seismic shift as the traditional FAANG dominance yields to a new powerhouse collective known as MANGOS. Comprising Meta (or Microsoft), Anthropic, Nvidia, Google, OpenAI, and SpaceX, this group represents the new vanguard of technological and economic influence. As the IPO market returns to vibrancy in mid-2026, three of these titans—SpaceX, Anthropic, and OpenAI—are preparing for simultaneous public debuts. This concentrated window of initial public offerings serves as a critical stress test for global investors and market valuations. The transition highlights a broader evolution in the tech sector, moving from social media and consumer electronics toward a future defined by artificial intelligence and aerospace exploration.

TechCrunch AI