Back to List
Managing AI Coding Through Agent Evaluation: Lessons from Meituan’s 310,000-Line Code Refactoring Project
Industry NewsAI CodingRefactoringSoftware Engineering

Managing AI Coding Through Agent Evaluation: Lessons from Meituan’s 310,000-Line Code Refactoring Project

The Meituan technical team has introduced a novel approach to managing AI-driven software development by applying Agent evaluation logic to large-scale code refactoring. With AI now capable of generating over 90% of code, the team argues that the primary challenge has shifted from generation speed to the implementation of effective constraints. Without unified standards, AI risks amplifying technical chaos. By refactoring 310,000 lines of code, Meituan demonstrated a framework involving technical debt sorting, rule construction, a standardized Refactoring SOP, and a Pre-PR mechanism. This system transforms high-cost refactoring projects into continuous, daily iterative actions. The practice highlights the necessity of moving beyond simple code generation toward a structured management model that ensures long-term system maintainability in an AI-centric development environment.

美团技术团队

Key Takeaways

  • Constraint Over Speed: In an environment where AI generates more than 90% of the code, the ability to constrain and guide the AI is more critical than the speed of code production.
  • Agent Evaluation Logic: Meituan utilizes an "Agent evaluation" mindset to manage AI coding, ensuring that the AI's output aligns with specific technical standards and architectural requirements.
  • Systematic Framework: The management approach is built on four pillars: technical debt sorting, rule construction, a standardized Refactoring SOP, and a Pre-PR (Pull Request) mechanism.
  • Continuous Integration: The methodology successfully transitions code refactoring from a high-cost, periodic "special project" into a sustainable, daily iterative process integrated into the standard development lifecycle.

In-Depth Analysis

The Challenge of AI-Generated Chaos

As AI tools become the primary authors of software code—reaching a threshold where over 90% of a system's codebase may be AI-generated—the technical landscape undergoes a fundamental shift. The Meituan technical team points out that while AI significantly accelerates the development process, it also possesses the potential to "成倍放大混乱" (multiply and amplify chaos) if left unconstrained. When multiple AI agents or human-AI collaborations produce code without a unified set of standards, the resulting technical debt can accumulate at an unprecedented rate. The core issue identified is that the bottleneck in modern software engineering is no longer how fast code can be written, but how effectively the resulting system can be governed and maintained.

The Agent Evaluation Management Framework

To address the risks of unconstrained AI coding, Meituan implemented a strategy based on "Agent evaluation thinking." This approach treats the AI coder as an autonomous agent that must be measured and restricted by a rigorous set of benchmarks. The practice, applied to a massive project involving 310,000 lines of code, relies on several key components:

  1. Technical Debt Sorting: Before refactoring can begin, the system must identify and categorize existing technical debt. This provides a roadmap for the AI to understand which areas of the codebase require the most attention.
  2. Rule Construction: Establishing clear, machine-readable rules is essential. These rules act as the boundaries within which the AI must operate, ensuring that generated code follows specific architectural and stylistic guidelines.
  3. Refactoring SOP (Standard Operating Procedure): By standardizing the steps required for refactoring, the team ensures consistency across different modules and iterations. This SOP guides the AI through the complex process of updating legacy code without introducing new regressions.
  4. Pre-PR Mechanism: The Pre-PR (Pull Request) mechanism serves as a final gatekeeper. It allows for the automated and manual review of AI-generated changes before they are merged into the main codebase, ensuring that every modification meets the established quality bars.

From Special Projects to Daily Iterations

One of the most significant outcomes of Meituan’s practice is the transformation of the refactoring process itself. Traditionally, large-scale refactoring (such as a 310,000-line project) is viewed as a high-cost, high-risk "special project" that requires dedicated time and resources. However, by leveraging AI under a structured management framework, Meituan has successfully integrated refactoring into the daily development flow. This shift allows for the continuous improvement of code quality, where technical debt is addressed incrementally during every iteration rather than being allowed to build up until it requires a massive, disruptive intervention.

Industry Impact

The methodology shared by Meituan provides a blueprint for the future of AI-assisted software engineering. As the industry moves toward "AI-native" development, the focus must shift from the tools of generation to the tools of management and evaluation. Meituan's success in refactoring 310,000 lines of code suggests that the role of the human developer is evolving into that of a "system architect" and "rule setter," who defines the constraints within which AI agents operate. This approach not only mitigates the risks of AI-driven technical debt but also sets a new standard for how large-scale enterprise systems can maintain agility and health in the age of automated programming.

Frequently Asked Questions

Question: Why is AI-generated code considered a risk for "amplifying chaos"?

AI can generate code much faster than humans can review it. Without a unified framework or strict rules, different AI prompts or models might produce inconsistent patterns, redundant logic, or architectural violations, leading to a rapid and disorganized accumulation of technical debt.

Question: What is the significance of the Pre-PR mechanism in AI coding?

The Pre-PR mechanism acts as a critical quality control layer. It ensures that AI-generated refactoring or new code is automatically validated against the project's rules and standards before it ever reaches the human review stage or the main code repository, reducing the burden on human developers and maintaining system integrity.

Question: How does Meituan's approach change the cost of code refactoring?

By using AI guided by SOPs and evaluation rules, refactoring becomes a continuous, automated, or semi-automated task. This removes the need for expensive, dedicated refactoring phases, making code maintenance a low-cost, integrated part of the daily development cycle.

Related News

Meituan LongCat Team Releases General 365 Benchmark Revealing Reasoning Gaps in Leading AI Models
Industry News

Meituan LongCat Team Releases General 365 Benchmark Revealing Reasoning Gaps in Leading AI Models

The Meituan LongCat team has officially introduced General 365, a new evaluation benchmark designed to test the reasoning capabilities of large language models. In a recent assessment of 26 mainstream models, the benchmark revealed a significant performance gap across the industry. Gemini 3 Pro, currently identified as the strongest model in the test, achieved an accuracy rate of 62.8%. However, the results indicate a broader struggle within the field, as the vast majority of the 26 models tested failed to reach the 60% accuracy threshold, which is considered the passing mark. This release by Meituan's technical team establishes a new standard for measuring AI reasoning, highlighting that even top-tier models have substantial room for improvement in complex cognitive tasks.

Managing AI Coding Through Agent Evaluation: A 310,000-Line Code Refactoring Case Study
Industry News

Managing AI Coding Through Agent Evaluation: A 310,000-Line Code Refactoring Case Study

As AI-generated code begins to account for over 90% of system development, the primary challenge shifts from increasing coding speed to managing and constraining AI output. Meituan's technical team has shared a comprehensive practice involving the refactoring of 310,000 lines of code using an 'Agent evaluation' mindset. By implementing a structured framework—including technical debt sorting, rule construction, standardized operating procedures (SOP), and a Pre-PR (Pull Request) mechanism—the team successfully transitioned code refactoring from a high-cost, specialized project into a sustainable, daily iterative process. This approach addresses the risk of AI-driven development amplifying system chaos and emphasizes the necessity of unified standards in the era of AI-native programming.

Meituan BI Evolution: Building a Next-Generation Architecture with Metrics Platforms and Enhanced Calculation Engines
Industry News

Meituan BI Evolution: Building a Next-Generation Architecture with Metrics Platforms and Enhanced Calculation Engines

Meituan's data platform team has pioneered a new generation of Business Intelligence (BI) architecture, placing a centralized metrics platform at its core. This strategic shift addresses critical limitations found in traditional BI systems, which often suffer from inconsistent data definitions—commonly known as "data caliber confusion"—and sluggish query performance when handling personalized datasets. By developing and implementing two primary technical capabilities, automatic semantics and enhanced calculation, Meituan has successfully streamlined its data processing workflows. This evolution marks a significant transition from dataset-driven analytics to a more robust, metrics-centric model, ensuring higher data reliability and faster insights for the organization's diverse business operations. The practice underscores Meituan's commitment to solving complex data engineering challenges through architectural innovation.