Fraud checks aren’t just for payments. Find out how to verify an Airwallex job or recruiter before you share your details.

Fraud checks aren’t just for payments. Find out how to verify an Airwallex job or recruiter before you share your details.

Global

AirDev: How We Taught AI Agents to Ship Production Code

This post was originally published on Airwallex’s Engineering blog.

Agentic coding has rapidly evolved from experimental novelty to essential infrastructure. At Airwallex, we’ve been building AirDev — our internal platform for AI-powered software development — and we’re ready to share what we’ve learned.

This post covers what AirDev is, why we built it, and how we think about AI-assisted development. A follow-up post will dive deeper into the technical architecture.

Why build custom coding agents?

Airwallex powers global payments for businesses worldwide. Our engineering teams maintain a large number of services spanning payment processing, treasury management, compliance systems, and global infrastructure.

Like many fast-growing engineering organizations we faced the friction of high-stakes monotony from tasks that demand perfect execution but offer no creative problem-solving.

Examples include propagation of configuration changes across multiple services, creation of new API endpoints based on existing conventions and replication of infrastructure updates across environments.

While these tasks are critical, they are also repetitive which makes them perfect candidates for automation.

Off-the-shelf coding assistants help with individual code generation, but they don’t understand our specific patterns, deployment pipelines, or the full context of why a change is needed. We wanted agents that could operate end-to-end: receive a task, explore the codebase, implement a solution, write tests, and open a merge request. All without human intervention.

What is AirDev?

AirDev is our platform for running fully autonomous coding agents. Engineers create a task describing what they need, and AirDev agents handle the rest.

The core workflow:

  1. Task ingestion: Tasks flow in from our project management tools with full business context
  2. Repository discovery: Agents search for and identify repositories relevant to the task
  3. Codebase exploration: Agents clone repositories, analyze existing patterns, and identify relevant files
  4. Task decomposition: For complex tasks, agents break down the work into multiple steps — planning first, then execution — to ensure each step addresses the core problem
  5. Implementation: Agents write code, tests, and configuration following repository conventions
  6. Review pipeline: Changes go through our standard merge request process with human review
  7. Status tracking: Full audit trail of what was changed and why

The key insight: agents operate within the same workflows as human engineers. They follow our commit conventions, branch naming standards, and code review requirements. The merge request from an agent looks like any other MR, because it goes through the same process.

The impact

The value of AirDev extends far beyond code output. It’s changing how we operate as an engineering organization.

For engineers, it means reclaiming time. Tasks that would typically take several hours — understanding a codebase, implementing changes across multiple services, writing tests, opening merge requests — now complete autonomously. Engineers can redirect that time toward higher-leverage work: system design, complex problem-solving, and mentoring. The cumulative effect adds up to thousands of engineering hours returned to work that genuinely requires human creativity and judgment.

For the business, the implications are transformative. Features that once required weeks of engineering bandwidth can now move from idea to production in days. Requirements that would have queued behind other priorities get addressed immediately. Infrastructure improvements that teams “never had time for” are now happening continuously.

This isn’t about doing the same things faster, it’s about unlocking capacity that didn’t exist before. Teams can pursue initiatives that previously would have been deprioritized. Product velocity increases without increasing headcount. Technical debt gets addressed in parallel with feature development rather than waiting for dedicated cleanup sprints.

Perhaps most importantly, AirDev changes the economics of software development. The marginal cost of well-defined engineering tasks approaches zero. This fundamentally shifts what’s possible: improvements that weren’t worth the engineering investment suddenly become viable.

We’re still early, but the trajectory is clear. AI agents won’t replace engineers, they’ll amplify what engineering teams can accomplish.

What kinds of tasks work well?

We’ve found that agents excel at certain categories of work:

  • Configuration changes: Updating settings across multiple services consistently
  • API development: Adding new endpoints that follow established patterns
  • Infrastructure automation: Provisioning resources and updating deployment configurations
  • Test coverage: Adding unit tests for existing functionality

The common thread: well-scoped tasks with clear patterns to follow. Agents learn from the existing codebase and replicate what works.

How agents work

Each AirDev agent runs in an isolated environment with access to:

  • Repository tooling: Git operations, file exploration, code search, and language-specific tooling
  • Integration APIs: Project management, documentation, version control systems, and CI/CD pipelines — agents monitor pipeline results and fix issues based on feedback

Agents are powered by large language models and operate through a structured prompt system that encodes our engineering standards. When an agent encounters a new repository, it explores the existing patterns before making changes — examining how similar features are implemented, what testing conventions are used, and how configuration is structured.

The goal is consistency: an agent’s code should be indistinguishable from code written by a human engineer familiar with the repository.

What we’ve learned

AI Agents

Context matters more than instructions. Agents that understand why a change is needed produce better solutions than agents following detailed specifications. Business context leads to better implementation choices.

Patterns beat documentation. Our most successful agents learn from examples in the codebase, not from written guidelines. When adding a new configuration, agents examine how existing configurations are structured and replicate the pattern.

Scope determines success. Agents excel at well-defined tasks: adding fields to data models, implementing API endpoints following existing patterns, enabling configuration options. Vague requests produce vague results.

Human review remains essential. Every merge request goes through human review. Engineers verify the approach, check test coverage, and ensure edge cases are handled. This isn’t a limitation, it’s the design. Agents are collaborators, not replacements.

What’s next

AirDev is under active development. We’re exploring multi-agent coordination for features spanning multiple services, improved context sharing across related tasks, and better feedback loops from code review.

We’re also working on the hard problems: handling ambiguous requirements, recovering gracefully from errors, and knowing when to ask for clarification rather than guessing.

This post is Part 1 of a series on AirDev. Part 2 will cover more about the technical architecture in detail.

Interested in building the future of financial infrastructure? We’re hiring.

Related stories
Related jobs

Role

Team

Location