On The Importance Of Tests And Best Practices When Using AI Coding Agents

#ai #vibecoding #tdd #coding

Note: This article is discussing the idea of supporting AI coding agents with TDD. You can read this case study for detailed implementation of TDD within a project

Introduction

The widespread adoption of AI coding agents has introduced a methodology gap. Whether in established organizations, startups, or tech enthusiasts, developers excited about using AI coding agents often overlook the importance of TDD (Test-Driven Development) and best practices. Considering how many of these tools are designed and presented, where the chat is the main interface, many users are eager to converse with the AI coding agent as if they're talking to an experienced developer. More often than not, the excitement of using a tool that spins hundreds of lines of code in seconds, is replaced by bewilderment and frustration when the code introduces unexpected bugs.
Three years into the AI era, users encounter similar issues: The AI coding agents are too eager to change, they lose context along the way and they introduce logic which seems superfluous. New tools and new models did little to remedy the situation. This begs the question: How can we use AI coding agents effectively without falling into the whirlpool of prompting and reprompting?

The Iterative Prompting Trap

Consider a typical interaction pattern:

Developer: "Create a shopping cart that handles inventory"
AI coding agent: generates basic implementation
Developer: "It should validate stock levels before adding items"
AI coding agent: regenerates, potentially losing previous logic
Developer: "Your change breaks the checkout flow"
AI coding agent: Previous validations disappear in the new implementation

I would argue that most developers using AI coding agents for the first time encountered this pattern. Though seemingly promising at the start, the more complex a project is, the longer a developer spends inside this loop. The longer the loop continues, the higher the likelihood that the AI coding agent will lose the original context. The resulting code often becomes a patchwork of changes, leading to bugs and unexpected behavior. Furthermore, this interaction presents a cognitive focus trap for the developer. Neither working nor resting, the developer is waiting for the next response from the AI coding agent, hindering their ability to enter a flow state. More often than not, the developer finally decides that it is a better use of their time to write the code themselves, which defeats the purpose of using an AI coding agent in the first place.

Enter TDD

The Power of Test-Driven Development

Test-Driven Development (henceforth TDD) is a software development methodology that utilizes tests to lead the development process.
In a deeper sense, TDD requires developers to capture the business logic and the expectations of a system in a logical expression. This expression serves as a powerful anchor during development, as it creates a clear contract and communication.
Furthermore, it ensures that future contributors to the code can understand the intent and expectations which the code is supposed to fulfill.
When using AI coding agents, TDD can be a powerful ally. It provides a structured approach to development, allowing developers to focus on the business logic and expectations rather than getting lost in the iterative prompting trap.

Using Rules, Instructions and Restrictions

Some agreements and best practices, however, are cumbersome to implement within tests. For example, if you want your AI coding agent to follow Domain Driven Design (DDD) principles, it would be quite cumbersome to express them within tests. Another example is when different systems have different naming conventions. Consider the case of developing in TypeScript or Java, where the case convention is camelCase, but using a Postgres database where the convention is snake_case. In these cases, it is better to use rules, instructions and restrictions to guide the AI coding agent.
Modern AI-enabled IDEs such as VSCode (copilot) and Cursor allow you to include rules (as it is named in cursor) or instructions (as it is named in VSCode). For these cases, you can consider different sets of rules, for example:

Constants:
- Defines where and how environment variables and constants are used in the project.
Testing Instructions:
- Specifies what depth of testing you expect the AI coding agent to write. Would integration testing be sufficient, or do you want every component to be unit tested?
Design Patterns:
- Specifies architectural patterns and conventions for implementing features (DDD).
Dev Environment:
- Gives general context of the development environment, such as the use of Docker and devcontainers.
Entity Naming:
- Defines naming conventions across the codebase.
Project Info:
- Provides project information. For example, a GitHub issue or a JIRA ticket.
Project Structure:
- Provides a tree structure of the project, helping AI agents understand where to place new files and how to organize code.
Test Protection:
- Prohibits the modification of e2e test files by AI agents, ensuring that tests remain stable and reliable.

The Illusion of Speed

A common argument against TDD is that it slows down development. It is true that in some instances, the attitude of "Move Fast and Break Things" has its merits. For example, when you want to quickly prototype an idea or create a proof of concept, TDD might truly be an example of over-engineering. However, as a project matures, the chunks of broken code accumulate and burden the project. Furthermore, within the context of AI coding agents, developers often find themselves trapped in the iterative prompting loop mentioned above, with no clear constraints on what the AI coding agent should do and no clear success criteria. Therefore, development teams should consider investing time in persisting the project agreements and expectations using tests, rules, and best practices.

The Merit Of Accessibility

It is common for individual contributors or small teams working on a project to overlook testing. A developer working on a project for some time has a mental model of the codebase, and can often navigate it with ease. However, as a project grows and new contributors join, the lack of tests often slows down the onboarding of even experienced developers. Furthermore, when a developer "tests" their work by compiling and running the code, they are often unaware of the functionalities outside of the scope of their task. A properly written test suite, however, provides coverage for those functionalities, ensuring that changes to the codebase do not break existing features. This is especially important when using AI coding agents, as they may introduce changes that are not immediately obvious to the developer.

Conclusion: Moving Beyond Vibe Coding

Though vibe coding has its place and merits, developers encounter diminishing returns over the time they invest in prompting and reprompting their AI coding agents. As a project matures, it can significantly benefit from a structured approach to development reinforced by TDD, rules and best practices. By investing time in refining the expectations and agreements of a project, developers can create a more stable and reliable codebase. Tests in this context are not deadweight on a car, but stabilizers for a well-aimed rocket.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.