Testing with AI

AI is great at writing tests that always pass, even when the code is broken.

The idea

If you ask an AI to "write tests for this function", it often writes Tautological Tests or Mock-heavy Tests. These tests look beautiful, give you 100% coverage, and are completely vacuous.

To verify a test is actually protecting you, you must Mutate the Code. If you introduce a deliberate bug into the logic and the test stays green, the test is lying to you.

App Code: calc_tax()

def calc_tax(price):
    return price * 0.20  # 20% tax

AI Test 1 (Vacuous)

def test_tax():
    # The AI literally re-implements the logic
    assert calc_tax(100) == 100 * 0.20

PASS

AI Test 2 (Concrete)

def test_tax():
    # The AI asserts against a hardcoded reality
    assert calc_tax(100) == 20.0

PASS

Both tests are green. You have 100% coverage. But are you safe?

How it works (Test Quality)

# The Tautological Test
# AI often looks at the source code and copies its logic into the 
# assertion. If you change the tax rate to 15%, the test will STILl
# pass because it calculates the expected value using the new code!

# The Mock-Heavy Test
# AI loves to mock everything. It will mock the database, the network,
# and the function itself. It ends up testing that "the mock was called",
# not that the system works.

# How to steer AI testing:
# "Write tests for calc_tax. Use concrete hardcoded values for assertions.
# Do NOT re-implement the logic in the test. Test edge cases (0, negative)."