I Automated Sprint Delivery With GUI Automation (No API Costs)
I use a two-pass development workflow: Gemini builds, Opus reviews. Each sprint has a build pass, a review pass, and a retro pass. Across 14 sprints, that's 42 individual workflow executions.
Running them manually means: open Antigravity, select the workflow, wait for it to finish, open the next workflow, wait again. Per sprint, it takes about 5 minutes of babysitting. Across 14 sprints, that's over an hour of clicking and waiting.
So I built a script that does the clicking for me.
Why not use the API?
Antigravity has an API. But I already pay for the Ultra subscription ($30/month) which gives me unlimited runs through the IDE. Using the API would mean paying per token on top of that. For 42 workflow runs per project, the API cost adds up fast.
GUI automation costs nothing extra. I'm using the subscription I already pay for.
How it works
The sprint runner is a bash script that uses macOS Accessibility APIs (via osascript and AppleScript) to control Antigravity:
- Focus the Antigravity window
- Open the command palette
- Type the workflow name (e.g.,
/sprint-04-build) - Press Enter
- Wait for completion (watches for the "done" indicator)
- Send a Slack notification
- Move to the next pass (build → review → retro)
- After retro, advance to the next sprint number
run_workflow() {
local workflow=$1
# Focus Antigravity
osascript -e 'tell application "Antigravity" to activate'
sleep 1
# Open command palette and type workflow
osascript -e 'tell application "System Events" to keystroke "k" using command down'
sleep 0.5
osascript -e "tell application \"System Events\" to keystroke \"${workflow}\""
sleep 0.3
osascript -e 'tell application "System Events" to key code 36' # Enter
# Wait for completion
wait_for_done
# Notify
send_slack "✓ ${workflow} complete"
}The wait_for_done function polls the Antigravity window content every 10 seconds, looking for completion signals. It times out after 15 minutes per workflow.
The chain
For each sprint, the runner executes three workflows in sequence:
/sprint-04-build → Gemini writes code
/sprint-04-review → Opus reviews and fixes
/sprint-04-retro → Opus categorises failures, updates PATTERNS.md
After retro completes, it increments the sprint number and starts again. You can specify a range:
./sprint-runner.sh --start 1 --end 14That runs all 42 workflows. I kick it off before bed and check Slack in the morning.
Slack notifications
Between each stage, the runner posts to Slack:
[10:32 PM] Sprint 04 build complete. 12 files changed. Starting review.
[10:47 PM] Sprint 04 review complete. 3 FAILs found. Starting retro.
[10:49 PM] Sprint 04 retro complete. PATTERNS.md updated. Moving to Sprint 05.
...
[6:14 AM] All 14 sprints complete. 168 files changed. 9 patterns promoted.
When I wake up, I've got a full changelog. If something failed mid-run (build error, timeout, compilation failure), the Slack message says which sprint and stage it died at, so I can resume from that point.
The problems
GUI automation is fragile. If macOS shows a notification over the Antigravity window, the click lands on the wrong element. If the window resizes, coordinates shift. If the Mac goes to sleep (display sleep, not actual sleep), the automation pauses.
My fixes:
- Disable Do Not Disturb isn't enough — I use
caffeinateto prevent sleep - Fixed window position and size before starting
- Retry logic: if a workflow doesn't start within 10 seconds, try again
Antigravity updates can break it. If the IDE changes its UI layout, the automation scripts need updating. This has happened twice in three months. Each time it was a 10-minute fix to adjust the selectors.
No parallel execution. GUI automation is inherently serial — one workflow at a time. I can't run Sprint 4 build and Sprint 5 build simultaneously because they'd conflict in the IDE. For true parallelism, I'd need the API.
Is it worth it?
For a single project with 14 sprints, it saves about 90 minutes of manual clicking and waiting. More importantly, it runs overnight. I'm not blocked during the day waiting for builds and reviews to complete.
The script itself is about 200 lines of bash. Took an afternoon to write and test. The ROI hit positive after the second project.
There's something satisfying about watching a $30/month IDE subscription do the work of what would otherwise be hundreds of dollars in API calls. It's held together with AppleScript and sleep timers, and it occasionally breaks when macOS decides to be helpful. But it works. And when I wake up to 14 completed sprints and a PATTERNS.md full of new entries, the duct-tape approach feels justified.