Skip to main content

Build a Match 3 Arcade Game Using Gemini CLI

·12 mins· loading · loading · ·
workflow & best practices ai & development gemini-cli golang codelab vibe-coding sub-agents
Daniela Petruzalek
Author
Daniela Petruzalek
Developer Relations Engineer at Google

The whole reason I became a software developer was because I loved video games as a child. I spent countless hours playing them, and I was deeply intrigued about how they were built. My father tried his best to explain about how TVs and computers worked, but it never really got into my head.

It was only during my teenage years when we finally got access to the internet that I started understanding a bit more. While normal teenagers were filling chat rooms, chasing people on ICQ, and working on their Orkut profiles, I was researching game development tutorials. Those were the good days.

Years went by and I never became a professional game developer. My career took me towards databases, data engineering, backend services, and the cloud. I do not regret my choices. Still, every now and then I wonder how it would feel to build my own indie game.

Guess what? With the rise of agentic coding, building complex applications — including games — has become so accessible that we don’t need to wonder anymore. We can build a fully functioning, cloud-deployed game today, as I am about to show you.

There are two ways you can read this article: as an aspiring game developer wanting to experiment with GenAI, or as a professional developer using game development as a fun way to learn new agentic coding skills. Whichever path you choose, during the course of this article I am going to show you two specific Gemini CLI features: plan mode and sub-agents. But before that, let’s talk a bit about technology.

How to choose the right technology for your project
#

This has been always an important decision in every software team. Should we use the tools we are familiar with? Should we follow new market tendencies? Should we build our own? Large companies usually stick to the tools they already know. To justify a change, you need a very compelling reason. That reason might come from the outside—like shifting market costs or a lack of available talent. Or it might come from the inside, such as the high cost of retraining your team to support a new stack.

Agentic coding completely shifts this dynamic. Because AI can handle the boilerplate, your choice of programming language matters much less today than your overall system architecture. For us developers, this is a huge relief. We can switch tech stacks to fit the problem without spending months learning new syntax.

You might be wondering, when the language loses importance, what stays? My answer is: the patterns. The way we structure software, not as a silo, but as a collective of systems. This works both at a macro (system design) and micro level (program design). You don’t need to know what every single line of code does, but you do need to know how the different pieces of your software interact together, and you do need to know how to steer the agent into the direction of the correct implementation.

Does this mean we can go back to writing everything in BASIC? No, because a language is never a choice on its own. A language brings with it a specific set of features and an entire ecosystem. We are bound to choosing the technology that has the best fit for what we are trying to achieve. The only thing that is not relevant anymore is the capability of the team to write the software. This can be easily mitigated with modern coding agents as long as the team has strong software engineering fundamentals.

While one criterion goes out of fashion, new ones appear. In this case, we are going to be paying close attention to how easy it is for the coding agent to generate high quality software in the target language.

For this particular project I choose Go for two main reasons: it is a lightweight language which coding agents handle pretty well (my godoctor MCP also helps!) and it has a mature open source game development ecosystem around ebitengine.

Could I have done it in Three.js? Yes. However, I really wanted to get as close to the console / arcade game experience as possible, so a compiled game is a must for me. Also, I only care about 2D, so no need for big engines like Unity or Unreal. Finally, ebitengine has commercial games published on the Nintendo store (for Nintendo Switch), which feeds into my dream of eventually publishing a game (of course, not this one).

Talking a bit of the strengths of Go: being a compiled language helps us catch a great deal of errors earlier in the development process. Python has similar game development capabilities but being interpreted means it slows down my test cycle. Additionally, Go can be natively compiled for your local machine, or compiled to WebAssembly (WASM) for the web. This means I can also deploy my game as a web service with a few small changes.

The return of the software analyst
#

While the agent is doing the heavy lifting of writing the Go code and compiling both the server and the WASM binaries, we still have strict responsibilities when it comes to design.

Software engineering is shifting. We are spending less time worrying about clever syntax and more time thinking about high-level patterns.

In a way, it feels like we are returning to the era of the old-school ‘Software Analyst’. Instead of writing every line manually, our main job is translating human requirements into a precise set of instructions so the AI can write the actual code.

I don’t have game development experience per se, but as a gamer and an enthusiast I am familiar with the domain language used to describe what I want to achieve with my game. By grounding my prompt on certain keywords (for example, arcade game, match 3) or using well-known examples (for example, “I need a track inspired by the 16-bit and 32-bit generations of puzzle games, but with a modern twist”) I can communicate my intentions to the agent in a better way than someone trying to build a game with absolutely no gaming experience.

I’m just leaving this here to highlight a point. Even if coding itself becomes a secondary skill, the ability to describe patterns and features is still a critical software engineering skill. You must know the domain language of your field, be it backend, frontend or anything in between.

Moving from design to implementation with plan mode
#

Domain language is a start, but writing the perfect one-shot prompt is rarely feasible. As developer relations, we do use one-shot prompts all the time in demos and presentations, but what we usually don’t talk about is how often we spend hours refining that one-shot prompt before being able to show it to the public.

Crafting the perfect prompt is a mixture of art and science, and even if you have a deep understanding of the domain language there will always be gaps. Luckily, outside the world of demos and presentations we don’t need to be one-shotting anything. Also, we don’t need to work on the prompts on our own as agents can also help us with them. This is where plan mode comes to help.

In plan mode, Gemini CLI will first elaborate an implementation plan before writing any code. This creates an opportunity for you to do a back and forth conversation with the agent, refining the plan and ensuring the implementation is going in the direction you desire.

In a normal conversation with the agent, it might ask you to enter in plan mode based on the conversation flow (for example, responding to a prompt that includes a “let’s make a plan” phrase), but if you don’t want to rely on the agent to decide when to enter plan mode you can always toggle it manually with the /plan command.

In plan mode, the agent will not only elaborate an implementation plan based on your request but might also ask for one or more clarifying questions using the ask_user tool. When the plan is ready it will ask you for review and give you the opportunity to steer the plan in any direction including correcting assumptions and adding or removing features.

For example, a fairly polished - but far from perfect - prompt for my Match 3 game is shown below:

Build a Match-3 game called 'Cloud Crush' in Go using Ebitengine v2.
The entire game screen should have background.png as background.
The play area should be an 8x8 grid with white background. 
On the right side of the play area include a side panel with UI elements 
like player score and how to play instructions.
The side panel should have a solid background colour to help with readability of the UI.

Use standard GCP product logos (e.g. Compute Engine, Cloud Storage, BigQuery, etc.)
as the game gems. These logos are provided in the gcp_sprites.png file.

The logos are saved as 64x64 sprites but scale them as necessary
based on the screen resolution. Implement swapping, clearing 3+ gems, and gravity.

Use ebitengine native font rendering (size 48 for titles and size
24 for normal text) for all text and not the debug print.

The font should be monospaced (golang.org/x/image/font/gofont/gomono).
Keep the UI tidy and harmonic, e.g. centered text should always be
adjusted based on text length, not just guess based on estimates.

While this prompt covers many aspects of the game, it is common for the agent to ask for more details like “what should be the screen resolution” or “would you prefer smooth or static animations”.

Once we are happy with the level of details in the plan we can ask the agent to start building, which will exit plan mode. This part is no different from any typical coding task. After a few turns we should have a game running similar to this:

Cloud Crush Gameplay Screenshot

Automating web testing with the browser agent
#

One of the hardest things to do in game development is testing. You cannot write a standard unit test to verify every possible game state, or to check if your rendering functions are drawing the right elements to the screen. You could try, but I guarantee it would be a tedious, brittle and time consuming process.

This doesn’t mean we shouldn’t write automated tests at all, but that there are limits between what should be done with pure code and what needs human play testing. For example, unit testing algorithms like collision and path finding seems ok to me, but validating your UI across different resolutions might be better done by a human (for example, how do you unit test “is this font readable?”).

Or, at least, it was until now… a sub-agent enters the chat

With frontier models’ multimodal capabilities and some clever use of agents, we can actually automate visual verification. In the Gemini CLI, a sub-agent is a specialized persona that runs independently of the main conversation in its own context window. Sub-agents can be used to add all sorts of capabilities to your base coding workflow.

In our testing scenario, we can use an experimental agent bundled with the CLI called the @browser_agent. Because it is experimental, you need to enable it manually by editing your settings.json file. For example, this is a minimalist settings.json that enables the browser agent with a visual model:

{
  "agents": {
    "overrides": {
      "browser_agent": {
        "enabled": true
      }
    },
    "browser": {
      "visualModel": "gemini-2.5-computer-use-preview-10-2025"
    }
  }
}

Normally, the browser agent navigates a webpage by reading its accessibility tree—the hidden structure that screen readers use. However, our Match 3 game is drawn entirely on a single HTML canvas. To the accessibility tree, it just looks like a giant blank box.

This is where adding a vision model changes the game. By configuring the agent with a visualModel (like gemini-2.5-computer-use-preview-10-2025), it literally learns to see. It takes screenshots, analyzes the visual layout, and figures out the exact X and Y coordinates it needs to click on the screen.

Instead of manually clicking through the deployed Cloud Run application, you can type @browser_agent please test the live URL... to instruct it to navigate the site, play a round of the game, and take screenshots of the working screens.

This doesn’t replace human playtesting for game feel, but it automates the visual validation, proving the UI renders correctly without leaving the terminal.

Outsourcing my security anxiety
#

With the implementation working and the UI verified, we cannot forget about security.

I am not an application security expert, making me the wrong person to evaluate the security posture of a web app. However, much like agentic coding mitigated my lack of game engine experience, sub-agents can mitigate my lack of security expertise. As an orchestrator, I do not need to know every cross-site scripting vector; I only need to know how to spin up a specialist with a clean context to look for them.

We can create an isolated execution environment by defining a custom agent in a Markdown file (.gemini/agents/security-auditor.md) that can be summoned using @security_auditor.

---
name: security_auditor
description: Specialized in finding security vulnerabilities in code.
kind: local
tools:
  - read_file
  - grep_search
model: gemini-3-flash-preview
temperature: 0.2
max_turns: 10
---

You are a ruthless Security Auditor. Your job is to analyze code for potential
vulnerabilities.

Focus on:

1.  SQL Injection
2.  XSS (Cross-Site Scripting)
3.  Hardcoded credentials
4.  Unsafe file operations

When you find a vulnerability, explain it clearly and suggest a fix. Do not fix
it yourself; just report it.

We give it a specific system prompt (the body of the markdown file) and tools like read_file and grep_search (defined in the frontmatter). Because it runs in its own context loop, it does not clutter the main conversation history.

I pointed this auditor at the CloudCrush codebase to check for hardcoded credentials, unsafe file operations, and deployment risks. Even if a custom security agent does not replace a dedicated human professional, it provides a baseline layer of defence that I would otherwise lack.

A new development workflow
#

This workflow defines what I consider to be the new standard for software development. We are using agents to write the code, and actively building custom tools, skills and sub-agents to enforce our architectural and quality standards.

And for the careful readers out there, you might have noticed I was intentionally lean on the step by step instructions in this article. This is because we have an entire codelab dedicated to this experience which you can access using the link below. In this codelab you will be able to test everything discussed in this article following step by step instructions, ultimately building your own version of this match 3 game.

Codelab: Build a Match 3 Arcade Game With Gemini CLI

Of course, if you have any questions, please feel free to reach out on any of my socials.

Related

Improving Coding Agents with Help from Science

·10 mins· loading · loading
ai & development workflow & best practices agent ai golang mcp vibe-coding gemini-cli
Taming AI agents requires science, not just vibes. Discover how A/B testing and statistical rigor turn coding agents into a measurable engineering discipline.

Building Agent Skills with skill-creator

·8 mins· loading · loading
ai & development workflow & best practices gemini-cli agent-skills vibe-coding
Learn how to use Gemini CLI’s built-in skill-creator to automatically generate, refine, and structure your own custom Agent Skills using practical examples.

Mastering Agent Skills in Gemini CLI

·6 mins· loading · loading
ai & development workflow & best practices gemini-cli agent-skills mcp vibe-coding
Unlock on-demand expertise for your AI agent. Learn how to use Agent Skills in Gemini CLI to build modular, scalable, and autonomous workflows.