My AI Coding Stack: A 2025 Retrospective

As we close out 2025, I wanted to share my experience with the current landscape of AI coding tools. My experience might differ completely from yours depending on your stack, so let’s start there.

The Context: My Tech Stack

I am a full-stack web developer building simple, functional products. TypeScript is the only language I work comfortably in, alongside the necessary HTML/CSS.

Frontend: Next.js or Astro.
Backend: Mostly Next.js API routes. For standalone APIs, I reach for Bun or Express.
Database: Postgres with Drizzle ORM (my default for almost everything).
UI: Shadcn/ui.

I haven’t given much thought to “no-code” or “low-code” products like Lovable, Replit, Bolt, or v0. While they are fun to try out, I am not their target user. I prefer full control over my environment.

The Daily Driver: Cursor

Cursor has been my default IDE for the entire year. While my recent experience has been marred by some bugs, it honestly still offers the best-in-class auto-complete experience.

When I want to write code myself but need intelligent suggestions, Cursor is unbeatable. Its codebase indexing is superior to the competition, making context retrieval very accurate.

Plan Mode: This is great for mapping out what you want to do before implementation, though it can be buggy.
Composer-1 Model: I use this on the simple pro plan. It has no rate limits and is perfect for single-file tasks or small tweaks.

The Heavy Hitters: OpenAI vs. Claude

OpenAI Codex (GPT-5)

When OpenAI released the GPT-5 Codex model, it was the king of UI development. I used it almost exclusively for a while because of its design sensibilities.

The Good: Best UI and design capabilities.
The Bad: It is incredibly slow. It “thinks” a lot. In my experience, running tasks in Codex would take 20+ minutes on average. The latency eventually made for a poor developer experience.

Claude Code (Opus 4.5)

For most of the year, I avoided Claude Code because I was already paying for Cursor and OpenAI. But with the launch of Opus 4.5, I had to switch.

I’ve tried accessing Opus 4.5 via various tools (Antigravity, Kiro), but the native Claude Code CLI gives the best experience.

The Magic: Opus 4.5 simply does not fail. I can give it complex tasks spanning 5 to 20 files, and it implements them correctly on the first try. It is the only model I trust for deep architectural logic.
The Cost: It is expensive and has a surprisingly short context window. I often burn through the context limits in the planning phase alone.
My Rule: I almost never ask Opus to do design or UI tweaks. It’s too expensive for that. I save it for “must-work” feature implementations.

The Specialists: Gemini & Kiro

Gemini-3 (Flash & Antigravity)

There was a lot of hype on X about Gemini-3 before release, and it didn’t disappoint. It has become my go-to for visual work.

Antigravity: My initial experience with the Antigravity was horrible—it was buggy and would generate incomplete, boring landing pages. However, it has improved significantly.
Gemini-3-Flash: This is the sweet spot. It creates excellent designs and almost never hits rate limits. For static landing pages, I now prefer using Antigravity powered by Gemini-3-Flash.

Kiro (Spec-Driven Development)

I wasn’t interested in Kiro initially, but their “spec-driven” approach is useful for specific workflows. I mostly used it because I had free credits for Opus 4.5 on the platform.

Kiro shines when I need a verified, step-by-step implementation (e.g., adding a complete Auth system):

It generates a design doc (choosing libs, auth modes).
It implements the DB schema (which I verify).
It adds the API routes (which I verify).
It adds the Frontend components.

Unlike Claude Code, which creates a plan and executes everything at once, Kiro allows me to check the code at every stage.

My Workflow: End of 2025

So, how does this all fit together?

Default: I open every project in Cursor.
UI Iteration: If I need to redesign a UI and iterate 3-5 times, I use Gemini-3-Flash (via CLI or Antigravity).
Complex Features: For new feature development, Opus 4.5 via Claude Code is the default.
- I ask it to make a plan and write it to my repo.
- I read and tweak the plan manually.
- Implementation hack: If the plan is straightforward, I ask the cheaper models in Cursor to implement it to save Claude rate limits. If it’s critical to get it right immediately, I let Opus 4.5 handle the implementation.
Code Review: I use Cursor’s Bugbot. It auto-reviews PRs and is great at catching fine details, like broken links or easy-to-miss logic errors.

Before we get into it, here is my tech stack and what I do because the experiance may completly be different based on what frameworks and languages you use.

I am a web developer. I build simple products with typescript that’s the only language I can work confortably in and have to do little bit of html css. for frameworks Nextjs, Astro for building frontends, APIs if they are minimal I usually just build with Next itself. but if not possible I go with bun/express apis. I use drizzle orm almost for all my projects where database is required. postgres as my default db choice. use shadcn/ui for UI libraries.

Since I am a developer, I haven’t given much thought to products like lovable, replit, bolt, v0 etc.. though sometimes they are nice to just try things our. but I am not their regular user.

So let’s get into the IDEs, My default IDE which I have been usign for the whole year is Cursor, works best. latly my experiance is not really good because of some bugs but honestly it has the best auto-complete in IDE experiance. for example when I want to write code myself but want to get good suggestions. in this scenario cursor is the best. cursor also have the best in class indexing of your entire codebase. the plan mode is good to first just map out what you want to do and how to do it and them implement it. but it has some bugs.

When openai launched their codex model it had the best UI development capability this was when gpt-5 released and gpt-5-codex model. because of it’s UI and design abilities I was using gpt-5-codex almost exclusively. was getting good results. though this is one of the slowest models to work with. It thinks a lot, works lot slower, in my experiance I had tasks running regularly in codex that would take on average more than 20 min. which was not a good experiance at all.

While using these tools I had been reading about and trying the claude code all the time. for most of the year I didn’t try claude code because I was already paying for cursor and codex/openai because I needed the auto-complete and tab suggestions from cursor and needed the UI experties of codex. but with the launch of claude opus 4.5 I had to make that switch. I have tried opus 4.5 model using claude-code, antigravity and kiro, using claude-code gives probably the best experiance but this model comes with the cost. it has very short context window. in my workflow I had many times consumed all the context window in the planning phase itself even before I start the actual implementation. but opus 4.5 never fails. I give it tasks like complete feature implementations where changes span between 5 to 20 files, it gives me plans and implements it and it works in first try. it just works. I can say this only about opus. of cource it costs a lot you have to be careful. I almost try and avoide the urge to give tasks to opus. I almost never ask opus to to design and UI changes.

For gemini-3 was seeing lot of hype on the X about it’s design and landing page capabilities even before this was released. and it didn’t dissapoint. it is a good model. when it launched I tried it using gemini cli first and then through the antigravity. My experiance with antigravity was horibble at first. everything was breaking. when I asked it to create some landing pages it would just create one or 2 sections and stop like complete boring typical sections. but for same prompt the experiance through gemini-cli was good recently the experiance in antigravity has improved. now actually when I have static website landing pages I usually prefer to use antigravity with gemini-3-flash. I almost never get rate limitted and gemini-3-flash is actually good at designs.

Kiro I was not interested in kiro at all but sometimes ther spec driven development feature does work. and I have used it a lot becaus opus 4.5 was available for free with some credits and I wanted to give it a go. their spec driven agent does work well in some cases for example when you have to add complete auth feature to your app it first generates a design document with specifying what lib we will use what auth modes we will have etc then creates design document then implemtnts task this allows me to complete tasks one by one for example for auth it first adds the db schema changes needed and stops then I can have a look at the db changes confirm this is what I want and then start the next task to add the api routes. then again I can verify if it is implemented how I like it then it adds the frontend components so you can check as you go ahed as opposed to if I give same prompt to the claude code it will create one plan and once it starts executing it will implement all at once. so depending on how I like it or types of tasks I have sometimes I like kiro.

so these are my preferences at the end of 2025. by default I open any projects in cursor. my default IDE I have the simple pro plan thir composer-1 model has no rate limits and is good enough for 1 file tsks. If I have to redesign some UI and I want to iterate on it 3-5 times with different prompts I use gemini-3-flash using either gemini-cli or antigravity. for all complex tasks or any new feature development opus 4.5 is my default using claude-code I first ask it to make a plan write it to my repo I prefer to read it and make change to it. sometimes I ask other models in cursor to review these plans written by claude and then implenent it. if it’s simple or straight forward plan I use other models in cursor to implement because don’t want to waste claudes rate limits but if I want to get it correct in first attempt then I use opus 4.5

and the code review agents. cursor has a bugbot that auto reviews all pull requests, was also using codex reviews. bugbot works fine. it does find some fine details like link to a page that does not exist or some changes which are easy to misss.