Building a SaaS with AI: Day-by-Day Chronicles
This article documents my experiences in attempting to build a SaaS using the latest AI tools.
08/07/2025
Product Variant Page
Which yesterday looked like this
Progress on task #5 viewed through Linear
07/07/2025
The Stats
Let's take a quick look at the stats!
Windsurf provides me with a breakdown of my coding exploits using their IDE.
I have been exclusively using Windsurf for the last month to create https://mybike.imara.ae. Some jump-out stats:
Over 22K lines of code?!!?!?! Although that's been over a month, it's 4 days of coding! Looking at the code, there are, however, several optimisations I would make to the generated code to make it more reusable and easier to understand for a developer who needs to maintain it.
As promised, I have not touched a line of code since initiating the project, which shows with windsurf reporting 99% code generated by windsurf.
The surprising stats is:
Only 21 prompts (cascade conversations). This is mainly because I’m using Taskmaster to instruct Windsurf on what to do.
Most of the MCP calls are syncing between linear and TaskMaster.
Finally…
Credits used: 220. I’m an early bird subscriber to Windsurf and pay $15 per month for 500 prompt credits1. So I used almost half of my monthly allocation to build the site so far. That is a fantastic bargain!
However, in the image you can see in “cascade model usage” that while the vast majority is the use of “Claude 3.7 sonnet”, 2% is the use of “Claude Sonnet 4 (BYOK)2”.
Let me be clear. For coding like this, Claude Sonnet 4 is unrivalled. Its understanding of context and ability to utilise multiple concerns to derive a solution is unmatched. The 2% usage is when I noticed that Claude's 3.7 Sonnet was getting stuck, and I could not find a solution. One prompt to Claude, 4 Sonnets in every example solved the issue.
You can review the git log if you're interested, which surprisingly has only 22 commits. A prompt per commit 😉.
02/07/2025
New Design!
01/07/2025
TLDR;
Site migration looks amateur compared to wolfis.ae
Windsurf Browser changes everything about web development workflow
Created Task #17 to fix our design standards
Something quite obvious about mybike at the moment:
It doesn't look very good. The difference is evident when you put them side by side:
Wolfis has that premium e-commerce feel. The kind that makes you trust them with your money. MyBike looks like a student project.
This is where Windsurf has implemented a game-changing feature that many people aren't aware of.
Introducing Windsurf Browser:
You can summon it by prompting in Cascade (⌘L) “open windsurf browser”, which will give this option:
Pressing the button will open:
Okay, it appears to be a standard Chromium browser. That's because it includes the windsurf-debugging tool, which provides access to the browser's DOM. Windsurf takes advantage of the Chrome DevTools Protocol (CDP).
The interface has a menu that allows you to either select an element or take a screenshot. The reference or image is sent to Windsurf and appears in Cascade as reference links.
The Workflow Efficiency Winner
Instead of the usual dance, write code, switch to browser, inspect element, copy details, switch back, make changes, refresh, repeat. Instead, you can do this:
Click the component of the Windsurf Browser
Tell Cascade what you want changed
Watch it happen
I used this functionality to analyse wolfis.ae properly. Navigate to their site, inspect the key elements, and send visual references directly to Cascade. I used this method to create Task #17, which involved redesigning the UX to resemble Wolfi’s.
Welcome Task #17
Why This Matters
This addresses a request that Cursor users have been making. The ability to preview and inspect DOM elements directly in the IDE without breaking flow. Currently, developers often switch to external browsers to inspect elements and send them to chat for changes.
Windsurf Browser eliminates that friction. You stay in a flow state while building better products faster.
27/06/2025
TLDR;
Deployed site https://mybike.imara.ae,
Git repo: https://github.com/judioo/mybike
The reason Task #3 (authentication) was skipped
I started yesterday with a clear to-do list. By the end of the day, my AI project manager had done something unexpected.
Something caught my eye immediately. Taskmaster had completed tasks 1, 2, and 4, skipped task 3 entirely, and proceeded directly to task 5. Why would it do that?
Priorities, Dependencies & Complexity
Examining the task list, you can see the dependencies mapped out. Task 2 depends on Task 1, Task 3 depends on both Tasks 1 and 2, and so on. All of Task 3's dependencies were satisfied, so why was it bypassed whilst Tasks 4 and 5 were prioritised?
How exactly does Taskmaster decide what it should work on next?
Reasoning: AI Vs Human
My first instinct was that it must use some sort of scoring system. Most project management tools rely on frameworks like RICE (Reach, Impact, Confidence, Effort) or the simpler ICE method (Impact, Confidence, Ease). They use formulas that return our priority scores:
RICE Score = (Reach × Impact × Confidence) / Effort
These frameworks are everywhere because they're transparent, predictable, and easy to explain to stakeholders. "Task A scored 47.3, Task B scored 52.1, so we're doing B first." Simple.
Oddly enough, although Taskmaster doesn't use RICE, it employs a similar method to determine the priority of work.
Initial Task Generation and Prioritisation
Digging around in Taskmaster code, I found Taskmaster relies on the PRD and the LLM's analysis (e.g Claude) to infer task ordering and importance. Crucially, it captures dependencies between tasks, which implicitly sets an execution order. For example, if Task B depends on Task A, TaskMaster will list Task A as a prerequisite of Task B. In practice, tasks with no dependencies (foundational tasks) appear first, listed by lower ID numbers, and tasks that depend on them follow later. Taskmaster utilises the LLM to categorise specific tasks from "high" to "low" priority, based on the PRD context, which suggests critical must-haves versus nice-to-haves. Otherwise, the default priority is medium for all tasks.
TaskMaster AI not only prioritises tasks but also evaluates the complexity of each task, often using a research-oriented AI model (the Perplexity model) to aid in this analysis. There is a built-in complexity analysis feature that can “score” tasks and recommend breaking down those that are too large.
Selecting the Next Task
Among the ready tasks, TaskMaster sorts by the task’s priority level (High > Medium > Low). A task marked as "high" will be ranked above tasks with medium and low priorities. As mentioned, by default, all tasks are set to medium priority. However, if the user or the LLM has flagged some tasks as high priority, those will naturally be suggested first once unblocked. If multiple tasks share the same priority, TaskMaster uses secondary criteria to break ties. It considers the dependency count and the order of task IDs.
And this explains why Task #3 was skipped. All things considered, Taskmaster acts as an intelligent project manager.
A quick look at how the site looks
18/06/2025
This is going to be a fairly large project, and it's not one you can accomplish with just one prompt. LLMs are improving with larger context windows, but complex builds require coordination across multiple tasks.
Most workflow engines (N8n, Boomi, Airflow) let you manually create build processes. What I need is something that continuously orchestrates and builds the product. I need a Project Manager.
Enter TaskMaster,
Our AI's personal project manager. Organise, research, expand, prioritise, and ship tasks effortlessly. Enjoy permanent context, zero drift & instant clarity.
Partly lifted directly from their landing page propaganda spiel, but it's precisely what I need. One problem. TaskMaster is 100% terminal-based. I need UI for our important but non-existent board members.
Enter Linear!
I chose Linear as my project management and issue tracking tool due to its focus on speed, simplicity, and a developer-friendly interface, particularly for high-performing product and engineering teams. It offers a streamlined workflow and intuitive design that boosts productivity, especially for teams that find traditional tools cumbersome.
More lifted content, but it's best of breed.
The plan: TaskMaster as our Boomerang orchestrator, creating tasks and subtasks. Linear as our presentation layer - the view non-engineers would use in a team environment.
I configured Windsurf with Linear and TaskMaster-ai MCPs. TaskMaster requires a Product Requirements Document (PRD) to generate initial tasks. So, how do I write that?
Taskmaster requires a Product Requirements Document (PRD) to create its initial tasks. SO, how do I write that? In Windsurf, I prompted:migrating https://wolfis.ae/ to https://mybike.ae
This site needs to be migrated to a new stack basted in nextjs using best practices and development methods.
we need to write a plan.
analysis the site and construct a plan for migrating the entiire site. consider all the pages (use a sitemap if needed)
only consider the domain wolfis.ae any navigation off this page is out of scope
plan only. no code at this stage
The outcome was this document.
Perfect, I have our PRD now, time to start Taskmaster. Before starting, I created a bare minimum Next.js project.
Initialise Taskmaster and use the PRD.md and create tasks in both Taskmaster and Linear. The prompt also instructed Windsurf to sync all task amendments in both products.
The outcome was:
and
As you can see, Windsurf could resist but complete several tasks within the YOLO session.
With the product definition complete, I am now ready to start building the product.
After the end of the day, the output was… interesting. The AIs had decided to implement a design system first, just as a typical design agency would.
And it didn't look very good. The colour palette seemed like a random assortment of colours. However, I’ve seen the same from humans that ended up being the foundation of some good sites, so I left it for the day. Let’s see what it does tomorrow.
16/06/2025
I carve out an hour every evening for personal growth. Lately, that's meant experimenting with new AI tools beyond what I’m daily exposed to.
One of the burning questions I have is whether I can build a SaaS exclusively using AI tooling. Yesterday, I finally decided to try and started building a proof of concept exclusively using AI tooling. I was just getting started when this message from Simon Rugg appeared in one of my groups:
Sure, I know a thing or two about cycling, but it’s all from the UK. So, I decided to explore what Dubai has to offer. That’s when I discovered Wolfis an online & premise bike store. On their site, I found they provide pretty much everything a budding cyclist needs. From blogs to rental services and, of course, bikes.
Which made me start thinking. How hard would it be to replicate a site like this using solely AI tools? With this hypothesis in mind, I pivoted my project to clone Wolfi.
Introducing MyBike.
The rules are simple:
I can only use prompts to direct the AI tools. While I can prepare environments and advise on fixes, I am not allowed to correct any errors manually. I must leave the AI and its tools to resolve issues.
I can use any AI tooling to achieve the tasks.
Once prompted, I can not interfere. I must allow it to take its own path. I can add additional instructions, but these must be about the product likeness, not the methods it uses.
I began by using WindSurf as my Integrated Development Environment3 and chose my preferred model.
Renamed my virgin repo and off I go.
I hate show-offs, but I pay yearly, so I get it for less than that… sorry
BYOK (Bring Your Own Key). It’s a direct result of the OpenAI acquisition that pushed WindSurf to introduce this. As a result, I have since last month become a hybrid IDE user (Cursor for Claude 4 Sonnet, Windsurf for the rest… News flash, Cursor just changed their pricing model).
IDE might be the most appropriately named acronym ever. IDEs in their many forms are becoming the web browser of the AI revolution.