In September 2024, we launched the beta of joinstems.com - a platform where music enthusiasts and producers can access official stems from well-known tracks and share remixes. From concept to launch took less than 5 months with a team of just 1.5 developers (well, more like 1.2 if I'm being honest), reaching 10K users and generating nearly 1K remixes in our first weeks.
What makes this project fascinating is that it paralleled the rapid evolution of GenAI development tools. I started working on this project in May with GitHub Copilot for code completion, experimented with Supermaven, and ultimately found my stride with Claude and Cursor. Throughout this journey, I watched GenAI transform from a simple code completion tool into something that felt more like a collaborative development partner.
Despite the endless stream of AI demos and tutorials flooding the internet, I've noticed a lack of practical accounts about using GenAI to ship actual products. After a decade of launching various products, I wanted to share real insights from building joinstems.com - both the wins and the "well, that didn't work" moments. Let's dive into what actually worked, what didn't, and what surprised me along the way.
Brief stack description
First, let's talk stack. We went with T3 as the foundation - a modern TypeScript stack I knew well and trusted for its type safety and great DX. Here's what we're running with:
- Next.js + Tailwind CSS + NextAuth.js as base
- tRPC + tanstack-query for type-safe APIs (type safety becomes even more crucial when co-piloting with AI)
- Postgres + Prisma + Neon for database
- Mux + WavesurferJS for audio streaming & visualization (the trickiest part for AI to handle, more on that later)
- react-admin with ra-data-simple-prisma for admin panel (where GenAI really showed its muscles)
- Vercel for hosting
- Digital Ocean Spaces for file storage (after a fun S3 cost surprise 😅)
- Twilio + Resend for communications
- Sentry for error tracking
This stack turned out to be an ideal foundation for AI-assisted development, though not always in ways I expected. Type safety especially proved crucial - it helped catch those occasional AI hallucinations before they became production issues.
But the real magic happened in how these tools complemented GenAI development. The combination of Cursor's code generation with Prisma's schema-first approach and react-admin's patterns created a surprisingly powerful workflow. Let me show you concrete and tangible examples where GenAI was transformative for us.
Where GenAI shined
Let's start with the most immediate win - crushing boilerplate code.
Admin Panel generation
We chose react-admin with ra-data-simple-prisma as our foundation, which already provides solid abstractions. Adding GenAI to this stack pretty much 10x things from there.
After establishing the initial architecture, the workflow became remarkably straightforward:
- Define the model in Prisma schema (and even that could be directly fed as part of instruction)
- Provide context to Claude/Cursor with a prompt like:
For admin panel, I am using react-admin and ra-data-simple-prisma
Master file is @AdminApp.tsx, example of entrity wrapper is @Track.tsx and supporting backend router @route.ts
Generate necessary admin pages for new Notifications model that I just created.
- Check the model in @schema.prisma
- Add route.ts file for db manipulation
- Add @Notification.tsx file for react-admin front-end handling
- Update @AdminApp.tsx and @route.ts to correctly handle navigation
The output would include:
- Prisma route handlers
- React-admin frontend components
- Navigation updates
- Type-safe implementations matching existing patterns
What's notable here is consistency and reliability. After implementing 2-3 models, the AI became remarkably accurate at maintaining our established patterns
Re-using Patterns Across the App
As an extension of the previous example, one of the most appreciated benefits of GenAI was how it accelerated pattern replication across different features. Think of it like having a developer who not only remembers every pattern you've established but can instantly adapt it to new contexts - without the usual "wait, how did we do this last time?" moments.
My favorite example in this project shows how we handled data loading and updates, combining server-side trpc, client-side infinite scroll, and optimistic updates. The implementation flows like this:
- Fetch initial set of remixes as part of server-rendering
- Pass those initial remixes to client-side component as initial data
- Load more remixes with infinite scroll
- Optimistically update remixes on user actions like upvote, bookmark
When adapting this pattern for comments, instead of manually rewriting everything, I simply showed the AI our remix implementation with a prompt like: "Here's how we handle data loading and updates for remixes. Can you adapt this pattern for comments, keeping in mind they're nested under remixes?"
The AI not only replicated the pattern but handled the nested structure naturally - a task that would have required careful manual adaptation otherwise.
This pattern replication became one of our most powerful use cases for GenAI. What started as a solution for remixes became our standard approach across multiple features, each adaptation taking much less time than before. The key was having that initial pattern well-established - once we had that, GenAI became remarkably good at maintaining consistency while handling feature-specific requirements.
Responsive Design
In my experience on this project, GenAI turned out particularly good with adopting layouts to various viewports, and typically doing that from the first try. It would often be enough to write a desktop version and then explain how I want other viewports to look like, and the model would suggest a near perfect solution on the first attempt.
For example, our remix card component on desktop displays the waveform visualization prominently with track details to the right, followed by interaction buttons (like, share, comment) underneath. When I needed to adapt this for mobile, I simply explained something along the lines of:
"For mobile viewport, I want to:
Keep waveform as the main focus but slightly reduce its height
Stack track title and artist info below it
Arrange action buttons in a compact row
Maintain tap-friendly spacing for all interactive elements"
This worked perfectly on the first try - maintaining visual hierarchy while ensuring good mobile usability. The AI understood both the design goals and our established responsive patterns, saving what would typically be multiple rounds of CSS tweaking, now reduced to a single round of minor adjustments.
Getting Unstuck
When I would feel lazy or didn't know where to start, I would often just prompt for something, and the fact of having a discussion with AI would allow me to start things. It's like having a patient collaborator who's always ready to brainstorm, even when you're not sure what you're building yet.
Instead of staring at a blank editor trying to figure out where to begin, I could start with vague prompts like: "I need to build a notification system for new comments. What are the key components we should consider?" or "Looking at our remix feed component - what would be a good first step to add sorting options?"
Even if I ended up not using most of the AI's suggestions, these conversations helped overcome that initial resistance to starting. The AI's responses would often trigger thoughts like "well, that's not exactly how I want to do it, but..." - and suddenly I'm actively problem-solving instead of procrastinating.
This turned out to be particularly valuable for those "I'll do it later" tasks like error handling or accessibility improvements. Having an AI to bounce ideas off of made it easier to tackle these less exciting but crucial aspects of development.
Where GenAI Falls Short
As powerful as GenAI proved to be in accelerating our development, it still has areas where it falls short or remains inconsistent. Here are the main issues I encountered, and how I worked around them.
Performance optimization
Unless explicitly prompted about performance considerations, current models often generate suboptimal code. In our case, this manifested most clearly in over-fetching and unoptimized Prisma queries. The AI would happily generate code that pulls entire records when we only needed specific fields, or create separate queries where a single join would have been more efficient.
In some cases, writing a single raw SQL query would be a much better solution, but unless explicitly instructed to consider performance implications, I haven't seen LLMs suggest this approach on their own.
Library-Level Challenges
If you are combining libraries, GenAI models are only as good as the libraries themselves. AI struggles when problems need to be solved at the library level rather than through configuration or usage patterns. A prime example was our integration with Wavesurfer.js for audio visualization. We needed Wavesurfer.js waveform not to stop external media element when it gets unmounted, and it took me making a patch to the library itself.
While the AI could help with basic setup and common usage patterns, when it came to core functionality issues, it kept suggesting different configuration approaches instead of looking at the library code itself. Even when the solution required modifying the library's source code, none of the GenAI models I tried ever suggested this approach, instead generating variations of the same ineffective solutions.
Version-Sensitive Libraries
One particular gotcha: AI often generates code for outdated library versions. I ran into this with react-admin, headless-ui, and tanstack-query, where if the exact library version wasn't specified, outdated syntax would frequently appear. The solution was to explicitly specify versions in our prompts, but interestingly enough, just referencing package.json doesn't always work reliably.
Consistent in Error
An interesting quirk I would often encounter: once a model makes a mistake in understanding our codebase or implementation, it tends to consistently repeat that mistake in subsequent code generation. This creates a sort of "error cascade" where even after being corrected, during different points in the chat, the mistake pattern just gets repeated over and over again or at least not consistently fixed.
For example, I noticed this frequently with syntax inconsistency. Let's say the GenAI model generated code using an outdated version of syntax. After applying the change, you as a developer would correct the syntax and/or instruct the model to use the specific version. Yet, at some point down the chat, the outdated syntax would pop up again.
I experimented with various prompt qualities and approaches, and yet, I haven't been able to consistently resolve this behavior. Only realistic solution that I have seen is to restart the chat, which is unfortunate, as valuable context would often be lost.
Biggest Personal Takeaway
Generated code is only as good as your prompt. I came across a quote that resonated deeply with my experience: "If you can't write it yourself, you're unlikely to be able to prompt it well." This captures the essence of working with GenAI perfectly - at least at the current stage of the technology. While models will likely evolve beyond this limitation, right now the tool primarily amplifies your existing knowledge rather than replacing it.
Just as we maintain and version our codebase, I've found it crucial to store and continuously iterate on prompts. The most effective prompts often emerge through multiple refinements, and interestingly, I started using AI itself to help improve my prompts. It's a meta-learning cycle: use AI, learn what works, refine prompts, get better results.
I believe that AI prompts themselves will become an important part of intellectual property in the future. They encapsulate not just the what of development, but the how - the patterns, preferences, and accumulated knowledge that make code not just functional, but well-crafted.
Unexpected Surprise
An interesting discovery I made along the way: using Claude directly versus through Cursor yielded noticeably different results, likely due to Cursor's additional prompt optimization for cost efficiency. While both provided valuable assistance, their outputs often differed in style and approach. This wasn't necessarily a good or bad thing - more like having two different collaborators with their own strengths. I'd also recommend trying out Typing Mind and experimenting with different models through APIs to get a better understanding of how GenAI works.
Conclusions
Building joinstems.com with generative AI has been an eye-opening experience. While the technology significantly accelerated our development process, it also highlighted an important reality: we're currently in an AI-assisted software engineering phase, where AI acts more as an amplifier than a replacement. It magnifies both the strengths and weaknesses of developers, making solid engineering fundamentals more crucial than ever.
Despite the impressive capabilities shown in areas like pattern replication and responsive design, it's clear that GenAI tools aren't yet ready to ship even moderately complicated products end-to-end. They excel at specific tasks - crushing boilerplate, adapting established patterns, accelerating initial implementations - but still require careful oversight and an experienced hand to guide them toward production-ready solutions.
However, it's also clear that we're in a new era of software development. In just two years since the release of ChatGPT, the development workflow has changed forever. The speed of progress suggests this is just the beginning.
Top comments (1)
Thanks for sharing the experience!