Building AI Products That Actually Help People: Lessons from Five Launches
Starting with the Problem, Not the Technology
The most common mistake in building AI products is falling in love with the technology and then searching for a problem it can solve. We made this mistake early and learned from it quickly. The products that succeeded were the ones where we started with a genuine human problem, validated that the problem was real and painful enough that people would pay for a solution, and only then asked whether AI was the right tool to address it.
When we built Sumika, our due diligence platform for Japanese real estate, we did not start with the question "what can AI do for real estate?" We started by talking to foreign investors who had lost money on Japanese property purchases because they could not access or understand the information needed to make good decisions. The problem was information asymmetry compounded by language barriers. AI turned out to be a powerful solution because natural language processing could bridge the language gap while structured analysis could organize complex property data into actionable insights. But the starting point was always the person with the problem, not the model with the capability.
Choosing the Right AI Model for the Job
Not every problem needs the most powerful model available. One of the most practical lessons we learned is that matching the model to the task is critical for both performance and economics.
For CaseCraft, our legal template platform, the core task is generating jurisdiction-specific legal documents from user inputs. This requires high accuracy, sensitivity to legal language, and the ability to adapt to specific state laws. We use the most capable models available for document generation because errors in legal documents have real consequences. The cost per generation is higher, but users are creating documents they will file in court, so reliability is non-negotiable.
In contrast, NeuraCraft, our brain training game collection, uses AI primarily for generating puzzle content and adapting difficulty levels. These tasks can tolerate more variation in output quality, and faster, less expensive models perform well. The user experience depends on variety and responsiveness, not on the precision of each individual generation.
The principle is straightforward: understand the cost of errors in your specific context, and choose the model that delivers the right accuracy-to-cost ratio. Overbuilding with the most expensive model everywhere will destroy your unit economics. Underbuilding with cheap models in high-stakes contexts will destroy user trust.
User Trust and Transparency
AI products have a trust problem. Users have been burned by products that overpromise and underdeliver, that hallucinate confidently, or that obscure how they work. Building trust requires deliberate design decisions that prioritize transparency even when it is commercially inconvenient.
In every product we build, we follow three rules for trust:
- Be explicit about what the AI can and cannot do. CaseCraft clearly states that it generates legal templates for informational purposes, not legal advice, and recommends attorney review for complex matters. This is not just a legal disclaimer; it is woven into the user experience at every decision point.
- Show your work. When Sumika generates a property analysis, it cites the data sources and explains the reasoning. Users can trace any conclusion back to the underlying information and decide for themselves whether they agree with the assessment.
- Fail gracefully. When the AI is uncertain, it says so. We designed our systems to express confidence levels and to default to conservative outputs when input data is ambiguous. A system that says "I don't have enough information to assess this risk" is more trustworthy than one that generates a confident-sounding but unsupported conclusion.
Iteration and User Feedback
No AI product ships in its ideal form on the first version. The gap between what you think users need and what they actually need is always larger than you expect. Closing that gap requires a tight feedback loop and the willingness to change direction based on what you learn.
Sacred Trails, our pilgrimage route companion app, illustrates this well. We initially focused on AI-generated route recommendations, thinking that personalized itinerary planning would be the core value proposition. User feedback told us otherwise. What users actually wanted most was practical information: temple hours, transportation connections, accommodation options, and safety tips. The AI was most valuable not for generating creative itineraries but for organizing and presenting factual information in the user's language at the moment they needed it.
We restructured the product around this insight, deprioritizing AI-generated recommendations and focusing on AI-powered information retrieval and contextual presentation. Engagement metrics improved significantly because we were solving the problem users actually had rather than the problem we assumed they had.
Pricing and Accessibility
AI products are expensive to run. Every API call has a cost, and that cost scales with usage. The temptation is to pass these costs directly to users through per-use pricing, but this creates anxiety and discourages the exploration that makes AI products valuable.
We have experimented with multiple pricing models across our products and arrived at a few principles. Subscription models work best for products with regular usage patterns. Per-transaction pricing works for high-value, low-frequency use cases like legal document generation. Free tiers are essential for building trust, because users need to experience the product before they can evaluate whether it is worth paying for.
Tsukisumi Card, our Japanese learning tool, uses a freemium model where core functionality is available for free and premium features require a subscription. This works because the free tier is genuinely useful on its own, which builds the trust and habit formation that converts users to paid plans. A free tier that feels crippled or deliberately limited does more harm than good.
What We Have Learned
Five launches have taught us that building AI products that actually help people is less about technical sophistication and more about discipline. Discipline to start with real problems. Discipline to choose the right tool for each task rather than the most impressive one. Discipline to be honest with users about what the technology can do. Discipline to listen to feedback and change direction when the evidence demands it. And discipline to price products in a way that makes them accessible to the people who need them.
At Evelyn AI, every product we build starts with the same question: will this make someone's life meaningfully better? If the answer is not a clear yes, we do not build it. Technology is only as valuable as the problems it solves, and the best AI products are the ones where users forget they are using AI at all because the experience just works.