AI Application Engineer

I ship LLM features into production SaaS, not prototypes.

Python backend, multi-LLM pipelines, full-stack delivery. The model call is the easy part. I build everything around it that survives production: async orchestration, cost control, billing, mobile, infrastructure.

Virko Kask

Selected work

Three things I shipped, and what they taught me.

Live products, not toy projects. Each one is a story about the part that is actually hard: making AI reliable, affordable and shippable.

Solo, end to end

Typeza

A production AI SaaS built alone, from database to App Store.

Problem

Most AI features die between the demo and production. Building a full AI SaaS solo means owning the whole chain, not just the model call: the async pipeline, the billing, the mobile app, the infrastructure that survives a deploy.

Approach

Each analysis fans out 15 LLM calls at once through a Celery and Redis pipeline, then a consensus step weighs whatever returns instead of failing on a slow model. Real-time transcription runs over WebSockets. React web, native SwiftUI iOS, Stripe and Apple IAP, all on AWS ECS managed with Terraform.

15
parallel LLM calls per analysis
< 3 min
end-to-end processing
4
ECS services, 100% Terraform
DjangoCeleryRedisOpenRouterReactSwiftUIAWS ECSTerraform
Architecture
React / iOS
client
Django API
DRF
enqueue
Celery + Redis
task queue
fan-out
LLM call x15
OpenRouter
15 parallel calls, < 3 min
weigh
Consensus
partial-tolerant
Report
structured
WebSocket
live transcript
Stripe + Apple IAP
billing
AWS ECS x4 + Terraform
infra

Lesson

The model call is one line of code. Orchestration, partial-failure handling and billing are the actual product.

View it live: typeza.ai
High-stakes AI

PrepDoc

A dual-LLM consensus engine for personalized health reports.

Problem

In health AI, a single model answering with confidence is a liability. The report cannot invent a supplement interaction that does not exist, or miss one that does. One opinion is not enough to stand behind.

Approach

Two models, Claude and Grok, analyze the same data independently, then challenge each other across three rounds. Agreements become the report. Disagreements are shown to the user honestly, with each model rating, instead of being papered over. Seven API calls per report, streamed over SSE, with hourly spend caps so a bug cannot drain the budget.

7
API calls per report
$2 to $4
compute cost per report
~90%
margin on AI compute
FastAPINext.jsClaudeGrokSSEStripePostgreSQL
Architecture
Upload + survey
intake
FastAPI
async backend
dispatch
Claude
Grok
2 models, independent
3 rounds
Debate
challenge + revise
Consensus + diffs
honest disagreement
SSE
Report
$2 to $4 / report
Spend caps
hourly guardrail
Stripe webhooks
fulfillment
Next.js
streaming UI

Lesson

One model gives you an answer. Two models give you a confidence level. The disagreements turned out to be the most valuable output.

View it live: prepdoc.ai
Reliability

MLB Scoring Engine

A daily data pipeline that has run untouched for a year and a half.

Problem

Data products do not fail on the clever parts. They fail on the boring ones: the API that changes shape, the run that silently stops, the tab limit nobody planned for. Reliability is the feature.

Approach

Scheduled cron jobs pull three sports data sources every day, score batters over a rolling 14-day window and publish rankings to a Next.js dashboard. No server to babysit, and the job cleans up after itself so it never hits a storage limit.

18+ mo
in production
0
maintenance, zero incidents
3
data sources, daily
PythonStatcast APIMLB Stats APICronNext.jsPostgreSQL
Architecture
Statcast
MLB Stats API
Weather
3 sources, daily
ingest
Scoring engine
14-day window
Rankings
daily output
Cron jobs
scheduled, no server
Next.js dashboard
public view
18+ months
in production
0 maintenance
zero incidents

Lesson

Boring is good. Reliability is not luck, it is a design choice: jobs that clean up after themselves, fail loudly, and never need a human at 2am.

View it live: venomanalytics.io

Working together

Rates are public, so you can self-select.

I work with startups that need AI in their product but cannot justify a $150k+ full-time hire. Geographic arbitrage, not a discount.

Backend development

$80/ hour

Python APIs, async pipelines, data systems, infrastructure. Django or FastAPI.

Most requested

AI / LLM integration

$120/ hour

Multi-model pipelines, consensus engines, streaming, cost control. The part that survives production.

Project minimum

$5,000fixed

Smaller than that is rarely worth the ramp-up for either of us.

Productized

AI Report Engine

Questionnaire or upload, an LLM pipeline, then paid personalized reports. I have built this exact shape three times in production. Live in three weeks.

from $8k
fixed scope

Not a fit

  • I integrate models, I do not train them.
  • No WordPress.
  • No pure frontend work.

Proof

Track record, not adjectives.

Live products are a stronger signal than any portfolio copy. Behind them: 10+ years of shipping production code and 10+ projects delivered across SaaS, data and AI.

10+
products shipped to production
10+ yrs
shipping production code
4
products built solo, end to end
18+ mo
zero-incident data pipeline

EU-registered company (Estonia). Proper contracts (MSA and SOW), clean invoicing in USD or EUR, and W-8BEN-E ready for US clients.

What I reach for

PythonFastAPIDjangoTypeScriptReactNext.jsPostgreSQLRedisCeleryAWSTerraformDockerClaude / OpenAI APIsStripe

Contact

Tell me what you want to build.

Email me the AI feature you have in mind, your stack and your timeline. I reply within 24 hours with a technical fit assessment. Free, and no sales call required.

Async-first. I will tell you honestly if it is not something I should build.