SaaS Development · Product

Building an AI Photo Animation SaaS from Scratch — Architecture, Payments & Lessons

By Alex — JBA Agency Published March 21, 2026 10 min read
$1.50
Per Premium Animation
Free
Tier Available
4
Core Tech Layers
Grok AI
AI Engine

The Concept: Bringing Old Photos to Life with AI

Every family has a box of old photographs — grandparents in black and white, faded portraits from the 1950s, moments frozen in time that feel distant because the medium itself is foreign to modern eyes. The premise behind JBA Agency's photo animation tool is simple: use AI to close that distance. Upload a still image, and the AI generates a subtle, realistic animation — a slight head turn, a gentle breath, eyes that blink — transforming a photograph into something that feels alive.

This is not an acquired client project. This is JBA Agency's own product — built internally, dogfooding our own SaaS development capability, and deployed publicly at jbagency.ro/animate/. It serves as both a real utility for users and a live proof of concept for prospective SaaS development clients who want to understand what we can build for them.

"Building your own product is the most honest demonstration of your development capability. We didn't just claim we could build AI SaaS products — we built one, shipped it, and let anyone use it to verify the claim." — Alex, Founder, JBA Agency

The tool supports two use cases: animating still photographs (adding lifelike motion to static images) and colorizing black-and-white photos with AI-generated color. Both capabilities run through the same pipeline, with different parameters passed to the underlying AI model.

Want to see it in action? Try the live AI photo animation tool — free for your first generation.

Try the Animator Free

Tech Stack: Every Decision Explained

Node.js + Express
API server & request routing
PostgreSQL
User accounts, credits, history
Stripe
Payments & webhook events
Grok AI API
Video generation from stills
Multer
File upload handling
Helmet.js
HTTP security headers

Why Node.js and Express?

For a SaaS built around AI API calls, Node.js's non-blocking I/O model is a natural fit. AI generation requests — particularly video generation from still images — are slow operations: anywhere from 8 to 45 seconds depending on model load. Node.js handles these as async operations without tying up server threads, meaning hundreds of concurrent requests can be in-flight while each waits for the AI API to respond.

Express was chosen over alternatives like Fastify or Koa because of ecosystem maturity — the middleware for security (helmet), file uploads (multer), rate limiting (express-rate-limit), and CORS handling all integrate cleanly with Express and are well-documented. For a SaaS MVP that needs to be production-ready quickly, betting on the most widely-understood framework reduces maintenance risk.

Why PostgreSQL?

The data model for this SaaS has clear relational structure: users have credit balances, credit balances change via Stripe payment events, and each animation generation deducts from a user's balance and creates a history record. These are ACID-requiring transactions. A NoSQL database would introduce consistency risks — the scenario where a generation completes but the credit deduction fails, or a payment webhook is processed twice, would be catastrophic for a billing-critical system. PostgreSQL's transactional guarantees eliminate these risks.

Why Grok AI for the Animation Engine?

The animation quality of an AI photo tool lives or dies on the underlying model. Grok AI's video generation API — specifically its image-to-video capability — produces realistic, temporally consistent motion from still photographs. The model understands facial geometry, lighting direction, and natural motion patterns, generating animations that feel physically plausible rather than jittery or uncanny.

Critically, Grok's API pricing model is per-generation, which aligns cleanly with a freemium SaaS business model: the cost of an API call is known in advance, enabling precise margin calculation for each tier. There are no unpredictable token costs that could erode margins at scale.

Architecture: How the System Is Structured

REQUEST FLOW (Animate Tool)

User Browser
   │ POST /api/animate (multipart/form-data)
   ▼
Express Middleware Stack
   ├── Helmet (security headers)
   ├── CORS (origin whitelist)
   ├── Rate Limiter (IP + user-based)
   ├── Auth middleware (JWT validation)
   └── Multer (file parse & validate)
   │
   ▼
Route Handler: /api/animate
   ├── Credit check (PostgreSQL tx open)
   ├── Deduct 1 credit (tx write)
   ├── POST image to Grok AI API
   │    └── await video URL (8–45s)
   ├── Store result in history table
   └── tx commit → return video URL
   │
   ▼
User Browser receives animated video

The key architectural principle: the credit deduction and the AI API call are coupled inside a single database transaction. If the Grok API call fails, the transaction rolls back — the user keeps their credit. If the API succeeds but the database write fails (unlikely, but possible), the transaction also rolls back. No credit is lost, no generation is lost, and no charge occurs without delivery.

File Handling with Multer

Multer is configured with strict constraints before a file ever reaches the route handler:

const upload = multer({
  storage: multer.memoryStorage(),   // never write to disk unvalidated
  limits: {
    fileSize: 10 * 1024 * 1024,      // 10MB max
    files: 1                          // single file per request
  },
  fileFilter: (req, file, cb) => {
    const allowed = ['image/jpeg', 'image/png', 'image/webp'];
    if (!allowed.includes(file.mimetype)) {
      return cb(new Error('Only JPEG, PNG, and WebP images are accepted'));
    }
    cb(null, true);
  }
});

Images are held in memory (not written to disk) during validation and then streamed directly to the Grok API. This eliminates an entire class of file system security risks and avoids the operational overhead of managing a temporary file directory with cleanup jobs.

Payment Integration: Stripe with Webhooks

Stripe powers the credit purchase flow. Users buy credit packs — not subscriptions — which matches the use case: someone who wants to animate a dozen family photos will buy a credit pack, use it, and may not return for months. A subscription model would create churn friction. Per-purchase credits create a frictionless "pay when you need it" experience.

The Checkout Flow

  1. User clicks "Buy Credits" — frontend calls POST /api/payments/create-checkout
  2. Server creates a Stripe Checkout Session with metadata: { userId, creditsPurchased }
  3. User is redirected to Stripe's hosted checkout page (Stripe handles card data — we never see it)
  4. On successful payment, Stripe fires a checkout.session.completed webhook event to our server
  5. Webhook handler verifies the event signature using stripe.webhooks.constructEvent()
  6. Verified event triggers a PostgreSQL UPDATE: credits = credits + purchased_amount WHERE user_id = ?
  7. User is redirected to a success page; their balance is immediately updated

Why Webhooks, Not Redirect Callbacks?

"Never trust the client for payment confirmation. A user closing the browser tab, losing connectivity, or manipulating redirect parameters can all cause a redirect-based payment confirmation to fail silently. Webhooks from Stripe's servers, verified by signature, are the only reliable source of truth for payment events." — Alex, JBA Agency

The webhook endpoint uses raw body parsing (not JSON-parsed) to preserve the exact bytes for signature verification. Stripe signs webhooks with a secret; if the computed HMAC doesn't match, the request is rejected with a 400 status before any business logic runs. This prevents spoofed payment confirmations from an attacker who knows the endpoint URL.

// Webhook signature verification
app.post('/api/payments/webhook',
  express.raw({ type: 'application/json' }),  // raw body required
  async (req, res) => {
    const sig = req.headers['stripe-signature'];
    let event;
    try {
      event = stripe.webhooks.constructEvent(
        req.body, sig, process.env.STRIPE_WEBHOOK_SECRET
      );
    } catch (err) {
      return res.status(400).send(`Webhook error: ${err.message}`);
    }
    // handle verified event...
  }
);

Idempotency: Handling Duplicate Webhooks

Stripe may deliver the same webhook event more than once — network failures, retries, and edge cases can all cause duplication. The webhook handler checks the Stripe event ID against a processed_events table before applying any credit changes. If the event ID already exists, the handler returns 200 (acknowledging receipt) without reprocessing. This ensures a user never receives double credits from a single payment, regardless of how many times Stripe retries delivery.

Security: Defense in Depth

Helmet.js Security Headers

Every response from the API server includes a hardened set of HTTP security headers via Helmet.js. Key headers include:

Rate Limiting: Two-Layer Defense

The animate endpoint has two independent rate limiters applied in sequence:

The user-based limit is stored in PostgreSQL rather than in-memory (Redis), meaning it persists across server restarts and works correctly in a multi-instance deployment scenario.

CORS Configuration

The API's CORS policy uses a strict origin whitelist. Only requests originating from jbagency.ro and www.jbagency.ro are accepted. This prevents other websites from making authenticated cross-origin requests to the API on behalf of a logged-in user (CSRF via third-party sites). Preflight OPTIONS requests are handled explicitly and return the correct headers without invoking the auth middleware.

[ Screenshot: AI Photo Animation tool UI — upload interface and before/after comparison ]

The Business Model: Freemium SaaS with Per-Generation Pricing

The pricing model is deliberately simple. Users get one free animation on account creation — enough to experience the product and judge quality — and then purchase credits to continue.

Free Tier

1 animation credit on signup. No credit card required. No time limit on using the free credit. This is not a trial — it's a permanent free tier with a single credit. The goal is to let users experience genuine value before any payment decision.

Premium — $1.50 per animation

Credits purchased in packs. No subscription, no auto-renewal, no hidden fees. Credits do not expire. Each credit produces one high-quality animated video (up to 8 seconds) or one colorized photo. Premium animations run at higher resolution and longer duration than the free tier.

"Per-generation pricing aligns incentives perfectly. We only earn when a user gets value. There's no subscription to cancel, no annual lock-in. If the product is good, people buy more credits. If it's not, they don't. It's the most honest SaaS pricing model there is." — Alex, JBA Agency

Unit Economics

Each animation generation has a known cost: the Grok API call cost plus a fraction of server compute. At $1.50 per credit with the AI API cost well below that figure, the margin per generation is healthy — particularly because infrastructure costs are fixed and margin improves with volume. The free tier credit costs approximately the same per-generation as a paid credit but is limited to one per user, so it functions as a capped acquisition cost rather than an ongoing loss.

Challenges Encountered and How They Were Solved

Challenge 1: Generation Time UX

AI video generation takes between 8 and 45 seconds. Showing a user a loading spinner for 45 seconds without feedback is terrible UX — they'll assume the system crashed. We implemented a polling-based progress system: the generation request returns a job ID immediately, and the frontend polls /api/animate/status/:jobId every 3 seconds, receiving a progress percentage and estimated time remaining. Users see genuine progress, not a spinner, which dramatically improved perceived performance and reduced abandonment during generation.

Challenge 2: File Size vs. Quality Tradeoff

Early testing revealed that users uploading very large, high-resolution images (20MB+ DSLR photos) created generation times at the extreme end of the range without proportional quality gains — the AI model operates at its native resolution regardless of input resolution above a certain threshold. We added a server-side image downscaling step (using the sharp library) that resizes inputs above 4096px on the longest dimension before passing them to the API. This reduced average generation time by 31% with no perceptible quality loss.

Challenge 3: Grok API Failure Handling

AI APIs are not 100% reliable. The Grok API occasionally returns 500 errors, particularly under high load. The system implements exponential backoff retry logic: on a 500 or 503 response, the handler waits 2 seconds, then 4 seconds, then 8 seconds before failing permanently. On permanent failure, the transaction is rolled back (user keeps their credit), and the user receives an error message explaining the generation failed with their credit preserved. This transparency — "we tried, it failed, you haven't been charged" — is critical for user trust in a paid service.

Challenge 4: Animation Artifacts on Some Photo Types

The AI model performs best on portrait photographs with clear subject-background separation. Group photos, photos with complex backgrounds, and very old daguerreotypes with significant damage produce inconsistent results. Rather than hide this, we added type-specific guidance on the upload page: a short checklist of photo characteristics that produce the best results, and a note on photo types where results may vary. This sets honest expectations and reduces support requests from users who uploaded a 1910 group photograph expecting Hollywood-quality animation.

Results and What This Proves

The animate tool is live and used daily. Early user data shows:

Beyond the metrics, this project demonstrates something specific about JBA Agency's SaaS development capability: we built a complete, production-grade AI SaaS product — with payments, security, database design, AI integration, and a freemium business model — as an internal product, not as a theoretical exercise. Clients who want to build AI-powered SaaS products can see exactly what we'd build for them by using what we built for ourselves.

"The best portfolio piece is a live product with real users and real payments. You can't fake a working Stripe integration or a 99.6% uptime record. Ship something real." — Alex, JBA Agency

If you're evaluating JBA Agency for a SaaS development project, the animate tool is the most direct proof of capability we can offer. Try it at jbagency.ro/animate/. For the full scope of what we build — from SaaS platforms to AI integrations to 24-hour website delivery — see the about page. For AI strategy at the organizational level, explore the Fractional AI Officer service. Questions? Check the FAQ.

The AI photo animation tool is live. Try it free — no credit card required for your first generation.

Try It Now →

About the Author

Alex is the founder of JBA Agency, a web development and AI solutions agency based in Romania. He built the animate tool from scratch as JBA Agency's flagship internal product — demonstrating SaaS development, AI integration, payment processing, and security architecture in a single live application. Alex works with clients across Europe to build web platforms, SaaS products, and AI-powered business tools.