DEV Community

Cover image for How to Stream AI Responses from Rails API to React (SSE Magic ✨)
Ganesh Navale
Ganesh Navale

Posted on

How to Stream AI Responses from Rails API to React (SSE Magic ✨)

🤔 Ever wonder how to stream AI responses to React from Rails?

You know that feeling when your users click "Improve with AI" and stare at a loading spinner for 30 seconds? Yeah, not great UX.

Let me show you how to make AI responses appear word-by-word like ChatGPT, without the complexity of WebSockets.


👉 The Problem

Traditional approach:

React app → Rails API → OpenAI → Wait 30 seconds → Full response
Enter fullscreen mode Exit fullscreen mode

What user sees: Loading spinner forever 😴

The user experience:

  • Clicks button
  • Waits... and waits...
  • Wonders if it's broken
  • Considers closing the tab

✨ The Solution: Server-Sent Events (SSE)

Modern approach:

React app → Rails SSE endpoint → Streams tokens → Real-time updates
Enter fullscreen mode Exit fullscreen mode

What user sees: AI typing word-by-word, just like ChatGPT ✨

Why SSE?

  • ✅ Simpler than WebSockets
  • ✅ Works over standard HTTP
  • ✅ Native browser streaming support (ReadableStream)
  • ✅ Perfect for one-way streaming (server → client)

🛠️ Let's Build It

Step 1: Rails API Backend

First, create the SSE helper:

# app/services/sse.rb
class SSE
  def initialize(stream)
    @stream = stream
  end

  def write(data, event: nil)
    @stream.write("event: #{event}\n") if event
    @stream.write("data: #{data.to_json}\n\n")
  end

  def close
    @stream.close
  end
end
Enter fullscreen mode Exit fullscreen mode

Step 2: Create the Controller

# app/controllers/api/v1/ai_improvements_controller.rb
class Api::V1::AiImprovementsController < ApplicationController
  include ActionController::Live

  def create
    response.headers['Content-Type'] = 'text/event-stream'
    response.headers['X-Accel-Buffering'] = 'no' # Disable nginx buffering
    response.headers['Access-Control-Allow-Origin'] = ENV['FRONTEND_URL']

    sse = SSE.new(response.stream)
    content = params[:content]

    # Your AI service that streams tokens
    AiContentImprover.new(content).improve_streaming do |token|
      sse.write({ token: token }, event: 'token')
    end

    sse.write({ done: true }, event: 'complete')

  rescue IOError
    # Client disconnected - this is normal
  ensure
    sse.close
  end
end
Enter fullscreen mode Exit fullscreen mode

Step 3: AI Service with Streaming

# app/services/ai_content_improver.rb
class AiContentImprover
  def initialize(content)
    @content = content
    @client = OpenAI::Client.new(access_token: ENV['OPENAI_API_KEY'])
  end

  def improve_streaming(&block)
    @client.chat(
      parameters: {
        model: "gpt-4-turbo-preview",
        messages: [
          { role: "system", content: "Improve this content. Make it clearer and more engaging." },
          { role: "user", content: @content }
        ],
        stream: proc do |chunk, _bytesize|
          text = chunk.dig("choices", 0, "delta", "content")
          block.call(text) if text
        end
      }
    )
  end
end
Enter fullscreen mode Exit fullscreen mode

Step 4: Routes

# config/routes.rb
Rails.application.routes.draw do
  namespace :api do
    namespace :v1 do
      post 'ai_improvements', to: 'ai_improvements#create'
    end
  end
end
Enter fullscreen mode Exit fullscreen mode

⚛️ React Frontend

Now the fun part - consuming the SSE stream in React:

import { useState, useRef, useEffect } from 'react'

function AiAssistant({ content }) {
  const [suggestion, setSuggestion] = useState('')
  const [loading, setLoading] = useState(false)
  const abortControllerRef = useRef(null)

  // Cleanup on unmount
  useEffect(() => {
    return () => {
      if (abortControllerRef.current) {
        abortControllerRef.current.abort()
      }
    }
  }, [])

  const improveContent = async () => {
    // Cancel any existing request
    if (abortControllerRef.current) {
      abortControllerRef.current.abort()
    }

    setLoading(true)
    setSuggestion('')
    abortControllerRef.current = new AbortController()

    try {
      const API_URL = 'http://localhost:3000'
      const response = await fetch(`${API_URL}/api/v1/ai_improvements`, {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          'Accept': 'text/event-stream',
        },
        body: JSON.stringify({ content }),
        signal: abortControllerRef.current.signal,
      })

      if (!response.ok) {
        throw new Error(`HTTP error! status: ${response.status}`)
      }

      const reader = response.body.getReader()
      const decoder = new TextDecoder()
      let buffer = ''

      while (true) {
        const { value, done } = await reader.read()
        if (done) break

        buffer += decoder.decode(value, { stream: true })
        const lines = buffer.split('\n')
        buffer = lines.pop() || ''

        for (const line of lines) {
          if (!line.startsWith('data: ')) continue

          const data = JSON.parse(line.slice(6))
          if (data.token) {
            setSuggestion(prev => prev + data.token)
          }
        }
      }
    } catch (error) {
      if (error.name !== 'AbortError') {
        console.error('Stream error:', error)
      }
    } finally {
      setLoading(false)
      abortControllerRef.current = null
    }
  }

  return (
    <div className="ai-assistant">
      <button 
        onClick={improveContent} 
        disabled={loading}
        className="improve-btn"
      >
        {loading ? '✨ AI thinking...' : '✨ Improve with AI'}
      </button>

      {suggestion && (
        <div className="suggestion-box">
          <p>{suggestion}</p>
        </div>
      )}
    </div>
  )
}

export default AiAssistant
Enter fullscreen mode Exit fullscreen mode

🎨 Add Some Style

.ai-assistant {
  margin: 20px 0;
}

.improve-btn {
  background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
  color: white;
  border: none;
  padding: 12px 24px;
  border-radius: 8px;
  cursor: pointer;
  font-weight: 600;
  transition: transform 0.2s;
}

.improve-btn:hover {
  transform: translateY(-2px);
}

.improve-btn:disabled {
  opacity: 0.6;
  cursor: not-allowed;
}

.suggestion-box {
  margin-top: 16px;
  padding: 16px;
  background: #f7fafc;
  border-left: 4px solid #667eea;
  border-radius: 4px;
  line-height: 1.6;
  white-space: pre-wrap;
}
Enter fullscreen mode Exit fullscreen mode

🎉 The Magic Happens

When the user clicks "✨ Improve with AI":

  1. Browser creates connection using fetch with ReadableStream
  2. Rails streams AI tokens one-by-one
  3. JavaScript appends each token to the UI
  4. User sees AI "typing" in real-time
  5. Connection closes automatically when done

Result:

  • ✨ Token-by-token streaming
  • 📱 Native browser API (no library needed)
  • 🔄 Easy to add retry logic if needed
  • 🚀 Much cleaner than WebSockets for this use case

🔥 Advanced: Add Rate Limiting

Prevent users from spamming your expensive AI endpoint:

# app/controllers/api/v1/ai_improvements_controller.rb
before_action :check_rate_limit

def check_rate_limit
  key = "ai_improve:#{current_user.id}"
  count = Rails.cache.increment(key, 1, expires_in: 1.minute)

  return unless count > 5

  sse = SSE.new(response.stream)
  sse.write({ error: "⏸️ Slow down! Try again in a minute." }, event: 'error')
  sse.close
end
Enter fullscreen mode Exit fullscreen mode

Update React to handle errors:

// In the stream parsing loop
if (data.error) {
  alert(data.error)
  return
}
Enter fullscreen mode Exit fullscreen mode

💰 Bonus: Cost Tracking

Track your AI usage to avoid surprise bills:

# app/models/ai_usage.rb
class AiUsage < ApplicationRecord
  belongs_to :user

  def self.total_cost_today
    where('created_at >= ?', Time.zone.today).sum(:estimated_cost)
  end
end

# In your controller:
usage = AiUsage.create!(
  user: current_user,
  action: 'content_improvement',
  input_tokens: content.length / 4, # rough estimate
)

# After streaming completes:
usage.update!(output_tokens: token_count, estimated_cost: calculate_cost)
Enter fullscreen mode Exit fullscreen mode

💡 Tip: For production, use the rack-cors gem instead of inline headers to properly handle preflight requests.


📊 SSE vs WebSocket vs Polling

Feature SSE (fetch) WebSocket Long Polling
Setup Complexity ⭐ Simple ⭐⭐⭐ Complex ⭐⭐ Moderate
HTTP Methods ✅ All (GET, POST) N/A ✅ All
Custom Headers ✅ Full control ⚠️ Limited ✅ Full control
Bidirectional ❌ Server → Client ✅ Both ways ❌ Client → Server
Auto Reconnect ❌ Manual ❌ Manual N/A
Connection Overhead Low (single HTTP) Low (persistent) High (repeated requests)
Scalability ⭐⭐ Good ⭐⭐⭐ Excellent ⭐ Poor
Proxy/Firewall Friendly ✅ Yes (HTTP) ⚠️ Sometimes blocked ✅ Yes
Browser Support ✅ All modern ✅ All modern ✅ All
Best For AI streaming, notifications Real-time chat, gaming Legacy systems

For AI streaming: SSE wins!

Why? AI responses are inherently one-directional (server → client), making WebSocket's bidirectional capability unnecessary overhead. SSE with fetch gives you POST support for sending prompts while streaming responses back.


🚀 What's Next?

Want to level up? Try:

  1. Add Anthropic Claude streaming (different API, same concept)
  2. Implement retry logic for failed streams
  3. Add progress indicators showing % complete
  4. Cache similar requests to save costs
  5. Support multiple AI models (let users choose)

💬 Over to You

  • Have you used SSE before? How was your experience?
  • Would you choose SSE or WebSockets for AI streaming?
  • What other AI features are you building in your Rails + React apps?

Drop a comment below! 👇


🔗 Useful Resources


Found this helpful? Give it a ❤️ and follow me for more Rails + React + AI tips!

Happy coding! 🚀

rails #react #ai #sse #openai #streaming #webdev

Top comments (0)