DEV Community

Hammer Nexon
Hammer Nexon

Posted on

How I Handle Edge Cases in a YouTube Transcript Tool

Building ScripTube (scriptube.me) taught me that the "happy path" is about 20% of the engineering work. The other 80% is edge cases.

Here's a tour of the edge cases I've encountered and how I handle them.

Edge Case 1: URL Variations

Users paste URLs in every imaginable format:

https://www.youtube.com/watch?v=dQw4w9WgXcQ
https://youtu.be/dQw4w9WgXcQ
https://youtube.com/watch?v=dQw4w9WgXcQ
http://www.youtube.com/watch?v=dQw4w9WgXcQ
https://www.youtube.com/watch?v=dQw4w9WgXcQ&t=120
https://www.youtube.com/embed/dQw4w9WgXcQ
https://m.youtube.com/watch?v=dQw4w9WgXcQ
youtube.com/watch?v=dQw4w9WgXcQ
dQw4w9WgXcQ
www.youtube.com/watch?v=dQw4w9WgXcQ&list=PLrAXtmErZgOeiKm4sgNOknGvNjby9efdf
Enter fullscreen mode Exit fullscreen mode

Some users paste the video ID directly. Some paste URLs with tracking parameters. Some paste playlist URLs where the video ID is buried in query params.

Solution: A robust URL parser that handles all of these, plus validation that the extracted ID is exactly 11 characters matching [a-zA-Z0-9_-].

Edge Case 2: No Transcript Available

Not all videos have transcripts. Reasons:

  • Creator disabled captions
  • Video is too new (auto-generation hasn't run yet)
  • Language not supported by auto-captioning
  • Music/instrumental content with no speech
  • Live streams (sometimes)

Solution: Clear, specific error messages. Not just "Error" — tell the user WHY and WHAT to do:

"This video doesn't have captions available. 
The creator may have disabled them, or auto-generation 
isn't available for this video's language."
Enter fullscreen mode Exit fullscreen mode

Edge Case 3: Age-Restricted Videos

YouTube requires sign-in for age-restricted content. The transcript API call fails because it can't authenticate.

Solution: Detect this specific error and tell the user clearly. I considered implementing cookie-based auth to bypass this, but decided against it for privacy reasons.

Edge Case 4: Auto-Generated vs. Manual Captions

Some videos have both. The quality difference is significant — manual captions are accurate, auto-generated ones have errors.

Solution: Prefer manual captions when available. Show the user which type they're getting:

const transcriptList = await listTranscripts(videoId);

// Prefer manual over auto-generated
const manual = transcriptList.find(t => !t.isGenerated);
const auto = transcriptList.find(t => t.isGenerated);

const selected = manual || auto;
const source = manual ? 'Manual captions' : 'Auto-generated';
Enter fullscreen mode Exit fullscreen mode

Edge Case 5: Multiple Languages

Videos can have transcripts in multiple languages. Users expect to get the language they want.

Solution: Default to the video's primary language, but let users select alternatives. For the API, accept a lang parameter with fallback chain:

const languagePriority = [
  requestedLang,     // What the user asked for
  'en',              // English fallback
  primaryLang,       // Video's primary language
];
Enter fullscreen mode Exit fullscreen mode

Edge Case 6: Extremely Long Videos

A 4-hour podcast transcript is 40,000+ words. Serving this as a single API response causes:

  • Slow response times
  • High memory usage
  • Frontend rendering issues

Solution: Stream the response for long transcripts, and offer a download option instead of displaying everything in the browser.

Edge Case 7: Special Characters and Encoding

Auto-generated captions include HTML entities, Unicode characters, and sometimes garbled text:

you're going to want to
the café is really nice
[Music]
[Applause]
Enter fullscreen mode Exit fullscreen mode

Solution: Decode HTML entities, normalize Unicode, and optionally strip non-speech markers like [Music]:

function cleanTranscriptText(text: string): string {
  return text
    .replace(/&/g, '&')
    .replace(/&lt;/g, '<')
    .replace(/&gt;/g, '>')
    .replace(/&#(\d+);/g, (_, code) => 
      String.fromCharCode(parseInt(code)))
    .replace(/&\w+;/g, ' ')
    .replace(/\[Music\]/gi, '')
    .replace(/\[Applause\]/gi, '')
    .replace(/\s+/g, ' ')
    .trim();
}
Enter fullscreen mode Exit fullscreen mode

Edge Case 8: Rate Limiting Abuse

Within the first week, someone tried to hit the API thousands of times per minute — probably building their own tool on top of mine.

Solution: Multi-layer rate limiting:

  • Per-IP: 10 requests per minute
  • Global: circuit breaker if YouTube starts returning errors
  • CAPTCHA for suspicious patterns

The Meta-Lesson

Every one of these edge cases represents a real user who tried to use the tool and either got a confusing error or broken output. Each fix made the tool more reliable for everyone.

The difference between a side project and a product is edge case handling. The happy path is table stakes. Handling the unhappy paths is what makes people trust your tool enough to rely on it.

If you're building something, I'd encourage you to start logging every error and unusual input you receive. Each one is a signpost pointing toward a better product.

Try ScripTube → scriptube.me


Total: 8 Dev.to articles, each 800-1500 words with code snippets and technical depth.

Top comments (0)