Porting YTDLP to Bun (YTDLB)

Recently, Bun migrated to Rust using Claude Code, which resulted in significant performance improvements and a smaller bundle size. From a technical perspective, that is quite impressive. Reducing a bundle size at that scale is no small feat, and so far, I have encountered no regressions on the Bun side. Let’s be honest, what Bun does should have been in Rust the whole time. Most of what Bun does is highly static and deterministic, and there is a very well-defined interface for its core functionality.

But of course, there is the elephant in the room: Claude Code. I personally almost never participate in internet drama, but this time, yt-dlp is dropping Bun support, explicitly citing the usage of LLMs in the code. I find that stance irrational and silly, especially given the objective improvements across the board, though that probably does not include compile speed, wink. I went to the yt-dlp GitHub issue to see if that was actually the case, and sure enough, that was pretty much the only reason.

Normally I would not care, but I was “vibe coding” a yt-dlp web GUI to manage my watch history, and I was using Bun for its significantly smaller bundle size compared to Node and its overall better developer experience. This change forces my project to either stick with an old Bun version indefinitely or bloat the bundle size. Both options are suboptimal. So, I raised my concern, citing essentially the same reasons.

I saw the writing on the wall, so I started working on my own fork in TypeScript.

The other day, I saw my comment deleted by the yt-dlp dev, and I was blocked from the project. Honestly, I did not even know you could be blocked on GitHub, but I guess it is an honor to be blocked by one of the more important OSS projects out there.

Personally, I have always had a few issues with the direction of yt-dlp. First, it is a Python project with a hard dependency on a JS runtime because many websites use JS for bot detection. As a result, a minimal bundle of yt-dlp requires both a Python interpreter and a JS runtime, which is ridiculous. Second, yt-dlp is essentially impossible to use as a library. This is not just because of an intentionally opaque codebase, but also because of the complete lack of type hints. yt-dlp requires Python 3.10+ so there is no backward compatibility excuse for the lack of typing, and it makes the code nearly impossible to integrate.

While those were my primary technical frustrations, the developer’s recent attitude toward AI-assisted coding is, in my opinion, the final straw. No matter your opinion on AI, it is factually great at two things: handling trivial problems and doing the boring grind of programming. yt-dlp is a project with very little architectural complexity, and most of its complexity comes from the fact that APIs change constantly. It is the perfect candidate for AI-assisted maintenance. Imagine having a nightly agent that tests and fixes API failures automatically. Instead, yt-dlp fails constantly because, without type hints, it is extremely difficult to refactor with confidence, even though the changing ecosystem demands constant refactoring.

The greatest sin of yt-dlp is probably its API stability, or lack thereof. You cannot trust the JSON output at all. So many people end up writing convoluted custom parsers just to handle the unreliable JSON output of yt-dlp. The only reason people do not deal with the underlying APIs directly is that there is no universal interface. yt-dlp could be that interface, like FFmpeg, which, regardless of how you feel about the developers, at least works, but it refuses to provide one.

Those are my objective beefs with yt-dlp. My subjective ones include a weird obsession with vendoring things without good abstraction. For example, there is a JS interpreter vendored inside the Python code, which does not even work properly, yet it remains there: yt-dlp/jsinterp.py.

I do not have an issue with vendoring in principle, but I take issue with engineering solutions that do not work and still ultimately require a JS runtime anyway.

Anyway, hopefully, I have demonstrated why I think the current yt-dlp codebase is in a rough spot. If you take a look yourself, you might agree.

That brings us to the TypeScript port. Almost as a joke, I decided to port the whole thing to TypeScript using AI, and it actually works surprisingly well. In less than an hour, I was able to download a YouTube video. It is not feature-complete yet, but it does what most people actually use the tool for.

Building on Bun, besides avoiding the JS runtime issue, allows me to access things like lol-html, which is extremely cool. It is also trivial to build a standalone executable, unlike the forced GPL of PyInstaller. Whether that is a pro or a con depends on your ideology, but I personally value the convenience. The Bun rewrite will take a while, as I have other stuff to do, but it is going to be a much better solution for me. It is significantly easier to manage a TypeScript project than an untyped Python one for a Bun server.

I do not have a rigid roadmap, but I am making a few decisions to ensure the project stays healthy. I am using Zod for schema validation. Unlike the Python version, which lacks type hints entirely, I am using Zod to enforce shapes. This makes error handling significantly more transparent because you can actually see what went wrong instead of fighting the codebase. I am also prioritizing heavy use of async. This is required for JS, which is one of the rare cases where JS actually forces developers to do something reasonable. It will make for a significantly more performant library. Finally, I am including strict type hints. This should be the bare minimum for any modern project.

This will likely never be a feature-for-feature fork. I have no interest in supporting JS runtimes other than Bun, and I am definitely not supporting the Python plugin feature. The whole concept of a YouTube downloader should have been a library from day one, not a massive, monolithic program with random features and undefined usage.

Feel free to vibe code and contribute to the repo here: https://github.com/yamada-sexta/yt-dlb.

I will set up nightly regression tests and some AI auto-repair when I have time, so it should hopefully just work.