Python Scripts to Download Your Videos & Subtitles Overnight

Python Scripts to Download Your Videos & Subtitles Overnight

Batch-download authorized videos and subtitles with yt-dlp; wake up to organized assets.

Queue it, sleep, wake up to clean folders. Use this only for content you own, have explicit permission to download, or Creative Commons works that permit reuse. Respect platform Terms of Service and licenses.

Hook & format breakdown

What problem this solves (with real use cases)


Install the tools (macOS & Windows)

You need two things: Python and yt‑dlp (plus FFmpeg for muxing/subtitles).

macOS (Homebrew)

Install Homebrew if you don’t have it:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Install Python and tools:

brew install python yt-dlp ffmpeg

Confirm versions:

python3 --version
yt-dlp --version
ffmpeg -version

Windows (Chocolatey)

Run PowerShell as Administrator, then install Chocolatey:

Set-ExecutionPolicy Bypass -Scope Process -Force; `
[System.Net.ServicePointManager]::SecurityProtocol = `
[System.Net.ServicePointManager]::SecurityProtocol -bor 3072; `
iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))

Install Python and tools:

choco install python yt-dlp ffmpeg -y

Confirm:

python --version
yt-dlp --version
ffmpeg -version

If ffmpeg isn’t found, close and reopen the terminal so PATH updates.


Choose a safe folder structure

Keep things predictable so your editor/scripts can find files every time.

/ingest
  /YYYYMMDD/                 # date you downloaded
    /<channel_or_playlist>/
      video_title_<id>.mp4
      video_title_<id>.<lang>.srt
      video_title_<id>.info.json   # optional metadata

Examples

/ingest/20250914/NASA/spacewalk_abc123.mp4
/ingest/20250914/MyChannel/how_to_edit_efg456.en.srt
Hook & format breakdown ---

yt‑dlp basics in plain English


Copy‑paste examples

Download a whole (authorized/CC) playlist into dated folders

macOS/Linux

yt-dlp -ciw   -f mp4   --write-sub --write-auto-sub --sub-lang "en.*,th.*" --convert-subs srt   --download-archive archive.txt   --sleep-requests 2 --max-sleep-interval 5 --retries 10   --paths "base=./ingest"   -o "%(upload_date)s/%(uploader)s/%(title).80s_%(id)s.%(ext)s"   "https://www.youtube.com/playlist?list=YOUR_PLAYLIST_ID"

Windows (PowerShell)

yt-dlp -ciw `
  -f mp4 `
  --write-sub --write-auto-sub --sub-lang "en.*,th.*" --convert-subs srt `
  --download-archive archive.txt `
  --sleep-requests 2 --max-sleep-interval 5 --retries 10 `
  --paths "base=.\ingest" `
  -o "%(upload_date)s/%(uploader)s/%(title).80s_%(id)s.%(ext)s" `
  "https://www.youtube.com/playlist?list=YOUR_PLAYLIST_ID"

Download from a text file of URLs (one per line)

macOS/Linux

yt-dlp -ciw -f mp4 --write-auto-sub --convert-subs srt   --download-archive archive.txt   --paths "base=./ingest"   -o "%(upload_date)s/%(uploader)s/%(title).80s_%(id)s.%(ext)s"   -a urls.txt

Windows

yt-dlp -ciw -f mp4 --write-auto-sub --convert-subs srt `
  --download-archive archive.txt `
  --paths "base=.\ingest" `
  -o "%(upload_date)s/%(uploader)s/%(title).80s_%(id)s.%(ext)s" `
  -a urls.txt

Notes: -c continues partial downloads, -i ignores errors, -w avoids overwrites. %(upload_date)s is YYYYMMDD.


The simplest Python wrapper

A minimal script that reads your URLs, calls yt‑dlp, and writes a manifest CSV so you can track successes/failures.

  1. Put your playlist or video links in urls.txt (one per line).
  2. Save this as overnight_dl.py next to urls.txt:
#!/usr/bin/env python3
import csv, subprocess, time, pathlib, datetime, sys

BASE = pathlib.Path(__file__).resolve().parent
INGEST = BASE / "ingest"
ARCHIVE = BASE / "archive.txt"
URLS_FILE = BASE / "urls.txt"
LOG_FILE = BASE / "overnight.log"
MANIFEST = BASE / "manifest.csv"

CMD = [
    "yt-dlp",
    "-ciw",
    "-f", "mp4",
    "--write-sub", "--write-auto-sub", "--sub-lang", "en.*,th.*",
    "--convert-subs", "srt",
    "--download-archive", str(ARCHIVE),
    "--sleep-requests", "2", "--max-sleep-interval", "5", "--retries", "10",
    "--paths", f"base={INGEST}",
    "-o", "%(upload_date)s/%(uploader)s/%(title).80s_%(id)s.%(ext)s",
]

def run(url: str) -> int:
    with LOG_FILE.open("a", encoding="utf-8") as log:
        log.write(f"\n--- {datetime.datetime.now().isoformat()} START {url}\n")
        p = subprocess.run(CMD + [url], stdout=log, stderr=log, text=True)
        log.write(f"--- {datetime.datetime.now().isoformat()} END {url} rc={p.returncode}\n")
        return p.returncode

def main():
    INGEST.mkdir(parents=True, exist_ok=True)
    ARCHIVE.touch(exist_ok=True)

    if not URLS_FILE.exists():
        print("No urls.txt found. Create one with your own/authorized/CC URLs.", file=sys.stderr)
        sys.exit(1)

    rows = []
    for url in URLS_FILE.read_text(encoding="utf-8").splitlines():
        url = url.strip()
        if not url or url.startswith("#"):
            continue
        rc = run(url)
        rows.append({
            "timestamp": datetime.datetime.now().isoformat(timespec="seconds"),
            "url": url,
            "ok": "yes" if rc == 0 else "no"
        })
        time.sleep(1)  # small pause between items

    # Append to (or create) manifest
    write_header = not MANIFEST.exists()
    with MANIFEST.open("a", newline="", encoding="utf-8") as f:
        w = csv.DictWriter(f, fieldnames=["timestamp", "url", "ok"])
        if write_header:
            w.writeheader()
        w.writerows(rows)

if __name__ == "__main__":
    main()
  1. Run it

macOS

python3 overnight_dl.py

Windows

python overnight_dl.py

What to expect

Hook & format breakdown

Scheduling overnight

macOS (cron)

Find full paths:

which python3
pwd

Assume:

Edit your crontab:

crontab -e

Run every night at 2:00 AM:

0 2 * * * PATH="/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin" /opt/homebrew/bin/python3 /Users/you/OvernightDL/overnight_dl.py >> /Users/you/OvernightDL/overnight.log 2>&1

Keep your Mac awake: System Settings → Battery → prevent sleep during scheduled runs.

Windows (Task Scheduler)


Subtitles & fallbacks

Good practice


Safety, ethics, and guardrails


Troubleshooting for beginners

“command not found” / 'yt-dlp' is not recognized
Install via Homebrew/Chocolatey, then reopen the terminal. Check: yt-dlp --version, ffmpeg -version, python --version.

“Permission denied” (macOS) / script won’t run
Make scripts executable: chmod +x script.sh. For Python: python3 file.py.

PowerShell blocked scripts (Windows)
Allow local scripts (once):

Set-ExecutionPolicy -Scope CurrentUser -ExecutionPolicy RemoteSigned

Rate limits / timeouts
Add waits and retries:

--sleep-requests 2 --max-sleep-interval 5 --retries 10 -i

Subtitles didn’t appear
Not all videos have subs. Request both publisher and auto; if none exist, note “no-subs” in your manifest for later transcription.

Post‑processing error: ffprobe/ffmpeg not found
Install FFmpeg (brew install ffmpeg or choco install ffmpeg -y) and reopen your terminal.

Filenames too long (Windows)
Use shorter templates and safe characters:

Weird paths or spaces in names
Wrap paths in quotes: "C:\My Videos\ingest" or "My Clip.mp4".

Still stuck?
Open overnight.log, copy the exact error line, and search it with “yt-dlp” or “ffmpeg”. Most issues have a known fix.


Optional: “enhanced” manifest (more fields)

If you also write info JSON sidecars (--write-info-json), you can extend the script later to parse fields like title, id, uploader, upload_date, duration, and subtitle languages into your CSV. This helps with analytics and edit prep.

Manifest headers idea

timestamp,url,id,uploader,upload_date,title,duration,has_subs,languages,ok
Hook & format breakdown

FAQ

Can I download private or paid content?
Only if it’s yours and allowed by the platform and license/contract. This guide assumes public videos you own/are authorized to download or CC‑permitted works.

How do I limit quality/size (e.g., 1080p)?
Use format selection:

-f "bv*[height<=1080]+ba/b[height<=1080]"

Can I resume if the connection drops?
Yes — -c continues partial files; -w avoids overwrites; --download-archive skips repeats.

How do I pick subtitle languages?
--sub-lang "en.*,th.*" grabs English variants and Thai. Add --convert-subs srt for editor-friendly SRT.

Is auto‑captioning reliable?
Varies. Treat auto subs as a starting point; review or re‑transcribe important videos later.

Can I run multiple downloads at once?
Possible, but for beginners use a single queue with light sleeps to avoid throttling.