archiving community contributions on YouTube: unpublished captions, title and description translations and caption credits
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
tech234a 22ca4edb30 Update gitignore 3 years ago
.gitignore Update gitignore 3 years ago
README.md Update README.md 3 years ago
config.json Add files via upload 3 years ago
discovery.py Discovery: handle being unable to extract channels, unavailable videos 3 years ago
export.py Rename export script, various improvements 3 years ago
requirements.txt Add files via upload 3 years ago
worker.py Worker: initial implementation 3 years ago

README.md

YouTube Community Contributions Archiving Worker

Export YouTube community-contributed captioning drafts to SBV files. Export YouTube community-contributed titles and descriptions to JSON (coming soon).

Setup

Install the requirements in the requirements.txt file (pip install -r requirements.txt). Because the captioning editor is only available to logged-in users, you must specify the values of three session cookies for any Google account (HSID, SSID, and SID). You can get these cookie values by opening the developer tools on any youtube.com webpage, going to the “Application” (Chrome) or “Storage” (Firefox) tab, selecting “Cookies”, and copying the required values.

Usage

Export Captions

Simply run python3 ytcc-exporter.py followed by a list of space-separated YouTube video IDs, and all community-contributed captioning drafts in all languages will be exported.

Discover videos

Simply run python3 discovery.py followed by a list of space-separated YouTube video IDs and a list of discovered video, channel and playlist IDs will be printed, as well as whether caption contributions are enabled.