archiving community contributions on YouTube: unpublished captions, title and description translations and caption credits
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 2.7 KiB

3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
  1. # YouTube Community Contributions Archiving Worker
  2. <a href="https://discord.gg/7QxcBvw"><img alt="Discord" src="https://img.shields.io/discord/755014354734153818?style=plastic"></a>
  3. ## This project is now complete, and we are working on sorting and finalizing the data. Thank you to everyone who contributed!
  4. Worker for the `Save Community Captions` project: Archiving unpublished YouTube community-contributions.
  5. [Lost? Click here to learn what this is all about!](https://github.com/Data-Horde/ytcc-archive/wiki/General-Information)
  6. ## Current Stats
  7. See how much has been archived so far.
  8. * https://atdash.meo.ws/d/attv2/archive-team-tracker-charts-v2?orgId=1&var-project=ext-yt-communitycontribs
  9. * https://tracker.archiveteam.org/ext-yt-communitycontribs/
  10. ## Setup
  11. To run these tools you will need to supply session cookies (SSID,HSID,SID) [see the
  12. tutorial for more details](https://github.com/Data-Horde/ytcc-archive/wiki/Setup-Tutorial).
  13. ## Primary Usage
  14. ### Heroku⭐️⭐️⭐️ (Minimal Setup! Minimal Maintenance!)
  15. A wrapper repo for free and easy deployment and environment configuration, as well automatic updates every 24-27.6 hours is available. Deploy up to 5 instances of it to a free Heroku account (total max monthly runtime 550 hours) with no need for credit card verification by clicking the button below.
  16. [![Deploy](https://www.herokucdn.com/deploy/button.svg)](https://heroku.com/deploy?template=https://github.com/Data-Horde/ytcc-archive-heroku)
  17. ### Archiving Worker⭐️
  18. After completing the above setup steps, simply run
  19. ```bash
  20. python3 worker.py
  21. ```
  22. ### Docker image⭐️⭐️
  23. Stable Docker Image:
  24. ```bash
  25. docker pull fusl/ytcc-archive
  26. ```
  27. Run:
  28. ```bash
  29. docker container run --restart=unless-stopped --network=host -d --tmpfs /grab/out --name=grab_ext-yt-communitycontribs -e HSID=XXX-e SID=XXX -e SSID=XXX -e TRACKER_USERNAME=Fusl -e PYTHONUNBUFFERED=1 fusl/ytcc-archive
  30. ```
  31. ## Bonus Features
  32. ### Export Captions and Titles/Descriptions Manually
  33. This feature requires an [older version of `export.py`](https://github.com/Data-Horde/ytcc-archive/blob/4bbffa6dc3469832609b6e56ae926dcdf7e729ac/export.py). Get this file, Python 3, and the `requests` module (`pip install requests`). Then, simply run `python3 exporter.py` followed by a list of space-separated YouTube video IDs, and all community-contributed captioning and titles/descriptions in all languages will be exported.
  34. ### Discover Videos Manually
  35. Simply run `python3 discovery.py` followed by a list of space-separated YouTube video IDs and a list of discovered video, channel and playlist IDs will be printed, as well as whether caption contributions are enabled.