The little things give you away... A collection of various small helper stuff
Vous ne pouvez pas sélectionner plus de 25 sujets Les noms de sujets doivent commencer par une lettre ou un nombre, peuvent contenir des tirets ('-') et peuvent comporter jusqu'à 35 caractères.
 
 
 
JustAnotherArchivist 236278f0b4 Fix decoding of links on Facebook profiles il y a 4 ans
LICENSE Initial commit il y a 5 ans
README.md Initial commit il y a 5 ans
archivebot-blogspot Fix HTTPS handling il y a 5 ans
archivebot-high-memory Support python3 in any directory instead of just /usr/bin il y a 4 ans
archivebot-jobid-calculation More snscrape helper tools il y a 5 ans
archivebot-jobs Suppress output if there are no matched jobs il y a 4 ans
archivebot-list-stuck-requests Fix line endings il y a 5 ans
archivebot-monitor-job-queue First set of little things il y a 5 ans
archivebot-youtube Add helper for AB/chromebot-ing YouTube channels and users il y a 5 ans
bing-scrape Add Bing, Reddit/Pushshift, and FoolFuuka scrapers il y a 5 ans
curl-ua Replace curl-archivebot-ua with a more general curl-ua script that supports different UAs selected by aliases il y a 4 ans
europarl-meps-collect Add script for scraping MEP links from europarl.europa.eu il y a 5 ans
foolfuuka-search Better workaround for the 5000 results limit; works for FoolFuuka 2.0.1 and up il y a 5 ans
format-size Split out size formatting il y a 5 ans
fos-ftp-upload First set of little things il y a 5 ans
get-crx4chrome-urls First set of little things il y a 5 ans
ia-derive Add script to queue derive on IA il y a 5 ans
ia-upload-progress Proper script for tracking size of uploaded data il y a 5 ans
iasha1check First set of little things il y a 5 ans
ix.io-upload Allow overriding the "remote filename" il y a 5 ans
killcx-all-https First set of little things il y a 5 ans
mastodon-enumerate-users Enumerate users on a Mastodon instance il y a 5 ans
mastodon-outdated Finding outdated Mastodon instances il y a 5 ans
pipelines-launch-in-tmux-windows First set of little things il y a 5 ans
pipelines-monitor-tmux-wget-outcomes Monitor how a pipeline's wget processes are faring il y a 5 ans
pipelines-stop-gracefully First set of little things il y a 5 ans
reddit-pushshift-search Add Bing, Reddit/Pushshift, and FoolFuuka scrapers il y a 5 ans
run-every-five-minutes First set of little things il y a 5 ans
s3-bucket-list Fix marker loop on some filenames due to lacking HTML entity processing il y a 4 ans
snscrape-extract Add support for Twitter hashtag extraction il y a 4 ans
snscrape-facebook-user Silence by default il y a 5 ans
snscrape-instagram-user Silence by default il y a 5 ans
snscrape-prepare-commands Add support for Twitter hashtag extraction il y a 4 ans
snscrape-tmux Update tmux session commands il y a 4 ans
snscrape-twitter-filter Filter Twitter hashtag scrapes based on account scrapes il y a 5 ans
snscrape-twitter-hashtag Extract external links from Twitter il y a 5 ans
snscrape-twitter-user Extract external links from Twitter il y a 5 ans
snscrape-upload Print Instagram ignore immediately after upload instead of at the end il y a 5 ans
snscrape-vk-user Silence by default il y a 5 ans
snscrape-wiki-transfer-merge Helper tools for snscrape and the wiki pages il y a 5 ans
social-media-extract-profile-link Fix decoding of links on Facebook profiles il y a 4 ans
tar-many-files-progress First set of little things il y a 5 ans
tcp-closer Add tcp-closer command il y a 5 ans
transfer.notkiska.pw-upload Print deletion URL on stderr il y a 4 ans
uniqify Add uniqify il y a 5 ans
url-normalise Normalise domain name to lower-case before further processing il y a 4 ans
warc-size Split out size formatting il y a 5 ans
warc-tiny Fix detection of multiple transfer encodings il y a 4 ans
website-extract-social-media Add support for Facebook /pages/category/Category/Name-ID URLs il y a 4 ans
wget-spider-estimate-size First set of little things il y a 5 ans
wiki-list-to-main Add ArchiveBot wiki list helper il y a 5 ans
wiki-recursive-extract-normalise Fix deduplication within each section processing il y a 4 ans
wiki-sections-sort Add wiki-sections-sort il y a 4 ans
wiki-website-extract-social-media Add script for automatic social media discovery il y a 4 ans
wpull1-parallel-progress-monitor First set of little things il y a 5 ans
wpull1-progress-monitor First set of little things il y a 5 ans
wpull2-url-origin Fixed version which handles multiple roots correctly il y a 5 ans
youtube-filter-autogen-channels Add youtube-filter-autogen-channels il y a 4 ans

README.md

Over the past few years, I’ve written and accumulated a number of useful little things to help with archival-related tasks. This repository collects them. I hope someone finds some of them useful.

License (applies to all programs in this repository)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.