109 コミット (5c907488e11b7c164e2b3453559c5217fabf3252)
 

作成者 SHA1 メッセージ 日付
  JustAnotherArchivist 5c907488e1 Handle broken pipe on stdout 4年前
  JustAnotherArchivist b38349e91f Fix duplicate slashes 4年前
  JustAnotherArchivist f23e4cc71e Retry on internal errors 4年前
  JustAnotherArchivist bfe5f59e25 Add marker loop detection 4年前
  JustAnotherArchivist 66bdef3247 Take a bucket URL argument instead of hostname + bucketname 4年前
  JustAnotherArchivist e385c1d302 Limit curl to 10 seconds 4年前
  JustAnotherArchivist 74162445aa Replace curl-archivebot-ua with a more general curl-ua script that supports different UAs selected by aliases 4年前
  JustAnotherArchivist 9d712d64d7 Ignore certain URLs on Twitter and Instagram entirely 4年前
  JustAnotherArchivist 87826d4844 Use line variable instead of prefix+url 4年前
  JustAnotherArchivist 163aacf13c Print deletion URL on stderr 4年前
  JustAnotherArchivist 486a593f15 Add support for more weird Facebook URLs 4年前
  JustAnotherArchivist 256a94443e Fix deduplication within each section processing 4年前
  JustAnotherArchivist 98d77ecc96 Deduplicate output 4年前
  JustAnotherArchivist 6ce64baf87 Remove redundant url-normalise after the extraction 4年前
  JustAnotherArchivist 318183148e Fix URL extraction from Facebook profile overview pages 4年前
  JustAnotherArchivist 869ade27eb Separate names in stderr annotations for the various url-normalise processes 4年前
  JustAnotherArchivist 79f0bd4332 Normalise URLs everywhere to reduce duplicates 4年前
  JustAnotherArchivist dc4efcfbfb One URL normalisation script to rule them all 4年前
  JustAnotherArchivist 0f13a1fadd Add verbosity options, and annotate stderr on wiki-recursive-extract 4年前
  JustAnotherArchivist 3ec816cd04 Add script for link extraction from social media profiles 4年前
  JustAnotherArchivist 5285c406d9 Add script for recursive website and social media discovery 4年前
  JustAnotherArchivist 2be9ca922e Ignore more useless Facebook links 4年前
  JustAnotherArchivist c3b0e5543e Add support for facebook.com/pg/something 4年前
  JustAnotherArchivist 7c389f1fef Add support for hashbang fragments on Twitter links 4年前
  JustAnotherArchivist c56736bc4a Ignore /intent on Twitter 4年前
  JustAnotherArchivist 4f34753788 Add support for Instagram posts and ignore spurious links from the CDN 4年前
  JustAnotherArchivist ad030f5d21 Add support for Facebook pages and groups 4年前
  JustAnotherArchivist cd0b3f6214 Ignore /vi/* on YouTube (video thumbnails) 4年前
  JustAnotherArchivist 6f1cca73ad Support hashtags 4年前
  JustAnotherArchivist c61efa03f0 Make social media normalisation script snscrape-independent 4年前
  JustAnotherArchivist e6008eb971 Add script for automatic social media discovery 4年前
  JustAnotherArchivist fed66542fa Support python3 in any directory instead of just /usr/bin 4年前
  JustAnotherArchivist 5982e131a4 Stop gracefully when encountering a SIGPIPE 4年前
  JustAnotherArchivist c13a1150df Add support for WARC/1.1 4年前
  JustAnotherArchivist 376cde7b8c Fix broken block digest calculation on malformed HTTP responses 4年前
  JustAnotherArchivist b121cbd958 Write all log messages to stderr 4年前
  JustAnotherArchivist ed1270d988 Add support for upper-cased chunk lengths 4年前
  JustAnotherArchivist d4826abde2 Add record ID to log messages 4年前
  JustAnotherArchivist 4925a912c0 Add youtube-filter-autogen-channels 4年前
  JustAnotherArchivist 9b8f223776 Add wiki-sections-sort 4年前
  JustAnotherArchivist 552a4147c2 Fix not returning complete body for non-chunked responses 4年前
  JustAnotherArchivist 0dc0de6b50 Add support for lists 4年前
  JustAnotherArchivist 9d344df8c6 +x 4年前
  JustAnotherArchivist f6a7cbfc70 Fix --with-list-urls help message 4年前
  JustAnotherArchivist 9743aa7c35 Add s3-bucket-list 4年前
  JustAnotherArchivist 91adce786f Add YouTube normalisation script 4年前
  JustAnotherArchivist 5ca90c3b7d Update tmux session commands 4年前
  JustAnotherArchivist 679923d37d Add support for Twitter hashtag extraction 4年前
  JustAnotherArchivist 663383830c Add support for lists 4年前
  JustAnotherArchivist d85d142def Handle parameters on Twitter URLs 5年前