117 Révisions (257b578fbe0ffc1ba069983526628c38051f1b81)
 

Auteur SHA1 Message Date
  JustAnotherArchivist 257b578fbe Add descending sort il y a 4 ans
  JustAnotherArchivist 6e7449d137 Support column names in any capitalisation il y a 4 ans
  JustAnotherArchivist e5e7bdf8af Add more filtering options il y a 4 ans
  JustAnotherArchivist c611420be9 Remove options from usage line il y a 4 ans
  JustAnotherArchivist 824eb5e353 Add script for getting an AB job overview table il y a 4 ans
  JustAnotherArchivist 34c1a58034 Fix detection of multiple transfer encodings il y a 4 ans
  JustAnotherArchivist 195df08cd5 Fix marker loop on some filenames due to lacking HTML entity processing il y a 4 ans
  JustAnotherArchivist 3cc3a1ed38 Fix nested tags il y a 4 ans
  JustAnotherArchivist 5c907488e1 Handle broken pipe on stdout il y a 4 ans
  JustAnotherArchivist b38349e91f Fix duplicate slashes il y a 4 ans
  JustAnotherArchivist f23e4cc71e Retry on internal errors il y a 4 ans
  JustAnotherArchivist bfe5f59e25 Add marker loop detection il y a 4 ans
  JustAnotherArchivist 66bdef3247 Take a bucket URL argument instead of hostname + bucketname il y a 4 ans
  JustAnotherArchivist e385c1d302 Limit curl to 10 seconds il y a 4 ans
  JustAnotherArchivist 74162445aa Replace curl-archivebot-ua with a more general curl-ua script that supports different UAs selected by aliases il y a 4 ans
  JustAnotherArchivist 9d712d64d7 Ignore certain URLs on Twitter and Instagram entirely il y a 4 ans
  JustAnotherArchivist 87826d4844 Use line variable instead of prefix+url il y a 4 ans
  JustAnotherArchivist 163aacf13c Print deletion URL on stderr il y a 4 ans
  JustAnotherArchivist 486a593f15 Add support for more weird Facebook URLs il y a 4 ans
  JustAnotherArchivist 256a94443e Fix deduplication within each section processing il y a 4 ans
  JustAnotherArchivist 98d77ecc96 Deduplicate output il y a 4 ans
  JustAnotherArchivist 6ce64baf87 Remove redundant url-normalise after the extraction il y a 4 ans
  JustAnotherArchivist 318183148e Fix URL extraction from Facebook profile overview pages il y a 4 ans
  JustAnotherArchivist 869ade27eb Separate names in stderr annotations for the various url-normalise processes il y a 4 ans
  JustAnotherArchivist 79f0bd4332 Normalise URLs everywhere to reduce duplicates il y a 4 ans
  JustAnotherArchivist dc4efcfbfb One URL normalisation script to rule them all il y a 4 ans
  JustAnotherArchivist 0f13a1fadd Add verbosity options, and annotate stderr on wiki-recursive-extract il y a 4 ans
  JustAnotherArchivist 3ec816cd04 Add script for link extraction from social media profiles il y a 4 ans
  JustAnotherArchivist 5285c406d9 Add script for recursive website and social media discovery il y a 4 ans
  JustAnotherArchivist 2be9ca922e Ignore more useless Facebook links il y a 4 ans
  JustAnotherArchivist c3b0e5543e Add support for facebook.com/pg/something il y a 4 ans
  JustAnotherArchivist 7c389f1fef Add support for hashbang fragments on Twitter links il y a 4 ans
  JustAnotherArchivist c56736bc4a Ignore /intent on Twitter il y a 4 ans
  JustAnotherArchivist 4f34753788 Add support for Instagram posts and ignore spurious links from the CDN il y a 4 ans
  JustAnotherArchivist ad030f5d21 Add support for Facebook pages and groups il y a 4 ans
  JustAnotherArchivist cd0b3f6214 Ignore /vi/* on YouTube (video thumbnails) il y a 4 ans
  JustAnotherArchivist 6f1cca73ad Support hashtags il y a 4 ans
  JustAnotherArchivist c61efa03f0 Make social media normalisation script snscrape-independent il y a 4 ans
  JustAnotherArchivist e6008eb971 Add script for automatic social media discovery il y a 4 ans
  JustAnotherArchivist fed66542fa Support python3 in any directory instead of just /usr/bin il y a 4 ans
  JustAnotherArchivist 5982e131a4 Stop gracefully when encountering a SIGPIPE il y a 4 ans
  JustAnotherArchivist c13a1150df Add support for WARC/1.1 il y a 4 ans
  JustAnotherArchivist 376cde7b8c Fix broken block digest calculation on malformed HTTP responses il y a 4 ans
  JustAnotherArchivist b121cbd958 Write all log messages to stderr il y a 4 ans
  JustAnotherArchivist ed1270d988 Add support for upper-cased chunk lengths il y a 4 ans
  JustAnotherArchivist d4826abde2 Add record ID to log messages il y a 4 ans
  JustAnotherArchivist 4925a912c0 Add youtube-filter-autogen-channels il y a 4 ans
  JustAnotherArchivist 9b8f223776 Add wiki-sections-sort il y a 4 ans
  JustAnotherArchivist 552a4147c2 Fix not returning complete body for non-chunked responses il y a 4 ans
  JustAnotherArchivist 0dc0de6b50 Add support for lists il y a 4 ans