295 Commits (c50a8fd796a26b8c69076bce2e1758b0a0c70b0f)
 

Author SHA1 Message Date
  JustAnotherArchivist c50a8fd796 Fix 'Dictionary mismatch' error when very small dicts are used because the temporary file isn't written to disk before zstdcat gets executed 2 years ago
  JustAnotherArchivist 5bc3d4b020 Fix crash on an empty response 2 years ago
  JustAnotherArchivist 7f25c092d1 Catch other connection errors 2 years ago
  JustAnotherArchivist f8352809f3 Handle ConnectionResetError 2 years ago
  JustAnotherArchivist 0b34268210 Catch socket.timeout, which is a separate exception class from TimeoutError before Python 3.10 2 years ago
  JustAnotherArchivist 0f7a2b32a3 Log number of results on a page 2 years ago
  JustAnotherArchivist 628aeb052f Handle rate limiting 2 years ago
  JustAnotherArchivist d3ea3ce8a0 Switch from urllib to http.client to reuse connections 2 years ago
  JustAnotherArchivist 8f7619ff3a Add retries 2 years ago
  JustAnotherArchivist f98fdd5f01 Fix printing HTTP response line to stdout instead of stderr 2 years ago
  JustAnotherArchivist c9400ac46f Fix recognition of command without optional parts 2 years ago
  JustAnotherArchivist 5ca15a7c94 Add concurrency support 2 years ago
  JustAnotherArchivist 191948cf9d Print number of modified records on requeueing 2 years ago
  JustAnotherArchivist 5121524f83 Log retrieval of showNumPages 2 years ago
  JustAnotherArchivist aba7a1b0b8 Replace resumeKey pagination with page number pagination 2 years ago
  JustAnotherArchivist d57324a26c Add --where for arbitrary conditions 2 years ago
  JustAnotherArchivist fed64387bd Invert count/write logic 2 years ago
  JustAnotherArchivist f914b6afbe Also reset the status_code on requeueing 2 years ago
  JustAnotherArchivist 303bb69c37 Add ia-cdx-search 2 years ago
  JustAnotherArchivist 0b45f7b2ba Swap syntaxes 2 years ago
  JustAnotherArchivist b2c9ea2fa4 Refactor 2 years ago
  JustAnotherArchivist eaf53e1a44 Add alphabetseq 2 years ago
  JustAnotherArchivist c9c8b7e1f7 Add ia-wait-item-tasks 2 years ago
  JustAnotherArchivist b440b35c2f Handle ancient /?v= URLs 2 years ago
  JustAnotherArchivist 0044281b9d Add YouTube channel listing script 2 years ago
  JustAnotherArchivist 1686e04cbe Add a timeout to prevent potentially indefinite blocking 2 years ago
  JustAnotherArchivist 2fc9652ee9 Add support for other instances and full-instance listing 2 years ago
  JustAnotherArchivist b72da478b2 Fix org repo listing on new design/site structure 2 years ago
  JustAnotherArchivist ce7a069af5 Add --jsonl option 2 years ago
  JustAnotherArchivist 9412f0c81c Add azure-storage-list 2 years ago
  JustAnotherArchivist 696e221fc1 Add support for password-protected folders 2 years ago
  JustAnotherArchivist 158c1f1fe0 Fix usage error 2 years ago
  JustAnotherArchivist 53bfe468bf Basic error checks 2 years ago
  JustAnotherArchivist 8c612082b6 Restore MD5 check as the API returns it again 2 years ago
  JustAnotherArchivist 8554c01a84 Fix gofile.io download to the new getFolder endpoint and download server structure 2 years ago
  JustAnotherArchivist a246bad957 Add support for Shorts 2 years ago
  JustAnotherArchivist 6d019e63fc Fix removenonyt performance by using simpler fixed-string patterns instead of a PCRE 2 years ago
  JustAnotherArchivist b27a428787 Fix usage notes from URLs to lines on stdin 2 years ago
  JustAnotherArchivist c4b62c2fea Fix piping when reads return less data than expected 2 years ago
  JustAnotherArchivist dba6d1fb0e Fix stderr printing 2 years ago
  JustAnotherArchivist 6e5a019d9e Always decode stdin with surrogateescape to avoid breaking on binary input 2 years ago
  JustAnotherArchivist e48fb9d1b6 Tighten patterns for user and custom channel URLs so they can handle HTML input more easily 2 years ago
  JustAnotherArchivist 9cbc3f7968 Extract playlist and channel IDs from watch URLs 2 years ago
  JustAnotherArchivist 80bf010433 Percent-decode each line only once 2 years ago
  JustAnotherArchivist f1fcfabafa Add support for reading warc.zst from stdin 2 years ago
  JustAnotherArchivist d5f646f995 Add zstdwarccat 2 years ago
  JustAnotherArchivist 4415c8d5dd Add support for img.youtube.com (old thumbnails) 2 years ago
  JustAnotherArchivist 50a0fcc7b0 Fix performance regression due to 479c2684 2 years ago
  JustAnotherArchivist 479c268441 Fix whitespace handling 2 years ago
  JustAnotherArchivist 56f21d1fc0 Add aggressive video ID v parameter extraction 2 years ago