130 コミット (d7a07d1d9904b84059b61d781c79c1978a9d6fdc)
 

作成者 SHA1 メッセージ 日付
  JustAnotherArchivist d7a07d1d99 Normalise domain name to lower-case before further processing 4年前
  JustAnotherArchivist e655080e20 Add support for Facebook /pages/category/Category/Name-ID URLs 4年前
  JustAnotherArchivist daa1a95792 Proper URL decoding 4年前
  JustAnotherArchivist 1bee1cdcc7 Add support for Facebook /people/Name/ID URLs 4年前
  JustAnotherArchivist 00107c0ef0 Add support for YouTube /c/X URLs 4年前
  JustAnotherArchivist b59b82041c Add support for wiki list entries with options 4年前
  JustAnotherArchivist d5953ca95c Use old Opera UA for Twitter to force the old design 4年前
  JustAnotherArchivist 1fa57d41a3 Fix extraction on Wix sites from JSON inside a data attribute 4年前
  JustAnotherArchivist 4a742162d0 Suppress output if there are no matched jobs 4年前
  JustAnotherArchivist fe72d57d7e Add filtering based on substrings anywhere in the string and on regex 4年前
  JustAnotherArchivist cf30a53f82 Add case-insensitive filtering 4年前
  JustAnotherArchivist 711e444e8e Highlight jobs that have been inactive for over 6 hours 4年前
  JustAnotherArchivist b2919030ab Fix sorting on numerical columns 4年前
  JustAnotherArchivist 257b578fbe Add descending sort 4年前
  JustAnotherArchivist 6e7449d137 Support column names in any capitalisation 4年前
  JustAnotherArchivist e5e7bdf8af Add more filtering options 4年前
  JustAnotherArchivist c611420be9 Remove options from usage line 4年前
  JustAnotherArchivist 824eb5e353 Add script for getting an AB job overview table 4年前
  JustAnotherArchivist 34c1a58034 Fix detection of multiple transfer encodings 4年前
  JustAnotherArchivist 195df08cd5 Fix marker loop on some filenames due to lacking HTML entity processing 4年前
  JustAnotherArchivist 3cc3a1ed38 Fix nested tags 4年前
  JustAnotherArchivist 5c907488e1 Handle broken pipe on stdout 4年前
  JustAnotherArchivist b38349e91f Fix duplicate slashes 4年前
  JustAnotherArchivist f23e4cc71e Retry on internal errors 4年前
  JustAnotherArchivist bfe5f59e25 Add marker loop detection 4年前
  JustAnotherArchivist 66bdef3247 Take a bucket URL argument instead of hostname + bucketname 4年前
  JustAnotherArchivist e385c1d302 Limit curl to 10 seconds 4年前
  JustAnotherArchivist 74162445aa Replace curl-archivebot-ua with a more general curl-ua script that supports different UAs selected by aliases 4年前
  JustAnotherArchivist 9d712d64d7 Ignore certain URLs on Twitter and Instagram entirely 4年前
  JustAnotherArchivist 87826d4844 Use line variable instead of prefix+url 4年前
  JustAnotherArchivist 163aacf13c Print deletion URL on stderr 4年前
  JustAnotherArchivist 486a593f15 Add support for more weird Facebook URLs 4年前
  JustAnotherArchivist 256a94443e Fix deduplication within each section processing 4年前
  JustAnotherArchivist 98d77ecc96 Deduplicate output 4年前
  JustAnotherArchivist 6ce64baf87 Remove redundant url-normalise after the extraction 4年前
  JustAnotherArchivist 318183148e Fix URL extraction from Facebook profile overview pages 4年前
  JustAnotherArchivist 869ade27eb Separate names in stderr annotations for the various url-normalise processes 4年前
  JustAnotherArchivist 79f0bd4332 Normalise URLs everywhere to reduce duplicates 4年前
  JustAnotherArchivist dc4efcfbfb One URL normalisation script to rule them all 4年前
  JustAnotherArchivist 0f13a1fadd Add verbosity options, and annotate stderr on wiki-recursive-extract 4年前
  JustAnotherArchivist 3ec816cd04 Add script for link extraction from social media profiles 4年前
  JustAnotherArchivist 5285c406d9 Add script for recursive website and social media discovery 4年前
  JustAnotherArchivist 2be9ca922e Ignore more useless Facebook links 4年前
  JustAnotherArchivist c3b0e5543e Add support for facebook.com/pg/something 4年前
  JustAnotherArchivist 7c389f1fef Add support for hashbang fragments on Twitter links 4年前
  JustAnotherArchivist c56736bc4a Ignore /intent on Twitter 4年前
  JustAnotherArchivist 4f34753788 Add support for Instagram posts and ignore spurious links from the CDN 4年前
  JustAnotherArchivist ad030f5d21 Add support for Facebook pages and groups 4年前
  JustAnotherArchivist cd0b3f6214 Ignore /vi/* on YouTube (video thumbnails) 4年前
  JustAnotherArchivist 6f1cca73ad Support hashtags 4年前