44 Révisions (v0.2.1)
 

Auteur SHA1 Message Date
  JustAnotherArchivist 93df9cd18d Get rid of the temporary extra log file and read the plain file instead il y a 4 ans
  JustAnotherArchivist 08c3d55376 Add comment on block digest workaround (cf. f14a664b) il y a 4 ans
  JustAnotherArchivist 413435b7fb Work around warcio not writing the correct WARC-Profile header for revisit records on WARC/1.1 il y a 4 ans
  JustAnotherArchivist 08d96b37c5 Support deep/multiple inheritance from Item il y a 4 ans
  JustAnotherArchivist 9d8de13775 Add Item.flush_subitems to flush the new subitems to the database while the item is still being processed il y a 4 ans
  JustAnotherArchivist 50b936b18c Refactor QWARC class to keep relevant variables in instance attributes instead of local variables il y a 4 ans
  JustAnotherArchivist c5d8d93166 Remove stray whitespace il y a 4 ans
  JustAnotherArchivist 8ee9b20718 Remove WARC-Target-URI header from warcinfo record il y a 4 ans
  JustAnotherArchivist f14a664b1c Work around warcio not writing a block digest for warcinfo records (https://github.com/webrecorder/warcio/issues/87) il y a 4 ans
  JustAnotherArchivist 7d53577522 Add parameter for disabling SSL/TLS certificate validation il y a 4 ans
  JustAnotherArchivist 7e049423a4 The memory leak has vanished as of CPython 3.7.3 il y a 4 ans
  JustAnotherArchivist bd14ab3901 Fix crash due to closing the log handler on reaching the max WARC size il y a 4 ans
  JustAnotherArchivist 08117630b0 Remove warcinfo record in each data WARC and refer to the process's warcinfo record in the meta WARC instead il y a 4 ans
  JustAnotherArchivist 26aab15605 urn:X-qwarc instead of urn:qwarc il y a 4 ans
  JustAnotherArchivist 50d46ad51c Use log filename in the target URI of the log resource record il y a 4 ans
  JustAnotherArchivist e093211496 Set content type for resource records il y a 4 ans
  JustAnotherArchivist ae46b53401 Always write a WARC-Warcinfo-ID header il y a 4 ans
  JustAnotherArchivist 23fcdd4026 Write microsecond dates for request and response records il y a 4 ans
  JustAnotherArchivist 3030ad10ab Mark private API accordingly il y a 4 ans
  JustAnotherArchivist e0b4104d21 Remove log handler before writing log record since that requires closing the stream il y a 4 ans
  JustAnotherArchivist 6cfd352f68 Write WARC/1.1 files il y a 4 ans
  JustAnotherArchivist e1ad5c232e Write warcinfo and resource records in meta WARC on firing up qwarc rather than at the end il y a 4 ans
  JustAnotherArchivist f038cf91db Fix unfound distribution handling il y a 4 ans
  JustAnotherArchivist a5dfd5c805 Write spec file + its dependencies and command line to meta WARC il y a 4 ans
  JustAnotherArchivist e99e2304c9 Write meta WARC with log file il y a 4 ans
  JustAnotherArchivist d751844626 Fix starting another item before stopping on STOP file or memory limit exceedance il y a 4 ans
  JustAnotherArchivist 2b0778f9b5 Remove leftovers from initial code rewrite il y a 4 ans
  JustAnotherArchivist 85d78cee13 Add warcinfo record with version information on Python, system, and dependencies il y a 4 ans
  JustAnotherArchivist 9eaa7be4c8 Python 3.7 compatibility il y a 4 ans
  JustAnotherArchivist 9cff6bd5c1 Only open a WARC file when necessary to avoid producing empty WARCs at the end il y a 4 ans
  JustAnotherArchivist 21cf784102 Use setuptools_scm for versioning il y a 4 ans
  JustAnotherArchivist ab22966fef Add to log which item a message is coming from il y a 4 ans
  JustAnotherArchivist 6fafd32685 Error when the retries are exceeded il y a 4 ans
  JustAnotherArchivist 8647d6b396 Use f-strings instead of str.format il y a 4 ans
  JustAnotherArchivist 5008e6e8cd Deduplicate items il y a 4 ans
  JustAnotherArchivist 46c95e2157 Disable decoding the response content il y a 4 ans
  JustAnotherArchivist 91cd20f567 Version 0.1.3 il y a 5 ans
  JustAnotherArchivist 85f6f7bd82 Make qwarc.utils.handle_response_limit_error_retries more useful by passing the deferring handler as an argument il y a 5 ans
  JustAnotherArchivist ad22a2327a Support adding headers to individual requests il y a 5 ans
  JustAnotherArchivist 67076f964c Add support for POST requests il y a 5 ans
  JustAnotherArchivist 57764eb2b0 Version 0.1.2 il y a 5 ans
  JustAnotherArchivist 2d52e78d85 Fix reference to aiohttp.CientError il y a 5 ans
  JustAnotherArchivist 0f107e988d Version 0.1.1 il y a 5 ans
  JustAnotherArchivist c1574a06c9 Fix sleep task type il y a 5 ans
  JustAnotherArchivist e0ca88c807 Fix reference to get_rss il y a 5 ans
  JustAnotherArchivist 984d28ede0 Fix type of --memorylimit, --disklimit, and --warcsplit values il y a 5 ans
  JustAnotherArchivist 8a8935810d Fix references to memory and disk space check methods il y a 5 ans
  JustAnotherArchivist 1c8983fc1e Version 0.1.0 il y a 5 ans
  JustAnotherArchivist be5673cfbf Add record deduplication within a process il y a 5 ans
  JustAnotherArchivist 43f1b5e06e Add LICENSE and README il y a 5 ans