Skip to content

Fix missing \r\n in data/whirlwind.warc{.wet|.wat}#15

Merged
lfoppiano merged 2 commits intomainfrom
bugfix/correct-line-ending
Feb 11, 2026
Merged

Fix missing \r\n in data/whirlwind.warc{.wet|.wat}#15
lfoppiano merged 2 commits intomainfrom
bugfix/correct-line-ending

Conversation

@lfoppiano
Copy link
Collaborator

This PR fixes #13:

  • fix .editorconfig
  • add .gitattributes
  • replace data/whirlwind.warc.wat, data/whirlwind.warc.wetanddata/whirlwind.warc` with the ones from the python tour
(base) ➜  jwarc git:(feature/filter-type) java -jar target/jwarc-0.34.1-SNAPSHOT.jar validate --verbose /Users/lfoppiano/development/projects/cc/whirlwind-java/data/whirlwind.warc
Validating /Users/lfoppiano/development/projects/cc/whirlwind-java/data/whirlwind.warc
whirlwind.warc:   offset 0 (length 749) warcinfo application/warc-fields
whirlwind.warc:   offset 749 (length 626) request application/http;msgtype=request
    https://an.wikipedia.org/wiki/Escopete
    HTTP/1.1 GET
    date: 2024-05-18T01:58:10Z
    payload media type: application/octet-stream
whirlwind.warc:   offset 1375 (length 75174) response application/http;msgtype=response
    https://an.wikipedia.org/wiki/Escopete
    HTTP/1.1 200 OK
    date: 2024-05-18T01:58:10Z
    payload media type: text/html
    payload digest pass
    block digest pass
whirlwind.warc:   offset 76549 (length 589) metadata application/warc-fields
    https://an.wikipedia.org/wiki/Escopete
    date: 2024-05-18T01:58:10Z
(base) ➜  jwarc git:(feature/filter-type) java -jar target/jwarc-0.34.1-SNAPSHOT.jar validate --verbose /Users/lfoppiano/development/projects/cc/whirlwind-java/data/whirlwind.warc.wet
Validating /Users/lfoppiano/development/projects/cc/whirlwind-java/data/whirlwind.warc.wet
whirlwind.warc.wet:   offset 0 (length 635) warcinfo application/warc-fields
whirlwind.warc.wet:   offset 635 (length 4860) conversion text/plain
    block digest pass
(base) ➜  jwarc git:(feature/filter-type) java -jar target/jwarc-0.34.1-SNAPSHOT.jar validate --verbose /Users/lfoppiano/development/projects/cc/whirlwind-java/data/whirlwind.warc.wat
Validating /Users/lfoppiano/development/projects/cc/whirlwind-java/data/whirlwind.warc.wat
whirlwind.warc.wat:   offset 0 (length 545) warcinfo application/warc-fields
whirlwind.warc.wat:   offset 545 (length 1697) metadata application/json
    https://an.wikipedia.org/wiki/Escopete
    date: 2024-05-31T01:17:49Z

@lfoppiano lfoppiano merged commit 25af34c into main Feb 11, 2026
1 check passed
@lfoppiano lfoppiano deleted the bugfix/correct-line-ending branch February 11, 2026 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use \r\n in data/whirlwind.warc

2 participants