Skip to content

ENH: Add lychee link-checker as a Utilities/Maintenance script#5377

Merged
hjmjohnson merged 27 commits into
InsightSoftwareConsortium:mainfrom
jhlegarreta:AddLinkCheckerGHAWorkflow
May 12, 2026
Merged

ENH: Add lychee link-checker as a Utilities/Maintenance script#5377
hjmjohnson merged 27 commits into
InsightSoftwareConsortium:mainfrom
jhlegarreta:AddLinkCheckerGHAWorkflow

Conversation

@jhlegarreta
Copy link
Copy Markdown
Member

@jhlegarreta jhlegarreta commented May 31, 2025

Add a manual lychee link-checker (Utilities/Maintenance/check-links.sh + lychee.toml) for periodic maintainer-driven runs, instead of the per-PR GHA workflow originally proposed.

Why a manual script instead of a GHA workflow

The original workflow attempt failed CI consistently due to:

  • HTTP 429 rate-limits from github.com on commit URLs in release notes (no per-run workaround that doesn't also accept real broken links).
  • lycheeverse/lychee#1574 — caching across CI runs is not effective.
  • False positives on host-specific bot blocks (999 from LinkedIn, etc.).

In the Feb 2026 discussion on this PR, @jhlegarreta agreed the per-PR CI approach was not workable and authorized a pivot. This commit takes over the branch (with Co-Authored-By: credit) and converts to the script approach.

What's included
Path Purpose
Utilities/Maintenance/check-links.sh Bash wrapper. Resolves the repo root, requires lychee on PATH, accepts optional path arguments, writes a Markdown report and persistent cache.
Utilities/Maintenance/lychee.toml Configuration with rate-limit-aware accept = [..., 429, 999], commit-URL skip regex, ThirdParty exclusion, and document-only include globs.
.gitignore Adds .lycheecache and .lychee-report.md (local artifacts).

Intended use: a maintainer runs Utilities/Maintenance/check-links.sh periodically (or scopes it to a subdirectory) and acts on the report. No CI gate is added.

@github-actions github-actions Bot added type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Enhancement Improvement of existing methods or implementation labels May 31, 2025
@jhlegarreta
Copy link
Copy Markdown
Member Author

Not sure why it complains for the links to the commits in the release notes files.

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Jun 2, 2025

Many failures are due to HTTP code 429: Network error: Too Many Requests. Can we slow down request rate or something similar?

@jhlegarreta jhlegarreta force-pushed the AddLinkCheckerGHAWorkflow branch from ae36dc6 to cb55e14 Compare June 8, 2025 16:22
@jhlegarreta
Copy link
Copy Markdown
Member Author

jhlegarreta commented Jun 8, 2025

Looks like the limiting the rate of requests through caching does not work. Not sure if the cache has to be built first to have it working. Have gone through these
https://github.com/lycheeverse/lychee-action?tab=readme-ov-file#utilising-the-cache-feature
https://github.com/lycheeverse/lychee/blob/master/docs/TROUBLESHOOTING.md

but have not found how to modify the arguments to make throttling/request pace limiting work. Have seen other workflows accepting 429, e.g.

--accept 403,429,500,502,999

but I guess that will effectively not check the links at issue.

Also, checks will not work until this issue is solved:
lycheeverse/lychee#1574

I do not know any other tool that does this job for rst files. This one only seems to work for md files:
https://github.com/tcort/markdown-link-check

@hjmjohnson hjmjohnson marked this pull request as draft January 27, 2026 13:59
@hjmjohnson
Copy link
Copy Markdown
Member

@jhlegarreta It seems like this effort has been abandoned due to difficulties. Perhaps this effort can be made into a manual script in the Utilities directory that is periodically run rather than addding it to the CI in a way that will slow down other efforts?

I'm making a pass through issues trying reduce the number of stale, unlikely to proceed issues.

@jhlegarreta
Copy link
Copy Markdown
Member Author

@hjmjohnson thanks for the heads-up; going through challenging times on my end, so I am being unable to push ITK items. My sincere apologies. The approach you propose sounds reasonable. Feel free to close the PR.

@hjmjohnson hjmjohnson force-pushed the AddLinkCheckerGHAWorkflow branch from cb55e14 to cc6bfed Compare May 9, 2026 20:07
hjmjohnson added a commit to jhlegarreta/ITK that referenced this pull request May 9, 2026
Replaces the original GHA workflow attempt (PR InsightSoftwareConsortium#5377) with a manual
script under Utilities/Maintenance/.  Per author + maintainer
discussion on the original PR, integrating lychee into per-PR CI is
not workable: lycheeverse/lychee#1574 and HTTP 429 rate-limits on
github.com / DOI hosts produce too many spurious failures to act on
in CI.

New artifacts:
  - Utilities/Maintenance/check-links.sh — wrapper that resolves the
    repository root, requires lychee on PATH, runs against the supplied
    paths (or the whole tree by default), and writes a Markdown report
    plus a persistent cache.
  - Utilities/Maintenance/lychee.toml — configuration with the rate-
    limit-aware accept list (treats 429/999 as non-broken), commit-URL
    skip pattern that motivated the original CI failures, exclusions
    for ThirdParty trees, and path globs limited to documentation
    file types.
  - .gitignore: ignore the local cache (.lycheecache) and report
    (.lychee-report.md) artifacts so re-running does not pollute the
    working tree.

This script is intended for periodic / on-demand runs by maintainers,
not the per-PR pipeline.

Co-Authored-By: Jon Haitz Legarreta Gorroño <5576557+jhlegarreta@users.noreply.github.com>
@hjmjohnson hjmjohnson changed the title ENH: Add link checker GHA workflow file ENH: Add lychee link-checker as a Utilities/Maintenance script May 9, 2026
@hjmjohnson
Copy link
Copy Markdown
Member

@jhlegarreta — thank you for the original work and for the OK to pivot. Force-pushed cc6bfedde3 to AddLinkCheckerGHAWorkflow (lease pinned to your prior tip). The branch now contains a manual Utilities/Maintenance/check-links.sh + lychee.toml instead of the GHA workflow, with you credited via Co-Authored-By: on the commit. Title and body updated to match.

Holler if you'd rather I close this and open the script under a fresh PR — both work; reusing this branch felt cleaner since the conversation history is already here.

hjmjohnson added a commit to jhlegarreta/ITK that referenced this pull request May 9, 2026
Replaces the original GHA workflow attempt (PR InsightSoftwareConsortium#5377) with a manual
script under Utilities/Maintenance/.  Per author + maintainer
discussion on the original PR, integrating lychee into per-PR CI is
not workable: lycheeverse/lychee#1574 and HTTP 429 rate-limits on
github.com / DOI hosts produce too many spurious failures to act on
in CI.

New artifacts:
  - Utilities/Maintenance/check-links.sh — wrapper that resolves the
    repository root, requires lychee on PATH, runs against the supplied
    paths (or the whole tree by default), and writes a Markdown report
    plus a persistent cache.
  - Utilities/Maintenance/lychee.toml — configuration with the rate-
    limit-aware accept list (treats 429/999 as non-broken), commit-URL
    skip pattern that motivated the original CI failures, exclusions
    for ThirdParty trees, and path globs limited to documentation
    file types.
  - .gitignore: ignore the local cache (.lycheecache) and report
    (.lychee-report.md) artifacts so re-running does not pollute the
    working tree.

This script is intended for periodic / on-demand runs by maintainers,
not the per-PR pipeline.

Co-Authored-By: Jon Haitz Legarreta Gorroño <5576557+jhlegarreta@users.noreply.github.com>
@hjmjohnson hjmjohnson force-pushed the AddLinkCheckerGHAWorkflow branch from cc6bfed to b9db686 Compare May 9, 2026 20:18
@github-actions github-actions Bot added the area:Documentation Issues affecting the Documentation module label May 9, 2026
@hjmjohnson hjmjohnson marked this pull request as ready for review May 10, 2026 00:59
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 10, 2026

Greptile Summary

This PR adds a manual lychee link-checker maintenance script (Utilities/Maintenance/check-links.sh + lychee.toml) for periodic maintainer-driven use, and accompanies it with a broad sweep of broken-link fixes across 22 documentation files (dead http:// URLs, stale Kitware blog posts, old Doxygen paths, and outdated wiki links).

  • New tooling: check-links.sh wraps lychee with caching, a per-run Markdown report, and optional path scoping; lychee.toml configures rate-limit-aware accept codes, commit-URL skip regex, and extension/path filters. Both files are gitignored via .gitignore.
  • Documentation sweep: ~60 link replacements across release notes, migration guide, FAQ, and contributing docs.
  • One behavioural bug in the script: set -euo pipefail conflicts with the manual status=$? / exit $status pattern \u2014 when lychee finds broken links it exits non-zero, causing bash to terminate immediately and skipping the confirmation message.

Confidence Score: 3/5

Safe to merge after fixing the set -e conflict in check-links.sh; all other findings are documentation cosmetics.

The shell script's set -euo pipefail silently swallows the "Report written to..." confirmation and bypasses the explicit exit-code path whenever lychee finds broken links — precisely the case this tool is built for. The lychee.toml exclude-path entry points to a non-existent directory. The documentation changes are straightforward link-rot fixes with no logic implications.

Utilities/Maintenance/check-links.sh (exit-code handling) and Utilities/Maintenance/lychee.toml (stale exclude path).

Important Files Changed

Filename Overview
Utilities/Maintenance/check-links.sh New lychee wrapper script; set -euo pipefail conflicts with the manual exit-code capture pattern — the "Report written to" message and exit $status are skipped when lychee finds broken links.
Utilities/Maintenance/lychee.toml New lychee config; exclude_path entry Documentation/Release is stale — release notes live at Documentation/docs/releases/ — making the exclusion a silent no-op.
.gitignore Adds .lycheecache and .lychee-report.md gitignore entries; correct and complete.
Documentation/docs/releases/1.0.md Dead links updated; CDash URL is duplicated in the resource list (two entries with identical URL replacing two formerly distinct resources).
Documentation/Maintenance/Release.md Corrects release-notes directory reference from Documentation/ReleaseNotes to Documentation/docs/releases.
Documentation/docs/migration_guides/itk_5_migration_guide.md Doxygen links updated to docs.itk.org; an incomplete "Update scripts" stub section removed cleanly.
Documentation/docs/releases/5.0b03.md Multiple dead links replaced with current GitHub/main-branch equivalents; JIRA tracker noted as decommissioned.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A([Maintainer runs check-links.sh]) --> B{lychee on PATH?}
    B -- No --> C[exit 127]
    B -- Yes --> D{lychee.toml present?}
    D -- No --> E[exit 1]
    D -- Yes --> F[cd REPO_ROOT]
    F --> G{Args supplied?}
    G -- No --> H[default scan]
    G -- Yes --> I[use supplied paths]
    H & I --> J[lychee runs with cache and markdown output]
    J -- exit 0 --> K[echo Report written, exit 0]
    J -- exit non-zero --> L[set -e terminates: echo and exit skipped]
    K --> M([Done])
    L --> N([Script exits with lychee code without confirmation])
Loading

Reviews (1): Last reviewed commit: "ENH: lychee.toml absorbs gateway-timeout..." | Re-trigger Greptile

Comment thread Utilities/Maintenance/check-links.sh Outdated
Comment thread Utilities/Maintenance/lychee.toml Outdated
Comment thread Documentation/docs/releases/1.0.md
hjmjohnson and others added 2 commits May 9, 2026 20:07
Replaces the original GHA workflow attempt (PR InsightSoftwareConsortium#5377) with a manual
script under Utilities/Maintenance/.  Per author + maintainer
discussion on the original PR, integrating lychee into per-PR CI is
not workable: lycheeverse/lychee#1574 and HTTP 429 rate-limits on
github.com / DOI hosts produce too many spurious failures to act on
in CI.

New artifacts:
  - Utilities/Maintenance/check-links.sh — wrapper that resolves the
    repository root, requires lychee on PATH, runs against the supplied
    paths (or the whole tree by default), and writes a Markdown report
    plus a persistent cache.
  - Utilities/Maintenance/lychee.toml — configuration with the rate-
    limit-aware accept list (treats 429/999 as non-broken), commit-URL
    skip pattern that motivated the original CI failures, exclusions
    for ThirdParty trees, and path globs limited to documentation
    file types.
  - .gitignore: ignore the local cache (.lycheecache) and report
    (.lychee-report.md) artifacts so re-running does not pollute the
    working tree.

This script is intended for periodic / on-demand runs by maintainers,
not the per-PR pipeline.

Co-Authored-By: Jon Haitz Legarreta Gorroño <5576557+jhlegarreta@users.noreply.github.com>
Updates four URLs flagged by Utilities/Maintenance/check-links.sh that
have well-defined modern equivalents:

- Doxygen `\tparam` reference: stack.nl/~dimitri/doxygen ->
  doxygen.nl/manual (the canonical Doxygen documentation site since
  the project's move).
- SCI Institute Seg3D landing page: cibc-software/seg3d.html ->
  sci.utah.edu/seg3d.
- SCI Institute SCIRun landing page: cibc-software/scirun.html ->
  sci.utah.edu/scirun.
- MITK home: mitk.org/wiki/The_Medical_Imaging_Interaction_Toolkit
  (the wiki page no longer resolves) -> mitk.org/ (the project home).

All four replacements were verified to return HTTP 200.

Other lychee findings (release-notes link rot, ipfs.io connection
resets, Kitware blog post moves) need case-by-case research and are
deferred.
@hjmjohnson hjmjohnson force-pushed the AddLinkCheckerGHAWorkflow branch from 33fbffc to e9f9781 Compare May 10, 2026 01:08
@jhlegarreta
Copy link
Copy Markdown
Member Author

Holler if you'd rather I close this and open the script under a fresh PR — both work; reusing this branch felt cleaner since the conversation history is already here.

@hjmjohnson This is fine. Thanks for all this work.

Comment thread Documentation/docs/migration_guides/itk_5_migration_guide.md
Comment thread Documentation/docs/learn/courses.md
Comment thread Documentation/docs/learn/courses.md
Comment thread Documentation/docs/learn/courses.md
Comment thread Documentation/docs/learn/faq.md Outdated
Comment thread Documentation/docs/releases/5.0b01.md Outdated
Bulk update of broken links flagged by
Utilities/Maintenance/check-links.sh on top of the prior obvious-fixes
commit:

- itk.org/Insight/Doxygen/html/...
    -> docs.itk.org/projects/doxygen/en/stable/...
  (the legacy doxygen path moved to docs.itk.org).

- classitk_1_1Experimental_1_1<Range>.html
    -> classitk_1_1<Range>.html
  for ImageBufferRange, IndexRange, ShapedImageNeighborhoodRange,
  ImageRegionRange.  These classes were promoted out of the
  itk::Experimental namespace; only the un-Experimental URL resolves
  on the modern doxygen build.  HyperrectangularImageNeighborhoodShape
  is touched the same way (the un-Experimental name is the live one).

- namespaceitk_1_1Experimental.html -> namespaceitk.html
  (the namespace was retired; its members live in itk:: now).

- github.com/.../ITK/blob/master/Documentation/ITK5MigrationGuide.md
    -> github.com/.../ITK/blob/main/Documentation/docs/migration_guides/itk_5_migration_guide.md
  (file moved + renamed; anchors in the old URL resolve in the new
  layout).

- Documentation/Maintenance/Release.md: ReleaseNotes folder rename
  -> Documentation/docs/releases.

- courses.md: uu.nl/en/master/medical-imaging/study-programme
    -> uu.nl/en/masters/medical-imaging.

All replacement URLs verified to return HTTP 200 before staging.
Re-running check-links.sh on the touched files reduces error count
from 90 to 17 (residual = dead course pages, ipfs.io infra, HDF5
license relocation, opencollective rate-limit transient — all need
case-by-case research).
Six links in Documentation/docs/releases/5.0b03.md pointed at files
that have moved or been renamed since the 5.0b03 release.  Repoint
each to its current location on main:

- .github/ISSUE_TEMPLATE.md    -> .github/ISSUE_TEMPLATE/ (templates
                                  moved to a directory of files).
- .github/PULL_REQUEST_TEMPLATE.md
                               -> .github/pull_request_template.md
                                  (renamed lowercase).
- Documentation/CodeOfConduct/Motivation.md
                               -> CODE_OF_CONDUCT.md (the standalone
                                  Motivation.md was merged into the
                                  top-level Code of Conduct).
- Documentation/Data.md        -> Documentation/docs/contributing/data.md
                                  (docs reorganised under
                                  Documentation/docs/).
- Documentation/UploadBinaryData.md
                               -> Documentation/docs/contributing/upload_binary_data.md.
- Utilities/UploadBinaryData.sh (script removed)
                               -> Documentation/docs/contributing/upload_binary_data.md
                                  (the canonical doc that supersedes
                                  the now-removed helper script).
- Documentation/ReleaseNotes/  -> Documentation/docs/releases/.

Replacement URLs verified to return HTTP 200.  Two remaining 404s in
this file (atlassian.net JIRA project — decommissioned, the broken
link itself communicates that fact; GitCheatSheet.pdf — file removed
without a successor) are left in place for historical accuracy.
The legacy ITK JIRA project at insightsoftwareconsortium.atlassian.net
was decommissioned by Atlassian; the URL no longer resolves.  Remove
the broken link and rephrase the surrounding sentence to state the
decommissioning directly.
The University of Central Florida (cs.ucf.edu/~bagci/teaching/mic17),
Uppsala (it.uu.se/edu/course/homepage/bild1/vt14), and Western
University (eng.uwo.ca/biomed/courses/courses_9519) course pages have
been retired with no announced successor URLs.  Remove the three
bullets rather than carry permanently broken links.
gdcm.sourceforge.net/Copyright.html no longer resolves; GDCM
development moved to github.com/malaterre/GDCM, where the canonical
copyright file is Copyright.txt at the repo root.
hjmjohnson added 20 commits May 11, 2026 08:41
The Utilities/ITKv5Preparation directory contained one-shot bash
scripts used during the 4 -> 5 migration; the directory was removed
once the migration completed.  The "Update scripts" section pointed at
that now-deleted directory and trailed off mid-sentence; remove it
rather than carry a permanently broken link to scripts that no longer
exist.
The TIFF row of the supported-formats table had a markdown link with
an empty URL ([\`itk::TIFFImageIO\`]()).  Repoint to the actual
TIFFImageIO doxygen page on docs.itk.org.
HDF Group reorganised the HDF5 license layout: COPYING_LBNL_HDF5 was
renamed to LICENSE_LBNL_HDF5 and then consolidated into the single
top-level LICENSE file at the repository root.  The legacy
support.hdfgroup.org/ftp/HDF5/releases/COPYING_LBNL_HDF5 URL no longer
resolves.

Update the licenses.md note to point at
https://github.com/HDFGroup/hdf5/blob/develop/LICENSE, which contains
the LBNL Copyright Notice and Licensing Terms verbatim.
The HDF Group retired the support.hdfgroup.org/HDF5/ landing page; the
canonical HDF5 home is now www.hdfgroup.org/solutions/hdf5/.
itk.org/CourseWare/Training/RegistrationMethodsOverview.pdf no longer
exists.  Repoint the "registration overview" sentence in faq.md to the
Registration chapter of the ITK Software Guide (Book 2, Chapter 3),
which is the canonical successor and is actively maintained.
The Hyperrectangular shape no longer has its own doxygen page on the
modern docs.itk.org build; demoting that markdown link to inline code
keeps the class name visible without pointing at a 404.  The sibling
ShapedImageNeighborhoodRange page does still exist and remains linked.
Documentation/GitCheatSheet.pdf was removed from the repository
without a successor.  Remove the trailing "We also have a Git
cheatsheet for quick reference." sentence rather than leave a
permanent 404; the surrounding prose still points at the Software
Guide and CONTRIBUTING.md as starting points.
Insight Journal moved from the legacy
   InsightJournalManager/view_reviews.php?...&pubid=N
URL scheme to the canonical
   /browse/publication/N
form years ago.  All 14 publication URLs flagged by
Utilities/Maintenance/check-links.sh in
Documentation/docs/releases/3.2.md are mechanically rewritten to the
modern path; each was verified to return HTTP 200 individually.
The itk.org wiki was retired and snapshot-archived under
insightsoftwareconsortium.github.io/ITKWikiArchive (gh-pages source at
github.com/InsightSoftwareConsortium/ITKWikiArchive).  Four entries
in the 4.0 release notes still pointed at the dead itk.org/Wiki/...
URLs (or used a malformed escape sequence on the archive URL).
Update them to the verified archive paths:

- Modern_C\%2B\%2B (broken backslash escape)
    -> Modern_C%252B%252B/ (the directory's own name is double-encoded
       in the archive layout).
- itk.org/Wiki/Refactoring_itk::FEM_framework_-_V4
    -> ITK_Release_4/Refactoring_FEM_Framework/.
- itk.org/Wiki/Refactoring_Level-Set_framework_-_V4
    -> ITK/Release_4/Refactoring_Level_Set_Framework/Refactoring_Level_Set_Framework/.
- itk.org/Wiki/GPU_Acceleration_-_V4
    -> ITK_Release_4/GPU_Acceleration/GPU_Acceleration/.

Each replacement was verified to return HTTP 200 individually.
The 1.0 release notes pre-date most of ITK's current infrastructure;
the broken URLs flagged by Utilities/Maintenance/check-links.sh have
well-defined modern successors:

- public.kitware.com/dashboard.php?name=itk (the legacy Kitware
  dashboard) and public.kitware.com/Dart (Dart, the predecessor of
  CDash) -> open.cdash.org/index.php?project=Insight.
- public.kitware.com/Cable (the CABLE C++ wrapping system) ->
  github.com/CastXML/CastXML.  CABLE was succeeded by GCC-XML and
  then by CastXML, which is the wrapping toolchain ITK uses today.
- www.cmake.org/CMake/HTML/Download.html -> cmake.org/download/.
- www.itk.org/HTML/Download.php -> docs.itk.org/en/latest/download.html.
- www.itk.org/HTML/Examples.htm -> examples.itk.org/.
- www.itk.org/mailman/listinfo/insight-users (the legacy mailing
  list) -> discourse.itk.org/ (the modern discussion forum that
  replaced it).

All replacements verified to return HTTP 200 individually.
creatis.insa-lyon.fr/Public/Gdcm/Main.html no longer resolves; GDCM
development moved to github.com/malaterre/GDCM (matches the same
fix in faq.md).
review.source.kitware.com/p/ITK no longer resolves; the Gerrit
instance was decommissioned after the move to GitHub pull requests.
Replace the dead link with a parenthetical noting the migration.
The scanco.ch customer-login FAQ page is behind authentication and
returns 404 to anonymous fetches; the format-name cell stands on its
own without the link.  No public Scanco format reference is currently
discoverable to substitute.
Both 'changes in style' and 'Coding Style Guide' references in
5.0a01.md pointed at Book1ch13.html#x57-259000C in the ITK Software
Guide.  The 5.x edition of the SG renumbered the coding-style chapter
to Book1ch9, and the per-section anchor IDs (`x57-...`) were
regenerated.  Drop the chapter-specific anchor and link to the
current chapter file (HTTP 200 verified); readers can navigate the
chapter TOC for the relevant section.
Kitware retired the kitware.com/blog/home/post/N URL scheme; the
specific posts referenced in the 4.8 release notes were never
captured by the Wayback Machine and are not discoverable via the
modern Kitware site search.  Replace each broken "Details:" link
with the canonical ITK successor where one exists, and drop the
link entirely where no successor is available (the surrounding
prose already describes the topic):

- post/888 (CastXML wrapping replaces GCCXML)
    -> https://github.com/CastXML/CastXML
- post/912 (Emscripten / JavaScript build)
    -> https://wasm.itk.org/ (the canonical itk-wasm successor)
- post/890 (Software Guide HTML edition)
    -> https://itk.org/ITKSoftwareGuide/html/
- post/904 (cross-compilation/packaging),
  post/887 (Raspberry Pi),
  post/893 (Android),
  post/883 (MXE/MinGW-w64),
  post/891 (POWER8),
  post/899 (UpdateThirdPartyFromUpstream.sh / Git subtree)
    -> "Details:" link removed; topic line retained.
Same Kitware-blog URL retirement as the 4.8 fixup; canonical
successors used where available, dead link dropped otherwise:

- post/942 (AnisotropicDiffusionLBR web-browser reproducibility)
    -> https://wasm.itk.org/ (the itk-wasm in-browser runtime that
       evolved out of the original Emscripten experiments).
- post/997 (External Modules outside the ITK source tree)
    -> https://docs.itk.org/en/latest/contributing/module_workflows.html
       (the canonical module workflows doc).
- post/939 (Option to export all library symbols on Windows)
    -> "Details:" link removed; topic line retained.
… URLs

Two hdl.handle.net handles consistently return HTTP 500; the matching
publications were located in the modern insight-journal.org catalog
and verified to return HTTP 200:

- 10380/320 (SplitComponents new-class entry in 4.6 release notes)
    -> insight-journal.org/browse/publication/774
       ("An ITK Class that Splits Multi-Component Images").
- 1926/3596 (SLIC super-pixel filter in 5.0b01 release notes,
  referenced twice in the same file)
    -> insight-journal.org/browse/publication/989
       ("Scalable Simple Linear Iterative Clustering (SSLIC) Using a
       Generic and Parallel Approach", Lowekamp et al.).

Each replacement was confirmed by matching publication metadata in
the IJ /browse listing.
Both jeffro.net/mind and caddlab.rad.unc.edu/software/MIND have been
offline for years and have no Wayback Machine snapshot.  Drop the
hyperlinks, keep the URLs as inline code so the historical record of
where the project was hosted is preserved, and add a one-clause note
that the URLs are no longer reachable.
visual.nlm.nih.gov no longer hosts the 2010 ITKv4 kick-off meeting
agenda; the Internet Archive captured the page on 2012-03-13.
Repoint the link to the Wayback snapshot so the historical
information remains reachable.
After all link-rot fixes for ITK's documentation, the periodic
check-links.sh run still surfaced ~14 "errors" that were not link rot
but transient infrastructure responses from the maintainer's network:

- 504 from eth.limo gateway in front of content-link-upload.itk.eth.limo.
- 522 from Cloudflare in front of opencollective.org.
- TCP-level resets and connection-failed signals from ipfs.io,
  monai.io, dicom.nema.org.

Extend the accept list to cover 504 and 522, and turn on
accept_timeouts so the residual reachability artifacts don't drown the
report.  Real link rot (404, 5xx other than 504/522, etc.) still
surfaces normally.
@hjmjohnson hjmjohnson force-pushed the AddLinkCheckerGHAWorkflow branch from e9f9781 to 8bee451 Compare May 11, 2026 13:43
@hjmjohnson hjmjohnson self-requested a review May 12, 2026 15:23
Copy link
Copy Markdown
Member

@hjmjohnson hjmjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jhlegarreta for initiating this effort.

@hjmjohnson hjmjohnson merged commit 7d30daa into InsightSoftwareConsortium:main May 12, 2026
22 of 23 checks passed
@jhlegarreta jhlegarreta deleted the AddLinkCheckerGHAWorkflow branch May 12, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Documentation Issues affecting the Documentation module type:Enhancement Improvement of existing methods or implementation type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants