Skip to content

Add advance health check for database#361

Merged
drtechie merged 17 commits intoPSMRI:mainfrom
DurgaPrasad-54:main
Mar 2, 2026
Merged

Add advance health check for database#361
drtechie merged 17 commits intoPSMRI:mainfrom
DurgaPrasad-54:main

Conversation

@DurgaPrasad-54
Copy link
Copy Markdown
Contributor

@DurgaPrasad-54 DurgaPrasad-54 commented Feb 23, 2026

📋 Description

JIRA ID:

This PR implements /health and /version endpoints in the Common-API as part of Issue PSMRI/AMRIT#102

What’s changed

  • Update /health endpoint
  • Update /version endpoint

Summary by CodeRabbit

  • New Features

    • Build now emits a git properties file with branch, commit, version, and build time.
    • Version endpoint now returns build metadata as JSON.
    • Health checks enhanced with per-component status, severity reporting, timestamps, and background diagnostics.
  • Security

    • /version and /health endpoints are exempted from authentication for public access.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 23, 2026

Warning

Rate limit exceeded

@DurgaPrasad-54 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 11 minutes and 13 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Build now emits git metadata during Maven initialize; /version returns JSON built from generated git.properties; health checks gained a background MySQL diagnostics scheduler, richer per-component status and lifecycle shutdown; auth filter now exempts /health and /version from JWT checks.

Changes

Cohort / File(s) Summary
Build Configuration
pom.xml
Added io.github.git-commit-id:git-commit-id-maven-plugin (v9.0.2) to run revision in initialize, generate ${project.build.outputDirectory}/git.properties, include only git.branch, git.commit.id.abbrev, git.build.version, git.build.time, and disable failures when Git info is absent.
Version Endpoint
src/main/java/com/iemr/common/controller/version/VersionController.java
Refactored /version to @GetMapping producing JSON and returning ResponseEntity<Map<String,String>> populated from git.properties with fallback to "unknown"; added helper to load properties and updated logging.
Health Check Service
src/main/java/com/iemr/common/service/health/HealthService.java
Reworked health logic: added scheduled background MySQL diagnostics (stuck processes, long transactions, deadlocks, slow queries, connection usage), cached DB severity, timeout-bounded SELECT 1 DB connectivity checks, Redis connectivity checks, structured per-component response with checkedAt, and a @PreDestroy shutdownDiagnostics() lifecycle method.
Authentication Filter
src/main/java/com/iemr/common/utils/JwtUserIdValidationFilter.java
Extended shouldSkipAuthentication to treat ${contextPath}/health and ${contextPath}/version as public endpoints so they bypass JWT validation.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant Scheduler
    participant HealthService
    participant MySQL as MySQL Database
    participant Cache as Severity Cache
    participant Redis as Redis

    Note over Scheduler,HealthService: Background diagnostics (startup & periodic)
    Scheduler->>HealthService: trigger runAdvancedMySQLDiagnostics
    HealthService->>MySQL: run diagnostic queries (stuck, long tx, deadlocks, slow queries)
    MySQL-->>HealthService: diagnostic results
    HealthService->>HealthService: compute worst-case severity
    HealthService->>Cache: update cached DB severity

    Note over Client,HealthService: Health check request flow
    Client->>HealthService: GET /health
    HealthService->>MySQL: SELECT 1 (bounded timeout)
    MySQL-->>HealthService: connectivity result
    HealthService->>Cache: read cached DB severity
    HealthService->>Redis: check connectivity
    Redis-->>HealthService: connectivity result
    HealthService->>Client: return per-component map (status, severity, checkedAt)
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through builds to fetch the git trace,
Wove health-check threads in a quiet, safe space,
Background diagnostics nibble through night,
Version maps glow with commit-time light,
I leave the endpoints open — swift as a hare's pace! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add advance health check for database' accurately reflects the main change: adding advanced database health checks to the HealthService. It aligns with the primary objective and is concise and clear.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/java/com/iemr/common/utils/JwtUserIdValidationFilter.java (1)

239-256: ⚠️ Potential issue | 🟡 Minor

Duplicate /user/refreshToken entry with inconsistent match semantics.

Line 243 already checks path.equals(contextPath + "/user/refreshToken"), and line 254 adds path.startsWith(contextPath + "/user/refreshToken"). The startsWith variant is strictly more permissive (it would also match /user/refreshTokenEvil) and makes the equals check on line 243 redundant. Remove the duplicate and decide which semantic is intended — equals is safer.

Proposed fix
 				|| path.startsWith(contextPath + "/user/validateSecurityQuestionAndAnswer")
 				|| path.startsWith(contextPath + "/user/logOutUserFromConcurrentSession")
-				|| path.startsWith(contextPath + "/user/refreshToken")
 				|| path.equals(contextPath + "/health")
 				|| path.equals(contextPath + "/version");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/com/iemr/common/utils/JwtUserIdValidationFilter.java` around
lines 239 - 256, The shouldSkipAuthentication method contains duplicate checks
for "/user/refreshToken": one using equals (path.equals(contextPath +
"/user/refreshToken")) and another using startsWith (path.startsWith(contextPath
+ "/user/refreshToken")); remove the redundant check and choose the intended
semantic (prefer equals for strict match). Edit the shouldSkipAuthentication
function to eliminate the duplicate and keep only the intended condition (either
the equals variant for exact match or the startsWith variant if prefix matching
is required), ensuring no other logic references rely on the removed form.
🧹 Nitpick comments (3)
src/main/java/com/iemr/common/controller/version/VersionController.java (1)

58-76: Consider caching git properties — they never change at runtime.

loadGitProperties() re-reads git.properties from the classpath on every /version call. Since these values are immutable after build time, loading them once (e.g., in @PostConstruct or a lazy-init field) would be cleaner and avoids repeated I/O, even if the endpoint is low-traffic.

Also, the logger.info("version Controller Start/End") calls on lines 61 and 74 will fire on every request to a public, unauthenticated endpoint — consider lowering to debug.

Proposed refactor — load once at startup
 `@RestController`
 public class VersionController {
 
 	private final Logger logger =
 			LoggerFactory.getLogger(this.getClass().getSimpleName());
 	
 	private static final String UNKNOWN_VALUE = "unknown";
+	private final Map<String, String> versionInfo;
+
+	public VersionController() {
+		Map<String, String> info = new LinkedHashMap<>();
+		try {
+			Properties gitProperties = loadGitProperties();
+			info.put("buildTimestamp", gitProperties.getProperty("git.build.time", UNKNOWN_VALUE));
+			info.put("version", gitProperties.getProperty("git.build.version", UNKNOWN_VALUE));
+			info.put("branch", gitProperties.getProperty("git.branch", UNKNOWN_VALUE));
+			info.put("commitHash", gitProperties.getProperty("git.commit.id.abbrev", UNKNOWN_VALUE));
+		} catch (Exception e) {
+			logger.error("Failed to load version information", e);
+			info.put("buildTimestamp", UNKNOWN_VALUE);
+			info.put("version", UNKNOWN_VALUE);
+			info.put("branch", UNKNOWN_VALUE);
+			info.put("commitHash", UNKNOWN_VALUE);
+		}
+		this.versionInfo = Map.copyOf(info);
+	}
 
 	`@Operation`(summary = "Get version information")
 	`@GetMapping`(value = "/version", produces = MediaType.APPLICATION_JSON_VALUE)
 	public ResponseEntity<Map<String, String>> versionInformation() {
-		Map<String, String> response = new LinkedHashMap<>();
-		try {
-			logger.info("version Controller Start");
-			Properties gitProperties = loadGitProperties();
-			response.put("buildTimestamp", gitProperties.getProperty("git.build.time", UNKNOWN_VALUE));
-			response.put("version", gitProperties.getProperty("git.build.version", UNKNOWN_VALUE));
-			response.put("branch", gitProperties.getProperty("git.branch", UNKNOWN_VALUE));
-			response.put("commitHash", gitProperties.getProperty("git.commit.id.abbrev", UNKNOWN_VALUE));
-		} catch (Exception e) {
-			logger.error("Failed to load version information", e);
-			response.put("buildTimestamp", UNKNOWN_VALUE);
-			response.put("version", UNKNOWN_VALUE);
-			response.put("branch", UNKNOWN_VALUE);
-			response.put("commitHash", UNKNOWN_VALUE);
-		}
-		logger.info("version Controller End");
-		return ResponseEntity.ok(response);
+		return ResponseEntity.ok(versionInfo);
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/com/iemr/common/controller/version/VersionController.java`
around lines 58 - 76, The versionInformation() method currently calls
loadGitProperties() on every request and logs "version Controller Start/End" at
INFO; change this to load git.properties once at startup and use DEBUG for
request tracing: create a private field (e.g., gitPropertiesCache or
cachedGitInfo) populated in a `@PostConstruct` method or as a lazy-initialized
singleton that calls loadGitProperties() once and falls back to UNKNOWN_VALUE on
error, update versionInformation() to read from the cache instead of calling
loadGitProperties(), and change logger.info(...) to logger.debug(...) to avoid
noisy INFO logs for the public endpoint.
src/main/java/com/iemr/common/service/health/HealthService.java (2)

50-62: Unused constant RESPONSE_TIME_SLOW_MS.

RESPONSE_TIME_SLOW_MS (line 56) is declared but never referenced anywhere in this file. Remove it to avoid confusion, or implement the response-time check it was intended for.

Remove dead constant
     private static final String LOG_EVENT_POOL_EXHAUSTED = "MYSQL_POOL_EXHAUSTED";
-    private static final long RESPONSE_TIME_SLOW_MS    = 2000; // > 2s → SLOW
     private static final int  STUCK_PROCESS_THRESHOLD  = 5;    // > 5 stuck → WARNING
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/com/iemr/common/service/health/HealthService.java` around lines
50 - 62, The constant RESPONSE_TIME_SLOW_MS is declared but never used; either
remove the unused constant from HealthService or implement the intended
response-time check using it (e.g., in any method that evaluates query latency)
so the value is referenced. Locate RESPONSE_TIME_SLOW_MS in HealthService and
either delete that declaration along with any related comment, or add logic
(using RESPONSE_TIME_SLOW_MS) where query/response times are assessed (e.g., in
methods that compute slow query metrics) to classify responses as SLOW when >
RESPONSE_TIME_SLOW_MS.

202-355: Background diagnostic approach is well-structured overall.

The per-check isolation with individual try-catch blocks, structured log events with tags, and the escalation model are all sound. A few minor observations beyond what's already flagged:

  1. CHECK 2 (long-running transactions, line 253): Escalates directly to CRITICAL, which maps to DOWN in the health endpoint. A single long transaction (e.g., a legitimate batch job > 30s) would mark the entire DB as DOWN. Consider using WARNING here instead, or making the threshold configurable.

  2. CHECK 5 (lines 304-337): Threads_connected is the server-wide thread count, not the application's pool usage. In shared-database deployments, other apps' connections inflate this metric and could trigger false alerts. The log message says "connection pool" but this is actually server-level connection usage — consider clarifying the log or adding a note.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/com/iemr/common/service/health/HealthService.java` around lines
202 - 355, runAdvancedMySQLDiagnostics: for CHECK 2 (InnoDB long-running
transactions / LOG_EVENT_LOCK_WAIT) avoid mapping a single long transaction
directly to CRITICAL/DOWN—change the escalation to WARNING (use
escalate(worstSeverity, "WARNING")) or make the threshold/severity configurable
(introduce a config flag used by the check) so legitimate long-running jobs
don't mark DB as DOWN; for CHECK 5 (Threads_connected / LOG_EVENT_POOL_EXHAUSTED
/ LOG_EVENT_CONN_USAGE) update the log messages and variable names to reflect
these are server-level connection metrics (e.g., mention "server-level
Threads_connected (may include other apps)" instead of "connection pool") and/or
add a comment explaining this limitation so alerts aren't misleading in
shared-database deployments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 141-200: checkDatabaseConnectivity currently returns the
cachedDbSeverity (which defaults to "INFO") on a successful SELECT 1 while Redis
returns "OK", causing inconsistent healthy severities; modify
checkDatabaseConnectivity so that on successful connectivity it sets severity to
"OK" (and uses resolveDatabaseStatus("OK") for status) instead of using
cachedDbSeverity, and ensure any related logic in runAdvancedMySQLDiagnostics
continues to update cachedDbSeverity for degraded states only so healthy checks
remain "OK".
- Around line 284-301: The slow query check uses the cumulative MySQL
"Slow_queries" counter and will permanently warn after the first slow query; add
a long field (e.g., previousSlowQueryCount) to HealthService and update the
CHECK 4 block to compute a delta = slowQueries - previousSlowQueryCount, only
warn/escalate (use LOG_EVENT_SLOW_QUERIES and escalate(...)) when delta > 0,
include the delta and cumulative value in the log, and then assign
previousSlowQueryCount = slowQueries so subsequent runs only flag new slow
queries (follow the same delta-tracking pattern used by previousDeadlockCount).

---

Outside diff comments:
In `@src/main/java/com/iemr/common/utils/JwtUserIdValidationFilter.java`:
- Around line 239-256: The shouldSkipAuthentication method contains duplicate
checks for "/user/refreshToken": one using equals (path.equals(contextPath +
"/user/refreshToken")) and another using startsWith (path.startsWith(contextPath
+ "/user/refreshToken")); remove the redundant check and choose the intended
semantic (prefer equals for strict match). Edit the shouldSkipAuthentication
function to eliminate the duplicate and keep only the intended condition (either
the equals variant for exact match or the startsWith variant if prefix matching
is required), ensuring no other logic references rely on the removed form.

---

Nitpick comments:
In `@src/main/java/com/iemr/common/controller/version/VersionController.java`:
- Around line 58-76: The versionInformation() method currently calls
loadGitProperties() on every request and logs "version Controller Start/End" at
INFO; change this to load git.properties once at startup and use DEBUG for
request tracing: create a private field (e.g., gitPropertiesCache or
cachedGitInfo) populated in a `@PostConstruct` method or as a lazy-initialized
singleton that calls loadGitProperties() once and falls back to UNKNOWN_VALUE on
error, update versionInformation() to read from the cache instead of calling
loadGitProperties(), and change logger.info(...) to logger.debug(...) to avoid
noisy INFO logs for the public endpoint.

In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 50-62: The constant RESPONSE_TIME_SLOW_MS is declared but never
used; either remove the unused constant from HealthService or implement the
intended response-time check using it (e.g., in any method that evaluates query
latency) so the value is referenced. Locate RESPONSE_TIME_SLOW_MS in
HealthService and either delete that declaration along with any related comment,
or add logic (using RESPONSE_TIME_SLOW_MS) where query/response times are
assessed (e.g., in methods that compute slow query metrics) to classify
responses as SLOW when > RESPONSE_TIME_SLOW_MS.
- Around line 202-355: runAdvancedMySQLDiagnostics: for CHECK 2 (InnoDB
long-running transactions / LOG_EVENT_LOCK_WAIT) avoid mapping a single long
transaction directly to CRITICAL/DOWN—change the escalation to WARNING (use
escalate(worstSeverity, "WARNING")) or make the threshold/severity configurable
(introduce a config flag used by the check) so legitimate long-running jobs
don't mark DB as DOWN; for CHECK 5 (Threads_connected / LOG_EVENT_POOL_EXHAUSTED
/ LOG_EVENT_CONN_USAGE) update the log messages and variable names to reflect
these are server-level connection metrics (e.g., mention "server-level
Threads_connected (may include other apps)" instead of "connection pool") and/or
add a comment explaining this limitation so alerts aren't misleading in
shared-database deployments.

Comment thread src/main/java/com/iemr/common/service/health/HealthService.java
Comment thread src/main/java/com/iemr/common/service/health/HealthService.java Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/main/java/com/iemr/common/service/health/HealthService.java (2)

258-281: Stuck process check: warn-level log but OK severity for counts 1–5.

When stuckCount is between 1 and STUCK_PROCESS_THRESHOLD (5), the code logs at WARN level but returns SEVERITY_OK, which means no escalation. This is a bit inconsistent — either the situation is worth alerting on (and should return at least INFO or a lower-tier signal) or the log should be at INFO/DEBUG level.

If this is intentional observability without escalation, consider logging at INFO instead of WARN to avoid noisy log-based alerts that don't match the reported severity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/com/iemr/common/service/health/HealthService.java` around lines
258 - 281, performStuckProcessCheck logs a WARN when any stuckCount > 0 but
still returns SEVERITY_OK for counts <= STUCK_PROCESS_THRESHOLD, causing
inconsistent noisy warnings; change the non-escalating branch so that when
stuckCount > 0 && stuckCount <= STUCK_PROCESS_THRESHOLD you call logger.info
(not logger.warn) and retain returning SEVERITY_OK, and keep logger.warn +
return SEVERITY_WARNING only when stuckCount > STUCK_PROCESS_THRESHOLD (refer to
performStuckProcessCheck, STUCK_PROCESS_THRESHOLD, STUCK_PROCESS_SECONDS,
logger.warn/info, SEVERITY_OK, SEVERITY_WARNING).

79-79: Remove unused constant RESPONSE_TIME_SLOW_MS.

This constant is declared at line 79 but never used anywhere in the codebase. Remove it to keep the code clean.

🧹 Proposed fix
-    private static final long RESPONSE_TIME_SLOW_MS    = 2000; // > 2s → SLOW
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/com/iemr/common/service/health/HealthService.java` at line 79,
Remove the unused constant RESPONSE_TIME_SLOW_MS from HealthService to clean up
dead code: locate the declaration "private static final long
RESPONSE_TIME_SLOW_MS = 2000;" in the HealthService class and delete that field,
then run a build/compile to confirm no references remain (if any exist, replace
usages with the appropriate existing threshold or refactor to use the existing
RESPONSE_TIME_* constants).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 174-183: When a successful SELECT 1 proves DB connectivity, avoid
reporting DOWN based on background diagnostics: in the block that reads
cachedDbSeverity and sets result.put(FIELD_STATUS...) (using
cachedDbSeverity.get() and resolveDatabaseStatus(severity)), check if
resolveDatabaseStatus(severity) would return DOWN and, if so, override/cap the
severity to DEGRADED before putting values into result (e.g., set severity =
"DEGRADED" and use resolveDatabaseStatus("DEGRADED") for FIELD_STATUS); make the
same change in the similar block around lines 392-398 so that DOWN is reserved
for actual connection failures handled in the catch branch.
- Around line 283-304: performLongTransactionCheck currently marks any single
long-running InnoDB transaction (> STUCK_PROCESS_SECONDS) as CRITICAL; change it
to use a count threshold similar to STUCK_PROCESS_THRESHOLD (e.g.,
STUCK_TRANSACTION_THRESHOLD) and implement graduated escalation: if lockCount ==
0 return SEVERITY_OK; if 0 < lockCount <= STUCK_TRANSACTION_THRESHOLD return
SEVERITY_WARNING (and log with LOG_EVENT_LOCK_WAIT, lockCount and threshold); if
lockCount > STUCK_TRANSACTION_THRESHOLD return SEVERITY_CRITICAL (and log
likewise). Keep existing exception logging behavior but include lockCount-based
decision logic inside performLongTransactionCheck and add/define the new
STUCK_TRANSACTION_THRESHOLD constant.

---

Duplicate comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 330-353: The current delta logic in performSlowQueryCheck uses
previousSlowQueryCount.getAndSet(slowQueries) which can produce a false-positive
warning on first run if previousSlowQueryCount defaults to 0; change the
initialization/flow so the first observed slowQueries value is only used to seed
previousSlowQueryCount (no warning) and subsequent runs compute delta normally.
Specifically, in performSlowQueryCheck, after reading slowQueries from
ResultSet, detect an uninitialized sentinel (e.g., previousSlowQueryCount == 0
or a dedicated negative sentinel) and set previousSlowQueryCount to slowQueries
without logging; otherwise compute delta = slowQueries - previousSlow and only
log/warn when delta > 0 (retain STATUS_VALUE and LOG_EVENT_SLOW_QUERIES
references).

---

Nitpick comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 258-281: performStuckProcessCheck logs a WARN when any stuckCount
> 0 but still returns SEVERITY_OK for counts <= STUCK_PROCESS_THRESHOLD, causing
inconsistent noisy warnings; change the non-escalating branch so that when
stuckCount > 0 && stuckCount <= STUCK_PROCESS_THRESHOLD you call logger.info
(not logger.warn) and retain returning SEVERITY_OK, and keep logger.warn +
return SEVERITY_WARNING only when stuckCount > STUCK_PROCESS_THRESHOLD (refer to
performStuckProcessCheck, STUCK_PROCESS_THRESHOLD, STUCK_PROCESS_SECONDS,
logger.warn/info, SEVERITY_OK, SEVERITY_WARNING).
- Line 79: Remove the unused constant RESPONSE_TIME_SLOW_MS from HealthService
to clean up dead code: locate the declaration "private static final long
RESPONSE_TIME_SLOW_MS = 2000;" in the HealthService class and delete that field,
then run a build/compile to confirm no references remain (if any exist, replace
usages with the appropriate existing threshold or refactor to use the existing
RESPONSE_TIME_* constants).

Comment thread src/main/java/com/iemr/common/service/health/HealthService.java
Comment thread src/main/java/com/iemr/common/service/health/HealthService.java
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
src/main/java/com/iemr/common/service/health/HealthService.java (2)

102-103: Both delta counters start at 0, causing a false-positive WARNING on the first diagnostic run.

Innodb_deadlocks and Slow_queries are server-lifetime cumulative counters. On first run after service startup, previousDeadlockCount and previousSlowQueryCount are 0, so any non-zero current value (accumulated before this service started) registers as a new delta and emits a WARNING — triggering alert noise on every deployment.

Consider seeding the counters from MySQL during the first diagnostic run using a sentinel value (e.g., -1L) to skip delta comparison:

♻️ Proposed approach
-    private final AtomicLong previousDeadlockCount  = new AtomicLong(0);
-    private final AtomicLong previousSlowQueryCount = new AtomicLong(0);
+    // -1L = "first run, seed only — do not emit a delta warning"
+    private final AtomicLong previousDeadlockCount  = new AtomicLong(-1L);
+    private final AtomicLong previousSlowQueryCount = new AtomicLong(-1L);

Then in each delta check:

-                long previousDeadlocks = previousDeadlockCount.getAndSet(currentDeadlocks);
-                if (currentDeadlocks > previousDeadlocks) {
+                long previousDeadlocks = previousDeadlockCount.getAndSet(currentDeadlocks);
+                if (previousDeadlocks >= 0 && currentDeadlocks > previousDeadlocks) {

Apply the same guard to performSlowQueryCheck.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/com/iemr/common/service/health/HealthService.java` around lines
102 - 103, Previous delta counters previousDeadlockCount and
previousSlowQueryCount are initialized to 0 causing false positives on first
run; change their initial sentinel to -1L and update the diagnostic logic in the
methods that compute deltas (the deadlock check and performSlowQueryCheck) to
treat a previous value of -1 as "first run"—on first run set
previousDeadlockCount/previousSlowQueryCount to the current MySQL-retrieved
count and skip emitting a WARNING or computing a delta, otherwise compute delta
normally and update the previous counter after the check.

79-79: RESPONSE_TIME_SLOW_MS is declared but never used — either wire it in or remove it.

If the intent is to classify a slow-but-alive DB response (e.g., response time > 2 s → SLOW/WARNING), the elapsed-time measurement and comparison logic is missing from checkDatabaseConnectivity. Otherwise, remove the dead constant.

#!/bin/bash
# Confirm RESPONSE_TIME_SLOW_MS is not referenced anywhere else in the codebase.
rg -n "RESPONSE_TIME_SLOW_MS"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/java/com/iemr/common/service/health/HealthService.java` at line 79,
RESPONSE_TIME_SLOW_MS is declared but unused; either remove the constant or wire
it into checkDatabaseConnectivity by measuring DB call elapsed time (capture
start/end around the DB ping in checkDatabaseConnectivity), compare the elapsed
millis to RESPONSE_TIME_SLOW_MS and return/mark a SLOW or WARNING health state
accordingly (use the existing health result/enum used by
checkDatabaseConnectivity), or if you choose deletion simply remove the
RESPONSE_TIME_SLOW_MS constant and any related dead code; update unit tests if
present to reflect the chosen behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 177-181: The health check currently only sets
stmt.setQueryTimeout(3) but does not bound dataSource.getConnection(), so
getConnection() can block up to the pool's connectionTimeout and violate the
"/health must never block > 3s" guarantee; either ensure the connection pool
(HikariCP) configuration sets connectionTimeout ≤ 3000 ms or wrap the entire
connection+query block in a timed task (e.g., ExecutorService.submit(() -> { try
(Connection c = dataSource.getConnection(); Statement s = c.createStatement()) {
s.setQueryTimeout(3); s.execute("SELECT 1"); } })).get(3, TimeUnit.SECONDS) and
handle TimeoutException by failing fast and closing resources, referencing
dataSource.getConnection(), stmt.setQueryTimeout(...), and
performConnectionUsageCheck to keep consistent behavior.
- Around line 269-276: The current logic logs sub-threshold stuck MySQL process
counts using logger.warn while still returning SEVERITY_OK; change the logging
level for the non-escalating branch so that when stuckCount > 0 but <=
STUCK_PROCESS_THRESHOLD (use the same check around stuckCount and
STUCK_PROCESS_THRESHOLD) you call logger.info instead of logger.warn and keep
returning SEVERITY_OK, and reserve logger.warn (or leave as-is) only in the
branch where stuckCount > STUCK_PROCESS_THRESHOLD that returns SEVERITY_WARNING;
update the logging call that references LOG_EVENT_STUCK_PROCESS and
STUCK_PROCESS_SECONDS accordingly.
- Line 31: Update the incompatible import in HealthService: replace the javax
annotation import with the Jakarta equivalent by changing the import of
PreDestroy from javax.annotation.PreDestroy to jakarta.annotation.PreDestroy so
the HealthService class compiles under Spring Boot 3.2.2 (locate the import
statement in the HealthService file and update it accordingly).

---

Duplicate comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 183-186: When the background diagnostic cachedDbSeverity is
CRITICAL but the live "SELECT 1" check succeeds, cap the reported severity to
DEGRADED before resolving status so we don't report DOWN; update the block that
currently does result.put(FIELD_STATUS, resolveDatabaseStatus(severity)) and
result.put(FIELD_SEVERITY, severity) to first map SEVERITY_CRITICAL ->
SEVERITY_DEGRADED (e.g., compute a cappedSeverity variable from
cachedDbSeverity.get()), use cappedSeverity when calling resolveDatabaseStatus
and when putting FIELD_SEVERITY, and keep reporting STATUS_DOWN only in the
existing catch branch that handles exceptions.

---

Nitpick comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 102-103: Previous delta counters previousDeadlockCount and
previousSlowQueryCount are initialized to 0 causing false positives on first
run; change their initial sentinel to -1L and update the diagnostic logic in the
methods that compute deltas (the deadlock check and performSlowQueryCheck) to
treat a previous value of -1 as "first run"—on first run set
previousDeadlockCount/previousSlowQueryCount to the current MySQL-retrieved
count and skip emitting a WARNING or computing a delta, otherwise compute delta
normally and update the previous counter after the check.
- Line 79: RESPONSE_TIME_SLOW_MS is declared but unused; either remove the
constant or wire it into checkDatabaseConnectivity by measuring DB call elapsed
time (capture start/end around the DB ping in checkDatabaseConnectivity),
compare the elapsed millis to RESPONSE_TIME_SLOW_MS and return/mark a SLOW or
WARNING health state accordingly (use the existing health result/enum used by
checkDatabaseConnectivity), or if you choose deletion simply remove the
RESPONSE_TIME_SLOW_MS constant and any related dead code; update unit tests if
present to reflect the chosen behavior.

Comment thread src/main/java/com/iemr/common/service/health/HealthService.java Outdated
Comment thread src/main/java/com/iemr/common/service/health/HealthService.java
Comment thread src/main/java/com/iemr/common/service/health/HealthService.java
@sonarqubecloud
Copy link
Copy Markdown

@drtechie drtechie merged commit 4b44045 into PSMRI:main Mar 2, 2026
2 checks passed
vanitha1822 added a commit that referenced this pull request Apr 20, 2026
* Update application.properties

* add column in create BeneficiaryModel

* Elasticsearch implementation for Beneficiary Search (#324)

* fix: implement functionality to search beneficiaries with Elasticsearch

* fix: remove unwanted import

* fix: update pom.xml

* fix: change the response code

* variable added

* update language

* update language

* Downgrade version from 3.6.1 to 3.6.0

* Elastic Search Implementation for Advanced Search (#327)

* fix: cherry-pick commits for advanced search

* fix: cherry-pick commit for token issue - mobile application

* fix: add the missing properties

* fix: add function to retrieve userid

* fix: move the fetch Userid to jwtUtil

* Remove empty line in application.properties

* fix:signature check for mmu

* Update application.properties

* Update application.properties

* fix: retrive any user without deleted

* implement state wise hide un hide form fields

* implement state wise hide un hide form fields

* implement state wise hide un hide form fields

* enhance welcome sms code

* fix hide unhide form issue

* docs: add DeepWiki badge and documentation link

* Add DeepWiki badge to README

Added DeepWiki badge to README for better visibility.

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* fix hide unhide form issue

* chore(swagger): automate swagger sync to amrit-docs (#354)

* chore(swagger): automate swagger sync to amrit-docs

* chore(swagger): automate swagger sync to amrit-docs

* chore(swagger): automate swagger sync to amrit-docs

* Update the swagger json github workflow (#359)

* chore(swagger): automate swagger sync to amrit-docs

* chore(swagger): automate swagger sync to amrit-docs

* chore(swagger): automate swagger sync to amrit-docs

* fix(swagger): update the workflow and fix the running issue

* fix(swagger): fix the swagger json workflow

* chore(swagger): add fixed branch name in workflow

* chore(ci): prevent multiple swagger sync PRs by using fixed branch

* chore(swagger): add Dev/UAT/Demo servers to OpenAPI config

* chore(swagger): avoid default server URLs

* chore(swagger): remove field injection and inject URLs into OpenAPI bean

* Add /health endpoint and standardize /version response (#331)

* Add /health endpoint and standardize /version response

* Add license headers and Javadocs to health and version controllers

* Enhance /health endpoint to check Database and Redis connectivity

* Improve /health endpoint HTTP status handling and logging

* Enhance database health check with validation query

* Refactor health controller to constructor injection and constants

* Refactor: Extract business logic to HealthService to keep controller lean

* Refactor: Extract business logic to HealthService to keep controller lean

* Fix: Use ObjectProvider for optional health dependencies

* Add advance health check for database (#361)

* chore(swagger): automate swagger sync to amrit-docs

* chore(swagger): automate swagger sync to amrit-docs

* chore(swagger): automate swagger sync to amrit-docs

* fix(swagger): update the workflow and fix the running issue

* fix(swagger): fix the swagger json workflow

* chore(swagger): add fixed branch name in workflow

* chore(ci): prevent multiple swagger sync PRs by using fixed branch

* chore(swagger): add Dev/UAT/Demo servers to OpenAPI config

* chore(swagger): avoid default server URLs

* chore(swagger): remove field injection and inject URLs into OpenAPI bean

* feat(health,version): update version and health endpoints and add advance check for database

* fix(health): normalize severity and fix slow query false positives

* fix(health): avoid false CRITICAL on single long-running MySQL transaction

* fix(health): enforce 3s DB connection timeout via HikariCP

* Merge Release-3.8.0 (3.6.1) to Main (#379)

* Move code to 3.6.1 to 3.8.0 (#372)

* fix: cors spell fixes and import of packages updates

* fix: deployment issue fix

* feat: amm-1959 dhis token for cho report re-direction

* fix: beneficiary history on revisit (#320)

* fix: call type mapper (#322)

* Elasticsearch implementation for Beneficiary Search (#324)

* fix: implement functionality to search beneficiaries with Elasticsearch

* fix: remove unwanted import

* fix: update pom.xml

* fix: change the response code

* variable added

* Elastic Search Implementation for Advanced Search (#327)

* fix: cherry-pick commits for advanced search

* fix: cherry-pick commit for token issue - mobile application

* fix: add the missing properties

* fix: add function to retrieve userid

* fix: move the fetch Userid to jwtUtil

* fix:signature check for mmu

* fix: retrive any user without deleted

* fix: update KM filepath

* FLW-713 Remove All File Upload Options (#350)

* FLW-713 Remove All File Upload Options

* Fix UserServiceRoleRepo dependency issue and codeRabit comment

* fixed coderabit comment

* fix userMappingId issue

* Add SMS functionality in release-3.6.1 (#358)

* Enable SMS Functionality in MMU App to Send Prescriptions  (#325)

* fix: sms template save and map mmu (#306)

* Vb/sms (#307)

* fix: sms template save and map mmu

* fix: enable mms for mmu prescription

* Enable SMS Functionality in MMU App to Send Prescriptions  (#325)

* fix: sms template save and map mmu (#306)

* Vb/sms (#307)

* fix: sms template save and map mmu

* fix: enable mms for mmu prescription

---------

Co-authored-by: Vishwanath Balkur <118195001+vishwab1@users.noreply.github.com>

---------

Co-authored-by: 5Amogh <amoghavarsh@navadhiti.com>
Co-authored-by: Vanitha S <116701245+vanitha1822@users.noreply.github.com>
Co-authored-by: Sachin Kadam <152252767+sac2kadam@users.noreply.github.com>
Co-authored-by: vanitha1822 <vanitha@navadhiti.com>
Co-authored-by: Saurav Mishra <80103738+SauravBizbRolly@users.noreply.github.com>

* fix: add OTP rate limiting to prevent OTP flooding on sendConsent endpoint (#373)

- Add OtpRateLimiterService with Redis-backed per-mobile rate limits (3/min, 10/hr, 20/day)
- Add OtpRateLimitException for 429 responses
- Integrate rate limiter in BeneficiaryOTPHandlerImpl and BeneficiaryConsentController
- Add otp.ratelimit.* properties to common_ci and common_docker profiles
- Update common_example.properties with new OTP rate limit config

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Health api (#376)

* Cherry-pick health and version API enhancements to release-3.6.1 (#371)

* feat(health,version): update version and health endpoints and add advance check for database

* fix(health): normalize severity and fix slow query false positives

* fix(health): avoid false CRITICAL on single long-running MySQL transaction

* fix(health): enforce 3s DB connection timeout via HikariCP

* Release 3.6.1 (#374)

* feat(health,version): update version and health endpoints and add advance check for database

* fix(health): normalize severity and fix slow query false positives

* fix(health): avoid false CRITICAL on single long-running MySQL transaction

* fix(health): enforce 3s DB connection timeout via HikariCP

* feat(health): add healthcontroller and fix versioncontroller issues

* fix: build error (#375)

---------

Co-authored-by: KOPPIREDDY DURGA PRASAD <144464542+DurgaPrasad-54@users.noreply.github.com>
Co-authored-by: Vanitha S <116701245+vanitha1822@users.noreply.github.com>

---------

Co-authored-by: Vishwanath Balkur <118195001+vishwab1@users.noreply.github.com>
Co-authored-by: 5Amogh <amoghavarsh@navadhiti.com>
Co-authored-by: Sachin Kadam <152252767+sac2kadam@users.noreply.github.com>
Co-authored-by: Saurav Mishra <80103738+SauravBizbRolly@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: KOPPIREDDY DURGA PRASAD <144464542+DurgaPrasad-54@users.noreply.github.com>

* fix: video consultation functionality

* fix: pom version update

* fix: add cti-server-ip

* fix: comment unwanted code

* fix: update videocall url property

* fix: update cti-server-ip

* docs: add CLAUDE.md for Claude Code guidance

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: KM issue

* fix: KM issue

* fix: remove unwanted imports

* fix: conflicts

* fix: update the temp path

* Fix the OpenKM Issue (#389)

* fix: remove km in application.properties

* fix: update all the properties to fetch from env

* fix: update path

* fix: KM issue

* fix: get file from km

* fix: build issue

* fix: build issue

* fix: remove unwanted imports

* fix: build issue

* fix: remove commented line

* Enable KM configuration in common_example.properties

Uncomment KM configuration properties for OpenKM.

* Fix ConfigProperties to resolve env variable placeholders via Spring Environment (#390)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: update sms issue

* fix: build issue

* fix: update condition

* fix: edit ben issue

* fix: phone number issue for sms

* fix: update the url with jwt token

* fix: jitsi authorization issue

* fix: skip auth

* fix: hash key updation

* fix: jwt type in header for authorization

* fix: update file path

* fix: vc recording path updation

* fix: update video call recording functionality

* fix: remove unwanted codes

* fix: coderabbit comments

---------

Co-authored-by: Saurav Mishra <80103738+SauravBizbRolly@users.noreply.github.com>
Co-authored-by: Saurav Mishra <saurav.mishra@bizbrolly.com>
Co-authored-by: Sachin Kadam <152252767+sac2kadam@users.noreply.github.com>
Co-authored-by: Mithun James <drtechie@users.noreply.github.com>
Co-authored-by: Amoghavarsh <93114621+5Amogh@users.noreply.github.com>
Co-authored-by: vishwab1 <vishwanath@navadhiti.com>
Co-authored-by: SnehaRH <77656297+snehar-nd@users.noreply.github.com>
Co-authored-by: DurgaPrasad-54 <prasad8790237@gmail.com>
Co-authored-by: KOPPIREDDY DURGA PRASAD <144464542+DurgaPrasad-54@users.noreply.github.com>
Co-authored-by: Vaishnav Bhosale <vaishnavbharatbhosale@gmail.com>
Co-authored-by: Vishwanath Balkur <118195001+vishwab1@users.noreply.github.com>
Co-authored-by: 5Amogh <amoghavarsh@navadhiti.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: SnehaRH <sneha@navadhiti.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants