Skip to content

[improve][client] Enhance connection and timeout logging#539

Merged
BewareMyPower merged 5 commits intoapache:mainfrom
bigo-sg:improve-connection-logging
Mar 3, 2026
Merged

[improve][client] Enhance connection and timeout logging#539
BewareMyPower merged 5 commits intoapache:mainfrom
bigo-sg:improve-connection-logging

Conversation

@zhanglistar
Copy link
Contributor

  • Add endpoint IP:port details to connection failure messages
  • Log resolved endpoints for DNS lookup debugging
  • Include physical address in connection timeout errors
  • Add broker remote address to network request timeout warnings
  • Log producer queue sizes during send timeouts for backpressure debugging

These improvements provide better context for troubleshooting
network connectivity issues, DNS resolution problems, and
producer backpressure scenarios.

Fixes #

Master Issue: #

Motivation

Modifications

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Documentation

  • doc-required
    (Your PR needs to update docs and you will update later)

  • doc-not-needed
    (Please explain why)

  • doc
    (Your PR contains doc changes)

  • doc-complete
    (Docs have been already added)

  - Add endpoint IP:port details to connection failure messages
  - Log resolved endpoints for DNS lookup debugging
  - Include physical address in connection timeout errors
  - Add broker remote address to network request timeout warnings
  - Log producer queue sizes during send timeouts for backpressure debugging

  These improvements provide better context for troubleshooting
  network connectivity issues, DNS resolution problems, and
  producer backpressure scenarios.
Copy link
Contributor

@BewareMyPower BewareMyPower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you share the original logs to see if the current logs are enough? For example, you added warning logs to handleXxxTimeout, but it actually sets the ResultTimeout on the corresponding promise, which could be processed in the callback.

@zhanglistar
Copy link
Contributor Author

Could you share the original logs to see if the current logs are enough? For example, you added warning logs to handleXxxTimeout, but it actually sets the ResultTimeout on the corresponding promise, which could be processed in the callback.

Yes, I have seen the original logs, it's hard to find which broker is timeout. For example, log is

FailedCallBack cluster=75 topic=bigo_adx_win_75 app_id=75 noraml=0 retry=0 reason: Local: Message timed out

@BewareMyPower
Copy link
Contributor

I've got it, so you have configured a list of brokers but one of them might be stuck for long and the client side didn't fail over to a different broker.

Copy link
Contributor

@BewareMyPower BewareMyPower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also fix the format check via clang-format-11.

@BewareMyPower BewareMyPower added this to the 4.2.0 milestone Mar 3, 2026
@BewareMyPower BewareMyPower merged commit 070132b into apache:main Mar 3, 2026
14 checks passed
@BewareMyPower BewareMyPower modified the milestones: 4.2.0, 4.1.0 Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants