Reduce likelihood of race conditions on keep-alive timeout calculatio…#52653
Reduce likelihood of race conditions on keep-alive timeout calculatio…#52653zanettea wants to merge 1 commit intonodejs:mainfrom
Conversation
|
Review requested:
|
|
ping @mweberxyz |
lib/_http_agent.js
Outdated
| let serverHintTimeout = ( NumberParseInt(hint) * 1000 ) - 1000; | ||
| serverHintTimeout = serverHintTimeout > 0 ? serverHintTimeout : 0; | ||
|
|
||
| if (serverHintTimeout < agentTimeout) { |
There was a problem hiding this comment.
This conditional needs to be fixed - agentTimeout defaults to 0, so the serverHintTimeout is never being set.
There was a problem hiding this comment.
In this case the socket shouldn't be reused at all. Is setting socket.setTimeout(0) the right way to do it?
There was a problem hiding this comment.
Take a look at socket docs - socket.setTimeout(0) doesn't mean immediate, it means never.
There was a problem hiding this comment.
Sorry, I have limited understanding of the network internals. In undici ( https://github.com/nodejs/undici/pull/291/files ) if the keepAliveTimeout goes down to 0 they flag the connection as reset:
if (!keepAliveTimeout || keepAliveTimeout < 1e3) {
client[kReset] = true
}
I don't know how to achieve the same result here. Maybe:
socket.setKeepAlive(false);
?
There was a problem hiding this comment.
I also have a limited understanding of network internals, but I think:
if (!agentTimeout || serverHintTimeout < agentTimeout) {
is what you want.
| socket.setTimeout(server.keepAliveTimeout); | ||
| // Increase the internal timeout wrt the advertised value to reduce likeliwood of ECONNRESET errors | ||
| // due to race conditions between the client and server timeout calculation | ||
| socket.setTimeout(server.keepAliveTimeout + 1000); |
There was a problem hiding this comment.
I don't think this server change is necessary.
There was a problem hiding this comment.
Might be the cause of all the test failures.
There was a problem hiding this comment.
This fix protects clients that are using the hint timeout as is (current node impl) without adjusting it for network (or cpu load) delays
There was a problem hiding this comment.
If server.keepAliveTimeout is set to 0 (never time out) this will change it to 1 second.
There was a problem hiding this comment.
If server.keepAliveTimeout == 0 then the if condition line1012 is false. The timeout on the socket is set only if server.keepAliveTimeout != 0
There was a problem hiding this comment.
You're correct, sorry I missed that.
lib/_http_agent.js
Outdated
| const serverHintTimeout = NumberParseInt(hint) * 1000; | ||
| // Let the timer expires before the announced timeout to reduce | ||
| // the likelihood of ECONNRESET errors | ||
| let serverHintTimeout = ( NumberParseInt(hint) * 1000 ) - 1000; |
There was a problem hiding this comment.
In the case the server responds with a 1 second keepalive, this will set the disable the timeout because serverHintTimeout will be 0 -- maybe go with - 500 in place of - 1000 ?
| serverHintTimeout = serverHintTimeout > 0 ? serverHintTimeout : 0; | ||
| if (serverHintTimeout === 0) { | ||
| // cannot safely reuse the socket because the server timeout is too short | ||
| canKeepSocketAlive = false; |
There was a problem hiding this comment.
I have updated the PR to just skip keep alive if the timeout is too short. In my local build the test now pass.
This comment has been minimized.
This comment has been minimized.
8407fd7 to
bc34680
Compare
|
Found the same issue in dotnet -- they went with a 1 second offset as well, so that was a good choice. 👍 |
ba0d895 to
2124b55
Compare
reduce likelihood of race conditions on keep-alive timeout calculation between http1.1 servers and clients and honor server keep-alive timeout when agentTimeout is not set Fixes: nodejs#47130 Fixes: nodejs#52649
There was a problem hiding this comment.
To me, it looks like this should be a testable behavior change.
I had a short look and I guess if test-http-client-keep-alive-hint.js would be a better test it should actually fail now, because the case it tests now behaves quite different (there is no keep-alive because the hint is just 1 second).
I think a test somewhat along these lines would pass on main and fail on this branch:
'use strict';
const common = require('../common');
const assert = require('assert');
const http = require('http');
const server = http.createServer(
{ keepAliveTimeout: common.platformTimeout(60000) },
function(req, res) {
req.resume();
res.writeHead(200, { 'Connection': 'keep-alive', 'Keep-Alive': 'timeout=1' });
res.end('FOO');
}
);
server.listen(0, common.mustCall(() => {
let shouldStillBeAlive = true;
setTimeout(() => {
shouldStillBeAlive = false;
}, common.platformTimeout(500)); // 500ms buffer
const req = http.get({ port: server.address().port }, (res) => {
assert.strictEqual(res.statusCode, 200);
res.resume();
});
req.on('socket', (socket) => {
socket.on('close', common.mustCall(() => {
if (shouldStillBeAlive) {
assert.fail('socket prematurely closed');
}
server.close();
}));
});
}));
// This timer should never go off as the agent will parse the hint and terminate earlier
setTimeout(common.mustNotCall(), common.platformTimeout(3000)).unref();I guess it would be good to write a similar test (hopefully a version that has to rely less on timers) that verifies that we are actually latency adjusting (currently hard-coded to 1s) the hint based keep alive?
| } else if (state.outgoing.length === 0) { | ||
| if (server.keepAliveTimeout && typeof socket.setTimeout === 'function') { | ||
| socket.setTimeout(server.keepAliveTimeout); | ||
| // Increase the internal timeout wrt the advertised value to reduce likeliwood of ECONNRESET errors |
There was a problem hiding this comment.
| // Increase the internal timeout wrt the advertised value to reduce likeliwood of ECONNRESET errors | |
| // Increase the internal timeout wrt the advertised value to reduce likelihood of ECONNRESET errors |
| const serverHintTimeout = NumberParseInt(hint) * 1000; | ||
|
|
||
| if (serverHintTimeout < agentTimeout) { | ||
| // Let the timer expires before the announced timeout to reduce |
There was a problem hiding this comment.
| // Let the timer expires before the announced timeout to reduce | |
| // Let the timer expire before the announced timeout to reduce |
|
Hey, this valuable PR fixes many similar issues for |
jakecastelli
left a comment
There was a problem hiding this comment.
Could you please add a test?
|
Hi, I'd like to help this land to resovle similar issues. Can I continue the work and add @zanettea as a co-author? To me, there are just test cases to be added |
Added 1 seconds threshold in keepalive timeout client-side
Added 1 second threshold in keepalive timeout server-side (expire the socket timeout 1 sec after the announced timeout)
Probably better to use a configurable threshold like in undici keepAliveTimeoutThreshold (nodejs/undici#291)