readline: add support for async iteration#23916
readline: add support for async iteration#23916TimothyGu wants to merge 2 commits intonodejs:masterfrom
Conversation
550bb04 to
b631b4c
Compare
|
Should we benchmark this comparatively with the |
lib/readline.js
Outdated
There was a problem hiding this comment.
Would it make sense expose this as a separate method? converting to a stream might be an issue for multiple people.
There was a problem hiding this comment.
I consider this an implementation detail of @@asyncIterator method. A major reason of why the performance of this method isn't up to par to 'line' event, as you have noted in #23916 (comment), is because of the double buffering necessitated by the intermediate stream, so I'd rather not expose the stream at the moment.
|
I've compiled the branch and tested with 1 GB file (22 514 395 lines): Scripts and results:'use strict';
const fs = require('fs');
const readline = require('readline');
let counter = 0;
let dummy;
console.time('event');
const rl = readline.createInterface({
input: fs.createReadStream('big-file.txt', 'utf8'),
crlfDelay: Infinity,
});
rl.on('line', (line) => {
counter++;
dummy = line;
}).on('close', () => {
console.timeEnd('event');
console.log(`Lines: ${counter}, last line length: ${dummy.length}`);
});'use strict';
const fs = require('fs');
const readline = require('readline');
(async function main() {
let counter = 0;
let dummy;
console.time('asyncIterator');
const rl = readline.createInterface({
input: fs.createReadStream('big-file.txt', 'utf8'),
crlfDelay: Infinity,
});
for await (const line of rl) {
counter++;
dummy = line;
}
console.timeEnd('asyncIterator');
console.log(`Lines: ${counter}, last line length: ${dummy.length}`);
})();So event implementation currently 3 times as fast as async iterator implementation. Maybe we should warn about this. |
|
Can you check reading the same file with |
|
@mcollina Do you mean to compare async iterating over unsplit chunks vs async iterating over split lines? If so, I have ~ 4x factor: Scripts and results:const fs = require('fs');
(async function main() {
let counter = 0;
let dummy;
console.time('asyncIteratorChunks');
for await (const chunk of fs.createReadStream('big-file.txt', 'utf8')) {
counter++;
dummy = chunk;
}
console.timeEnd('asyncIteratorChunks');
console.log(`Chunks: ${counter}, last chunk length: ${dummy.length}`);
})();'use strict';
const fs = require('fs');
const readline = require('readline');
(async function main() {
let counter = 0;
let dummy;
console.time('asyncIteratorLines');
const rl = readline.createInterface({
input: fs.createReadStream('big-file.txt', 'utf8'),
crlfDelay: Infinity,
});
for await (const line of rl) {
counter++;
dummy = line;
}
console.timeEnd('asyncIteratorLines');
console.log(`Lines: ${counter}, last line length: ${dummy.length}`);
})(); |
|
@mcollina Or do you mean to compare async iterating over unsplit chunks vs event implementation for unsplit chunks? If so, I have 1:1 factor, i.e. the same speed: Scripts and results:'use strict';
const fs = require('fs');
let counter = 0;
let dummy;
console.time('eventChunks');
const readable = fs.createReadStream('big-file.txt', 'utf8');
readable.on('data', (chunk) => {
counter++;
dummy = chunk;
}).on('close', () => {
console.timeEnd('eventChunks');
console.log(`Chunks: ${counter}, last chunk length: ${dummy.length}`);
});'use strict';
const fs = require('fs');
(async function main() {
let counter = 0;
let dummy;
console.time('asyncIteratorChunks');
for await (const chunk of fs.createReadStream('big-file.txt', 'utf8')) {
counter++;
dummy = chunk;
}
console.timeEnd('asyncIteratorChunks');
console.log(`Chunks: ${counter}, last chunk length: ${dummy.length}`);
})(); |
|
#23901 has landed, it seems those commits can be excluded to simplify reviews. |
|
And beware #23929, we may have conflicts. |
|
@vsemozhetbyt thanks for those benchmarks, those are quite interesting. Specifically the fact that using stream iteration is now essentially on par with Related to |
Co-authored-by: Ivan Filenko <[email protected]> Fixes: nodejs#18603 Refs: nodejs#18904
Documentation changes
b631b4c to
f8ff7c7
Compare
|
@vsemozhetbyt @mcollina I've updated this PR to address the documentation comments. Please take a look. |
|
Docs LGTM. Thank you! |
|
Maybe cc @nodejs/streams ? |
|
@devsnek maybe? Anyway this could land because it's older than a week. I'd recommend to wait for 2 days and then land if no one objects. |
Co-authored-by: Ivan Filenko <[email protected]> Fixes: nodejs#18603 Refs: nodejs#18904 PR-URL: nodejs#23916 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Gus Caplan <[email protected]>
Notable Changes:
* console,util:
* `console` functions now handle symbols as defined in the spec.
nodejs#23708
* The inspection `depth` default is now back at 2.
nodejs#24326
* dgram,net:
* Added ipv6Only option for `net` and `dgram`.
nodejs#23798
* http:
* Chosing between the http parser is now possible per runtime flag.
nodejs#24739
* readline:
* The `readline` module now supports async iterators.
nodejs#23916
* repl:
* The multiline history feature is removed.
nodejs#24804
* tls:
* Added min/max protocol version options.
nodejs#24405
* The X.509 public key info now includes the RSA bit size and the
elliptic curve. nodejs#24358
* url:
* `pathToFileURL()` now supports LF, CR and TAB.
nodejs#23720
* Windows:
* Tools are not installed using Boxstarter anymore.
nodejs#24677
* The install-tools scripts or now included in the dist.
nodejs#24233
* Added new collaborator:
* [antsmartian](https://github.com/antsmartian) - Anto Aravinth.
nodejs#24655
PR-URL: nodejs#24854
PR-URL: #26472 Refs: #23916 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
PR-URL: nodejs#26472 Refs: nodejs#23916 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
PR-URL: #26472 Refs: #23916 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
PR-URL: #26472 Refs: #23916 Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Matteo Collina <[email protected]>
Co-authored-by: Ivan Filenko <[email protected]> Fixes: #18603 Refs: #18904 PR-URL: #23916 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Gus Caplan <[email protected]>
Co-authored-by: Ivan Filenko <[email protected]> Fixes: #18603 Refs: #18904 PR-URL: #23916 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Gus Caplan <[email protected]>
Co-authored-by: Ivan Filenko <[email protected]> Fixes: #18603 Refs: #18904 PR-URL: #23916 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Gus Caplan <[email protected]>
| const TOTAL_LINES = 18; | ||
|
|
||
| (async () => { | ||
| const readable = new Readable({ read() {} }); |
There was a problem hiding this comment.
Looking at this now (to solve another bug) I think this backpressure behaviour is confusing since other consumers can listen to line on the stream and it's surprising we pause() it for them.
Rewritten version of #18904, using more existing streams mechanisms.
Depends on #23901 for some of the edge case tests (relevant commits included within this PR).
Co-authored-by: Ivan Filenko [email protected]
Fixes: #18603
Refs: #18904
/cc @mcollina @devsnek @prog1dev
Checklist
make -j4 test(UNIX), orvcbuild test(Windows) passes