Skip to content

Serialize dependency graph directly from DepGraph#80957

Merged
bors merged 1 commit into
rust-lang:masterfrom
tgnottingham:direct_serialize_depgraph
Jan 19, 2021
Merged

Serialize dependency graph directly from DepGraph#80957
bors merged 1 commit into
rust-lang:masterfrom
tgnottingham:direct_serialize_depgraph

Conversation

@tgnottingham

Copy link
Copy Markdown
Contributor

Reduce memory usage by serializing dep graph directly from DepGraph,
rather than copying it into SerializedDepGraph and serializing that.

@rust-highfive

Copy link
Copy Markdown
Contributor

r? @lcnr

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 12, 2021
@tgnottingham

tgnottingham commented Jan 12, 2021

Copy link
Copy Markdown
Contributor Author

@rustbot label T-compiler A-incr-comp I-compiletime I-compilemem

@bors try @rust-timer queue

@rustbot rustbot added A-incr-comp Area: Incremental compilation I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 12, 2021
@rust-timer

Copy link
Copy Markdown
Collaborator

Awaiting bors try build completion.

@tgnottingham

Copy link
Copy Markdown
Contributor Author

Hm, I think the try request was dropped, maybe because the regular CI checks hadn't passed yet.

@bors try @rust-timer queue

@rust-timer

Copy link
Copy Markdown
Collaborator

Awaiting bors try build completion.

@bors

bors commented Jan 12, 2021

Copy link
Copy Markdown
Collaborator

⌛ Trying commit 3ae463ae9db28561c67bbbad2735c751508ef9e0 with merge 140e2d824700dd32cc8a5a4fee0ca320c3d12360...

@bors

bors commented Jan 13, 2021

Copy link
Copy Markdown
Collaborator

☀️ Try build successful - checks-actions
Build commit: 140e2d824700dd32cc8a5a4fee0ca320c3d12360 (140e2d824700dd32cc8a5a4fee0ca320c3d12360)

@rust-timer

Copy link
Copy Markdown
Collaborator

Queued 140e2d824700dd32cc8a5a4fee0ca320c3d12360 with parent 7a9b552, future comparison URL.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 13, 2021
@rust-timer

Copy link
Copy Markdown
Collaborator

Finished benchmarking try commit (140e2d824700dd32cc8a5a4fee0ca320c3d12360): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 13, 2021
@tgnottingham tgnottingham changed the title Serialize dependency graph directly from DepGraph to reduce memory usage Serialize dependency graph directly from DepGraph Jan 13, 2021
Reduce memory usage by serializing dep graph directly from `DepGraph`,
rather than copying it into `SerializedDepGraph` and serializing that.
@tgnottingham tgnottingham force-pushed the direct_serialize_depgraph branch from 3ae463a to 09067db Compare January 13, 2021 06:20
@lcnr

lcnr commented Jan 13, 2021

Copy link
Copy Markdown
Contributor

r? @michaelwoerister maybe

@michaelwoerister

Copy link
Copy Markdown
Member

Thanks for the PR, @tgnottingham! This looks good to me in general. I'll review properly next week.

@michaelwoerister

Copy link
Copy Markdown
Member

Thanks for the PR, @tgnottingham! This looks like a step in the right direction.

One thing I noticed while reading the code was that quite a bit of this data could be mapped from disk into memory as is (at least on little endian platforms) and already provides random access (DepNodes and Fingerprints don't use leb128 because that would actually waste space). So in the future, we might want to just define a proper binary file format for the dep-graph and make decoding lazy -- or partially lazy, depending on whether we can afford the additional disk space for encoding edge indices with 4 bytes instead of using leb128.

Anyway, getting rid of the memory spike is the right thing to do in any case.

@bors r+

@bors

bors commented Jan 19, 2021

Copy link
Copy Markdown
Collaborator

📌 Commit 09067db has been approved by michaelwoerister

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 19, 2021
@tgnottingham

tgnottingham commented Jan 19, 2021

Copy link
Copy Markdown
Contributor Author

Thanks @michaelwoerister!

we might want to just define a proper binary file format for the dep-graph and make decoding lazy -- or partially lazy, depending on whether we can afford the additional disk space for encoding edge indices with 4 bytes instead of using leb128.

As a reference point, the style-servo-opt incr-full benchmark creates almost 16 million edges. The LEB128 encoding of the u32 numbers 0 to that number of edges is ~58MB, and the raw encoding is ~60MB. I was really surprised to see that, but it checks out. The majority (~13 million) of those edges use 4 byte LEB128 encodings, so the savings aren't substantial.

Edit: realized the above is a bad comparison, as the edges have values from 0 to number of nodes, not number of edges. But, if you assume the ~16 million edges have values equally distributed between 0 and 2.75 million (number of nodes in same benchmark), then the LEB128 encoding is ~48MB. So the savings are a bit more significant.

This isn't representative of the actual edge data space requirements, but it suggests that raw-encoding might not be so bad.

@bors

bors commented Jan 19, 2021

Copy link
Copy Markdown
Collaborator

⌛ Testing commit 09067db with merge c5a96fb...

@bors

bors commented Jan 19, 2021

Copy link
Copy Markdown
Collaborator

☀️ Test successful - checks-actions
Approved by: michaelwoerister
Pushing c5a96fb to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jan 19, 2021
@bors bors merged commit c5a96fb into rust-lang:master Jan 19, 2021
@rustbot rustbot added this to the 1.51.0 milestone Jan 19, 2021
@tgnottingham tgnottingham deleted the direct_serialize_depgraph branch January 20, 2021 18:31
@michaelwoerister

Copy link
Copy Markdown
Member

It would be awesome if we added -Zself-profile events for tracking this kind of data (i.e. file sizes) and made perf.rlo show them. That would make experimenting with things like this more convenient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-incr-comp Area: Incremental compilation I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. I-compiletime Issue: Problems and improvements with respect to compile times. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants