Conversation
In WebAssembly, jump instruction does not contain explicit address, which makes the compiler much easier to implement, because it does not have to "fixup" the jump address if it is jumping forward. The work of calculating the destination address is responsibility of the runtime virtual machine. This design is a necessity rather than a preference since Wasm uses variable length encoding for integers, calculating offset later would shift a whole chunk of instructions. I like this design so much so that I steal the idea of structural control flow. Now it looks like a hybrid of higher level language and assembly.
Now it follows Wasm structured control flow model completely.
It's like operand stack, but for blocks.
Because we resolve the addresses at cache_bytecode(), we don't need to do it in runtime.
70003a0 to
ba6a556
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
In WebAssembly, jump instruction does not contain explicit address, which makes the compiler much easier to implement, because it does not have to "fixup" the jump address if it is jumping forward.
The work of calculating the destination address is responsibility of the runtime virtual machine.
This design is a necessity rather than a preference since Wasm uses variable length encoding for integers, calculating offset later would shift a whole chunk of instructions.
Another reason Wasm uses structured control flow is the safety. It won't break predictable behavior by jumping to a random address by a bug in bytecode, since the jump address is determined by the structure. The address is inferred by the block nesting level, which is easily verified by checking if it's less than the block stack size.
Also, structured control flow is much easier to read and reason about. Jump addresses are harder to understand and debug, since the information of control structure is lost by compiling from AST to bytecode.
I like this design so much so that I steal the idea of structural control flow. Now it looks like a hybrid of higher level language and assembly.
structural flow instructions
We have following new "instructions" with quotes, because they are not found in real CPUs.
The stack for control blocks, pushed each time by Block or Loop instruction and popped by End.
Jump instructions jump forward in Block and back in Loop.
We follow the WebAssembly VM model, where loops and branches are implemented as blocks (structured control flow).
The block can be one of the following:
Block (jump forward)
Block
...
End
Loop (jump backward)
Loop
...
End
If (skip forward conditionally)
If
...
End
If/Else (skip to else clause conditionally)
If
...
Else
...
End
Any jump instructions (
Jmp,JtandJf) in these blocks will transfer the control flow to a new address, either the beginning of a block (Loop) or end (all the other instructions).Compiler generated code
In practice,
BlockandLoopwill always come in pairs. If you compile afor,looporwhileloop, the instructions would look like this:The inner part is the body of the loop control structure.
Jmp _ 1will jump to the end of the block, meaning escaping the loop, makingbreak.Jmp _ 2will jump to the beginning of the loop, makingcontinueflow.Disassembly improvement
Now the disassembly is much more ergonomic. It shows the control flow structure by indentation.
It was like this before.
Benchmark
Since structural flow puts the burden of calculating jump address to the bytecode interpreter, we would like to know how much is the impact on perfoamnce.
Here is the latest measurement of Mandelbrot set ascii art rendering time among other languages. Error bars are standard deviation of 5 runs.
The ones relevant are named Mascal.
It is very small difference to measure accurately, so I increased the number of iterations from 256 to 1024 so that the task takes longer time, and extracted only relevant measurements to Mascal.
It is still marginal difference, but it seems consistent that the speed is
Mascal strflow < Mascal strflow-cached < Mascal bytecode. It makes sense because the timing of calculating jump addresses are like below:As you go down, it comes closer to the runtime, so it will put more workload on the execution.
Also note that I added
Mascal varlen, which is an implementation of variable length instructions in bytecode in varlen branch. It is the most compact representation, but not necessarily the fastest.Conclusion
The performance is marginally better if we calculate the jump address at compile times (as real CPU instructions do), but the overhead can be minimized by caching the jump addresses at loading time.
Do we still want to apply this change? We want to have variable length instructions to minimize cache memory requirement for the instructions, which would require structural control flow. However, experiments show that variable length encoding adds significant runtime overhead, which cancels the benefit of memory locality.