I've been reviewing bytecodealliance/wasm-tools#1146 which is the start of the implementation of implementation imports for the component model and it's raising questions about internal details which I wanted to raise to the design level. Before this PR only two forms of imports were supported for components:
(import "foo" (func ...))
(import (interface "foo:bar/baz") (func ...))
With the recently specified implementation imports the above PR is adding support for new forms of imports:
;; from before
(import "foo" (func ...))
(import (interface "foo:bar/baz") (func ...))
;; new
(import "foo" (integrity "xx") (func ...))
(import "foo" (url "xx") (func ...))
(import "foo" (relative-url "xx") (func ...))
(import (locked-dep "foo:bar/baz") (func ...))
(import (unlocked-dep "foo:bar/baz") (func ...))
Throughout these refactorings, and previously when (interface ...) imports were added, the internal data structures of much of the tooling around the component model ignores this metadata and instead thinks of imports as a map of "string to thing". This is additionally done for instantiation where instantiation arguments are provided as a list of "string to thing". Each import form then has a canonical string associated with it that is used internally. This canonical string is what disallows overlap between imports, but it additionally loses context like (url ...) and (integrity ...) which I believe is ok for the current use cases of the tooling (e.g. the url or integrity doesn't affect validation)
(import "foo" (func ...)) ;; name = "foo"
(import (interface "foo:bar/baz") (func ...)) ;; name = "foo:bar/baz"
(import "foo" (integrity "xx") (func ...)) ;; name = "foo"
(import "foo" (url "xx") (func ...)) ;; name = "foo"
(import "foo" (relative-url "xx") (func ...)) ;; name = "foo"
(import (locked-dep "foo:bar/baz") (func ...)) ;; name = "foo:bar/baz"
(import (unlocked-dep "foo:bar/baz") (func ...)) ;; name = "foo:bar/baz"
So far so good, but a problem is starting to arise at the next step of integrating this change into tooling. There are a number of locations where this intermediate representation of "string to thing" is then reencoded as a component. For example wasm-compose uses the results of wasmparser validation to create a new component. This walks over the imports of one component and generates new imports in an outer component based on the union of subcomponents (e.g. you import foo, I import bar, when we're composed the outer component imports foo and bar). With implementation imports this is starting to break down because the results of validation don't have all the metadata for imports like urls/integrity or even a differentiator for the kind of import (e.g. interface vs locked-dep).
Previously this sort of worked where the structure of the name could be used to infer the import. For example if the name had a / or : then it previously was required to be an interface import where otherwise it was a kebab-name import. Now though there are many more fields to infer and additionally some that are not syntactically distinguished by their string (e.g. (interface "a:b/c") and (locked-dep "a:b/c").
So far I believe we've been roughly trying to keep an equivalence where "map of strings" is a valid way to view the imports and exports of a component. The binary encoding is stricter to provide more semantic meaning and enumerate the various accepted forms. Currently, however, the change with implementation imports is feeling like it's pushing in the direction of "map of string to thing" is no longer a valid representation for component imports.
Thus, I'm opening up this issue for some further discussion. I'm curious if there are thoughts about maybe I'm approaching this completely the wrong way. Or are we trying to stuff too much into imports? Or is "map of string to thing" no longer desired and implementations should all be refactored?
I originally started typing all this up to solve an ambiguity between (interface "a:b/c") and (locked-dep "a:b/c") by perhaps having their string representation be syntactically different, or something like that. I realize though that this still doesn't take into account integrity which wasm-compose otherwise wouldn't be able to preserve today either. I'm not actually sure how best to support that myself, which is why I'm thinking a bit broader here at the end of typing this.
I've been reviewing bytecodealliance/wasm-tools#1146 which is the start of the implementation of implementation imports for the component model and it's raising questions about internal details which I wanted to raise to the design level. Before this PR only two forms of imports were supported for components:
With the recently specified implementation imports the above PR is adding support for new forms of imports:
Throughout these refactorings, and previously when
(interface ...)imports were added, the internal data structures of much of the tooling around the component model ignores this metadata and instead thinks of imports as a map of "string to thing". This is additionally done for instantiation where instantiation arguments are provided as a list of "string to thing". Each import form then has a canonical string associated with it that is used internally. This canonical string is what disallows overlap between imports, but it additionally loses context like(url ...)and(integrity ...)which I believe is ok for the current use cases of the tooling (e.g. theurlorintegritydoesn't affect validation)So far so good, but a problem is starting to arise at the next step of integrating this change into tooling. There are a number of locations where this intermediate representation of "string to thing" is then reencoded as a component. For example
wasm-composeuses the results ofwasmparservalidation to create a new component. This walks over the imports of one component and generates new imports in an outer component based on the union of subcomponents (e.g. you importfoo, I importbar, when we're composed the outer component importsfooandbar). With implementation imports this is starting to break down because the results of validation don't have all the metadata for imports like urls/integrity or even a differentiator for the kind of import (e.g.interfacevslocked-dep).Previously this sort of worked where the structure of the name could be used to infer the import. For example if the name had a
/or:then it previously was required to be aninterfaceimport where otherwise it was a kebab-name import. Now though there are many more fields to infer and additionally some that are not syntactically distinguished by their string (e.g.(interface "a:b/c")and(locked-dep "a:b/c").So far I believe we've been roughly trying to keep an equivalence where "map of strings" is a valid way to view the imports and exports of a component. The binary encoding is stricter to provide more semantic meaning and enumerate the various accepted forms. Currently, however, the change with implementation imports is feeling like it's pushing in the direction of "map of string to thing" is no longer a valid representation for component imports.
Thus, I'm opening up this issue for some further discussion. I'm curious if there are thoughts about maybe I'm approaching this completely the wrong way. Or are we trying to stuff too much into imports? Or is "map of string to thing" no longer desired and implementations should all be refactored?
I originally started typing all this up to solve an ambiguity between
(interface "a:b/c")and(locked-dep "a:b/c")by perhaps having their string representation be syntactically different, or something like that. I realize though that this still doesn't take into account integrity whichwasm-composeotherwise wouldn't be able to preserve today either. I'm not actually sure how best to support that myself, which is why I'm thinking a bit broader here at the end of typing this.