Skip to content

Add support for containers/storage#63

Merged
runcom merged 19 commits intocontainers:masterfrom
nalind:add-storage-transport
Dec 15, 2016
Merged

Add support for containers/storage#63
runcom merged 19 commits intocontainers:masterfrom
nalind:add-storage-transport

Conversation

@nalind
Copy link
Copy Markdown
Member

@nalind nalind commented Aug 31, 2016

This patch adds containers/storage as a backend type called 'oci-storage', using a storage.Mall initialized with the library's default settings, or using an externally-initialized storage.Mall.

An image's blobs are stored either as oci-storage layers (if they look like filesystem layers) or (alongside the manifest and signatures) as named big data items associated with the image.

Record-keeping (so that we can remember which blobs were layers and which weren't) are encoded as a JSON object which is stored as the oci-storage image's metadata.

The type of compression which we detected around a layer blob when it was imported is encoded as a field in a JSON object which is stored as the oci-storage layer object's metadata, and is reapplied if we need to export the layer later.

If we find ourselves importing an image with the same ID as an image that we already have, we return an error. If we find ourselves importing an image which contains layers which we already have, we flag
an error if its contents differ from the one we already have, otherwise we accept it.

If we find ourselves importing an image that wants to be tagged with a name that is already in use, the name is assigned to the new image and the old image remains otherwise unmodified.

@nalind nalind force-pushed the add-storage-transport branch 5 times, most recently from 2ae8dd9 to 3cbb921 Compare August 31, 2016 15:32
Comment thread storage/storage_image.go
if (n > 0) && archive.IsArchive(header[:n]) {
// It's a filesystem layer. If it's not the first one in the
// image, we assume that the most recently added layer is its
// parent.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit like the ImageDestination interface should be better typed / more specific about the content being transferred; knowing about configs / layers / layer orders

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the order matters for filesystem blobs when this library is being used to store them, but I don't think the rest of the transport types care, since I'm pretty sure they don't need to look at the contents. Still, it's probably worth clarifying that we depend on the ordering being right in types.go.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, looking at the current genericImage.BlobDigests, the order is not currently preserved on input. That can change of course, and I think docker-daemon: will need this to change as well.


For these kinds of API changes, it would be very convenient to have both the definition of the interfaces, the transports implementing the interfaces, and the core of skopeo copy, the primary user, in a single repo — then we could change the ImageSource/ImageDestination interfaces without breaking anything else and without needing to keep a skopeo branch+PR in sync.

#56 + containers/skopeo#175 does that, so if this PR ended up changing the types, perhaps it would be practical to base it on that?

(Warning: #56 is being rebased as other PRs are merged; but I don't expect it to substantially change.)

@runcom Note that #56 currently depends on #52 , primarily for the types.SystemContext API changes; I could split the API changes from #52 so that #52 is only the lookaside implementation, which would allow merging the API changes + copy.go before reviewing the lookaside—if that helped you in any way.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@runcom Note that #56 currently depends on #52 , primarily for the types.SystemContext API changes; I could split the API changes from #52 so that #52 is only the lookaside implementation, which would allow merging the API changes + copy.go before reviewing the lookaside—if that helped you in any way.

that would be awesome and I can go ahead and review #56 and maybe getting it before this (#63) so Nalin's work can benefit from it.

wdyt?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@runcom OK, #64 contains the API changes , and #56 re-merged to not include the lookaside. skopeo PRs update will follow.

@nalind the API churn in #64 will affect this I'm afraid, but it should be easy to adapt.

@mtrmac
Copy link
Copy Markdown
Collaborator

mtrmac commented Aug 31, 2016

Thanks!

For now just a few comments from a quick skim, I haven’t read all of the code.

Structurally it is a bit surprising to have both ImageSource and ImageDestination as the same object, though it doesn’t hurt anything at all.

@runcom
Copy link
Copy Markdown
Member

runcom commented Aug 31, 2016

@nalind tests failures seems related (missing or wrong vendoring of containers/storage?)

@mtrmac
Copy link
Copy Markdown
Collaborator

mtrmac commented Aug 31, 2016

@nalind tests failures seems related (missing or wrong vendoring of containers/storage?)

Yeah, containers/image does not do any vendoring, but make test-skopeo users (only) the libraries already vendored in skopeo. containers/skopeo#179 could help with that

@nalind nalind force-pushed the add-storage-transport branch 15 times, most recently from 1295c5b to c111f3e Compare September 8, 2016 14:17
Update tests for parsing reference strings, and to match other changes
made as part of the review.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Add HasBlob() and ReapplyBlob() methods to the ImageDestination
interface, for checking if a blob has already been Put() to a
destination, and allowing the destination implementation to offer the
caller a chance to avoid having to re-Get() the source blob.

This means that copyLayers() no longer keeps track of which blobs it's
copied and skips any -- it handes that down to copyLayer().  The
copyLayer() function is now the lone method of an object that caches the
diffIDs of blobs that it's copied, and skips copying a layer if the
destination blob is already present (HasBlob() says "yes") AND the
blob's diffID is already known.

The "directory" and "oci" destinations implement HasBlob() as a stat()
check for the file they'd create, and make ReapplyBlob() a no-op.

The "docker/daemon" destination now caches the digests of blobs that
it's already added to the output tarball, and checks that in HasBlob(),
making ReapplyBlob() also a no-op.

The "docker"/"openshift" destination HasBlob() checks if the destination
blob is present using an HTTP HEAD request, as PutBlob() does, making
its ReapplyBlob() also a no-op.

Lastly, the "storage" destination HasBlob() checks the image's blobs
list; ReapplyBlob() generates a diff of the most recent layer that
resulted from applying a blob with the specified digest and hands it off
to PutBlob().

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
@nalind nalind force-pushed the add-storage-transport branch from 7d14a9f to 706129b Compare December 14, 2016 22:36
nalind added a commit to nalind/containers-image that referenced this pull request Dec 14, 2016
Fix stylistic nits from PR review of containers#63

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Try to make it clearer that HasBlob() gets to indicate whether or not
ReapplyBlob() is a viable alternative to reapplying a possibly-duplicate
blob using PutBlob(), and that ReapplyBlob() necessarily has to be
called only after HasBlob() has returned success.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
In the schema2 tests, don't bother fleshing out an implementation of
HasBlob or ReapplyBlob that won't be called in the tests.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Change the list of previously-pushed blobs that we keep in a
docker/daemon destination image into a map, to make lookups faster.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Add Size(), for returning an approximation of the size, in bytes, that
an image occupies in its current location.  Make the default
implementation return (-1, nil), and make sure that we check for an
empty error in storageImage, which tries to return a useful value.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Remove a couple of type casts which recent changes made redundant (oops)
and a couple of redundant length checks around for loops.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
The requireSize argument to diffLayer is always false, so remove it and
logic that it triggered.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Calls to ParseReference() eventually end up calling GetStore(), and
GetStore() needs more information than a reference can tell it, so a few
test cases that appear to be valid at first glance still fail to parse.

Add some more detail on why that happens: in order to match a valid
store, that store needs to have been previously opened by the calling
application, but since the store is in a temporary directory created
just for the test, and we aren't opening that location with GetStore()
first, it's expected to fail.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Convert the helper functions which copy.Image() used, often passing in
several arguments, into methods of a helper object, which can remember
most of them.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Fix stylistic nits from PR review of containers#63

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
When HasBlob() returns false, always return a non-nil error.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Return more context in error values instead of merely logging them and
returning a more generic error.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Don't use the name of a builtin ("close") as a variable name.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
@nalind nalind force-pushed the add-storage-transport branch from 706129b to 0616be9 Compare December 14, 2016 22:37
nalind added a commit to nalind/cri-o that referenced this pull request Dec 14, 2016
Vendor a copy of containers/storage, the proposed version of
containers/image that knows how to use it, and new dependencies that
they pull in.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
nalind added a commit to nalind/cri-o that referenced this pull request Dec 14, 2016
Vendor a copy of containers/storage, the proposed version of
containers/image that knows how to use it, and new dependencies that
they pull in.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
@runcom
Copy link
Copy Markdown
Member

runcom commented Dec 15, 2016

@nalind make lint is failing https://travis-ci.org/containers/image/builds/184081374#L349

I believe we're pretty good to go after you fix that

@runcom
Copy link
Copy Markdown
Member

runcom commented Dec 15, 2016

@nalind just have the last question: this is adding a transport just for containers/storage but would it be possible to still use a specific transport and just stream/pass blobs to contianers/storage? I mean, can contianers/storage and image be used fully decoupled?

like:

streamOrPathToBlob = src.GetBlob
storage.AddLyaer(streamOrPathToBlob)

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
@nalind nalind force-pushed the add-storage-transport branch from 9f0ba20 to fa91766 Compare December 15, 2016 16:45
@nalind
Copy link
Copy Markdown
Member Author

nalind commented Dec 15, 2016

Re-pushed without the Makefile changes.

Comment thread image/memory.go
if err != nil {
return -1, err
}
return int64(len(s)), nil
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only the length of the manifest, which does not really make sense. Considering that memoryImage is only returned by types.Image.UpdatedImage, it should say something semantically consistent with sourcedImage.Size.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh. Opened #199 to just return -1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants