Skip to content

Data array is flattened in numcodecs which reduced the compression ratio that ZFP can provide on multi-dimension arrays #303

@halehawk

Description

@halehawk

Minimal, reproducible code sample, a copy-pastable example if possible

In ensure_contiguous_ndarray:
if arr.flags.c_contiguous or arr.flags.f_contiguous:
        # can flatten without copy
        arr = arr.reshape(-1, order='A')

Problem description

ZFP is a lossy compressor that expects to get better compression ratio when compressing multi-dimension arrays. Currently in the zfpy.encode, ensure_contiguous_ndarray function flattens the array and feeds to ZFP compressor. In this way, ZFP compressor only can get less expected compression ratio. So I tried to add a switch in ensure_contiguous_ndarray, if the codec is ZFP, I skipped the flattening. However, this only works on C-order array, it doesn't work on F-order array because ZFP decoder only returns C-order array. When the decoder returns C-order multi-dimension array for an original F-order multi-dimension array, ZFP compressor cannot pass the F-order array encode and decode test.

Version and installation information

Please provide the following:

  • 0.9.0 of numcodecs.__version__
  • Python 3.7, 3.8
  • Operating system (Linux/Windows)
  • pip install

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions