Skip to content

ListArray with 64bit instead of 32bit offset #3845

@maartenbreddels

Description

@maartenbreddels

I'm starting to appriciate Arrow more and more, and many new ideas I have with vaex fully align with arrow. I've recently been adding in string support in vaex and I am trying to follow the arrow specs, however have a bit of an issue with the int32 limitation of ListArray

This limits each element in the list to be ~2GB in size (which is not crazy), but also complicates matters when dealing with string concatenation, having to switch to a ChunkedArray. The memory saving by doing 32bit is not huge, and I am not sure the penalty of code complexity is worth it.

I was wondering if the 32bit limitation is set in stone, or is something that could change, and if maybe other people have strong opinions about.

cc @xhochy

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions