Hi.
I have to inflate a .csv.gz file which should return a 4 GB CSV with 25 million rows.
When I use an app or the gzip command line, I get the full file without issue.
When I use Zlib::GzipReader, only the first row is returned.
> Zlib::GzipReader.open("adresses-france.csv.gz") { |gz| print gz.read }
id;id_fantoir;numero;rep;nom_voie;code_postal;code_insee;nom_commune;code_insee_ancienne_commune;nom_ancienne_commune;x;y;lon;lat;type_position;alias;nom_ld;libelle_acheminement;nom_afnor;source_position;source_nom_voie;certification_commune;cad_parcelles
=> nil
The file is provided by the french government:
There are many other files in the directory (for each region) but I cannot reproduce the issue with other files.
This service also provided a similar file in Addok format (https://adresse.data.gouv.fr/data/ban/adresses/latest/addok/adresses-addok-france.ndjson.gz) which should return a 3GB file with 2 million rows, but only the 25k first rows are returned by Zlib::GzipReader.
Is there any limit to what Zlib can support ? (size, rows, ..)
Does it come from the compressed file ?
Hi.
I have to inflate a
.csv.gzfile which should return a 4 GB CSV with 25 million rows.When I use an app or the
gzipcommand line, I get the full file without issue.When I use
Zlib::GzipReader, only the first row is returned.The file is provided by the french government:
There are many other files in the directory (for each region) but I cannot reproduce the issue with other files.
This service also provided a similar file in Addok format (https://adresse.data.gouv.fr/data/ban/adresses/latest/addok/adresses-addok-france.ndjson.gz) which should return a 3GB file with 2 million rows, but only the 25k first rows are returned by
Zlib::GzipReader.Is there any limit to what Zlib can support ? (size, rows, ..)
Does it come from the compressed file ?