GSLA - update handler to handle non gzipped files#264
Conversation
Codecov Report
@@ Coverage Diff @@
## master #264 +/- ##
==========================================
+ Coverage 87.98% 88.04% +0.06%
==========================================
Files 50 50
Lines 3221 3229 +8
Branches 536 537 +1
==========================================
+ Hits 2834 2843 +9
Misses 243 243
+ Partials 144 143 -1
Continue to review full report at Codecov.
|
mhidas
left a comment
There was a problem hiding this comment.
I just made a few suggestions to simplify handling of input file and file collection using recently added aodncore functionality.
| netcdf_collection = self.file_collection.filter_by_attribute_id('file_type', FileType.NETCDF) | ||
| netcdf_file = netcdf_collection[0] | ||
| netcdf_file.publish_type = PipelineFilePublishType.NO_ACTION |
There was a problem hiding this comment.
Since you already know it's a single netCDF file being handled, you can do this more simply...
| netcdf_collection = self.file_collection.filter_by_attribute_id('file_type', FileType.NETCDF) | |
| netcdf_file = netcdf_collection[0] | |
| netcdf_file.publish_type = PipelineFilePublishType.NO_ACTION | |
| self.file_collection.set_publish_types(PipelineFilePublishType.NO_ACTION) |
| netcdf_file = netcdf_collection[0] | ||
| netcdf_file.publish_type = PipelineFilePublishType.NO_ACTION | ||
|
|
||
| gzip_path = os.path.join(self.temp_dir, os.path.basename(self.input_file + '.gz')) |
There was a problem hiding this comment.
There's a shortcut available here too... (file_basename property)
| gzip_path = os.path.join(self.temp_dir, os.path.basename(self.input_file + '.gz')) | |
| gzip_path = os.path.join(self.temp_dir, self.file_basename + '.gz') |
| netcdf_file_gz = PipelineFile(gzip_path, file_update_callback=self._file_update_callback) | ||
| netcdf_file_gz.publish_type = PipelineFilePublishType.HARVEST_UPLOAD | ||
|
|
||
| self.file_collection.add(netcdf_file_gz) |
There was a problem hiding this comment.
And again... (see aodn/python-aodncore#209)
| netcdf_file_gz = PipelineFile(gzip_path, file_update_callback=self._file_update_callback) | |
| netcdf_file_gz.publish_type = PipelineFilePublishType.HARVEST_UPLOAD | |
| self.file_collection.add(netcdf_file_gz) | |
| self.add_to_collection(gzip_path, publish_type=PipelineFilePublishType.HARVEST_UPLOAD) |
|
thanks @mhidas I followed your suggestions |
|
👍 I haven't really looked at the unittests, but they all seem to be passing, so should be ok. Is this ready to merge then? |
|
it is ! thanks. oh yeah I looked at them quite a lot! |
Add logic to transform a received NetCDF file into a .nc.gz
historically, files were always sent as *.nc.gz. But as of April 2021, files might be pushed as *.nc
To be consistent with the existing dataset, and gogoduck, we transform this .nc into a .nz.gz