netcdf.NetCDFDataset
kedro_datasets_experimental.netcdf.NetCDFDataset ¶
NetCDFDataset(
*,
filepath,
temppath=None,
load_args=None,
save_args=None,
fs_args=None,
credentials=None,
metadata=None
)
Bases: AbstractDataset
NetCDFDataset loads and saves data to a local netcdf (.nc) file.
Example usage for the YAML API:¶
single-file:
type: netcdf.NetCDFDataset
filepath: s3://bucket_name/path/to/folder/data.nc
save_args:
mode: a
load_args:
decode_times: False
multi-file:
type: netcdf.NetCDFDataset
filepath: s3://bucket_name/path/to/folder/data*.nc
load_args:
concat_dim: time
combine: nested
parallel: True
Example usage for the Python API:¶
from kedro_datasets.netcdf import NetCDFDataset
import xarray as xr
ds = xr.DataArray(
[0, 1, 2], dims=["x"], coords={"x": [0, 1, 2]}, name="data"
).to_dataset()
dataset = NetCDFDataset(
filepath=tmp_path / "data.nc",
save_args={"mode": "w"},
)
dataset.save(ds)
reloaded = dataset.load()
assert ds.equals(reloaded)
Parameters:
-
filepath(str) –Filepath in POSIX format to a NetCDF file prefixed with a protocol like
s3://. If prefix is not provided,fileprotocol (local filesystem) will be used. The prefix should be any protocol supported byfsspec. It can also be a path to a glob. If a glob is provided then it can be used for reading multiple NetCDF files. -
temppath(str | None, default:None) –Local temporary directory, used when reading from remote storage, since NetCDF files cannot be directly read from remote storage.
-
load_args(dict[str, Any] | None, default:None) –Additional options for loading NetCDF file(s). Here you can find all available arguments when reading single file: https://xarray.pydata.org/en/stable/generated/xarray.open_dataset.html Here you can find all available arguments when reading multiple files: https://xarray.pydata.org/en/stable/generated/xarray.open_mfdataset.html All defaults are preserved.
-
save_args(dict[str, Any] | None, default:None) –Additional saving options for saving NetCDF file(s). Here you can find all available arguments: https://xarray.pydata.org/en/stable/generated/xarray.Dataset.to_netcdf.html All defaults are preserved.
-
fs_args(dict[str, Any] | None, default:None) –Extra arguments to pass into underlying filesystem class constructor (e.g.
{"cache_regions": "us-east-1"}fors3fs.S3FileSystem). -
credentials(dict[str, Any] | None, default:None) –Credentials required to get access to the underlying filesystem. E.g. for
GCSFileSystemit should look like{"token": None}. -
metadata(dict[str, Any] | None, default:None) –Any arbitrary metadata. This is ignored by Kedro, but may be consumed by users or external plugins.
Source code in kedro_datasets_experimental/netcdf/netcdf_dataset.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | |
_is_multifile
instance-attribute
¶
_is_multifile = (
True
if "*" in str(PurePosixPath(self._filepath).stem)
else False
)
_load_args
instance-attribute
¶
_load_args = {
None: self.DEFAULT_LOAD_ARGS,
None: load_args or {},
}
_save_args
instance-attribute
¶
_save_args = {
None: self.DEFAULT_SAVE_ARGS,
None: save_args or {},
}
_storage_options
instance-attribute
¶
_storage_options = {
None: self._credentials,
None: self._fs_args,
}
__del__ ¶
__del__()
Cleanup temporary directory
Source code in kedro_datasets_experimental/netcdf/netcdf_dataset.py
197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 | |
_describe ¶
_describe()
Source code in kedro_datasets_experimental/netcdf/netcdf_dataset.py
174 175 176 177 178 179 180 | |
_exists ¶
_exists()
Source code in kedro_datasets_experimental/netcdf/netcdf_dataset.py
182 183 184 185 186 187 188 189 190 191 | |
_invalidate_cache ¶
_invalidate_cache()
Invalidate underlying filesystem caches.
Source code in kedro_datasets_experimental/netcdf/netcdf_dataset.py
193 194 195 | |
load ¶
load()
Source code in kedro_datasets_experimental/netcdf/netcdf_dataset.py
133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 | |
save ¶
save(data)
Source code in kedro_datasets_experimental/netcdf/netcdf_dataset.py
155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | |