Extract and analyze metadata from NetCDF files...
This skill provides tools for extracting metadata from NetCDF files into structured CSV format. Extract variable names, dimensions, shapes, data types, units, and all NetCDF attributes for documentation, analysis, and understanding NetCDF file contents.
Use this skill when:
Binary format that xarray can read directly. Comes in two versions:
engine='scipy' with xarrayengine='h5netcdf' with xarrayText representation of NetCDF files. Must be converted to binary using ncgen:
ncgen -o output.nc input.nc.cdl
Ensure the project has these dependencies installed:
xarray - NetCDF file readingscipy - Backend for NetCDF3 classic formath5netcdf (optional) - Backend for NetCDF4/HDF5 formatInstall with:
uv add xarray scipy h5netcdf
The skill includes scripts/extract_netcdf_metadata.py which extracts all variable metadata to CSV.
Usage:
# Process all .nc files in a directory
uv run python scripts/extract_netcdf_metadata.py
# Process specific files
uv run python scripts/extract_netcdf_metadata.py file1.nc file2.nc
Output: Creates .metadata.csv files alongside each .nc file with the same basename.
CSV Contents:
variable_name - NetCDF variable identifierdimensions - Dimension names (comma-separated)shape - Array shape as tupledtype - Data type (float32, int8, etc.)ndim - Number of dimensionssize - Total number of elementslong_name - Human-readable description (if present)units - Measurement units (if present)For custom metadata extraction or analysis:
import xarray as xr
# Open NetCDF file (use engine='scipy' for NetCDF3)
ds = xr.open_dataset('file.nc', engine='scipy')
# Access metadata
print(ds) # Overview of entire dataset
print(ds.dims) # Dimensions
print(ds.data_vars) # Data variables
# Access specific variable
var = ds['variable_name']
print(var.dims) # Variable dimensions
print(var.shape) # Variable shape
print(var.dtype) # Data type
print(var.attrs) # All attributes
# Access specific attributes
if 'long_name' in var.attrs:
print(var.attrs['long_name'])
if 'units' in var.attrs:
print(var.attrs['units'])
ds.close()
When working with .nc.cdl files, convert them first:
import subprocess
from pathlib import Path
cdl_file = Path("input.nc.cdl")
nc_file = cdl_file.with_suffix("").with_suffix(".nc")
subprocess.run(
["ncgen", "-o", str(nc_file), str(cdl_file)],
check=True
)
Then read with xarray as normal.
# Convert if CDL
ncgen -o data.nc data.nc.cdl
# Extract metadata
uv run python scripts/extract_netcdf_metadata.py data.nc
Result: data.metadata.csv created in the same directory.
# Convert all CDL files in directory
for f in *.nc.cdl; do
ncgen -o "${f%.cdl}" "$f"
done
# Extract metadata from all
uv run python scripts/extract_netcdf_metadata.py *.nc
Extract metadata from multiple files, then compare the CSV files to identify:
The NetCDF file is in classic format but xarray is using the wrong backend.
Fix: Use engine='scipy':
ds = xr.open_dataset(file, engine='scipy')
The ncgen tool is not installed.
Fix: Install NetCDF tools:
# macOS
brew install netcdf
# Ubuntu/Debian
apt install netcdf-bin
xarray requires a backend to read NetCDF files.
Fix: Install scipy for NetCDF3:
uv add scipy
Or h5netcdf for NetCDF4:
uv add h5netcdf
Command-line tool that extracts variable metadata from NetCDF files to CSV format. Run directly without reading into context. The script:
.metadata.csv extension alongside the original files