Regrid between rectilinear grids
[1]:
%matplotlib inline
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import numpy as np
import xarray as xr
import xesmf as xe
Prepare data
Input data
We regrid xarray’s built-in demo data. This data is also used by xarray plotting tutorial.
[2]:
ds = xr.tutorial.open_dataset(
"air_temperature"
) # use xr.tutorial.load_dataset() for xarray<v0.11.0
ds
[2]:
<xarray.Dataset> Dimensions: (lat: 25, time: 2920, lon: 53) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 Data variables: air (time, lat, lon) float32 ... Attributes: Conventions: COARDS title: 4x daily NMC reanalysis (1948) description: Data is from NMC initialized reanalysis\n(4x/day). These a... platform: Model references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
It is the air temperature data over US with 2920 time frames. Let’s plot the first frame:
[3]:
dr = ds["air"] # get a DataArray
[4]:
ax = plt.axes(projection=ccrs.PlateCarree())
dr.isel(time=0).plot.pcolormesh(ax=ax, vmin=230, vmax=300)
ax.coastlines()
[4]:
<cartopy.mpl.feature_artist.FeatureArtist at 0x7f3923d7f400>
Input grid
Its grid resolution is \(2.5^\circ \times 2.5^\circ\):
[5]:
ds["lat"].values, ds["lon"].values
[5]:
(array([75. , 72.5, 70. , 67.5, 65. , 62.5, 60. , 57.5, 55. , 52.5, 50. ,
47.5, 45. , 42.5, 40. , 37.5, 35. , 32.5, 30. , 27.5, 25. , 22.5,
20. , 17.5, 15. ], dtype=float32),
array([200. , 202.5, 205. , 207.5, 210. , 212.5, 215. , 217.5, 220. ,
222.5, 225. , 227.5, 230. , 232.5, 235. , 237.5, 240. , 242.5,
245. , 247.5, 250. , 252.5, 255. , 257.5, 260. , 262.5, 265. ,
267.5, 270. , 272.5, 275. , 277.5, 280. , 282.5, 285. , 287.5,
290. , 292.5, 295. , 297.5, 300. , 302.5, 305. , 307.5, 310. ,
312.5, 315. , 317.5, 320. , 322.5, 325. , 327.5, 330. ],
dtype=float32))
Output grid
Say we want to downsample it to \(1.0^\circ \times 1.5^\circ\). Just define the output grid as an xarray Dataset
. Notice here that we take care of passing some attributes to the coordinate variables. This ensures xESMF and it’s underlying helper, cf-xarray, understand which is which.
[6]:
ds_out = xr.Dataset(
{
"lat": (["lat"], np.arange(16, 75, 1.0), {"units": "degrees_north"}),
"lon": (["lon"], np.arange(200, 330, 1.5), {"units": "degrees_east"}),
}
)
ds_out
[6]:
<xarray.Dataset> Dimensions: (lat: 59, lon: 87) Coordinates: * lat (lat) float64 16.0 17.0 18.0 19.0 20.0 ... 70.0 71.0 72.0 73.0 74.0 * lon (lon) float64 200.0 201.5 203.0 204.5 ... 324.5 326.0 327.5 329.0 Data variables: *empty*
Perform regridding
Make a regridder by xe.Regridder(grid_in, grid_out, method)
. grid
is just an xarray Dataset
containing lat
and lon
values. In most cases, 'bilinear'
should be good enough. For other methods see Comparison of 5 regridding algorithms.
[7]:
regridder = xe.Regridder(ds, ds_out, "conservative")
regridder # print basic regridder information.
[7]:
xESMF Regridder
Regridding algorithm: conservative
Weight filename: conservative_25x53_59x87.nc
Reuse pre-computed weights? False
Input grid shape: (25, 53)
Output grid shape: (59, 87)
Periodic in longitude? False
The regridder says it can transform data from shape (25, 53)
to shape (59, 87)
.
Regrid the DataArray
is straightforward:
[8]:
dr_out = regridder(dr, keep_attrs=True)
dr_out
[8]:
<xarray.DataArray 'air' (time: 2920, lat: 59, lon: 87)> array([[[296.1936 , 296.4933 , 296.64383, ..., 296.6239 , 296.57 , 296.35767], [295.9 , 296.09998, 296.19998, ..., 295.9 , 295.9 , 295.43332], [295.9 , 296.09998, 296.19998, ..., 295.9 , 295.9 , 295.43332], ..., [243.79999, 244.26666, 244.5 , ..., 233.63335, 235.29999, 237.96663], [243.79999, 244.26666, 244.5 , ..., 233.63335, 235.29999, 237.96663], [241.87102, 242.6313 , 243.01498, ..., 233.68292, 235.44838, 237.66943]], [[296.26776, 296.8064 , 297.0761 , ..., 296.19287, 296.17752, 296.2103 ], [296.19998, 296.53333, 296.69998, ..., 295.56668, 295.5 , 295.23334], [296.19998, 296.53333, 296.69998, ..., 295.56668, 295.5 , 295.23334], ... [249.89 , 249.49 , 249.29 , ..., 241.69 , 242.48999, 243.68997], [249.89 , 249.49 , 249.29 , ..., 241.69 , 242.48999, 243.68997], [246.84814, 246.2443 , 245.9487 , ..., 243.05266, 243.60286, 244.30954]], [[297.2945 , 297.6252 , 297.79266, ..., 296.21606, 296.0664 , 295.7324 ], [296.09 , 296.62332, 296.88998, ..., 295.69 , 295.69 , 295.35666], [296.09 , 296.62332, 296.88998, ..., 295.69 , 295.69 , 295.35666], ..., [249.89 , 249.49 , 249.29 , ..., 239.82333, 240.29 , 241.22331], [249.89 , 249.49 , 249.29 , ..., 239.82333, 240.29 , 241.22331], [246.3288 , 245.82306, 245.57744, ..., 241.16115, 241.18028, 241.57034]]], dtype=float32) Coordinates: * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 * lon (lon) float64 200.0 201.5 203.0 204.5 ... 324.5 326.0 327.5 329.0 * lat (lat) float64 16.0 17.0 18.0 19.0 20.0 ... 70.0 71.0 72.0 73.0 74.0 Attributes: long_name: 4xDaily Air temperature at sigma level 995 units: degK precision: 2 GRIB_id: 11 GRIB_name: TMP var_desc: Air temperature dataset: NMC Reanalysis level_desc: Surface statistic: Individual Obs parent_stat: Other actual_range: [185.16 322.1 ] regrid_method: conservative
The horizontal shape is now (59, 87)
, as expected. The regridding operation broadcasts over extra dimensions (time
here), so there are still 2920 time frames. lon
and lat
coordinate values are updated accordingly, and the value of the extra dimension time
is kept the same as input.
Important note: Extra dimensions must be on the left, i.e. (time, lev, lat, lon)
is correct but (lat, lon, time, lev)
would not work. Most data sets should have (lat, lon)
on the right (being the fastest changing dimension in the memory). If not, use DataArray.transpose
or numpy.transpose
to preprocess the data.
Check results on 2D map
The regridding result is consistent with the original data, with a much finer resolution:
[9]:
ax = plt.axes(projection=ccrs.PlateCarree())
dr_out.isel(time=0).plot.pcolormesh(ax=ax, vmin=230, vmax=300)
ax.coastlines()
[9]:
<cartopy.mpl.feature_artist.FeatureArtist at 0x7f391ace9eb0>
Check broadcasting over extra dimensions
xESMF tracks coordinate values over extra dimensions, since horizontal regridding should not affect them.
[10]:
dr_out["time"]
[10]:
<xarray.DataArray 'time' (time: 2920)> array(['2013-01-01T00:00:00.000000000', '2013-01-01T06:00:00.000000000', '2013-01-01T12:00:00.000000000', ..., '2014-12-31T06:00:00.000000000', '2014-12-31T12:00:00.000000000', '2014-12-31T18:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 Attributes: standard_name: time long_name: Time
[11]:
# exactly the same as input
xr.testing.assert_identical(dr_out["time"], ds["time"])
We can plot the time series at a specific location, to make sure the broadcasting is correct:
[12]:
plt.subplot(2, 1, 1)
dr.sel(lon=260, lat=40).plot() # input data
plt.subplot(2, 1, 2)
dr_out.sel(lon=260, lat=40).plot() # output data
[12]:
[<matplotlib.lines.Line2D at 0x7f391ac417c0>]