Regrid between rectilinear grids

[1]:
%matplotlib inline
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import numpy as np
import xarray as xr
import xesmf as xe

Prepare data

Input data

We regrid xarray’s built-in demo data. This data is also used by xarray plotting tutorial.

[2]:
ds = xr.tutorial.open_dataset('air_temperature') # use xr.tutorial.load_dataset() for xarray<v0.11.0
ds
[2]:
<xarray.Dataset>
Dimensions:  (lat: 25, lon: 53, time: 2920)
Coordinates:
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
  * time     (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
    air      (time, lat, lon) float32 ...
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

It is the air temperature data over US with 2920 time frames. Let’s plot the first frame:

[3]:
dr = ds['air']  # get a DataArray
[4]:
ax = plt.axes(projection=ccrs.PlateCarree())
dr.isel(time=0).plot.pcolormesh(ax=ax, vmin=230, vmax=300);
ax.coastlines();
../_images/notebooks_Rectilinear_grid_8_0.png

Input grid

Its grid resolution is \(2.5^\circ \times 2.5^\circ\):

[5]:
ds['lat'].values, ds['lon'].values
[5]:
(array([75. , 72.5, 70. , 67.5, 65. , 62.5, 60. , 57.5, 55. , 52.5, 50. ,
        47.5, 45. , 42.5, 40. , 37.5, 35. , 32.5, 30. , 27.5, 25. , 22.5,
        20. , 17.5, 15. ], dtype=float32),
 array([200. , 202.5, 205. , 207.5, 210. , 212.5, 215. , 217.5, 220. ,
        222.5, 225. , 227.5, 230. , 232.5, 235. , 237.5, 240. , 242.5,
        245. , 247.5, 250. , 252.5, 255. , 257.5, 260. , 262.5, 265. ,
        267.5, 270. , 272.5, 275. , 277.5, 280. , 282.5, 285. , 287.5,
        290. , 292.5, 295. , 297.5, 300. , 302.5, 305. , 307.5, 310. ,
        312.5, 315. , 317.5, 320. , 322.5, 325. , 327.5, 330. ],
       dtype=float32))

Output grid

Say we want to downsample it to \(1.0^\circ \times 1.5^\circ\). Just define the output grid as an xarray Dataset:

[6]:
ds_out = xr.Dataset({'lat': (['lat'], np.arange(16, 75, 1.0)),
                     'lon': (['lon'], np.arange(200, 330, 1.5)),
                    }
                   )
ds_out
[6]:
<xarray.Dataset>
Dimensions:  (lat: 59, lon: 87)
Coordinates:
  * lat      (lat) float64 16.0 17.0 18.0 19.0 20.0 ... 70.0 71.0 72.0 73.0 74.0
  * lon      (lon) float64 200.0 201.5 203.0 204.5 ... 324.5 326.0 327.5 329.0
Data variables:
    *empty*

Perform regridding

Make a regridder by xe.Regridder(grid_in, grid_out, method). grid is just an xarray DataSet containing lat and lon values. In most cases, 'bilinear' should be good enough. For other methods see Comparison of 5 regridding algorithms.

[7]:
regridder = xe.Regridder(ds, ds_out, 'bilinear')
regridder  # print basic regridder information.
Create weight file: bilinear_25x53_59x87.nc
[7]:
xESMF Regridder
Regridding algorithm:       bilinear
Weight filename:            bilinear_25x53_59x87.nc
Reuse pre-computed weights? False
Input grid shape:           (25, 53)
Output grid shape:          (59, 87)
Output grid dimension name: ('lat', 'lon')
Periodic in longitude?      False

The regridder says it can transform data from shape (25, 53) to shape (59, 87).

Regrid the DataArray is straightforward:

[8]:
dr_out = regridder(dr)
dr_out
[8]:
<xarray.DataArray 'air' (time: 2920, lat: 59, lon: 87)>
array([[[296.133997, 296.386693, ..., 296.433989, 296.199246],
        [295.978009, 296.182748, ..., 296.078024, 295.720987],
        ...,
        [243.27991 , 243.775195, ..., 235.339996, 237.632418],
        [242.240033, 242.879123, ..., 235.419992, 237.496416]],

       [[296.253996, 296.702038, ..., 296.039983, 296.017731],
        [296.21799 , 296.567677, ..., 295.680026, 295.51389 ],
        ...,
        [243.299945, 243.614048, ..., 232.720032, 234.519234],
        [242.700014, 243.038004, ..., 233.159978, 234.719252]],

       ...,

       [[297.189986, 297.623798, ..., 295.329987, 295.054879],
        [296.590054, 297.095963, ..., 295.170007, 294.798957],
        ...,
        [249.069873, 248.668228, ..., 242.790036, 243.748677],
        [247.430058, 246.932094, ..., 243.389967, 244.108711]],

       [[297.049976, 297.387852, ..., 295.989995, 295.690696],
        [296.410046, 296.844012, ..., 295.790018, 295.490757],
        ...,
        [248.929853, 248.553059, ..., 240.530029, 241.237198],
        [247.010071, 246.584907, ..., 241.009973, 241.453222]]])
Coordinates:
  * time     (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
  * lon      (lon) float64 200.0 201.5 203.0 204.5 ... 324.5 326.0 327.5 329.0
  * lat      (lat) float64 16.0 17.0 18.0 19.0 20.0 ... 70.0 71.0 72.0 73.0 74.0
Attributes:
    regrid_method:  bilinear

The horizontal shape is now (59, 87), as expected. The regridding operation broadcasts over extra dimensions (time here), so there are still 2920 time frames. lon and lat coordinate values are updated accordingly, and the value of the extra dimension time is kept the same as input.

Important note: Extra dimensions must be on the left, i.e. (time, lev, lat, lon) is correct but (lat, lon, time, lev) would not work. Most data sets should have (lat, lon) on the right (being the fastest changing dimension in the memory). If not, use DataArray.transpose or numpy.transpose to preprocess the data.

Check results on 2D map

The regridding result is consistent with the original data, with a much finer resolution:

[9]:
ax = plt.axes(projection=ccrs.PlateCarree())
dr_out.isel(time=0).plot.pcolormesh(ax=ax, vmin=230, vmax=300);
ax.coastlines();
../_images/notebooks_Rectilinear_grid_24_0.png

Check broadcasting over extra dimensions

xESMF tracks coordinate values over extra dimensions, since horizontal regridding should not affect them.

[10]:
dr_out['time']
[10]:
<xarray.DataArray 'time' (time: 2920)>
array(['2013-01-01T00:00:00.000000000', '2013-01-01T06:00:00.000000000',
       '2013-01-01T12:00:00.000000000', ..., '2014-12-31T06:00:00.000000000',
       '2014-12-31T12:00:00.000000000', '2014-12-31T18:00:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Attributes:
    standard_name:  time
    long_name:      Time
[11]:
# exactly the same as input
xr.testing.assert_identical(dr_out['time'], ds['time'])

We can plot the time series at a specific location, to make sure the broadcasting is correct:

[12]:
plt.subplot(2,1,1)
dr.sel(lon=260, lat=40).plot()  # input data
plt.subplot(2,1,2)
dr_out.sel(lon=260, lat=40).plot()  # output data
[12]:
[<matplotlib.lines.Line2D at 0x7ff41e255cc0>]
../_images/notebooks_Rectilinear_grid_30_1.png

Clean-up

xESMF saves the regridder to the current directory so you don’t need to re-compute it next time (see Save time by reusing regridder). If you don’t need it anymore, you can just delete it:

[13]:
regridder.clean_weight_file()  # regridder.c + TAB would bring-up the command
Remove file bilinear_25x53_59x87.nc