{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Analysis using `Xarray`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebooks demonstrates some features of `xarray` that are useful for Climate Data Analysis, including:\n", "\n", "1. Reading in multiple files at a time\n", "2. Averaging over dimensions to calculate an average in space, time, or over ensemble members\n", "3. Calculating a climatology and anomalies for monthly data\n", "4. Writing data to netcdf file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Monthly Data\n", "\n", "In this notebook, we will work with monthly data as an example. \n", "\n", "We will return to the CMIP5 data, this time for surface temperature (ts), which corresponds to sea surface temperature over the ocean, from the RCP4.5 scenario produced by the NCAR/CCSM4 model. This time, we will read in all of the ensemble members at one time.\n", "\n", "The data are located on the COLA severs in the following directory:\n", "```/shared/cmip5/data/rcp45/atmos/mon/Amon/tas/NCAR.CCSM4/r*i1p1/```\n", "\n", "The filename is:\n", "```tas_Amon_CCSM4_rcp45_r*i1p1_210101-229912.nc```\n", "\n", "The ensemble members are indicated by: r1i1p1, r2i1p1, r3i1p1, r4i1p1, r5i1p1, r6i1p1 in the directory and filename. \n", "\n", "In `xr.open_mfdataset`, we can simply use `*` in the filename and directory name to indicate all the ensemble members." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import warnings\n", "\n", "import numpy as np\n", "import xarray as xr\n", "import pandas as pd\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "warnings.filterwarnings(\"ignore\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Read multiple files using `xr.open_mfdataset`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set the path and filename using `*` for the ensemble members" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "path='/shared/cmip5/data/rcp45/atmos/mon/Amon/ts/NCAR.CCSM4/r*i1p1/'\n", "fname='ts_Amon_CCSM4_rcp45_r*i1p1_200601-210012.nc'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read the data" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/shared/cmip5/data/rcp45/atmos/mon/Amon/ts/NCAR.CCSM4/r*i1p1/ts_Amon_CCSM4_rcp45_r*i1p1_200601-210012.nc\n" ] } ], "source": [ "ds=xr.open_mfdataset(path+fname,concat_dim='ensemble',combine='nested',decode_times=True)\n", "print(path+fname)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When reading the data, we need to tell `xarray` how to put the data together. Here, I told it to create a new dimension called `ensemble` for combining the data. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset>\n", "Dimensions: (bnds: 2, ensemble: 6, lat: 382, lon: 288, time: 1140)\n", "Coordinates:\n", " * lon (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8\n", " * lat (lat) float64 -90.0 -89.06 -89.06 -88.12 ... 89.06 89.06 90.0\n", " * time (time) object 2006-01-16 12:00:00 ... 2100-12-16 12:00:00\n", "Dimensions without coordinates: bnds, ensemble\n", "Data variables:\n", " time_bnds (ensemble, time, bnds) object dask.array<chunksize=(1, 1140, 2), meta=np.ndarray>\n", " lat_bnds (ensemble, lat, bnds) float64 dask.array<chunksize=(1, 382, 2), meta=np.ndarray>\n", " lon_bnds (ensemble, lon, bnds) float64 dask.array<chunksize=(1, 288, 2), meta=np.ndarray>\n", " ts (ensemble, time, lat, lon) float32 dask.array<chunksize=(1, 1140, 382, 288), meta=np.ndarray>\n", "Attributes:\n", " institution: NCAR (National Center for Atmospheric Resea...\n", " institute_id: NCAR\n", " experiment_id: rcp45\n", " source: CCSM4\n", " model_id: CCSM4\n", " forcing: Sl GHG Vl SS Ds SA BC MD OC Oz AA\n", " parent_experiment_id: historical\n", " parent_experiment_rip: r1i1p1\n", " branch_time: 2005.0\n", " contact: cesm_data@ucar.edu\n", " references: Gent P. R., et.al. 2011: The Community Clim...\n", " initialization_method: 1\n", " physics_version: 1\n", " tracking_id: 635969e3-0203-402b-a58b-e3630cb58a30\n", " acknowledgements: The CESM project is supported by the Nation...\n", " cesm_casename: b40.rcp4_5.1deg.001\n", " cesm_repotag: ccsm4_0_beta49\n", " cesm_compset: BRCP45CN\n", " resolution: f09_g16 (0.9x1.25_gx1v6)\n", " forcing_note: Additional information on the external forc...\n", " processed_by: strandwg on mirage0 at 20111021\n", " processing_code_information: Last Changed Rev: 428 Last Changed Date: 20...\n", " product: output\n", " experiment: RCP4.5\n", " frequency: mon\n", " creation_date: 2011-10-21T21:56:36Z\n", " history: 2011-10-21T21:56:36Z CMOR rewrote data to c...\n", " Conventions: CF-1.4\n", " project_id: CMIP5\n", " table_id: Table Amon (26 July 2011) 976b7fd1d9e1be31d...\n", " title: CCSM4 model output prepared for CMIP5 RCP4.5\n", " parent_experiment: historical\n", " modeling_realm: atmos\n", " realization: 1\n", " cmor_version: 2.7.1" ], "text/plain": [ "
<xarray.Dataset>\n", "Dimensions: (bnds: 2, lat: 382, lon: 288, time: 1140)\n", "Coordinates:\n", " * lon (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8\n", " * lat (lat) float64 -90.0 -89.06 -89.06 -88.12 ... 89.06 89.06 90.0\n", " * time (time) datetime64[ns] 2006-01-16T12:00:00 ... 2100-12-16T12:00:00\n", "Dimensions without coordinates: bnds\n", "Data variables:\n", " lat_bnds (lat, bnds) float64 dask.array<chunksize=(382, 2), meta=np.ndarray>\n", " lon_bnds (lon, bnds) float64 dask.array<chunksize=(288, 2), meta=np.ndarray>\n", " ts (time, lat, lon) float32 dask.array<chunksize=(1140, 382, 288), meta=np.ndarray>" ], "text/plain": [ "