Introduction to plotting
Sign up to the DEA Sandbox to run this notebook interactively from a browser
Compatibility: Notebook currently compatible with both the
NCI
andDEA Sandbox
environmentsProducts used: ga_ls8cls9c_gm_cyear_3
Prerequisites: Users of this notebook should have a basic understanding of:
How to run a Jupyter notebook
The basic structure of the DEA satellite datasets
Inspecting available DEA products and measurements
How to load data from DEA
Background
Data visualisation is an important component of working with Earth Observation data. The xarray
Python package provides a range of straightforward data plotting options that allow users to quickly generate simple plots from multi-dimensional datasets. To generate more complex and informative plots, the DEA Notebooks repository also provides a custom plotting module with additional easy-to-use functionality.
Description
This introductory notebook demonstrates how to visualise DEA satellite data returned from running a datacube query. The notebook demonstrates commonly used xarray
plotting methods, as well as custom functions provided in the dea_tools.plotting module.
Topics covered in this notebook include:
View an area of interest prior to querying the datacube
Querying the datacube and loading data
Plotting single band data (e.g. a single satellite band)
Selecting and plotting individual timesteps
Plotting multiple timesteps
Customising plot appearance
Plotting three-band true or false colour imagery
Plotting single timesteps
Plotting multiple timesteps
Customising plot appearance
Getting started
To run this introduction to plotting data loaded from the datacube, run all the cells in the notebook starting with the “Load packages” cell. For help with running notebook cells, refer back to the Jupyter Notebooks notebook.
Load packages
The first step is to run %matplotlib inline
, which ensures figures plot correctly in the Jupyter notebook. Next, load the datacube
package to enable loading data, and a selection of custom DEA functions from the dea_tools.plotting module.
[1]:
%matplotlib inline
import datacube
import sys
sys.path.insert(1, '../Tools/')
from dea_tools.plotting import display_map, rgb
Connect to the datacube
The next step is to connect to the datacube database. The resulting dc
datacube object can then be used to load data. The app
parameter is a unique name used to identify the notebook that does not have any effect on the analysis.
[2]:
dc = datacube.Datacube(app="05_Plotting")
Analysis parameters
The following variables are required to establish a query for this notebook: - lat_range
: The latitude range to analyse (e.g. (-27.715, -27.755)
). For reasonable load times, keep this to a range of ~0.1 degrees or less. - lon_range
: The longitude range to analyse (e.g. (153.42, 153.46)
). For reasonable load times, keep this to a range of ~0.1 degrees or less. - time_range
: The date range to analyse (e.g. ("2013", "2017")
).
[3]:
lat_range = (-27.58, -27.666)
lon_range = (153.3, 153.4)
time_range = ("2013", "2017")
View the queried location
Before running a query and extracting and analysing data, it is useful to double-check that your location is correct. The display_map()
function shows your selected area as a red rectangle on an interactive map. Clicking on any point of the map will reveal the latitude and longitude coordinates of that point.
[4]:
display_map(x=lon_range, y=lat_range)
[4]:
Query and view data
The variables specified above are used to load data from the DEA datacube through the dc.load()
, which was introduced in the Loading data notebook. This introduction will continue to use the GeoMAD product, as used in the Loading data notebook.
[5]:
ds = dc.load(product="ga_ls8cls9c_gm_cyear_3",
x=lon_range,
y=lat_range,
time=time_range)
ds
[5]:
<xarray.Dataset> Size: 18MB Dimensions: (time: 5, y: 372, x: 375) Coordinates: * time (time) datetime64[ns] 40B 2013-07-02T11:59:59.999999 ... 20... * y (y) float64 3kB -3.165e+06 -3.165e+06 ... -3.176e+06 * x (x) float64 3kB 2.066e+06 2.066e+06 ... 2.077e+06 2.077e+06 spatial_ref int32 4B 3577 Data variables: nbart_blue (time, y, x) int16 1MB 236 244 235 230 224 ... 286 275 276 287 nbart_green (time, y, x) int16 1MB 354 369 358 350 343 ... 392 374 371 394 nbart_red (time, y, x) int16 1MB 284 286 278 273 255 ... 353 334 326 346 nbart_nir (time, y, x) int16 1MB 2484 2567 2580 2588 ... 1959 2048 2223 nbart_swir_1 (time, y, x) int16 1MB 1069 1093 1054 1006 ... 1230 1149 1200 nbart_swir_2 (time, y, x) int16 1MB 470 473 454 432 411 ... 657 581 530 543 sdev (time, y, x) float32 3MB 0.0006154 0.0001635 ... 0.001664 edev (time, y, x) float32 3MB 116.4 86.19 80.38 ... 162.9 156.3 bcdev (time, y, x) float32 3MB 0.02473 0.01709 ... 0.03776 0.03605 count (time, y, x) int16 1MB 8 8 8 8 8 8 8 ... 11 11 11 11 11 11 11 Attributes: crs: EPSG:3577 grid_mapping: spatial_ref
Plotting single band images
The xarray
package provides built-in methods for plotting individual data variables or measurements.
To do this, identify the band to plot. In this example, the nbart_swir_1
satellite band is used. To plot a single band, the data must be an xarray.DataArray
(to revise the difference between xarray.Dataset
and xarray.DataArray
objects, refer back to the Loading data notebook):
[6]:
ds.nbart_swir_1
[6]:
<xarray.DataArray 'nbart_swir_1' (time: 5, y: 372, x: 375)> Size: 1MB array([[[1069, 1093, 1054, ..., 38, 37, 37], [1085, 1074, 1021, ..., 37, 36, 36], [1067, 1036, 972, ..., 34, 34, 34], ..., [1249, 1294, 882, ..., 1097, 1092, 1139], [ 810, 801, 745, ..., 1095, 1066, 1145], [1123, 882, 865, ..., 1073, 1016, 1086]], [[1128, 1139, 1110, ..., 46, 45, 47], [1193, 1141, 1077, ..., 47, 45, 47], [1221, 1144, 1053, ..., 47, 45, 47], ..., [1355, 1344, 906, ..., 1107, 1105, 1177], [ 873, 895, 805, ..., 1100, 1070, 1171], [1184, 934, 871, ..., 1086, 1021, 1088]], [[1015, 1044, 1014, ..., 34, 33, 35], [1055, 1045, 977, ..., 33, 33, 30], [1078, 1014, 944, ..., 34, 33, 31], ..., [1237, 1343, 977, ..., 1066, 1038, 1095], [ 799, 828, 832, ..., 1072, 1045, 1114], [1225, 885, 843, ..., 1056, 999, 1067]], [[1068, 1071, 1039, ..., 38, 38, 38], [1114, 1081, 1012, ..., 37, 37, 37], [1140, 1061, 987, ..., 37, 37, 36], ..., [1287, 1343, 980, ..., 1107, 1088, 1161], [ 873, 914, 954, ..., 1101, 1059, 1157], [1227, 943, 896, ..., 1090, 1029, 1103]], [[1071, 1068, 1016, ..., 54, 55, 54], [1063, 1048, 975, ..., 55, 57, 56], [1094, 1032, 946, ..., 56, 58, 57], ..., [1257, 1269, 900, ..., 1250, 1193, 1228], [ 898, 880, 899, ..., 1244, 1176, 1238], [1232, 969, 939, ..., 1230, 1149, 1200]]], dtype=int16) Coordinates: * time (time) datetime64[ns] 40B 2013-07-02T11:59:59.999999 ... 201... * y (y) float64 3kB -3.165e+06 -3.165e+06 ... -3.176e+06 -3.176e+06 * x (x) float64 3kB 2.066e+06 2.066e+06 ... 2.077e+06 2.077e+06 spatial_ref int32 4B 3577 Attributes: units: 1 nodata: -999 crs: EPSG:3577 grid_mapping: spatial_ref
Selecting and plotting a single timestep
The returned object header above specifies that ds.swir1
is a xarray.DataArray
with five timesteps (i.e. <xarray.DataArray 'swir1' (time: 5, y: 446, x: 450)>
). To make a plot for a single timestep only, select the desired timestep using one of the following options:
.isel()
: This stands for “index selection”, which selects individual timesteps from a dataset based on the sequence of loaded timesteps. Counting in Python begins at 0, so to select the first timestep in thexarray.DataArray
we can specify.isel(time=0)
:
[7]:
first_timestep = ds.nbart_swir_1.isel(time=0)
first_timestep
[7]:
<xarray.DataArray 'nbart_swir_1' (y: 372, x: 375)> Size: 279kB array([[1069, 1093, 1054, ..., 38, 37, 37], [1085, 1074, 1021, ..., 37, 36, 36], [1067, 1036, 972, ..., 34, 34, 34], ..., [1249, 1294, 882, ..., 1097, 1092, 1139], [ 810, 801, 745, ..., 1095, 1066, 1145], [1123, 882, 865, ..., 1073, 1016, 1086]], dtype=int16) Coordinates: time datetime64[ns] 8B 2013-07-02T11:59:59.999999 * y (y) float64 3kB -3.165e+06 -3.165e+06 ... -3.176e+06 -3.176e+06 * x (x) float64 3kB 2.066e+06 2.066e+06 ... 2.077e+06 2.077e+06 spatial_ref int32 4B 3577 Attributes: units: 1 nodata: -999 crs: EPSG:3577 grid_mapping: spatial_ref
.sel()
: This selects data using real-world coordinate labels liketime
. For example, from the Coordinates section, the first timestep (i.e. the year 2013) is selected from thexarray.DataArray
by specifying.sel(time='2013')
:
[8]:
first_timestep = ds.nbart_swir_1.sel(time='2013')
first_timestep
[8]:
<xarray.DataArray 'nbart_swir_1' (time: 1, y: 372, x: 375)> Size: 279kB array([[[1069, 1093, 1054, ..., 38, 37, 37], [1085, 1074, 1021, ..., 37, 36, 36], [1067, 1036, 972, ..., 34, 34, 34], ..., [1249, 1294, 882, ..., 1097, 1092, 1139], [ 810, 801, 745, ..., 1095, 1066, 1145], [1123, 882, 865, ..., 1073, 1016, 1086]]], dtype=int16) Coordinates: * time (time) datetime64[ns] 8B 2013-07-02T11:59:59.999999 * y (y) float64 3kB -3.165e+06 -3.165e+06 ... -3.176e+06 -3.176e+06 * x (x) float64 3kB 2.066e+06 2.066e+06 ... 2.077e+06 2.077e+06 spatial_ref int32 4B 3577 Attributes: units: 1 nodata: -999 crs: EPSG:3577 grid_mapping: spatial_ref
After selecting a timestep, the .plot()
method can be used to display the chosen nbart_swir_1
data:
[9]:
first_timestep.plot()
[9]:
<matplotlib.collections.QuadMesh at 0x7ff3d0c7faf0>
Plotting multiple timesteps
It is often useful to produce plots for a single measurement across time, for example to compare change between satellite observations or summary datasets. To plot multiple images, skip the isel()
step above and plot the entire xarray.DataArray
directly.
To plot multiple timesteps in one figure, it is necessary to instruct the .plot()
function to put each timestep in a different column. This is done by specifying .plot(col="time")
:
[10]:
ds.nbart_swir_1.plot(col="time")
[10]:
<xarray.plot.facetgrid.FacetGrid at 0x7ff3a7e1c280>
This kind of plotting is called “facetted plotting”. For more information, refer to the xarray documentation
Customising plot appearance
The plots above are dark and difficult to see clearly. To improve the appearance of xarray
plots, use the robust=True
argument to optimise the plot colours by clipping extreme values or outliers. This will use the 2nd and 98th percentiles of the data to compute the color limits:
[11]:
ds.nbart_swir_1.plot(col="time", robust=True)
[11]:
<xarray.plot.facetgrid.FacetGrid at 0x7ff3a7a76c50>
Plots can be further customised by adding custom colour maps/styles using the cmap
parameter.
When choosing a colour map for a plot, it is important to choose a set of colours that are perceived logically by the human eye. The best colour maps are “perceptually uniform”: these colour maps increase logically from dark to light colours, where equal increases in lightness/darkness correspond to equal changes in data values. Some best-practice perceptually uniform colour maps include:
"viridis", "plasma", "inferno", "magma", "cividis"
For further reading about perceptually uniform colour maps in data visualisation, refer to the matplotlib documentation
It is also important to consider colour blindness when selecting a colour map. xarray
supports many colour maps from the “colorbrewer” family of colour maps which are optimised for colour blindness. You can use the interactive online tool to browse all available colour maps, or choose from one of the following commonly used options:
"Greys", "Purples", "Blues", "Greens", "Oranges", "Reds",
"YlOrBr", "YlOrRd", "OrRd", "PuRd", "RdPu", "BuPu",
"GnBu", "PuBu", "YlGnBu", "PuBuGn", "BuGn", "YlGn"
For a full list of available colour maps you can refer to this list.
The example cell below plots the data with the perceptually uniform magma
colour map:
[12]:
ds.nbart_swir_1.plot(col="time", robust=True, cmap="magma")
[12]:
<xarray.plot.facetgrid.FacetGrid at 0x7ff3a7bfb2e0>
Plotting true or false colour RGB images
Although xarray
makes it easy to plot single band images, plotting a three band colour photo-like image is less straightforward.
To make this easier, the dea-notebooks repository provides a custom rgb()
function that is designed for plotting three band images. The rgb()
function maps three data variables/measurements from the loaded dataset to the red, green and blue channels that are used to make a three-colour image.
Providing the nbart_red
, nbart_green
and nbart_blue
measurements from a dataset will produce a true colour image (akin to how humans view the landscape). Providing nbart_nir
, nbart_red
and nbart_green
measurements or any other set of three satellite bands from a dataset will produce a false colour image.
Hence, the rgb()
function can be used to visualise the data returned by a query. It requires the minimum input of:
ds:
Thexarray.Dataset
objectbands:
Three bands for display (these must be measurements found in the dataset)index:
The timestep to view, default is0
Plotting a single timestep
The time dimension of an xarray.Dataset
describes how many timesteps exist for the loaded data. In the rgb()
function, the index
variable is asking for which timesteps to view (similar to the isel()
example above). Remember: counting in Python begins at 0 so to view the earliest timesteps set index=0
:
[13]:
# View a red, green, blue (true colour) image of the first timestep
rgb(ds, bands=["nbart_red", "nbart_green", "nbart_blue"], index=0)
It is possible to change the input bands to plot a false colour image, which can provide different insights in a landscape. The false colour band combination (nbart_swir_1
, nbart_nir
, nbart_green
) emphasises growing vegetation in green, and water in deep blue:
[14]:
# View a swir1, nir, green (false colour) image of the first timestep
rgb(ds, bands=['nbart_swir_1', 'nbart_nir', 'nbart_green'], index=0)
Plotting multiple timesteps
As discussed in the single band example above, it can be useful to visualise multiple timesteps in a single plot (e.g. to compare change over time).
The rgb()
function can also do this, as long as a list of timesteps to view is provided to the index
argument, e.g. index=[X1, X2, ...]
. The example cell below plots the first and fifth image in the dataset using index=[0, 4]
(remembering that counting in Python starts at 0):
[15]:
# View a true colour image for the first and fifth timesteps
rgb(ds, bands=['nbart_red', 'nbart_green', 'nbart_blue'], index=[0, 4])
It is also possible to use rgb()
to plot all timesteps in a dataset using the col="time"
syntax demonstrated in the single band example above:
[16]:
# Plot all timesteps in the dataset
rgb(ds, bands=['nbart_red', 'nbart_green', 'nbart_blue'], col="time")
Customising plot appearance
By default, rgb()
generates plots with robust=True
to improve the appearance of the images by clipping out the darkest and brightest 2% of pixels, using the 2nd and 98th percentiles of the data to compute the colour limits
If this default provides poor results, the plot’s colour stretch can be customised using the percentile_stretch
parameter. This clips the most extreme minimum and maximum values in the dataset, improving the contrast and appearance of the plot.
For example, specifying percentile_stretch=[0.05, 0.95]
will clip out the darkest and brightest 5% of pixels, focusing the colour stretch on the remaining 90% of less extreme values:
[17]:
rgb(ds,
bands=['nbart_red', 'nbart_green', 'nbart_blue'],
index=0,
percentile_stretch=[0.05, 0.95])
Recommended next steps
To continue working through the notebooks in this beginner’s guide, the following notebooks are designed to be worked through in the following order:
Plotting (this notebook)
For advanced plotting and visualisation options, see:
The dea_tools.plotting module containing functions used for plotting and visualising DEA data.
Once you have you have completed the beginner tutorials, join advanced users in exploring:
The “DEA products” directory in the repository, where you can explore DEA products in depth.
The “How_to_guides” directory, which contains a recipe book of common techniques and methods for analysing DEA data.
The “Real_world_examples” directory, which provides more complex workflows and analysis case studies.
Additional information
License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Australia data is licensed under the Creative Commons by Attribution 4.0 license.
Contact: If you need assistance, please post a question on the Open Data Cube Discord chat or on the GIS Stack Exchange using the open-data-cube
tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on
GitHub.
Last modified: June 2024
Compatible datacube version:
[18]:
print(datacube.__version__)
1.8.6
Tags
Tags: sandbox compatible, NCI compatible, landsat 8, annual geomedian, plotting, dea_plotting, rgb, beginner, data visualisation, colour maps, isel, sel