An Overview of Malaria Atlas Project (MAP): Landcover dataset using Google Earth Engine

Authors

Table of Contents

1. Purpose

2. Dataset Description

3. Data I/O

4. Metadata Display and Basic Visualization

5. Use Case Examples

6. Create Binder Environment

7. Issues

8. References

Notebook Purpose

The purpose of this notebook is to provide an introductory overview to using the Malaria Atlas Project (MAP) Landcover data. An important tool in our arsenal to collectively address climate change and ecological degradation is the ability to monitor large-scale temporal and spatial changes in land cover. This gives us a broader picture of how the Earth's land has changed, and provides an important step in understanding how human activities are driving and influencing these changes.

The dataset allows us to visualize global land cover classification each year from 2001-2013 according to the International Geosphere-Biosphere Programme (IGBP) classification scheme, and can be broadly used for any large-scale land classification analysis where the metric of interest falls in this time period.

This notebook will use the Google Earth Engine (GEE) API to walk through loading in the dataset. We will explore the changes in South Sumatran forest cover from 2001 to 2012 using the MAP data, compare these results to the MODIS MCD12Q1 underlying. We will then use only the MODIS MCD12Q1 data to analyze the effects of the 2019/2020 fires on the landcover of Kangaroo Island as a method of comparing the MAP data product with its underlying MODIS data.

Dataset Description

Using the International Geosphere Biosphere Program (IGBP) layer from the MODIS annual landcover product (MCD12Q1) as the foundational layer, Henry Gibson and Daniel Weiss from the Malaria Atlas Project at the University of Oxford created the MAP dataset enabling visualization of the global change in landcover covering a 12 year period from 2001 to 2013. The MODIS layer has an annual temporal resolution, and an approximate spatial resolution of 5000m, which was converted to a fractional product indicating the integer percentage of the output pixel covered by each of the 17 landcover classes. A cluster of pixels from the higher resolution MODIS dataset is assigned an overall class based on the dominant pixel type in that cluster. For example, a 5 square kilometer area in the MODIS data, with pixels consisting of 50% water, 30% snow and ice, and 20% urban and built up areas will be assigned an overall class of water in the MAP dataset. This will systematically overestimate the actual area of the dominant pixel.

We will also briefly describe the MODIS MCD12Q1 annual landcover product. This data product is the foundational layer for the MAP landcover data and provides an interesting baseline with which we can compare our results, to better understand the consistency between data products and their derivatives. MCD12Q1 contains 5 different classification schemes for landcover (IGBP, UMD, LAI, BGC, and Annual Plant Functional Types Classification). The MAP dataset uses the first of these classification schemes, the Annual International Geosphere-Biosphere Programme (IGBP) classification. There are 17 different land cover classes in this scheme (see here), each using a different threshold to classify the pixels.

img1.png

Fig 1. Malaria Atlas Project - Landcover Pixel Classification Scheme | source

Dataset Input/Output

First, we import our necessary packages.
We use Earth Engine (ee) to access our data through the GEE API, geemap, matplotlib and cartopy for visualization/mapping, along with pandas and numpy for data manipulation.

Next, we initialize our notebooks' communication with GEE. Since we have already authenticated our code, we leave it commented out.

Now, we load in all of our data

Metadata Display and Basic Visualization

Metadata

We're arbitrarily choosing 2012 as our year of interest to make a simple visualization

Simple Visualization

Use Case Examples

Use Case #1

We are interested in exploring global histories of deforestation. We select Sumatra as a use-case example, since we know that this area has experienced large scale deforestation largely due to palm oil and pulp plantations$^4$.

To visualize trends in land cover change, we first concatenate a data frame of band percent cover per year of observation (2002-2012), where band names are representative of land cover categories. We then graph the indivudal bands of the data frame across a time axis to directly compare their fluctuations. Finally, we map the Evergreen Needleleaf Forest band over Sumatra using three disctint layers - for years 2002, 2006, and 2012. Mapping multiple layers enables us to manuever between forest landcovers for different years directly on the map display, which makes for more efficient comparisons given that visible differences are subtle between separate images.

Although our results display visual evidences of deforestation, there is concern that the dataset's resolution (5km) is too poor to capture informative nuances of deforestation. To explore this issue further, we choose to compare our original visualization with the dataset's foundational layer, MODIS MCD12Q1, which has a finer resolution of 500m.

Our data below reveals that between the years 2001 and 2012, deforestation in Sumatra was minimal, changing from 55% forest to 52% forest. We know, however, that deforestation is estimated to be much higher, since according to the World Wildlife Fund, Sumatra lost 50% of their forest in the past 22 years$^5$. For our second use case, we look at the underlying data, which is at a higher resolution, to see if we can make out this trend.

Below, we map just the forest landcover changes, where "greener" area corresponds to more forested area.

Use Case #2

In our second use case, we dive into the underlying data to the landcover data used above to see if we can better discern deforestation in Sumatra. We're using this data, since as described in our dataset description, this data has a much better spatial (and temporal) resolution.

Similar to our analysis with the previous data, let's explore a couple years to look at forest coverage.

Use Case #3

Our third use case similarly utilizes the MCD12Q1 dataset to examine landcover effects of the 2019-2020 bushfire on Kangaroo Island, Australia.

This historical fire burned nearly half of Kangaroo Island, killing two people, 60,000 livestock, and destroying 87 homes$^6$. The mapping below follows our previous use cases in a similar structural manner; however, we filter layers for years 2018 and 2020 to highlight the stark landcover contrast that occurrs over a small temporal range.

It is significant to note that in both our second and third use case examples, we visualize increases in Savannas coverage (defined as tree cover between 10-30%, canopy >2m). Contextually, we can attribute these transitions to deforestation and fire, respectively, but the imagery itself is not sufficient to distinguish between these causes.

Issues

There were some issues with the MAP dataset that may potentially make it unsuitable for some analyses. To begin with, the colors are improperly mapped to their respective landcover class, which is noticable on the visualized map and the built in legend. Greenland and Antarctica are classified as cropland, and one of the most extensive urban areas on the planet apparently extends through Northern Canada. We tried several different methods to update the existing legend or create a new one, but were unsuccessful.

The spatial resolution is also unclearly documented on Google Earth Engine and it wasn't until we visualized the data that we noticed that it is 5000m instead of 500m as in the MCD12Q1 underlying dataset. This causes each pixel to dramatically overestimate the extent of the dominant land class. For example, see that much of inland Canada appears to be underwater. However, it should be noted that even though each pixel is classified as one landcover class, it still retains the relative percentages of each of the other landcover classes in the dataframe. This enables the researcher to perform analyses on the non-dominant landcover classes for each pixel, and greatly expands the potential analyses with which this dataset can be used for.

The temporal resolution is also extremely coarse - yearly - making this dataset potentially unsuitable for analyses needing finer temporal and spatial scales.

References

  1. Landcover Data, Google Earth Engine - https://developers.google.com/earth-engine/datasets/catalog/Oxford_MAP_IGBP_Fractional_Landcover_5km_Annual#description
  2. Malaria Atlas Project - https://malariaatlas.org/
  3. MODIS MCD12Q1 - https://developers.google.com/earth-engine/datasets/catalog/MODIS_006_MCD12Q1#bands
  4. WWF International - https://wwfint.awsassets.panda.org/downloads/deforestation_fronts_factsheet___sumatra.pdf
  5. WWF International - https://wwf.panda.org/discover/knowledge_hub/where_we_work/sumatra/#:~:text=About%2012%20million%20hectares%20of,tigers%20left%20in%20the%20wild
  6. Wildfire Today - https://wildfiretoday.com/2021/03/25/report-released-for-the-bushfire-that-burned-much-of-kangaroo-island-in-south-australia/