Extracting data

Extracting data#

Now that we know (a) how to load ee.Image()s and filter ee.ImageCollection()s, (b) create and download ee.FeatureCollection()s, what we want to look at in this short section is: how to bring the things together. Specifically, we show here how to get the values of a single image. After that we apply this to the functions we have described before to download the data. We take as an example again our two points stored in fc and the GFW data.

import ee
import geemap
try:
    ee.Initialize()
except Exception as e:
    ee.Authenticate()
    ee.Initialize()
fc
    • type:FeatureCollection
      • ID:Integer
      • system:index:String
        • type:Feature
        • id:0
          • type:Point
            • 0:-60
            • 1:-20
          • ID:1
        • type:Feature
        • id:1
          • type:Point
            • 0:-61
            • 1:-21
          • ID:2
gfw = ee.Image("UMD/hansen/global_forest_change_2023_v1_11").select(['treecover2000'])
gfw
    • type:Image
    • id:UMD/hansen/global_forest_change_2023_v1_11
    • version:1711144720400111
        • id:treecover2000
        • crs:EPSG:4326
          • 0:0.00025
          • 1:0
          • 2:-180
          • 3:0
          • 4:-0.00025
          • 5:80
          • type:PixelType
          • max:255
          • min:0
          • precision:int
          • 0:1440000
          • 1:560000
      • system:asset_size:1358313037330
        • type:LinearRing
            • 0:-180
            • 1:-90
            • 0:180
            • 1:-90
            • 0:180
            • 1:90
            • 0:-180
            • 1:90
            • 0:-180
            • 1:-90
      • system:time_end:1672444800000
      • system:time_start:946684800000

The code needed for this is pretty straight forward:

vals = gfw.sampleRegions(collection=fc, properties=['ID'], scale=30, tileScale=16, geometries=False)

You can have a more detailed look into the documentation, but here is already a brief description of the function:

  • gfw.sampleRegions(): this basically means, that we take the image (all bands) and take a sample from it (aka: different smaller regions)

  • collection: are the features that we want to get the values for

  • properties: a list of attributes from the original feature collection that you want to keep

  • scale: is the spatial resolution in meters at which the values should be extracted. Since we know that the GFW data are from Landsat, we use 30m as a spatial resolution. If we were to use e.g., Sentinel-2 data, we would modify this to 10

  • tileScale: this is a factor for Earth Engine on how to subdivide the points. The more points and the larger the overall coverage is, the larger you want to choose the value

  • geometries: whether or not the geometries should be loaded to the client as well. With many points or complex geometries this increases the data size substantially.

Using our data coverters from the previous chapter we can now convert the data into a pandas dataframe and are done :-)

df = ee.data.computeFeatures({
    'expression': vals,
    'fileFormat': 'PANDAS_DATAFRAME'})
df
geo ID treecover2000
0 None 1 53
1 None 2 0