Accuracy assessment and area estimation

Learning goals

  • Improve your understanding of area-adjusted accuracy assessment
  • Assess the map errors of your forest type classification
  • Calculate error-adjusted class area estimates


Independent map accuracy assessment aims at getting an unbiased estimate of map accuracy. Generally, we identify a number of sample locations and compare the predicted class in the map with “true” class, which we determine using high quality reference information (e.g., very high resolution imagery or field surveys). The following section requires understanding of the basic concepts of map accuracy assessment, i.e. the confusion matrix, overall, producer’s, and user’s accuracy. Read more here if you don´t know what these are.

Forest type map with stratified random sample

Why is area-adjustment necessary?

Sample-based estimates of map accuracy are statistical estimates - we infer information about the total population (i.e. all pixels in the map) from a sub-population (i.e. the sample pixels). In order to get a statistically sound estimate of map accuracy, we need to pay attention to the way our sample locations are selected. Sample selection should be random and all pixels should have an inclusion probability greater than zero.

Typical sampling schemes include pure random, systematic random, and stratified random sampling. When is it useful to use stratified sampling over a systematic or pure random sample? Imagine a map forest loss in an area dominated by forest, where deforestation accounts for roughly 0.2% of the area. In order to get a sufficiently large sample (let´s assume 60 samples) for the deforestation class using pure random sampling, we would approximately need to sample 30,000 locations. Stratified random sample helps us to overcome this issue by precisely allocating the number of desired samples within each class (or region) of interest.

Illustration of pure random, systematic, and stratified random sampling designs. Source:

Let´s bring this into the context of our study site. A map of our study region reveals imbalances in class extent, e.g., the deciduous forest class accounts for only 7% of the study area, while coniferous forests account for 40%.

Class name Proportion (\(w_i\))
Deciduous forest \(0.07\)
Mixed forest \(0.25\)
Coniferous forest \(0.40\)
Non-forest \(0.28\)

Stratified random sampling helps us to validate the map using a sufficient amount of samples for each for the four classes, let´s say 50. Below, you find a confusion matrix of the map validation. Rows here relate to the map classes \(i\) and columns indicate reference classes \(j\). Note that the sum of samples per class differs quite substantially between map (\(n_i\)) and reference (\(n_j\)).

Class name Deciduous forest Mixed forest Coniferous forest Non-forest \(n_i\)
Deciduous forest \(39\) \(5\) \(1\) \(5\) \(50\)
Mixed forest \(15\) \(19\) \(10\) \(6\) \(50\)
Coniferous forest \(0\) \(7\) \(39\) \(4\) \(50\)
Non-forest \(11\) \(1\) \(4\) \(34\) \(50\)
\(n_j\) \(65\) \(32\) \(54\) \(49\) \(200\)

From this table, we can derive the unadjusted accuracy values, such as the overall accuracy as follows:

\(OA_{unadjusted} = \frac{(39 + 19 + 39 + 34)}{200} = 0.655 = 65.5\%\).

The above estimate of overall accuracy implicitly weighs each class according to the number of correctly classified samples in it. The more samples, the higher the influence on the overall accuracy score. The unadjusted overall accuracy is thus independent of the true area proportions of the map classes.

Furthermore, the unadjusted overall accuracy does not account for the sampling bias introduced by the stratified sampling. You can imagine this bias as varying sampling densities between the map classes. Let´s assume our map contains a total of 10,000 pixels, we have mapped 700 pixels as deciduous forest and 4,000 pixels as coniferous forest. By verifying 50 samples of each class, we assessed \(\frac{50}{700} = 0.071 = 7.1\%\) of the deciduous forest class pixels, while only \(\frac{50}{4,000} = 0.013 = 1.3\%\) of the coniferous forest class were investigated. The other way around, one sample of the deciduous forest class is considered representative for \(\frac{700}{50} = 14\) pixels, while one sample of the coniferous forest class represents \(\frac{4,000}{50} = 80\) pixels.

An assumption: The samples drawn from the deciduous forest stratum represent a much smaller area in our map (7%), as compared to the coniferous forest class (40%). In terms of overall accuracy, this means that the 39 correctly classified pixels of the deciduous forest class are probably less “relevant” than the 39 correctly classified pixels in the coniferous class. According to the numbers above, one sample of coniferous forest shold thus “weigh” \(\frac{80}{14} = 5.7\) times as much as a deciduous forest pixel. We have to account for this bias when estimating map accuracies from a stratified random sample.

How to do it?

This can be accomplished by populating the confusion matrix with probabilities of encountering the combination of map class \(i\) and reference class \(j\), expressed as:

\(p_{ij} = \frac{n_{ij}}{n_i} * w_i\)

\(n_{ij}\) = Number of pixels belonging to map class \(i\) and reference class \(j\)

\(n_i\) = Total number of pixels of map class \(i\)

\(w_i\) = Proportion of map class \(i\)

In the resulting matrix, each cell value represents the probability of occurrence of map class \(i\) and reference class \(j\).

Class name Deciduous forest Mixed forest Coniferous forest Non-forest
Deciduous forest \(\frac{39}{50}0.07\) \(\frac{5}{50}0.07\) \(\frac{1}{50}0.07\) \(\frac{5}{50}0.07\)
Mixed forest \(\frac{15}{50}0.25\) \(\frac{19}{50}0.25\) \(\frac{10}{50}0.25\) \(\frac{6}{50}0.25\)
Coniferous forest \(\frac{0}{50}0.40\) \(\frac{7}{50}0.40\) \(\frac{39}{50}0.40\) \(\frac{4}{50}0.40\)
Non-forest \(\frac{11}{50}0.28\) \(\frac{1}{50}0.28\) \(\frac{4}{50}0.28\) \(\frac{34}{50}0.28\)

Which yields the confusion matrix populated with probabilities:

Class name Deciduous forest Mixed forest Coniferous forest Non-forest \(\sum p_{i}\)
Deciduous forest \(0.055\) \(0.007\) \(0.001\) \(0.007\) \(0.070\)
Mixed forest \(0.075\) \(0.095\) \(0.050\) \(0.030\) \(0.250\)
Coniferous forest \(0.000\) \(0.056\) \(0.312\) \(0.032\) \(0.400\)
Non-forest \(0.062\) \(0.006\) \(0.022\) \(0.190\) \(0.280\)
\(\sum p_{j}\) \(0.192\) \(0.164\) \(0.385\) \(0.259\) \(1.000\)

This table sums to 1 and the row sums correspond to the map class proportions from above. We can now confirm our assumption from above. The 39 correct pixels of the deciduous forest class represent a smaller map proportion as compared to the 39 correctly classified pixels of the coniferous forest class (5.5% vs. 31.2%). The coniferous forest samples are therefore more relevant for the area-adjusted overall map accuracy than the deciduous forest samples. More precisely, they are \(\frac{0.312}{0.055} = 5.7\) times more important.

Based on this probabilistic confusion matrix, we can calculate the overall and class-wise accuracy scores as usual, while implicitly accounting for different class proportions and the sampling bias.

\(OA_{adjusted} = 0.055 + 0.095 + 0.312 + 0.190 = 0.652\)

We can notice that our overall accuracy did not change a lot after area-adjustment. Now let´s have a look at the user´s and producer´s accuracy of the deciduous forest class. First, we calculate the unadjusted accuracy scores:

\(UA_{unadjusted} = \frac{39}{50} = 0.78 = 78\%\)

\(PA_{unadjusted} = \frac{39}{65} = 0.60 = 60\%\)

Let´s do the same using the probability matrix:

\(UA_{adjusted} = \frac{0.055}{0.070} = 0.78 = 78\%\)

\(PA_{adjusted} = \frac{0.055}{0.192} = 0.29 = 29\%\)

We can see that the user´s accuracy stays the same after adjustment. This makes sense, as we do not incorporate the sampling bias when considering only samples drawn from one map stratum. For the producer´s accuracy, we combine information from samples across all strata. We hence automatically consider the different class proportions and sampling densities when calculating the producer´s accuracy from the confusion matrix populated with probabilities. The producer´s accuracy is relatively low after adjustment, owing to the fact that we likely omit a lot of deciduous forest, which we falsely classified as mixed forest or non-forest.

Error-adjusted area estimates

There are several methods to obtain class-wise area estimates. They can either be derived directly from the map by multiplying the map´s class proportions with the total size of the study area. Assuming a study area of \(S = 10,000 ha\), our area estimates for deciduous forest would thereby be:

\(area_{DF} = w_{i=DF} * S\)

\(area_{DF} = 0.070 * 10,000 ha = 700 ha\)

However, we already know that the reference data at hand is of better quality as compared to our map. Therefore class-area should preferably be estimated from the reference data:

\(area_{DF} = \sum_{p_{j=DF}}^i * S\)

\(area_{DF} = 0.192 * 10,000 ha = 1,920 ha\)

So the unbiased area estimate for deciduous forest in our study region is 1,920 ha, as compared to the map-based estimate of 700 ha. Here again, the substantial omission of deciduous forest in our map was taken into account, which in turn largely increases the area estimate for this class.

Wrapping it up, area adjustment can make quite a substantial difference for overall accuracy and producer´s accuracy. User´s accuracy remains unaffected. Producing area estimates directly from the reference data is advised in order to produce more accurate estimations of land cover, land use, or land change.

All procedures which were only briefly described here and are explained in more detail in Olofsson et al. (2014). We recommend this read, as it is documenting the current state-of-the-art in regards to accuracy assessment in high detail. Another source of background information is the technical documentation of the AREA² tools for accuracy assessment. An R implementation of the area adjusted accuracy assessment is provided in the mapac package.


In this assignment, you will perform an area-adjusted accuracy assessment of your forest type map and estimate error-adjusted area for each forest type class. In a first step, the accuracy scores will be investigated in an Excel spreadsheet. In case you do not yet have a forest type map, please use the one provided here.

1) Producing reference data

Load your forest type map in R. Create a stratified reference sample using sampleStratified(). We want to have a stratified equalized sample of 25 pixels per class. Specify the size parameter accordingly. Specify sp = TRUE to receive a spatial points object and na.rm = TRUE to ignore unclassified regions. Write the points to disk using writeOGR().

Load the point shapefile in QGIS and add a field classID of type Integer. Visit each point and determine the class label according to the high resolution imagery in GoogleEarth. Install and use the Send2GE plugin in QGIS to identify the precise pixel location. Enter the class labels in the attribute table. Work efficiently through the points and save your changes regularly.

Class name classID
Deciduous forest 1
Mixed forest 2
Coniferous forest 3
Non-forest 4

2) Area-adjusted accuracy assessment

Assess the class proportions of your map in R. To do so, use the freq() function to get pixel counts. Use these for calculating class proportions (0-1). Remember to exclude NAs, so the proportions of your 4 classes sum up to 1.

Load the reference point shapefile in R and create a confusion matrix from the resulting data.frame using table(). Copy & paste the values from your confusion matrix as well as the class proportions into the Excel spreadsheet. Next, answer the following questions:

  1. Which class has the highest / lowest user´s accuracy?
  2. Which class has the highest / lowest producer´s accuracy?
  3. How does the overall accuracy differ after area-adjustment? Why?
  4. How do the map-based area estimates differ from those obtained using the reference data?

3) Knowledge transfer

Implement three basic components of the accuracy assessment in R.

  1. Generate the confusion matrix containing probabilities. In this matrix, each cell value represents the probability of occurrence on map class \(i\) and reference class \(j\):

\[p_{ij} = \frac{n_{ij}}{n_i} * w_i\]

\(n_{ij}\) = Number of pixels belonging to map class \(i\) and reference class \(j\)

\(n_i\) = Total number of pixels of map class \(i\)

\(w_i\) = Proportion of map class \(i\)

  1. Calculate overall accuracy and class-wise user´s and producer´s accuracy from the confusion matrix.

  2. Produce error-adjusted area estimates from the confusion matrix.

To do this, track the underlying formulas of the respective sections in the Excel sheet and transfer them to R. Use helper functions such as diag(), sum() and apply(). Avoid for loops.

Compare the results with the values in the Excel table. If they differ, something went wrong. Check again!

Advanced assignment: Compare the accuracy of SVM and RF classification

Assess the area-adjusted accuracy estimates of the results of assignment 5.6).

There is a nice R package developed by one of the EOL colleagues, which implemented functions for stratified estimation of area and map accuracy which can also be employed for the case where sampling strata do not correspond to the map strata, as described here. You can install it as follows:

# install devtools packages
if (!library(devtools)) install.packages("devtools")

# load devtools package

# install mapac package from gitlab
install_gitlab(repo='pflugmad/mapac', host='', quiet=F, force=T)

Please find the documentation here. Use the aa_stratified() function in mapac to compare overall and class-wise user´s and producer´s accuracies of the two maps. Can you find a notable difference in the performance of RF versus SVM classification? If so, which?

Reading materials

In this paper, Hansen et al. (2013) presented the first global Landsat-based map of forest cover change. Please make sure to read the paper and the supplementary materials.

In order to make the discussion more vivid, we simulate a discussion on the applicability of the Global Forest Data as a baseline for REDD+ policy decisions. We therefore divide the classroom into two groups. Group A are “proponents” of the global product, whereas group B are its “opponents”. Students of team 1, 2 and 3 belong to group A (proponents), whereas students in teams 4 and 5 belong to group B (opponents).

Please prepare arguments for your group based on the core article and the supplementary material, as well as additional scientific literature if you wish (e.g. case studies). Arguments can target conceptual issues, methodological components, or other scientific evidence arguing for or against the suitability of the Global Forest Data in the context of REDD+.

Copyright © 2020 Humboldt-Universität zu Berlin. Department of Geography.