Abstract
Accurate mapping of oil palm is important for understanding its past and future impact on the environment. We propose to map and count oil palms by estimating tree densities per pixel for large-scale analysis. This allows for fine-grained analysis, for example regarding different planting patterns. To that end, we propose a new, active deep learning method to estimate oil palm density at large scale from Sentinel-2 satellite images, and apply it to generate complete maps for Malaysia and Indonesia. What makes the regression of oil palm density challenging is the need for representative reference data that covers all relevant geographical conditions across a large territory. Specifically for density estimation, generating reference data involves counting individual trees. To keep the associated labelling effort low we propose an active learning (AL) approach that automatically chooses the most relevant samples to be labelled. Our method relies on estimates of the epistemic model uncertainty and of the diversity among samples, making it possible to retrieve an entire batch of relevant samples in a single iteration. Moreover, our algorithm has linear computational complexity and is easily parallelisable to cover large areas. We use our method to compute the first oil palm density map with 10m Ground Sampling Distance (GSD), for all of Indonesia and Malaysia and for two different years, 2017 and 2019. The maps have a mean absolute error of ±7.3 trees/ha, estimated from an independent validation set. We also analyse density variations between different states within a country and compare them to official estimates. According to our estimates there are, in total, >1.2 billion oil palms in Indonesia covering >15 million ha, and > 0.5 billion oil palms in Malaysia covering >6 million ha.
Highlights
Oil palm is the third largest oil crop in the world by planted area, and accounted for 35% of the vegetable oil production in the world in 2019 (Unite States Department of Agriculture, 2021)
This study focuses on a geographical area that comprises the coun tries of Malaysia and Indonesia
Using different sample sizes B, we evaluate how much active learning improves performance compared to the original base dataset, and compare it against two alternative approaches: Naïve selection: here we add the same number of samples as with active learning, but in this case we pick random clusters of samples that are geographically close; simulating a naïve annotator
Summary
With the highest yield per hectare of any fat oil it is an attractive economic alternative in many tropical countries (Meijaard et al, 2018). Large-scale oil palm production in Malaysia and Indonesia is a potential driver of deforestation (Austin et al, 2019; Gaveau et al, 2019). Several works relate oil palm development with long-lasting effects on the environment, including loss of bio-diversity (Margono et al, 2014), poor air quality and high greenhouse gas emis sions (Noojipady et al, 2017; Van der Werf et al, 2009). Balancing the economic and social benefits of oil palm plantations with the impact it has on the environment is a challenging task. We refer the reader to (Meijaard and Sheil, 2019) for a thorough analysis of the ethics around the palm oil industry
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have