STATISTICAL AND MACHINE LEARNING MODELS APPLIED TO PREDICTING THE PRODUCTIVITY OF THE DWARF COCONUT TREE IN THE EASTERN AMAZON
Growth analysis; climate; Cocos nucifera L.; fruit development; artificial intelligence.
The Green Dwarf Coconut (Cocos nucifera L.) is a tropical crop of great economic and nutritional importance, particularly in humid regions. However, its perennial nature makes it vulnerable to climatic variations and extreme weather events, affecting all stages of its development cycle, particularly after inflorescence opening, and consequently, its productivity. In this thesis, we used statistical and machine learning models to estimate the productivity of the Green Dwarf Coconut and assess the impact of meteorological variables and extreme climate events on yield. In the first chapter, we analyzed high-yield (April) and low-yield (November) harvests based on nine years of experimental and meteorological data. We considered variables such as temperature, humidity, precipitation, wind, solar radiation, and radiation balance, as well as derived variables such as vapor pressure deficit, evapotranspiration, and soil water deficit/excess. Multiple linear regression (MLR) models and machine learning algorithms, including multilayer perceptron neural networks (MLP), support vector regression (SVR), and random forest (RF), were tested. Model performance was evaluated using the root mean square error (RMSE), coefficient of determination (R²), and mean absolute error (MAE), along with model interpretation via Shapley Additive Explanation (SHAP). Machine learning models demonstrated superior performance, with MLP being more suitable for high-productivity periods and RF for low-productivity periods. Among the most influential factors, solar radiation and water excess during fruit maturation stood out for high-productivity periods, while relative humidity and vapor pressure deficit were key determinants in low-productivity periods. In the second chapter, we assessed the impact of extreme climate events on the productivity of the Green Dwarf Coconut in northeastern Pará, distinguishing between the rainy period (PC—December to July) and the less rainy period (PMC—August to November) from 2015 to 2023. We analyzed meteorological variables and extreme climate events, including extreme maximum temperature (HT), extreme precipitation (HEP, 90th percentile), and low precipitation (LP, 10th percentile). We developed predictive models using MLR and RF, with RF proving to be the most efficient, achieving an RMSE equivalent to 20% of the average productivity. However, RF exhibited generalization difficulties on the test set, possibly due to overfitting. The inclusion of lagged productivity (P_t-1) demonstrated its significant influence on the models. During the PC, extreme precipitation events and water excess after the fifth month of inflorescence development contributed to increased productivity, whereas in the PMC, low precipitation events reduced yield. In some cases, high precipitation was able to mitigate the negative effects of low water availability. Our results highlight the importance of agrometeorological modeling and machine learning as tools for estimating the productivity of the Green Dwarf Coconut and understanding the impacts of climate variations. Identifying the most influential variables enables the development of adaptive strategies to mitigate productivity losses and enhance crop stability in the face of climate change.