F E A TUR ED AR T I C L E A machine learning-based exploration of resilience and food security Alexis H. Villacis1 | Syed Badruddoza2 | Ashok K. Mishra3 1Department of Agricultural, Environmental, and Development Economics, The Ohio State University, Columbus, Ohio, USA 2Department of Agricultural and Applied Economics, Texas Tech University, Lubbock, Texas, USA 3Morrison School of Agribusiness, W.P. Carey School of Business, Arizona State University, Mesa, Arizona, USA Correspondence Alexis H. Villacis, Department of Agricultural, Environmental, and Development Economics, The Ohio State University, Columbus, OH 43210, USA. Email: villacis.9@osu.edu Editor in charge: Gopinath Munisamy [Correction added on 14 November 2024, after first online publication: The article classification has been updated in this version.] Abstract Leveraging advancements in remote data collection and using the Food Insecurity Experience Scale (FIES) as a proxy measure of resilience, we show that machine learning models (such as Gradient Boosting Classifier, eXtreme Gradient Boosting, and Artificial Neural Net- works), can predict resilience with relatively high accu- racy (up to 81%). Key household-level predictors include access to financial institutions, asset owner- ship, the adoption of agricultural mechanization as evidenced by the use of tractors, the number of crops cultivated, and ownership of nonfarm enterprises. Our analysis offers insights to researchers and policymakers interested in the development of targeted interventions to bolster household resilience. KEYWORD S Ethiopia, Food Insecurity Experience Scale, Malawi, Nigeria, predictive performance, Uganda J E L C LA S S I F I CA T I ON C52, C83, O12, Q18 Life doesn't get easier or more forgiving; we get stronger and more resilient. Steve Maraboli (2009) Life, the Truth, and Being Free. Received: 10 July 2023 Accepted: 15 August 2024 DOI: 10.1002/aepp.13475 This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made. © 2024 The Author(s). Applied Economic Perspectives and Policy published by Wiley Periodicals LLC on behalf of Agricultural & Applied Economics Association. Appl Econ Perspect Policy. 2024;46:1479–1505. wileyonlinelibrary.com/journal/aepp 1479 mailto:villacis.9@osu.edu http://creativecommons.org/licenses/by-nc-nd/4.0/ http://wileyonlinelibrary.com/journal/aepp http://crossmark.crossref.org/dialog/?doi=10.1002%2Faepp.13475&domain=pdf&date_stamp=2024-09-19 Resilience has emerged as a prominent policy priority for sustainability and development (Jones et al., 2021). Humanitarian and development agencies, along with researchers and prac- titioners, are increasingly emphasizing resilience to develop long-term strategies that tackle the effects of climate change, conflict, and epidemics (Knippenberg et al., 2019). Defined, from a normative standpoint, as “the ability to achieve and maintain an acceptable standard of well- being even in the face of shocks and stressors” (Barrett & Constas, 2014), the concept of resil- ience is now prominently featured in large-scale sustainable development investments. These investments aim to support households and communities in coping with diverse shocks and stressors that undermine poverty reduction and food security efforts (Walsh-Dilley et al., 2016). The increasing emphasis on sustainable development investments (Mullan et al., 2018) underscores the necessity for robust empirical evidence to elucidate the interplay between well- being and shocks. Advancements in remote data collection, earth observations, and big data analytics offer promising avenues for gaining new insights into the dynamics of resilience (Knippenberg et al., 2019). Prominently, the application of machine learning algorithms pro- vides opportunities to identify more accurate predictors of resilience and vulnerable areas of concern (Jones et al., 2021; Lieslehto et al., 2022). While machine learning algorithms have gained traction in predicting food insecurity (Balashankar et al., 2023; Foini et al., 2023; Hossain et al., 2019; Martini et al., 2022; Villacis et al., 2023; Yeh et al., 2020), their application in forecasting household resilience remains lim- ited. However, leveraging machine learning models and big data holds the potential to improve the precision of resilience prediction, thereby offering valuable insights for decision-making processes aimed at supporting households in their recovery from adverse shocks. The present study expands the currently limited knowledge of machine learning-based examinations of resilience (Garbero & Letta, 2022; Knippenberg et al., 2019) by presenting novel evidence from smallholder farmers from various African countries. By employing com- prehensive data obtained from the Harmonized Phone Surveys conducted by the World Bank Living Standards Measurement Study (LSMS) in Ethiopia, Malawi, Nigeria, and Uganda in 2020, we leverage the exogenous shocks induced by the COVID-19 pandemic to explore the potential of machine learning in enhancing the understanding of farm-household resilience dynamics. Building upon the definition from Barrett and Constas et al. (2014), we proxy “an acceptable standard of well-being” with “an acceptable level of food security,” and subsequently utilize the Food Insecurity Experience Scale (FIES) for our purposes. Our efforts focus on employing vari- ous machine learning models to forecast resilience status and identify key predictors of resil- ience. Contrary to Knippenberg et al. (2019) but in accordance with Garbero and Letta (2022), we frame resilience prediction as a classification task rather than a regression problem. This choice is motivated by the primary objective of identifying households that demonstrate resil- ience in the face of exogenous shocks and determining the key features associated with their resilience. Results show that—within the context of our selected group of African nations— resilience can be predicted with relatively high accuracy (between 78% and 81%) using machine learning models. More importantly, we find that key household-level predictors of resilience include access to financial institutions, ownership of assets, risk and income diversification as evidenced by the number of crops cultivated and the ownership of nonfarm enterprises, and finally, the adoption of agricultural mechanization, as evidenced by the use of tractors for agri- cultural activities. Our study makes two distinct contributions to the existing knowledge on resilience building strategies in the face of shocks. Firstly, from an academic standpoint, we expand upon existing 1480 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense literature by employing novel proxy measures of resilience as well as novel machine learning models to forecast resilience. While previous studies utilized the Coping Strategy Index (CS) (Knippenberg et al., 2019) and the Ability to Recover from Shocks Index (ATR) (Garbero & Letta, 2022), we utilize the FIES as a proxy measure. Additionally, we incorporate robust machine learning models such as Gradient Boosting Classifier, eXtreme Gradient Boo- sting, and Artificial Neural Networks, which have not been extensively explored in previous resilience prediction research.1 We train these models to use current household features to pre- dict their resilience to shocks. Secondly, the identification of significant predictors of resilience provides valuable insights for both researchers and policymakers. Our findings enhance our understanding of the predic- tors of resilience within our selected group of African countries, deepening knowledge regard- ing the factors associated with households' ability to withstand and recover from shocks. From a policy perspective, these results hold practical implications as policymakers can utilize the identified predictors to guide the development and implementation of research programs aimed at better understanding and enhancing household resilience. The remainder of this paper is structured as follows. Section 1 outlines the context and essential definitions, describes the data utilized, and presents summary statistics. Section 2 examines the machine learning methods and approaches employed in this study. In Section 3, we present our research findings, emphasizing the use of machine learning methods for resil- ience prediction and identifying key predictors. Lastly, Section 4 concludes the paper and dis- cusses the policy implications derived from our findings. CONTEXT, DEFINITIONS, AND DATA To explore the potential of machine learning in advancing the analysis of household resilience, we leverage the shocks induced by the COVID-19 pandemic. In addition to the detrimental health outcomes such as morbidity and mortality, the pandemic led to travel restrictions, quar- antine measures, business closures, and school suspensions in various regions (Hsiang et al., 2020). These adverse shocks had substantial economic implications, resulting in a global economic contraction (Blake & Wadhwa, 2020). Consequently, food security and access to essential medicines and staple foods were impacted in low-income countries (Josephson et al., 2021). Of interest in this study is the food security status of households during exogenous shocks like the pandemic. Given that food security is one of the most widely recognized indicators of well-being (Pinstrup-Andersen, 2009), changes in food security status resulting from the adverse shocks of the pandemic provide an ideal context to investigate resilience.2 To study changes in the food security status of households during the pandemic, we use data from high-frequency phone surveys conducted in Ethiopia, Malawi, Nigeria, and Uganda during 2020. The phone surveys were supported by the World Bank since the outset of the pandemic, motivated by the suspension of regular in-person data collection (Rudin-Rush et al., 2022).3 Households inter- viewed by phone represent a subset of the complete study sample the World Bank team used to interview in person in each country as part of their Living Standards Measurement Study. The choice to focus on Ethiopia, Malawi, Nigeria, and Uganda in this study was based on the next specific criteria: (i) the presence of an official first case of COVID-19 reported in the country during February, March, or April of 2020, (ii) the availability of publicly accessible sur- vey data containing food security information from the early stages of the pandemic (May or MACHINE LEARNING AND RESILIENCE 1481 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense June 2020), and (iii) the existence of a survey follow-up that contained food security informa- tion and was close in timing with other countries surveyed—to maximize the sample size. Figure 1 illustrates how Ethiopia, Malawi, Nigeria, and Uganda fulfilled the criteria mentioned above, with a first round of data collection performed during May–June of 2020 and a follow-up round performed during August–November of 2020.4 To facilitate the utilization of data derived from the high-frequency phone surveys and enable cross-country comparisons, the World Bank LSMS team harmonized the variables obtained from these surveys. This harmonization process involved adhering to standardized def- initions and ensuring consistent variable names. The variables encompassed various aspects such as demography, housing, household consumption expenditure, agriculture, and food secu- rity (World Bank, 2021). To measure the food security status of households, the phone surveys used the FIES. The FIES is a metric used to assess the severity of food insecurity at the household or individual level. It relies on direct yes/no responses to eight questions regarding access to adequate food. The FIES serves as a scale encompassing a broad range of social, psychological, and health- related conditions, much like other established instruments used to measure unobservable traits. In Table 1, we provide the English version of the Food Insecurity Experience Scale Survey Module (FAO, 2017).5 The collective analysis of the eight questions of the FIES produces a quantitative tool to assess the prevalence of food insecurity. Specifically, the FIES methodology yields two indica- tors: (i) the prevalence of severe food insecurity and (ii) the prevalence of moderate or severe food insecurity (combining moderate and severe levels). Individuals experiencing moderate food insecurity often consume low-quality diets and may reduce the quantity of food they typi- cally consume at certain times during the year. On the other hand, those experiencing severe food insecurity endure days without eating due to a lack of financial means or resources to acquire food (FAO, 2017). In our analysis, households are deemed food insecure if their probability of being severely and/or moderately/severely food insecure is equal or greater than 50% (World Bank, 2021).6 This criterion delineates four possible scenarios for households across two periods: (i) remaining food secure in both periods, (ii) transitioning from food security in the first period FIGURE 1 Timeline of events in Ethiopia, Malawi, Nigeria, and Uganda during 2020. The date of the official first case of COVID-19 reported in each country was sourced from Roberts et al. (2021). Data collection dates were obtained from descriptions provided by the Microdata Library of the World Bank. 1482 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense to food insecurity in the second period, (iii) remaining food insecure in both periods, and (iv) moving from food insecurity in the first period to food security in the second period. In Table 2, we show the distribution of households in our sample according to the four possible scenarios of food security dynamics described above. We construct a binary indicator to represent resilience, as informed by the FIES indicators and the normative framework on resilience previously discussed. Non-resilience is straightfor- wardly indicated by persistence in or transitions to food insecurity (scenarios 2 and 3 described above, columns (3) and (4) of Table 2). However, the definition of resilience warrants discus- sion. A strict interpretation would identify resilience solely with continuous food security (sce- nario 1 described above, column (1) of Table 2), whereas a broad interpretation would encompass both consistent food security and recovery from food insecurity to food security TABLE 1 Food insecurity experience scale survey module. Question Standard label Question wording 1 WORRIED During the last 30 DAYS, was there a time when You were worried you would not have enough food to eat because of a lack of money or other resources? 2 HEALTHY Still thinking about the last 30 DAYS, was there a time when you were unable to eat healthy and nutritious food because of a lack of money or other resources? 3 FEWFOODS Was there a time when you ate only a few kinds of foods because of a lack of money or other resources? 4 SKIPPED Was there a time when you had to skip a meal because there was not enough money or other resources to get food? 5 ATELESS Still thinking about the last 30 DAYS, was there a time when you ate less than you thought you should because of a lack of money or other resources? 6 RANOUT Was there a time when your household ran out of food because of a lack of money or other resources? 7 HUNGRY Was there a time when you were hungry but did not eat because there was not enough money or other resources for food? 8 WHOLEDAY During the last 30 DAYS, was there a time when you went without eating for a whole day because of a lack of money or other resources? TABLE 2 Distribution of households according to their food security dynamics. (1) (2) (3) (4) (5) Continuous food security Recovery from food insecurity to food security Persistence in food insecurity Transition from food security to food insecurity Total Ethiopia 1522 328 383 174 2407 Malawi 345 262 773 140 1520 Nigeria 304 188 1079 158 1729 Uganda 988 582 258 67 1895 Total 3159 1360 2493 539 7551 MACHINE LEARNING AND RESILIENCE 1483 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense (scenarios 1 and 4 described above, columns (1) and (2) of Table 2). A graphical representation of these interpretations is depicted in Figure 2. Our machine learning models will analyze resil- ience under both interpretations to provide a more comprehensive and holistic understanding of resilience prediction.7 Regarding the potential predictors of resilience, we utilize a comprehensive set of variables whose details and summary statistics can be found in Table 3. The selection of variables was broad and inclusive, guided by data availability—from the phone surveys conducted by the World Bank LSMS team—and relevance to the research question, with the aim of maximizing predictive accuracy. They encompass a wide range of domains, including socioeconomics, demographics, income sources, assets, agricultural and livestock activities, and labor. Thus, pro- viding a comprehensive view of the factors that may influence resilience.8 For the interested reader, we present in Figures S1 and S2, of the Supporting Information, a visual examination of the distinctions between resilient and non-resilient households under both interpretations discussed previously. The figures showcase bar graphs representing the deviations of standardized predictors from the overall mean. They also show the groupwise 95% confidence intervals (resilient vs. non-resilient). These visualizations offer a comparison of the predictor variables, enabling readers to grasp the differences between the two groups. EMPIRICAL FRAMEWORK This section describes the machine learning approach and algorithms. We assume the binary indicator of resilience yð Þ as some function fð Þ of household socioeconomic and demographic variables Xð Þ, agricultural and labor variables Zð Þ, and country-level control variables cð Þ. The equation can be written as: y¼ f X ,Z,cð Þ, ð1Þ where y represents the binary variable indicating household resilience (=1 if the household is resilient and 0 otherwise) following the two interpretations described above. A data-driven FIGURE 2 Different interpretations of resilience. A strict interpretation of resilience identifies resilience solely with continuous food security. A broad interpretation of resilience identifies resilience encompassing both consistent food security and recovery from food insecurity to food security. 1484 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense TABLE 3 Summary statistics—Sample based on a broad interpretation of resilience. Ethiopia Malawi Nigeria Uganda All (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Variables Mean SD Mean SD Mean SD Mean SD Mean SD Asset index �0.16 1.61 �0.03 1.62 0.88 1.62 �0.58 1.55 0 1.68 Total land size owned (hectares) 0.31 1.67 0.15 0.42 0.69 1.27 0.82 1.26 0.49 1.33 Ownership of dwelling (Yes = 1) 0.53 0.5 0.63 0.48 0.61 0.49 0.84 0.36 0.65 0.48 Access to improved water source (Yes = 1) 0.16 0.37 0.97 0.16 0.18 0.38 0.59 0.49 0.44 0.5 Access to improved toilet (Yes = 1) 0.59 0.49 0.52 0.5 0.68 0.47 0.32 0.47 0.53 0.5 Account from financial institutions (Yes = 1) 0.78 0.41 0.41 0.49 0.63 0.48 0.59 0.49 0.62 0.48 Change in number of males aged 15–64 0.05 0.48 0.01 0.38 0.04 0.31 �0.01 0.27 0.02 0.38 Change in number of females aged 15–64 0.05 0.47 0.01 0.4 0.04 0.35 0 0.3 0.03 0.39 Change in overall HH size 0.06 0.59 0.15 0.88 0.17 0.77 0.01 0.63 0.09 0.71 Ownership of any ruminant (large or small) (Yes = 1) 0.3 0.46 0.19 0.39 0.34 0.47 0.49 0.5 0.33 0.47 Ownership of camelid (Yes = 1) 0.01 0.12 0 0 0 0.03 0 0 0.01 0.07 Ownership of equine (Yes = 1) 0.12 0.33 0 0.04 0.01 0.08 0 0.02 0.04 0.2 Ownership of poultry (Yes = 1) 0.17 0.37 0.36 0.48 0.27 0.44 0.42 0.49 0.29 0.45 Ownership of livestock (Yes = 1) 0.34 0.48 0.44 0.5 0.44 0.5 0.64 0.48 0.46 0.5 Cash crop cultivation (Yes = 1) 0.07 0.26 0.08 0.27 0.17 0.38 0.24 0.43 0.14 0.35 Number of crops cultivated 1.57 3.76 2.81 3.14 2.87 3.2 2.86 2.64 2.44 3.31 Sale of crop (Yes = 1) 0.13 0.33 0.35 0.48 0.4 0.49 0.53 0.5 0.34 0.47 Postharvest crop loss (Yes = 1) 0.03 0.16 0.14 0.35 0.03 0.17 0.01 0.09 0.05 0.21 Use of tractor (Yes = 1) 0.04 0.2 0 0.03 0.09 0.28 0.8 0.4 0.23 0.42 (Continues) MACHINE LEARNING AND RESILIENCE 1485 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense approach is more appropriate for our case since the exact functional relationship between resil- ience and its predictors is unknown. Our motivation for using a data-driven approach stems from the exploratory nature of our analysis. We aim to discern patterns from the observed data, in contrast to a confirmatory anal- ysis where one tests hypotheses derived from a structural or econometric model. The task at hand is twofold. First is feature extraction: we seek to identify the characteristics of households that were resilient during the pandemic compared to those that were not. The second task is prediction: we aim to evaluate how well the household features identified by our models can predict resilience in a blind out-of-sample test set. TABLE 3 (Continued) Ethiopia Malawi Nigeria Uganda All (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Variables Mean SD Mean SD Mean SD Mean SD Mean SD Use of any fertilizer (organic or inorganic) (Yes = 1) 0.17 0.37 0.57 0.5 0.34 0.47 0.12 0.32 0.27 0.45 Use of pesticides, fungicides or herbicides (Yes = 1) 0.07 0.26 0.08 0.27 0.33 0.47 0.14 0.34 0.15 0.36 Use of exchange and/or free labor (Yes = 1) 0.13 0.33 0.02 0.15 0.27 0.45 0 0.03 0.11 0.31 Use of hired labor (Yes = 1) 0.12 0.33 0.02 0.15 0.56 0.5 0.31 0.46 0.25 0.43 Working adults working in wage work (%) 23.6 34.7 18.5 28.8 15.2 27.4 20.5 31.02 19.9 31.2 Working adults working in nonfarm family enterprise (%) 13.2 27.6 26.2 33.7 30.2 34.4 16.9 27.82 20.6 31.4 Ownership of non- farm family enterprise (Yes = 1) 0.3 0.46 0.5 0.5 0.61 0.49 0.02 0.14 0.34 0.47 Rental income (Yes = 1) 0.09 0.29 0.09 0.29 0.06 0.23 0.13 0.34 0.09 0.29 Received remittance or assistance (Yes = 1) 0.21 0.41 0.54 0.5 0.38 0.49 0.4 0.49 0.36 0.48 Resilience—Broad interpretation (Yes = 1) 0.77 0.42 0.4 0.49 0.28 0.45 0.83 0.38 0.6 0.49 Observations 2407 1520 1729 1895 7551 Note: These summary statistics describe each of the key variables in our analysis and describe the composition of our sample based on our broad interpretation of resilience. For summary statistics of the sample based on our strict interpretation of resilience, see Table S1. The Asset Index is a comprehensive index representing the assets of the household. The total land size owned is limited to agricultural land. Rental income from shop, store, house, car, truck, other vehicles, land, agricultural tools, and transport of animals. Working adults working in wage work includes casual and permanent work. 1486 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense For a task like this, machine learning models have been shown to have an advantage in extracting complex relationships in a data-driven manner and a high out-of-sample predictive capacity (Athey & Imbens, 2019; Bajari et al., 2015; Baylis et al., 2021; Mullainathan & Spiess, 2017; Storm et al., 2020). These data-driven models reduce the dependency on researchers' prior assumptions about the functional form. They are also more flexible since the parameters are optimally chosen via a grid search. The learning aspect of machine learning involves the iterative process of tuning hyper- parameters based on the prediction errors observed in repeated subsamples. The full dataset was randomly divided into training (80%) and testing (20%) samples. The training sample was utilized to train the model using tenfold cross-validation (CV). This process involved randomly partitioning the training sample into 10 equal-sized subsamples, with nine subsamples used for training and one subsample for validation. This process was repeated 10 times, each time updating the hyperparameters to minimize prediction errors in the validation subsample. Each CV cycle involves training temporary models on nine subsamples and validating on one sub- sample, facilitating the identification and selection of the best hyperparameters based on the aggregated results from all 10 validations. Once the optimal hyperparameters were determined through this CV process, a final model was then trained using the entire training dataset (80% of the full data) with these selected hyperparameters. Finally, the trained model was evaluated using the held-out testing sample, which was not used during the training phase. This approach was implemented to ensure the robustness of the model (Vabalas et al., 2019; Zhang & Ling, 2018). We train five popular models for this analysis, including Logistic Regression, Random For- est (RF), Gradient Boosting Classifier (GBC), eXtreme Gradient Boosting (XGBoost), and Artifi- cial Neural Networks (ANNs). These models were selected for their superior predictive capabilities compared to other models such as classification trees or support vector machines (Amin et al., 2021; Athey & Imbens, 2019; Bajari et al., 2015; Dreiseitl & Ohno-Machado, 2002; Villacis et al., 2023). Next, we will provide an overview of each model. Logistic Regression The logit or Logistic Regression model is a widely used statistical method for modeling binary outcomes. It belongs to the class of generalized linear models (GLMs) and is particularly suit- able for situations where the response variable takes on one of two categorical values. In the logit model, the probability of the binary outcome is modeled as a function of the predictor vari- ables using the logistic function. The logit model provides estimates of the regression coeffi- cients, representing the change in the log odds of the outcome associated with a one-unit increase in the corresponding predictor, holding other predictors constant. These coefficients can be exponentiated to obtain odds ratios, indicating the multiplicative effect of the predictor on the odds of success. Random Forest Random Forest (RF) is an ensemble machine learning algorithm that combines the predictions of multiple decision trees to improve the accuracy of predictions (Breiman, 2001). The algo- rithm has become widely used in classification and regression tasks due to its robustness and MACHINE LEARNING AND RESILIENCE 1487 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense ability to handle high-dimensional data (Chernozhukov et al., 2017; Wager & Athey, 2018). The Random Forest algorithm operates through a series of steps to create a collection of decision trees. First, it randomly selects subsets of the training data through bootstrap sampling, which involves randomly selecting samples with replacements. This creates a new training set with some repeated samples and some not included samples. Next, the algorithm randomly selects a subset of features (predictors) from the full set of fea- tures. This randomness in feature selection introduces diversity among the trees and reduces correlation, leading to better overall performance. Each decision tree is built using the selected subset of data and features. At each node of the tree, the algorithm searches for the best-split point among the selected features, considering a specific criterion such as the Gini index9 for classification or variance for regression. This recursive splitting process continues until a stop- ping condition is met, which can be defined by a minimum node size or a maximum depth. Multiple decision trees are created through the ensemble creation step. The hyperparameter controls the number of trees in the forest. For classification problems like the model shown above, the prediction is determined by the majority vote or mode of the predictions from all the trees. One of the key advantages of the Random Forest algorithm is its ability to handle high-dimensional data and missing values. It also provides estimates of feature importance, allowing us to assess the relative importance of different predictors in the model. The Random Forest algorithm is less prone to overfitting com- pared to individual decision trees. The process of randomization in feature selection and data sampling helps reduce variance and provides more robust predictions (Athey & Imbens, 2019). Gradient Boosting and eXtreme Gradient Boosting GBC and XGBoost are popular machine learning models that use the principle of boosting to improve prediction accuracy by combining multiple simple models, typically decision trees. The main idea behind these methods is to build models sequentially, with each new model focusing on correcting the errors made by the ones before it. This process begins with an initial guess, which could be the average of the target values for a regression problem or a log-odds ratio for classification, setting the stage for further refinement. As the sequence progresses, each new model (e.g., a decision tree) is fitted on pseudo-residuals, using the gradients of the loss function with respect to the model predictions (Hastie et al., 2009). The contribution of each tree to the final model is controlled by a learning rate, a parameter that determines how quickly the model approaches the optimum structure. Regularization techniques such as shrinkage (reducing the step size toward the ultimate model), limiting tree complexity, and random subsampling are used in GBC to prevent overfitting and improve model generalization. XGBoost builds on the foundation of gradient boosting by introducing several optimizations aimed at improving the efficiency, speed, and scalability of the model through a series of targeted optimizations (Chen & Guestrin, 2016). It introduces regularization parameters, for example, one for tree pruning and another for step size shrinkage, to reduce overfitting by penalizing complexity, thereby improving the model's generalization capability. A key feature of XGBoost is its sparsity-aware algorithm, designed to handle missing data and zero-valued entries efficiently, learning the optimal branching direction for missing values to boost model accuracy. Additionally, XGBoost incorporates advanced options such as monotonic constraints and feature interaction constraints, allowing for tailored model adjustments to suit specific domain requirements. To expedite the identification of optimal split points in trees, it employs 1488 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense a compressed column-based data structure and a quantile sketch algorithm, which together streamlines the process of finding split candidates, significantly curtailing training duration. The integration of parallel processing further accelerates tree construction, while a suite of fea- tures for model evaluation and optimization, including built-in cross-validation and early stop- ping, enhances the overall effectiveness and precision of the model (Chen & Guestrin, 2016). The main differences between traditional GBC and XGBoost lie in XGBoost's advancements in the boosting framework. XGBoost goes further by improving regularization and computa- tional efficiency, especially in processing sparse data, and designing an architecture that speeds up computations (Bentéjac et al., 2021). While GBC is relatively simpler, it might struggle with handling large datasets or complex models due to its demand for computational and memory resources; XGBoost is more flexible in these scenarios through its optimized tree construction, effective tree pruning, parallel processing, and superior handling of missing values. Artificial Neural Networks (ANNs) ANNs operate by mimicking the structure and function of the human brain's neural networks, consisting of interconnected artificial neurons or “units” organized into layers. These layers include an input layer, one or more hidden layers, and an output layer. The heart of ANNs lies in their artificial neurons, which simulate the behavior of biological neurons. These neurons receive weighted inputs, which are the outputs from the previous layer's units, and each unit receives inputs from the previous layer, performs computations, and transmits outputs to the next layer. Activation functions, such as sigmoid, rectified linear activation unit (ReLU), or hyperbolic tangent, are applied to the weighted sum of inputs to introduce nonlinearity and transform it into an output. The activation function determines the neuron's response to the input and plays a crucial role in shaping the network's behavior (Jain et al., 1996). This layered structure allows ANNs to learn hierarchical representations of input data and capture complex relationships. During training, ANNs adjust the weights to minimize a defined objective or loss function through a process that involves forward feeding, where inputs propa- gate through the network, and outputs are compared to the desired outcomes. The resulting prediction errors are used to iteratively update the weights, aiming to reduce the overall loss. Backpropagation, the primary training algorithm for ANNs, employs gradient descent to calcu- late the gradients of the loss function with respect to the weights, guiding the weight updates and enabling the network to learn from the training data. Various techniques are employed to enhance the performance and generalization of ANNs. Regularization methods such as L1 and L2 regularization prevent overfitting by adding penalty terms to the loss function. Dropout, another regularization technique, randomly deactivates a fraction of units during training to improve network robustness, ensuring the network learns more general patterns that are not dependent on the presence of specific neurons. Feature extraction with Shapley Additive exPlanations values Predictive algorithms assess the importance of predictors by observing the increase in predic- tion error when each predictor is permuted, with a higher error signifying greater importance. However, this measure of importance is model-specific and may not reliably account for predic- tor interdependencies (Lundberg & Lee, 2017). Shapley values (Shapley, 1953), stemming from MACHINE LEARNING AND RESILIENCE 1489 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense cooperative game theory, provide a nuanced explanation of each predictor's contribution, taking into account interactions with other predictors. These values, known as Shapley Additive exPla- nations (SHAP), explain how each predictor influences the deviation of the actual prediction from the mean. Introduced by Lloyd Shapley in 1953 and adapted for machine learning, Shapley values are calculated as the mean marginal contribution of a predictor across all possi- ble predictor combinations, offering a more holistic and interpretable assessment than tradi- tional importance factors (Lundberg & Lee, 2017). Formally, the Shapley value at an observation i for a predictor xj from the set of all predic- tors K is denoted as: ϕij ¼ X S ⊆ K ∖ jf g Sj j! Kj j� Sj j�1ð Þ! Kj j! byi xS[ jf g � ��byi xSð Þ� � : ð2Þ These values are calculated by summing over all subsets S⊆K of the set of predictors that exclude the j-th predictor. The difference byi xS[ jf g � ��byi xSð Þ represents the prediction gap cau- sed by adding the j-th predictor to model. Each gap is respectively weighted by the number of permutations of predictors that can occur, given by the ratio of factorials in the formula. The Shapley value for a predictor is the aggregate of these weighted prediction gaps (Lundberg & Lee, 2017). Thus, the dimension of the Shapley value matrix corresponds to the number of observations by the number of predictors, as Shapley values are computed for each predictor with respect to each observation. Therefore, the sum of Shapley values across all predictors for a given observa- tion should equal the difference between the prediction for that specific observation and the average prediction over the dataset (or the model's expected value if a baseline has been defined). This ensures that the contribution of all predictors sums up to the actual prediction for each observation. Shapley values show the relationship between predictors and the outcome by quantifying the change in the predicted value associated with each predictor. A positive Shapley value implies that the predictor's inclusion increases the predicted outcome relative to the average prediction. Conversely, a negative Shapley value denotes a decrease in the predicted outcome when the predictor is included. A zero Shapley value indicates no change from the overall aver- age prediction. The magnitude of a Shapley value signifies the strength of a predictor's impact. Due to the computational intensity of considering all predictor combinations and orderings, we compute Shapley values exclusively for the model with the best out-of-sample performance. Model evaluation Evaluating the performance of machine learning models is crucial in assessing their effective- ness. Several performance metrics are commonly used to measure the quality of classification models, including Accuracy, Precision, Recall, F1 Score, and Cohen's Kappa (Amin et al., 2021; Villacis et al., 2023). Accuracy is a widely used metric that measures the overall correctness of predictions made by a classification model. Higher accuracy indicates a higher proportion of correct predictions, and it is calculated as the ratio of correctly classified instances to the total number of instances: 1490 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense Accuracy¼ TPþTN TPþTNþFPþFN , ð3Þ where TP, TN, FP, and FN stand for True Positive, True Negative, False Positive, and False Neg- ative, respectively. Precision measures the proportion of correctly predicted positive instances out of all instances predicted as positive. It quantifies the model's ability to avoid false positives. Thus, higher precision indicates a lower false positive rate. The measure of precision is important in applications where false positives are costly or undesirable, and it is calculated as follows: Precision¼ TP TPþFP : ð4Þ Recall, also known as sensitivity or true positive rate, measures the proportion of correctly predicted positive instances out of all actual positive instances. It quantifies the model's ability to identify positive instances. Thus, higher recall indicates a lower false negative rate. The mea- sure of recall is important in applications where false negatives are costly or undesirable, and it is calculated as follows: Recall¼ TP TPþFN : ð5Þ F1 Score combines precision and recall into a single metric via harmonic mean. It provides a balanced measure between precision and recall, and it is particularly useful when classes are imbalanced. The F1 score is important in applications where both false positives and false nega- tives need to be minimized, and it is calculated as follows: F1¼ 2�Precision�Recall PrecisionþRecall : ð6Þ Cohen's Kappa (Cohen, 1960) is a statistical measure that assesses the agreement between two raters (in this case, the model prediction and the true category) beyond chance agreement. It considers the observed agreement and the expected agreement due to chance, and it is calcu- lated as follows: Kappa¼ po�pe 1�pe , where po and pe represent observed and expected probabilities, respectively. Kappa values range from �1 to 1, where 1 indicates perfect agreement, 0 indicates agreement by chance, and nega- tive values indicate agreement worse than chance (Amin et al., 2021; Warrens, 2015). Preprocessing To ensure the accuracy and reliability of our machine learning models, we took two essential steps in preparing the predictor variables for training. First, we centered and scaled the MACHINE LEARNING AND RESILIENCE 1491 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense variables. This helps to remove any potential biases caused by varying scales of the variables and ensures that all variables contribute equally to the model. By doing this, we prevent any variable from dominating the prediction process simply because it has larger values. Secondly, we checked the correlation among the variables in use. High correlations between variables may affect model stability and interpretation and mislead importance factors. Figures S3 and S4, of the Supporting Information, shows the correlation among the variables we use. None of the variable pairs had a correlation close to one or negative one. Addressing class imbalance When a variable exhibits imbalanced classes, it means that one class contains significantly more samples than the other. For instance, under both, broader and stricter interpretations of resil- ience, the average values are 0.60 and 0.51, respectively (see Tables 3 and S1). This implies that there are more resilient households than non-resilient ones. However, such an imbalance can pose challenges as machine learning models may become biased toward the majority class, leading to suboptimal performance for the minority class (Amin et al., 2021). To address this issue, we employ the Synthetic Minority Over-sampling Technique (SMOTE) (Chawla et al., 2002). The SMOTE function is used to balance the classes by creating synthetic observations of the minority class. This is achieved by selecting examples that are close to the feature space, draw- ing a line between the examples in the feature space, and drawing a new sample at a point along that line. In the context of our analysis, SMOTE is applied to the training data to balance the classes.10 Hyperparameter tuning The logit model does not require hyperparameter tuning. In Random Forest, the primary hyper- parameters being tuned are the number of decision trees in the forest, the maximum depth of each decision tree, and the minimum number of samples required to split an internal node. To find the optimal combination of these hyperparameters, we define a grid of values for each parameter and systematically evaluate the model's performance across all possible combina- tions. Similarly, in GBC and XGBoost, the hyperparameters under consideration include the number of boosting rounds, the maximum depth of each individual tree, and the step size shrinkage (learning rate) for each boosting round. For ANNs, we focus on tuning the sizes of the hidden layers in the network, the activation function used in these layers, and the L2 regu- larization parameter. Overall, the process involves defining grids of parameter values for each model and systematically evaluating the models' performance across various combinations. This allows us to identify the set of hyperparameter values that maximizes the models' predictive capabilities. RESULTS In this section, we present our research findings, showcasing the use of machine learning models for resilience prediction and subsequently identifying key predictors. First, in Table 4, 1492 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense we present results of the performance statistics for the different machine learning models dis- cussed in Section 2 and employed for resilience prediction. Second, Figures 3 and 4 provide the results of feature extraction and identify the key predictors of resilience. Predictive performance This subsection discusses the performance statistics for different machine learning models in predicting resilience among sampled African households. Recall that resilience is measured as a binary variable, where a value of 1 indicates a resilient household, and 0 denotes a lack of resil- ience. The predictor variables used in our analysis include household income, assets, demo- graphics, agricultural and labor activities, and country-level factor variables. To evaluate the performance of the machine learning models in predicting household resilience, we present the corresponding performance statistics in Table 4. This includes measures of accuracy, preci- sion, recall, F1-score, and Cohen's Kappa for each model, respectively, for model validation and out-of-sample testing tasks. Table 4 shows that the GBC stands out as the top-performing model in predicting household resilience, according to the results from a tenfold cross-validation. It exhibits the highest overall TABLE 4 Performance of machine learning models. Response Model Accuracy Recall Precision F1 Kappa Panel A: Validation performance Resilience status: Broad interpretation GBC 0.744 0.78 0.791 0.79 0.469 Random Forest 0.733 0.77 0.781 0.78 0.445 XGBoost 0.723 0.781 0.762 0.77 0.419 Logistic 0.692 0.705 0.762 0.73 0.37 ANN 0.682 0.729 0.737 0.73 0.339 Resilience status: Strict interpretation GBC 0.759 0.776 0.759 0.77 0.518 Random Forest 0.748 0.745 0.758 0.75 0.496 XGBoost 0.747 0.757 0.75 0.75 0.494 ANN 0.721 0.739 0.721 0.73 0.441 Logistic 0.714 0.735 0.714 0.72 0.427 Panel B: Test performance Resilience status: Broad interpretation GBC 0.775 0.804 0.817 0.81 0.534 Resilience status: Strict interpretation GBC 0.81 0.843 0.796 0.82 0.618 Note: Accuracy, precision, recall, F1 score, and Cohen's Kappa are commonly used performance metrics in machine learning (the higher the better). Accuracy measures overall correctness; Precision focuses on true positives out of all positives predicted; Recall captures the true positive rate; F1 score combines Precision and Recall; Cohen's Kappa assesses agreement beyond chance. The validation performance is from tenfold cross-validation, and the test performance is based on the 20% out- of-sample data from Ethiopia, Malawi, Nigeria, and Uganda. See Section 2 for more details. Table S5 shows the gains of ML models. MACHINE LEARNING AND RESILIENCE 1493 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense FIGURE 3 Top 20 predictors of household resilience under the broad definition. The figure contains scatterplots for each predictor, illustrating respective SHAP values. Each dot on the scatterplot represents a SHAP value corresponding to an observation, with the color denoting the feature or predictor's value: Red indicates high values, while blue denotes low values. The horizontal position of a dot reflects the impact of that value on the model's prediction, with positive (negative) values suggesting a higher (lower) likelihood of resilience. The top 20 predictors are presented here based on their mean absolute SHAP values, reported on the y-axis. These mean values are multiplied by the sign of correlation between the predictor and its SHAP values, indicating the direction of association between resilience and the predictor. For detailed results, please refer to Section 3. 1494 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense FIGURE 4 Top 20 predictors of household resilience under the strict definition. The figure contains scatterplots for each predictor, illustrating respective SHAP values. Each dot on the scatterplot represents a SHAP value corresponding to an observation, with the color denoting the feature or predictor's value: Red indicates high values, while blue denotes low values. The horizontal position of a dot reflects the impact of that value on the model's prediction, with positive (negative) values suggesting a higher (lower) likelihood of resilience. The top 20 predictors are presented here based on their mean absolute SHAP values, reported on the y-axis. These mean values are multiplied by the sign of correlation between the predictor and its SHAP values, indicating the direction of association between resilience and the predictor. For detailed results, please refer to Section 3. MACHINE LEARNING AND RESILIENCE 1495 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense accuracy, with scores of 0.744 and 0.759 for broad and strict definitions of resilience, respec- tively. This measure of accuracy indicates GBC's effectiveness in correctly predicting both resil- ient and non-resilient households over relatively complex models like XGBoost and ANN. GBC excels in accuracy and demonstrates a well-balanced performance in terms of precision and recall—key metrics where precision is the ratio of true positives to all positive predictions, and recall, or true positive rate, quantifies how many actual positives were identified correctly. This balanced accuracy results in high F1 scores of 0.785 and 0.767 for both resilience interpre- tations, affirming that F1, as the harmonic mean of precision and recall, is a robust indicator of model reliability. Cohen's Kappa, which evaluates the agreement between the observed and predicted classifi- cations beyond chance, further solidifies GBC's superiority with scores of 0.469 and 0.518 under broad and strict interpretations of resilience. This suggests that the GBC model is performing significantly better than random chance. Therefore, we select GBC as the final model for the out-of-sample prediction of household resilience. When testing the out-of-sample performance using the held-out 20% of the data, the GBC maintains its strong performance with scores of 0.775 and 0.81 for broad and strict resilience definitions, respectively. That is, about four out of five households are correctly predicted to be resilient or non-resilient in the test dataset. GBC also continues to exhibit consistent precision and recall across both interpretations. The model's accuracy over randomness, as indicated by the Kappa statistic, reaches 0.53 and 0.61, respectively, underscoring the model's robustness.11 It is important to note that our model predicts the strict interpretation of resilience slightly more accurately than the broad interpretation of resilience. This is because the broad interpreta- tion includes households that transitioned from food insecure to food secure, which introduces some noise into the prediction process. Nevertheless, the similar performance metrics across definitions of resilience, as shown in Table 4, indicate that our model is robust across different interpretations of resilience. Table S5 shows a comparison of different models based on their gains over a naive baseline (e.g., the mean) and Logistic Regression in predicting resilience. The results indicate that machine learning models generally offer considerable improvements over both a naive baseline (about 50% for GBC) and Logistic Regression (about 16% for GBC), especially in terms of predic- tive accuracy. While the ML methods deployed in this study provide robust predictive capabili- ties, it is imperative to consider their limitations in interpretability and the higher demands on computational resources and expertise. The choice of model should be guided not only by pre- dictive accuracy but also by the specific needs and constraints of the application context. For our task, however, the GBC provides the most balanced and highest performance across various metrics, namely accuracy in terms of identifying resilient households as resilient and non-resilient households as non-resilient. These attributes mark the GBC as the model of choice for the subsequent analysis phase, where we will explore and extract the features that character- ize resilient households. Feature extraction Figures 3 and 4 present SHAP summary plots derived from the GBC, illustrating the top 20 pre- dictors of household resilience during the pandemic. Each dot corresponds to the SHAP value for a predictor for a particular observation, with its value shown on the x-axis. Interested readers can refer to Figure S5 to understand how these dots contribute to the overall prediction 1496 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense through SHAP values. To prevent overlapping, some dots are jittered to demonstrate the con- centration of data points. The color of the dots indicates the value of the predictor, with red representing higher values and blue indicating lower ones. A pattern of red dots to the right of zero suggests that higher values of a predictor have higher SHAP values, that is, the predictor has a positive corre- lation with the SHAP values (+). Hence, its association with household resilience is positive. Conversely, blue dots to the right of zero would imply a negative (�) correlation between the predictor and its SHAP values. Hence, its association with household resilience is negative. The mean absolute SHAP values are multiplied by the sign of this correlation and placed in parentheses for improved readability. These values are relative and do not have much interpret- ability in their absolute sense. The country-level control variable emerges as a significant determinant under both defini- tions of resilience. This is expected given its encapsulation of country-specific unobserved het- erogeneities that can influence household food security outcomes. Indeed, country-level factors are substantial during a global event such as a pandemic, which affects regions in disparate ways based on local policy, healthcare infrastructure, and economic resilience. We are more interested in household-level predictors. These predictors are similar under both interpretations of resilience (see Figures 3 and 4). For instance, the possession of an account with a financial institution presents the strongest positive association with resilience (+0.0033 and +0.0039, respectively, under the broad and strict interpretations of resilience). This indicates that households with access to financial institutions were also resilient in the data (Belayeth Hussain et al., 2019; Islam et al., 2016; Sakyi-Nyarko et al., 2022). Similarly, the adoption of agricultural mechanization, as evidenced by the use of tractors, aligns positively with resilience, indicating that the use of a tractor is one of the most identifiable features of resilient households (Amare & Endalew, 2016; Daum & Birner, 2020; Emami et al., 2018). Following this, the asset index emerges as a strong predictor of household resilience, likely reflecting the capacity to withstand shocks, as discussed in the existing literature (Ansah et al., 2019; Hidrobo et al., 2018; Manlosa et al., 2019). Other important features of resilient households include wage work, access to improved toilets, rental income, land size, ownership of equines, and cash crop cultivation, in that order. These predictors collectively reflect the eco- nomic resources, stable income, and ownership characteristics of households, which can be use- ful in sustaining their food consumption patterns during the pandemic. On the other hand, higher values of some predictors are associated with lower SHAP values, that is, the association between the predictor and the response variable is negative, as indicated by negative signs in parentheses (Figures 3 and 4). For example, ownership of nonfarm family enterprises, receipt of remittances or assistance, and the use of exchange or free labor have many negative SHAP values with their corresponding dots in red, suggesting that higher values of these predictors are negatively associated with household resilience. Upon closer inspection of these predictors, the operation of a nonfarm family enterprise similarly exhibits a negative association, potentially reflecting the broader economic downturn's impact on such businesses. The practice of cultivating a larger variety of crops is also slightly negatively correlated with resilience. Although speculative, this suggests that crop diversification at the household level, typically a risk mitigation strategy, may have presented challenges during the pandemic due to market disruptions. It is also intuitive that households receiving remittances or assistance were less likely to remain resilient during the pandemic, as these sources of support may have been insufficient or disrupted. Similarly, households utilizing exchange or free labor demonstrate a MACHINE LEARNING AND RESILIENCE 1497 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense low predicted chance of resilience, perhaps because opportunities like these may have been adversely affected by the pandemic. Among other factors, the use of fertilizer and ownership of a dwelling exhibit negative and positive associations, respectively, with household resilience. However, their respective SHAP values are heavily concentrated around zero, indicating that they may not exert a strong influ- ence on the prediction task. The feature extraction analysis highlights several key predictors that play a crucial role in predicting resilience within our study region. Notably, factors related to financial inclusion, agricultural mechanization (e.g., utilizing tractors), stable income, and economic resources emerge as significant positive predictors of resilience. Conversely, reliance on external assis- tance, engagement in nonfarm enterprises, and crop diversification are identified as negative predictors. Dimension reduction Note that the feature selection with SHAP values is based on the observed variables only, under the assumption of appropriate preprocessing, no indication of causal inference, and no high col- linearity among predictors (Fryer et al., 2021; Kumar et al., 2020; Lundberg & Lee, 2017; Marcílio & Eler, 2020). The SHAP values from our analysis highlight the relative importance of each predictor in detecting household resilience. As indicated by Figures 3 and 4, country-level characteristics emerge as the most influential factor, with the importance of each subsequent predictor diminishing. The diminishing values of importance allow us to refine our predictor set. For instance, as shown in Figure 3 (broad interpretation of resilience), the mean absolute SHAP value declines from 0.0087 for the country variable to 0.0033, and then decreases steadily until “Total land size owned (ha).” Beyond this point, such as with the addition of “rental income,” there is no substantial gain in prediction accuracy, indicating an optimal stopping point for the inclusion of predictors. The out-of-sample results of this model, with the number of predictors reduced from 29 to only 9, are presented in Table 5. The accuracy stands at 0.772 compared to the previously derived 0.775 (Table 4). Although recall declines slightly, precision increases, keeping the F1 score and Kappa statistic approximately the same. Thus, we can predict the broad interpretation of household resilience using only nine predictors while still achieving about 77.5% out- of-sample prediction accuracy. TABLE 5 Performance of machine learning models with reduced dimension. Response Model Accuracy Recall Precision F1 Kappa Test performance Resilience status: Broad interpretation GBC 0.772 0.783 0.826 0.804 0.531 Resilience status: Strict interpretation GBC 0.79 0.815 0.783 0.798 0.579 Note: Accuracy, precision, recall, F1 score, and Cohen's Kappa are commonly used performance metrics in machine learning (the higher the better). Accuracy measures overall correctness; Precision focuses on true positives out of all positives predicted; Recall captures the true positive rate; F1 score combines Precision and Recall; Cohen's Kappa assesses agreement beyond chance. The validation performance is from tenfold cross-validation, and the test performance is based on the 20% out- of-sample data from Ethiopia, Malawi, Nigeria, and Uganda. See Section 2 for more details. 1498 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense A similar exercise for Figure 4 (strict interpretation of resilience) shows that the mean abso- lute SHAP values rapidly decrease up to the variable “access to improved toilet,” then decline more gradually. Overall, Figures 3 and 4 indicate that the main difference between the top pre- dictors of the broad and strict interpretations of resilience is the ninth predictor: “access to improved toilet” variable is relevant for the strict interpretation, while “land ownership” is more relevant for the broad interpretation. In Table 5, we also present the out-of-sample performance of the GBC using these nine predictors for the strict interpretation of resilience. Although the statistics generally decrease from the full model (as seen in Table 4), the decline is minimal. Accuracy decreases from 81% to 79%. Recall and precision also decrease, affecting the F1 score and lowering the Kappa statistics from 0.618 to 0.579. Nonetheless, the results indicate that these nine variables can predict the strict interpretation of resilience with high accuracy, nearly four out of five times. In summary, the nine predictors—country-level control, account from financial institutions, use of tractors, asset index, number of crops cultivated, ownership of a nonfarm family enter- prise, recipient status of remittance or assistance, percentage of working adults in wage work, and access to an improved toilet—are major indicators of strict resilience. Note that land own- ership, which was relevant in predicting broad resilience, is not as relevant in predicting strict resilience. A possible explanation is that land ownership may have characterized households transitioning from food insecurity to food security, thus explaining their resilience. Still, it is less telling of the resilience among households who remained food secure throughout. DISCUSSION AND POLICY IMPLICATIONS In this study, we propose a machine learning framework to predict resilience—and therefore well-being dynamics—in regions affected by detrimental shocks stemming from the recent global pandemic. By integrating relevant concepts from the food security literature, our approach contributes to the comprehension of resilience, a concept that has gained significant importance in the field of development economics and has become a focal point for interna- tional development and humanitarian agencies in the past decade. The paper starts with a discussion of how we frame resilience in the face of adverse shocks using a normative approach and drawing upon methodologies for assessing food security pro- posed by the Food and Agricultural Organization (FAO). By anchoring our definition of resil- ience as a normative outcome and indexing it to the Sustainable Development Goals (SDGs),12 we ensure that our approach remains as a pro-poor concept (Barrett et al., 2021; Barrett & Constas, 2014). Subsequently, our application tests a battery of different machine learning algo- rithms and explores their capabilities in predicting household resilience status. We find that the machine learning models are able to identify eight out of 10 resilient households. Our dimension reduction analysis has identified eight major household features—given the country characteristics—that can help distinguish households likely to be resilient during a disease-induced pandemic from those that may not be. In this context, recognizing and empha- sizing the significance of these predictors can substantially aid in preemptively identifying households that could be vulnerable during a crisis. Moreover, these factors are critical for households across different countries. While there will inevitably be country-level differences, these factors warrant increased focus at the household level to better cushion future shocks. Specifically, for smallholder farmers from the selected group of African nations in our study, having access to financial institutions emerges as the most crucial predictor of resilience, MACHINE LEARNING AND RESILIENCE 1499 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense followed by the adoption of agricultural mechanization, as evidenced by the use of tractors. Complementing these predictors are risk and income diversification strategies, as evidenced by the number of crops cultivated, the ownership of nonfarm enterprises, and assets ownership. These results from our feature extraction exercise and identification of key predictors of resilience are in line with a growing literature linking resilience with financial inclusion (Arouri et al., 2015; Belayeth Hussain et al., 2019; Islam et al., 2016; Islam & Maitra, 2012; Jordan, 2015; Khandker et al., 2012; Sakyi-Nyarko et al., 2022), resilience with agricultural mechanization (Amare & Endalew, 2016; Daum & Birner, 2020; Emami et al., 2018), and resil- ience with household assets (Ansah et al., 2019; Gilligan & Hoddinott, 2007; Guo, 2011; Hidrobo et al., 2018; Little & Ahmad, 2002; Manlosa et al., 2019). However, it is important to note that the characterization of resilient and non-resilient households presented in this study must be interpreted with caution. While our models identify features helpful for prediction, these characteristics should be interpreted in the context of the limitations of our data and methodology. Leveraging multi-country data, our study extends beyond the scope of prior single-country analyses, enhancing the external validity of our findings. This approach is pivotal in substantiat- ing the role of household assets and access to financial institutions as critical factors in deter- mining household resilience, corroborating existing literature. However, our results also underline the importance of considering country-specific nuances in resilience studies. The correlations observed through our predictive modeling exercises should not be mis- construed as causality, a distinction of paramount importance when considering the implica- tions of our findings for policy formulation. As such, we aim to provide valuable insights to policymakers that, while informed by our analysis, acknowledge the limitations of our study: (i) regarding causality and (ii) recognizing the methodological constraints inherent in predictive modeling. Based on this context, the policy implications discussed next should be viewed as exploratory, guiding future empirical inquiries rather than prescribing definitive actions. The empirical associations we identify call for targeted pilot initiatives, focusing on key household-level predictors such as financial access, asset ownership, diversification in risk and income, and the uptake of agricultural mechanization. Rigorous evaluation of these interven- tions, including randomized control trials (RCTs), will be essential for elucidating their effects on resilience, potentially paving the way to establishing causal links. This necessitates policy measures that foster multidisciplinary collaborations combining quantitative and qualitative assessment methods to understand the mechanisms at play deeply. In addition, policymakers can embrace adaptive frameworks that can evolve based on emerging evidence. This approach would allow for the refinement of strategies in light of ongoing research findings, including those that may establish causality in the future. Emphasizing continued investment in data collection and analytical capabilities that lever- age machine learning and other advanced statistical methods is another important avenue policymakers should prioritize (Villacis & Badruddoza, 2023). This can improve the understand- ing of resilience dynamics over time, informing more nuanced policy interventions that can be adjusted as more is learned about the causal relationships between household-level factors and resilience. The significance of country-level control variables in our results highlights the importance of considering country-specific factors when designing and implementing policy interventions. While our findings do not directly measure the effectiveness of specific policies, they suggest that policymakers should account for the unique socioeconomic and political contexts of each country to enhance the relevance and potential impact of their interventions. While specific 1500 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense household-level predictors are highlighted in our study, policymakers must consider resilience as a multifaceted concept that requires a holistic approach. This means designing policies that not only address the direct predictors identified here but also consider broader social, economic, and environmental systems. Finally, engaging stakeholders and communities from the begin- ning of any intervention based on our findings is critical. This approach can help to uncover additional insights, foster local ownership of initiatives, and improve the effectiveness of resilience-building efforts. It is also important to acknowledge that in our specific setting, we implicitly focus on assessing short-term resilience given our data points span over a period of approximately 7 months. To overcome this limitation, future research could employ machine learning methods to investigate long-term dynamics by utilizing big data covering extended periods of time. Such an exercise would align with Constas et al. (2014) definition of resilience as “the capacity that ensures adverse stressors and shocks do not have long-lasting adverse develop- ment consequences.” However, the availability of data may pose constraints on the feasibility of such endeavors. Lastly, the utilization of a normative approach to resilience in our study— anchored in the context of food insecurity— may have inherent limitations. This normative approach could potentially constrain the measurement of resilience and conflate different phe- nomena (Barrett et al., 2021). Therefore, it is crucial to consider this caveat for the broader application of our results, as their applicability will depend on the specific objectives and intended goals of policy interventions. ACKNOWLEDGMENTS The senior authorship is shared between Villacis and Badruddoza. The authors thank Chris Barrett, two anonymous reviewers, and Gopinath Munisamy, managing editor at Applied Eco- nomic Perspectives and Policy, for constructive comments on a previous draft of this manuscript. We are also grateful to participants at the 2024 Southern Agricultural Economics Association (SAEA) and the 2024 Agricultural and Applied Economics Association (AAEA) Annual Meet- ings for constructive feedback that helped us improve this paper. The views expressed here are those of the authors and do not necessarily reflect those of donors or the authors' institutions. All errors are our own and the usual disclaimers apply. ENDNOTES 1 Knippenberg et al. (2019) used LASSO and Random Forest algorithms. Garbero and Letta (2022) employed Classification Trees, Bootstrap Aggregating (bagging), Random Forests, k-nearest Neighbor, and Support Vec- tor Machine algorithms. A number of studies show that gradient boosting and Neural Network models are more robust (Amin et al., 2021; Bajari et al., 2015; Jain et al., 1996; Mullainathan & Spiess, 2017). 2 Other examples of well-being outcomes include expenditures, consumption, income, assets, poverty, food security indicators (including, but not limited to, dietary diversity indices, coping strategies indices), health indicators (child health, anthropometry, morbidity, mortality), happiness and life satisfaction, equality, mar- ginalization, safety and security, experiences of conflict or violence (Barrett et al., 2021). 3 See the Supporting Information for more information on the high-frequency phone surveys (HFPS) conducted by the World Bank. 4 Additional phone survey rounds are available for these and other countries. For more information, see World Bank (2021). 5 In addition to considering compromised diet quality and reduced food quantity, the eight questions used to construct the FIES scale also capture psycho-social elements associated with anxiety or uncertainty regarding the ability to procure enough food, a facet that other measures do not (FAO, 2017). MACHINE LEARNING AND RESILIENCE 1501 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense 6 Table S2 presents the distribution of food-insecure households per country across the two rounds of data col- lected based on the FIES indicators. 7 For our analysis of the strict interpretation of resilience, we specifically omit households that moved from food insecurity to food security in the latter period. 8 All but three of the predictor variables were collected only during the first round of surveys. The three predic- tor variables also collected in the second round of surveys are: (i) the number of males aged 15–64, (ii) the number of females aged 15–64, and (iii) overall household size. For our analysis, we combine the information of these variables from the two periods and code them as (i) change in the number of males aged 15–64, (ii) change in the number of females aged 15–64, and (iii) change in the overall household size. See summary statistics in Table 3. 9 The Gini Index is calculated for each group of observations as, Gini Index = 1 � (probability of yes)2 � (probability of no)2, hence gives a measure of how mixed the classes are in that group. A Gini Index of 0 means perfect purity (all instances belong to the same class), and an Index of 0.5 means maximum impu- rity (instances randomly assigned to classes) (Raileanu & Stoffel, 2004). 10 We also ran the models without class balancing and found the results to be similar, with a slightly greater asymmetry in precision and recall (please see Table S3). The high validation performance but low test perfor- mance of models indicate that balancing classes was an appropriate step for this data. 11 The out-of-sample performance metrics for the remaining (suboptimal) models are placed in Table A4. 12 In SDG 2, countries commit to “End hunger, achieve food security and improved nutrition and promote sus- tainable agriculture” by 2030 (FAO, 2017). REFERENCES Amare, Dagninet, and Wolelaw Endalew. 2016. “Agricultural Mechanization: Assessment of Mechanization Impact Experiences on the Rural Population and the Implications for Ethiopian Smallholders.” Engineering and Applied Sciences 1(2): 39–48. Amin, Modhurima Dey, Syed Badruddoza, and Jill J. McCluskey. 2021. “Predicting Access to Healthful Food Retailers with Machine Learning.” Food Policy 99: 101985. Ansah, Isaac Gershon, Cornelis Gardebroek Kodwo, and Rico Ihle. 2019. “Resilience and Household Food Secu- rity: A Review of Concepts, Methodological Approaches and Empirical Evidence.” Food Security 11(6): 1187–1203. Arouri, Mohamed, Cuong Nguyen, and Adel Ben Youssef. 2015. “Natural Disasters, Household Welfare, and Resilience: Evidence from Rural Vietnam.” World Development 70: 59–77. Athey, Susan, and Guido W. Imbens. 2019. “Machine Learning Methods that Economists Should Know About.” Annual Review of Economics 11: 685–725. Bajari, Patrick, Denis Nekipelov, Stephen P. Ryan, and Miaoyu Yang. 2015. “Machine Learning Methods for Demand Estimation.” American Economic Review 105(5): 481–85. Balashankar, Ananth, Lakshminarayanan Subramanian, and Samuel P. Fraiberger. 2023. “Predicting Food Cri- ses Using News Streams.” Science Advances 9(9): eabm3449. Barrett, Christopher B., and Mark A. Constas. 2014. “Toward a Theory of Resilience for International Develop- ment Applications.” Proceedings of the National Academy of Sciences of the United States of America 111(40): 14625–30. Barrett, Christopher B., Kate Ghezzi-Kopel, John Hoddinott, Nima Homami, Elizabeth Tennant, Joanna Upton, and Wu. Tong. 2021. “A Scoping Review of the Development Resilience Literature: Theory, Methods and Evidence.” World Development 146: 105612. Baylis, K., T. Heckelei, and H. Storm. 2021. “Chapter 83‐Machine Learning in Agricultural Economics.” In Handbook of Agricultural Economics, Vol 5, edited by C. B. Barrett and D. R. Just, 4551–4612. Oxford, UK: Elsevier. Belayeth Hussain, A. H. M., Noraida Endut, Sumonkanti Das, Mohammed Thanvir Ahmed Chowdhury, Nadia Haque, Sumena Sultana, and Khandaker Jafor Ahmed. 2019. “Does Financial Inclusion Increase Financial Resilience? Evidence from Bangladesh.” Development in Practice 29(6): 798–807. 1502 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense Bentéjac, Candice, Candice Anna, Anna Csörg}o, and Gonzalo Martínez-Muñoz. 2021. “A Comparative Analysis of Gradient Boosting Algorithms.” Artificial Intelligence Review 54: 1937–67. Blake, Paul, and Divyamshi Wadhwa. 2020. “Year in Review: The Impact of COVID-19 in 12 Charts.” https:// blogs.worldbank.org/voices/2020-year-review-impact-covid-19-12-charts. Breiman, Leo. 2001. “Random Forests.” Machine Learning 45: 5–32. Chawla, Nitesh V., Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. “SMOTE: Synthetic Minority Over-Sampling Technique.” Journal of Artificial Intelligence Research 16: 321–357. Chen, Tianqi, and Carlos Guestrin. 2016. “Xgboost: A Scalable Tree Boosting System.” 785–794. Chernozhukov, Victor, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, and Whitney Newey. 2017. “Double/Debiased/Neyman Machine Learning of Treatment Effects.” American Economic Review 107(5): 261–65. Cohen, Jacob. 1960. “A Coefficient of Agreement for Nominal Scales.” Educational and Psychological Measure- ment 20(1): 37–46. Constas, Mark, Tim Frankenberger, and John Hoddinott. 2014. Resilience Measurement Principles: Toward an Agenda for Measurement Design 1. Resilience Measurement Technical Working Group, Technical Series: Food Security Information Network. Daum, Thomas, and Regina Birner. 2020. “Agricultural Mechanization in Africa: Myths, Realities and an Emerg- ing Research Agenda.” Global Food Security 26: 100393. Dreiseitl, Stephan, and Lucila Ohno-Machado. 2002. “Logistic Regression and Artificial Neural Network Classifi- cation Models: A Methodology Review.” Journal of Biomedical Informatics 35(5–6): 352–59. Emami, Mohammad, Morteza Almassi, Hossein Bakhoda, and Issa Kalantari. 2018. “Agricultural Mechaniza- tion, a Key to Food Security in Developing Countries: Strategy Formulating for Iran.” Agriculture & Food Security 7: 1–12. FAO. 2017. “The Food Insecurity Experience Scale: Measuring Food Insecurity Through People's Experiences.” https://www.fao.org/3/i7835e/i7835e.pdf Foini, Pietro, Michele Tizzoni, Giulia Martini, Daniela Paolotti, and Elisa Omodei. 2023. “On the Forecastability of Food Insecurity.” Scientific Reports 13(1): 2793. Fryer, Daniel, Inga Strümke, and Hien Nguyen. 2021. “Shapley Values for Feature Selection: The Good, the Bad, and the Axioms.” IEEE Access 9: 144352–60. Garbero, Alessandra, and Marco Letta. 2022. “Predicting Household Resilience With Machine Learning: Prelimi- nary Cross-Country Tests.” Empirical Economics 63(4): 2057–70. Gilligan, Daniel O., and John Hoddinott. 2007. “Is There Persistence in the Impact of Emergency Food Aid? Evi- dence on Consumption, Food Security, and Assets in Rural Ethiopia.” American Journal of Agricultural Eco- nomics 89(2): 225–242. Guo, Baorong. 2011. “Household Assets and Food Security: Evidence From the Survey of Program Dynamics.” Journal of Family and Economic Issues 32: 98–110. Hastie, T., R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer. Hidrobo, Melissa, John Hoddinott, Neha Kumar, and Meghan Olivier. 2018. “Social Protection, Food Security, and Asset Formation.” World Development 101: 88–103. Hossain, Marup, Conner Mullally, and M. Niaz Asadullah. 2019. “Alternatives to Caloriebased Indicators of Food Security: An Application of Machine Learning Methods.” Food Policy 84: 77–91. Hsiang, Solomon, Daniel Allen, Sébastien Annan-Phan, Kendon Bell, Ian Bolliger, Trinetta Chong, Hannah Druckenmiller, et al. 2020. “The Effect of Large-Scale Anti-Contagion Policies on the COVID-19 Pandemic.” Nature 584(7820): 262–67. Islam, Asadul, Chandana Maitra, Debayan Pakrashi, and Russell Smyth. 2016. “Microcredit Programme Partici- pation and Household Food Security in Rural Bangladesh.” Journal of Agricultural Economics 67(2): 448–470. Islam, Asadul, and Pushkar Maitra. 2012. “Health Shocks and Consumption Smoothing in Rural Households: Does Microcredit Have a Role to Play?” Journal of Development Economics 97(2): 232–243. Jain, Anil K., Jianchang Mao, and K. Moidin Mohiuddin. 1996. “Artificial Neural Networks: A Tutorial.” Com- puter 29(3): 31–44. MACHINE LEARNING AND RESILIENCE 1503 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense https://blogs.worldbank.org/voices/2020-year-review-impact-covid-19-12-charts https://blogs.worldbank.org/voices/2020-year-review-impact-covid-19-12-charts https://www.fao.org/3/i7835e/i7835e.pdf Jones, Lindsey, Mark A. Constas, Nathanial Matthews, and Simone Verkaart. 2021. “Advancing Resilience Mea- surement.” Nature Sustainability 4(4): 288–89. Jordan, Joanne Catherine. 2015. “Swimming Alone? The Role of Social Capital in Enhancing Local Resilience to Climate Stress: A Case Study From Bangladesh.” Climate and Development 7(2): 110–123. Josephson, Anna, Talip Kilic, and Jeffrey D. Michler. 2021. “Socioeconomic Impacts of COVID-19 in Low- Income Countries.” Nature Human Behaviour 5(5): 557–565. Khandker, Shahidur R., M. A. Baqui Khalily, and Hussain A. Samad. 2012. “Seasonal Hunger and Its Mitigation in North-West Bangladesh.” The Journal of Development Studies 48(12): 1750–64. Knippenberg, Erwin, Nathaniel Jensen, and Mark Constas. 2019. “Quantifying Household Resilience With High Frequency Data: Temporal Dynamics and Methodological Options.” World Development 121: 1–15. Kumar, I. E., S. Venkatasubramanian, C. Scheidegger, and S. A. Friedler. 2020. “Problems with Shapley‐Value‐ Based Explanations as Feature Importance Measures.” In Proceedings of the 37th International Conference on Machine Learning (ICML 2020), edited by H Daume and A Singh, 5447‐56. International Machine Learning Society (IMLS). Lieslehto, Johannes, Noora Rantanen, Lotta-Maria A. H. Oksanen, Sampo A. Oksanen, Anne Kivimäki, Susanna Paju, Milla Pietiäinen, et al. 2022. “A Machine Learning Approach to Predict Resilience and Sickness Absence in the Healthcare Workforce During the COVID-19 Pandemic.” Scientific Reports 12(1): 8055. Little, Peter D., and Abdel Ghaffar Muhammad Ahmad. 2002. “Building Assets for Sustainable Recovery and Food Security.” Broadening Access and Strengthening Input Market Systems. Lundberg, S. M., and S.‐I. Lee. 2017. “A Unified Approach to Interpreting Model Predictions.” Advances in Neu- ral Information Processing Systems 30: 4765–74 https://proceedings.neurips.cc/paper/2017/hash/ 8a20a8621978632d76c43dfd28b67767-Abstract.html. Manlosa, Aisa O., Jan Hanspach, Jannik Schultner, Ine Dorresteijn, and Joern Fischer. 2019. “Livelihood Strate- gies, Capital Assets, and Food Security in Rural Southwest Ethiopia.” Food Security 11: 167–181. Maraboli, S. 2009. Life, the Truth, and Being Free. Port Washington, NY: A Better Today Publishing. Marcílio, W. E., and D. M. Eler. 2020. “From Explanations to Feature Selection: Assessing SHAP Values as Fea- ture Selection Mechanism.” In 2020 33rd SIBGRAPI conference on Graphics, Patterns and Images (SIBGRAPI) 340–47. IEEE. Martini, Giulia, Alberto Bracci, Lorenzo Riches, Sejal Jaiswal, Matteo Corea, Jonathan Rivers, Arif Husain, and Elisa Omodei. 2022. “Machine Learning Can Guide Food Security Efforts When Primary Data Are Not Avail- able.” Nature Food 3(9): 716–728. Mullainathan, Sendhil, and Jann Spiess. 2017. “Machine Learning: An Applied Econometric Approach.” Journal of Economic Perspectives 31(2): 87–106. Mullan, M., L. Danielson, B. Lasfargues, N C Morgado, and E Perry. 2018. Climate‐Resilient Infrastructure: Pol- icy Perspectives. OECD Environment Policy Paper No. 14. Paris: OECD. Pinstrup-Andersen, Per. 2009. “Food Security: Definition and Measurement.” Food Security 1(1): 5–7. Raileanu, Laura Elena, and Kilian Stoffel. 2004. “Theoretical Comparison between the Gini Index and Informa- tion Gain Criteria.” Annals of Mathematics and Artificial Intelligence 41: 77–93. Roberts, David L., Jeremy S. Rossman, and Ivan Jari�c. 2021. “Dating First Cases of COVID-19.” PLoS Pathogens 17(6): e1009620. Rudin-Rush, Lorin, Jeffrey D. Michler, Anna Josephson, and Jeffrey R. Bloem. 2022. “Food Insecurity during the First Year of the COVID-19 Pandemic in Four African Countries.” Food Policy 111: 102306. Sakyi-Nyarko, Carlos, Ahmad Hassan Ahmad, and Christopher J. Green. 2022. “The Gender-Differential Effect of Financial Inclusion on Household Financial Resilience.” The Journal of Development Studies 58(4): 692–712. Shapley, L. S. 1953. “A Value for n-Person Games.” In Contributions to the Theory of Games. Vol. 2 of Annals of Mathematics Studies, edited by H. W. Kuhn and A. W. Tucker, 307–318. Princeton, NJ: Princeton University Press. Storm, Hugo, Kathy Baylis, and Thomas Heckelei. 2020. “Machine Learning in Agricultural and Applied Eco- nomics.” European Review of Agricultural Economics 47(3): 849–892. The World Bank. 2021. “COVID-19 High Frequency Phone Survey of Households 2020—World Bank LSMS Har- monized Dataset.” https://microdata.worldbank.org/index.php/catalog/4072/study-description. 1504 APPLIED ECONOMIC PERSPECTIVES AND POLICY 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html https://microdata.worldbank.org/index.php/catalog/4072/study-description Vabalas, Andrius, Emma Gowen, Ellen Poliakoff, and Alexander J. Casson. 2019. “Machine Learning Algorithm Validation with a Limited Sample Size.” PLoS One 14(11): e0224365. Villacis, Alexis H., and Syed Badruddoza. 2023. “Using Artificial Intelligence to Predict and Prevent Future Food Insecurity.” Georgetown Journal of International Affairs 24(2): 191–97. Villacis, Alexis H., Syed Badruddoza, Ashok K. Mishra, and Joaquin Mayorga. 2023. “The Role of Recall Periods when Predicting Food Insecurity: A Machine Learning Application in Nigeria.” Global Food Security 36: 100671. Wager, Stefan, and Susan Athey. 2018. “Estimation and Inference of Heterogeneous Treatment Effects Using Random Forests.” Journal of the American Statistical Association 113(523): 1228–42. Walsh‐Dilley, M., W. Wolford, and J. McCarthy. 2016. “Rights for Resilience: Food Sovereignty, Power, and Resilience in Development Practice.” Ecology and Society 21(1): 11. https://doi.org/10.5751/ES-07981-210111. Warrens, Matthijs J. 2015. “Five Ways to Look at Cohen's Kappa.” Journal of Psychology & Psychotherapy 5(4): 1. Yeh, Christopher, Anthony Perez, Anne Driscoll, George Azzari, Zhongyi Tang, David Lobell, Stefano Ermon, and Marshall Burke. 2020. “Using Publicly Available Satellite Imagery and Deep Learning to Understand Economic Well-Being in Africa.” Nature Communications 11(1): 2583. Zhang, Ying, and Chen Ling. 2018. “A Strategy to Apply Machine Learning to Small Datasets in Materials Sci- ence.” npj Computational Materials 4(1): 25. SUPPORTING INFORMATION Additional supporting information can be found online in the Supporting Information section at the end of this article. How to cite this article: Villacis, Alexis H., Syed Badruddoza, and Ashok K. Mishra. 2024. “A Machine Learning-Based Exploration of Resilience and Food Security.” Applied Economic Perspectives and Policy 46(4): 1479–1505. https://doi.org/10.1002/aepp.13475 MACHINE LEARNING AND RESILIENCE 1505 20405804, 2024, 4, D ow nloaded from https://onlinelibrary.w iley.com /doi/10.1002/aepp.13475 by M akerere U niversity, W iley O nline L ibrary on [27/03/2025]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense https://doi.org/10.5751/ES-07981-210111 https://doi.org/10.1002/aepp.13475 A machine learning‐based exploration of resilience and food security CONTEXT, DEFINITIONS, AND DATA EMPIRICAL FRAMEWORK Logistic Regression Random Forest Gradient Boosting and eXtreme Gradient Boosting Artificial Neural Networks (ANNs) Feature extraction with Shapley Additive exPlanations values Model evaluation Preprocessing Addressing class imbalance Hyperparameter tuning RESULTS Predictive performance Feature extraction Dimension reduction DISCUSSION AND POLICY IMPLICATIONS ACKNOWLEDGMENTS Endnotes REFERENCES SUPPORTING INFORMATION