Early View
ORIGINAL ARTICLE
Open Access

Using machine learning to unveil relevant predictors of adherence to recommended health-protective behaviors during the COVID-19 pandemic in Denmark

Lau Lilleholt

Corresponding Author

Lau Lilleholt

Department of Psychology, University of Copenhagen, Copenhagen, Denmark

Copenhagen Center for Social Data Science (SODAS), University of Copenhagen, Copenhagen, Denmark

Correspondence

Lau Lilleholt, Department of Psychology and Copenhagen Center for Social Data Science (SODAS), University of Copenhagen, Øster Farimagsgade 2A, 1353 Copenhagen, Denmark.

Email: [email protected]

Search for more papers by this author
Gretchen B. Chapman

Gretchen B. Chapman

Department of Social and Decision Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA

Search for more papers by this author
Robert Böhm

Robert Böhm

Department of Psychology, University of Copenhagen, Copenhagen, Denmark

Copenhagen Center for Social Data Science (SODAS), University of Copenhagen, Copenhagen, Denmark

Faculty of Psychology, University of Vienna, Vienna, Austria

Search for more papers by this author
Ingo Zettler

Ingo Zettler

Department of Psychology, University of Copenhagen, Copenhagen, Denmark

Copenhagen Center for Social Data Science (SODAS), University of Copenhagen, Copenhagen, Denmark

Search for more papers by this author
First published: 08 June 2024

Abstract

What were relevant predictors of individuals' proclivity to adhere to recommended health-protective behaviors during the COVID-19 pandemic in Denmark? Applying machine learning (namely, lasso regression) to a repeated cross-sectional survey spanning 10 months comprising 25 variables (Study 1; N = 15,062), we found empathy toward those most vulnerable to COVID-19, knowledge about how to protect oneself from getting infected, and perceived moral costs of nonadherence to be strong predictors of individuals' self-reported adherence to recommended health-protective behaviors. We further explored the relations between these three factors and individuals' self-reported proclivity for adherence to recommended health-protective behaviors as they unfold between and within individuals over time in a second study, a Danish panel study comprising eight measurement occasions spanning eight months (N = 441). Results of this study suggest that the relations largely occurred at the trait-like interindividual level, as opposed to at the state-like intraindividual level. Together, the findings provide insights into what were relevant predictors for individuals' overall level of adherence to recommended health-protective behaviors (in Denmark) as well as how these predictors might (not) be leveraged to promote public adherence in future epidemics or pandemics.

INTRODUCTION

In response to the outbreak of the corona virus disease 2019 (COVID-19), governments worldwide recommended and/or mandated several health-protective behaviors (HPB), such as hand sanitizing, mask wearing, and physical distancing. To aid governments in promoting and maintaining high public adherence (or for pure scientific purposes), researchers rushed to explore what made individuals adhere to these and other recommended HPB. Taking stock, a multitude of factors has been found to relate to individuals' proclivity to adhere to recommended HPB. Prominent examples include gender (Galasso et al., 2020), age (Bailey et al., 2021), pandemic efficacy (i.e. self-efficacy in handling the pandemic; Beeckman et al., 2020), personality characteristics (Zettler et al., 2022), empathy (Pfattheicher et al., 2020), institutional trust (Bargain & Aminjonov, 2020), and COVID-19-related risk perceptions (Dryhurst et al., 2020). Given that most studies have only focused on a few factors at a time (for exceptions, see, e.g., Clark et al., 2020; Wright et al., 2021), it remains unclear, however, which of the many factors are—when considered simultaneously—relevant predictors of individuals' proclivity to adhere to recommended HPB.

Knowing which factors are relevant predictors of individuals' proclivity to adhere to recommended HPB is useful for two reasons. First, it puts findings into perspective by making it possible to critically assess the comparative predictive importance of different factors. Second, it provides researchers and policymakers a succinct overview of which factors they probably should (not) focus on to promote and maintain high public adherence to recommended HPB in future epidemics or pandemics.

Against this background, we herein applied machine learning (namely, lasso regression) to identify which of more than 20 factors that have previously been linked to individuals' proclivity to adhere to recommended HPB were relevant predictors of this tendency during the COVID-19 pandemic in Denmark. We applied this method to data from a repeated cross-sectional survey, which assessed both the factors and recommended HPB throughout the COVID-19 pandemic in Denmark over a 10-month period (N = 15,062; Study 1). Following this, we explored the nature of the relations, as they unfold over time, between individuals' proclivity to adhere to recommended HPB and the three most relevant predictors of this tendency identified in Study 1, using data from a separate Danish panel survey spanning a period of 8 months, comprising participants from the same overall sample population (N = 441; Study 2). Specifically, relying on several random intercept cross-lagged models (RI-CLPMs; Hamaker et al., 2015), we explored whether the relations occurred at the trait-like interindividual level, the state-like intraindividual level, or both. By doing so, we provide new insights into how these factors might influence individuals' proclivity to adhere to recommended HPB (in Denmark), as well as how they may or may not be leveraged to promote and maintain high public adherence in future epidemics or pandemics.

Whereas the idea of using machine learning to predict individuals' proclivity to adhere to recommended HPB is not new, the present investigation, in addition to using Danish samples, extends previous research in three ways. First, the present investigation relies on data spanning almost a full year of the COVID-19 pandemic (10 months) rather than just a few months at its onset (Hajdu et al., 2022; Pavlović et al., 2022; Taye et al., 2023; van Lissa et al., 2022). This is crucial because a longer timespan makes it possible to investigate to what extent the comparative predictive importance of different factors changed over the course of the pandemic. Second, whereas previous research has focused mainly on maximizing the predictive accuracy of more complex machine learning models (e.g. random forests; Hajdu et al., 2022; Pavlović et al., 2022; Taye et al., 2023; van Lissa et al., 2022), the present investigation seeks to both maximize predictive accuracy and explicitly identify relevant predictors of individuals' proclivity to adhere to recommended HPB among a set of potential predictors using a simpler and thus more readily interpretable machine learning model, namely, lasso regression (Tibshirani, 1996). As a simple linear regression model that simultaneously performs both feature selection (i.e. identification of relevant predictors) and regularization to reduce overfitting and enhance model accuracy and interpretability, lasso regression is well-suited for this task (Tibshirani, 1996). Third, moving beyond the mere task of identifying relevant predictors among a set of potential predictors and building a predictive model of individuals' proclivity to adhere to recommended HPB, the present investigation further explores how the relations between individuals' proclivity to adhere to recommended HPB and, from all considered predictors of this tendency herein, the three most relevant predictors unfold over time both between and within individuals. To this end, we rely on RI-CLPMs, an extension of traditional cross lagged panel models (CLPM), which effectively separate trait-like interindividual associations from state-like intraindividual ones, allowing for a critical exploration of how two or more variables interrelate over time at both the trait-like interindividual and the state-like intraindividual level (Usami et al., 2019). Knowing whether the relevant predictors found herein are over time related to individuals' proclivity to adhere to recommended HPB at the trait-like interindividual level, the state-like intraindividual level, or both is crucial because it has important implications for the types of interventions that governments should seek to implement. In particular, if the relations mainly occur at the trait-like interindividual level, governments should, if possible, primarily seek to implement interventions that permanently change citizens levels in the predictors of interest. If the relations mainly occur at the state-like intraindividual level, by contrast, governments should rather seek to implement interventions that change citizens' current state levels in the predictors of interest.

STUDY 1

Building on previous research showing numerous links between, on the one hand, various perceptions, emotions, and personality characteristics, and, on the other hand, individuals' proclivity to adhere to recommended HPB, the main aim of Study 1 was to identify relevant predictors of individuals' overall level of adherence to recommended HPB (in Denmark). Specifically, we considered 23 factors that have all been linked to individuals' proclivity to adhere to recommended HPB during the COVID-19 pandemic and received scientific attention. Table S1 provides an overview of the considered factors, a brief theoretical explanation why each factor was included, as well as references to previous research linking each factor to individuals' proclivity to adhere to recommended HPB during the COVID-19 pandemic. Adding to this, we considered two factors that, to the best of our knowledge, have not been linked to individuals' overall level of adherence to recommended HPB previously but that are likely to be predictors, too. That is, we considered the perceived potential reputational and moral costs associated with nonadherence to recommended HPB.

The perceived potential reputational costs of nonadherence refer to the negative reputational consequences that individuals may face for not adhering to recommended HPB, such as being viewed negatively or even publicly shamed by others. 1 Given that most individuals appear to care about their reputation (Manrique et al., 2021), it seems likely that perceived potential reputational costs of nonadherence lead to a general increase in individuals' overall proclivity for adherence. Supporting this idea, previous research has shown that individuals are more likely to adhere to injunctive social norms—that is, “rules or beliefs as to what constitutes morally approved and disapproved conduct” (Cialdini et al., 1990, p. 1015)—in situations where not doing so might put their reputation at risk (e.g. being viewed as a liar; Abeler et al., 2019) and/or lead them to suffer from public shaming (e.g. being called out as a free-rider; Jacquet et al., 2011).

Similarly, the perceived moral costs of nonadherence refer to the negative moral emotions (e.g. feeling like a bad or immoral person) that individuals who view adherence as a moral obligation are likely to experience when they deliberately refrain from, or unintentionally fail to, adhere to recommended HPB. Given that most individuals care about maintaining a positive view of themselves as good and moral beings (Shalvi et al., 2011), it seems likely that any perceived moral costs of nonadherence increase individuals' overall proclivity to adhere to recommended HPB by making it psychologically costly to fail or deliberately refrain from doing so. Supporting this idea, previous research has shown that individuals often refrain from breaking injunctive social norms (e.g. being honest) even when there are no reputational costs or other forms of punishment associated with doing so (Abeler et al., 2019). This suggests that individuals are likely to be intrinsically motivated to maintain a positive self-image and avoid the moral costs that are likely to follow from not adhering to such norms.

In sum, the perceived potential reputational and moral costs of nonadherence are likely to influence individuals' proclivity to adhere to recommended HPB by changing the perceived psychological costs and benefits of doing so.

Taken together, Study 1 considered 25 predictors of individuals' proclivity to adhere to recommended HPB. Yet, while this is a substantial number of predictors to consider in a single regression model, it is obviously not an exhaustive list of all potentially relevant predictors of individuals' proclivity to adhere to recommended HPB. To better understand the implications of the results presented herein, it is necessary to take into account how and why these 25 predictors were selected. First, the 25 predictors considered herein were selected because they were assessed in each of the repeated cross-sectional survey that Study 1 relies on (see Methods). While the 25 predictors were thus partly selected due to their availability, it is important to note that the respective surveys were set up in collaboration with the World Health Organization (WHO) and that all factors included were chosen by experienced field experts. Conceptually, the survey aimed to assess factors in the realm of individuals' perceptions, emotions, knowledge, and behavioral adaptions to the COVID-19 pandemic in an evidence-based way. 2 In turn, while the inclusion and exact measurement of each of the 25 factors considered herein was decided (in accordance with health experts from the WHO) on short notice, the factors overall arguably capture many, although not all, of the various individual difference constructs that were relevant for individuals' responses to COVID-19, including their proclivity to adhere to recommended HPB during the pandemic. Second, the 25 predictors were selected because they all were consistently assessed over a long period of time (i.e. 10 months), making it possible to investigate to what extent the comparative predictive importance of these predictors changed over the course of the COVID-19 pandemic (i.e. we excluded other potential predictors, because they were not assessed over the full period such as vaccination status, which was not possible to include in the beginning). Finally, the 25 predictors were selected because all but two of them (namely, potential reputational and moral costs of nonadherence) had been consistently related to individuals' proclivity to adhere to recommended HPB in prior research (see Table S1), and thus been suggested as relevant.

Method

Data

The data underlying Study 1 comes from the Danish COSMO survey (https://psychology.ku.dk/cosmo-dk). Between March 2020 and September 2021, COSMO Denmark assessed Danish citizens' perceptions, emotions, knowledge, and behavioral adaptions to the COVID-19 pandemic, using a mixture of weekly, biweekly, and monthly repeated cross-sectional and panel surveys. Across surveys, COSMO Denmark additionally collected information on participants' age, gender, education, employment status, chronic disease status, and personality characteristics in terms of the six HEXACO dimensions (Honesty-Humility, Emotionality, Extraversion, Agreeableness vs. Anger, Conscientiousness, and Openness to Experience; Ashton & Lee, 2007). More specifically, for Study 1, we used data from 24 measurement occasions of the repeated cross-sectional Danish COSMO survey (namely, all measurement occasions between 2020-11-02 and 2021-09-26), in which the 25 predictors of interest were assessed each time.

Sampling procedure

The sampling procedure for the repeated cross-sectional Danish COSMO survey was as follows (for more details, see Zettler et al., 2021): In 2020, following data handling approval from the Faculty of Social Sciences at the University of Copenhagen (#514-0136/20-2000), the last author received contact information for two representative samples regarding age and gender of approximately 100,000 adult Danish citizens each from Statistics Denmark (https://www.dst.dk/en). From these samples, random nonoverlapping subsets of 5,250–8,500 individuals were invited via the official digital mail system in Denmark (https://www.e-boks.com/danmark/en) every other week from 2020-11-02 to 2021-09-26 to participate in the repeated cross-sectional Danish COSMO survey. The survey was set up and run in formr (Arslan et al., 2020).

Participants

A total of 15,475 respondents participated in the 24 measurement occasions of the repeated cross-sectional Danish COSMO survey considered herein, with each participant providing data on exactly one of the 24 occasions. Of these, 413 were excluded because they reported experiencing some form of technical issue while filling out the survey (e.g. they lost their connection to the internet), resulting in a final sample of N = 15,062 participants (54.67% female, 45.12% male, 0.21% other, Mage = 56.55, SDage = 15.46 years).

Measures

All measures were presented in Danish using a Likert-type scale with different anchors as response option. An overview of all items/scales and anchors can be found in Table S2 in the Supporting Information: (https://osf.io/as9mx/). Participants' perceived financial security was assessed with a five-item scale (Cronbach's α = .93) developed by Munyon et al. (2020), and their general risk preferences were measured using the general risk question developed by Dohmen et al. (2011). Following Weinstein (2000), participants' cognitive risk perceptions regarding COVID-19 was approximated by taking the product of one item measuring their assessment of how serious it would be for them to get infected with COVID-19 and one item asking them to judge their own likelihood of contracting the disease. Participants' affective risk perceptions regarding COVID-19 were assessed with the same six items (α = .78) as in Lilleholt et al. (2023). The prevalence of negative affect was measured using six items assessing participants' feelings of boredom, loneliness, isolation, pessimism, dissatisfactions, and stress (α = .84). Participants' worries about the potential consequences of the pandemic were assessed using seven items asking them how concerned they were about different negative outcomes such as the healthcare system being overburdened or small businesses going bankrupt (α = .74; Zettler et al., 2021). Feelings of empathy toward those most vulnerable to COVID-19 were assessed via three items (α = .86) from Pfattheicher et al. (2020). The overall level of institutional trust in politicians, police, experts, healthcare professionals, and different public and private institutions' ability to handle the pandemic in a safe and efficient manner was assessed via six items (α = .83). Pandemic fatigue was assessed via the six-item (α = .83) pandemic fatigue scale (Lilleholt et al., 2023). Pandemic efficacy was assessed via one item asking participants to indicate how easy it would be for them to avoid getting infected given the current circumstances of the pandemic. Similarly, participants' knowledge about how to protect themselves from getting infected was measured with a single item asking participants to rate the proficiency of their own knowledge regarding this subject. The six HEXACO dimensions—that is, Honesty-Humility (α = .37), Emotionality (α = .32), Extraversion (α = .63), Agreeableness vs. Anger (α = .46), Conscientiousness (α = .50), and Openness to Experience (α = .53)—were assessed using the Brief HEXACO-Inventory (De Vries, 2013). 3 In line with the Health Belief Model (Janz & Becker, 1984), the perceived barriers for adherence were assessed with three items asking participants how exhausting, expensive, and time consuming they found it to adhere to the recommended HPB (α = .59). The perceived potential reputational costs associated with not adhering to the recommended HPB was measured with three newly developed items (e.g. “I am afraid that others will get mad at me if I do not comply with the COVID-19 measures”; α = .81) and so was the perceived moral costs of nonadherence (e.g. “I would feel bad about myself if I did not comply with the COVID-19 measures”; α = .79). Finally, participants' overall proclivity to adhere to recommended HPB was assessed via one general question (“I follow the instructions from the authorities to prevent the spread of the novel coronavirus (COVID-19) in my country”) and five more specific items, namely, “I wash my hands often or use hand disinfectant,” “I make sure to cough or sneeze in my sleeve rather than in my hands,” “I try to limit the amount of physical contact I have with others (e.g. handshakes, kisses on the cheek, hugs),” “I pay extra attention to cleaning at the moment,” and “I keep a distance to the elderly and/or people that I know to suffer from a chronic illness” (α = .77). We focused on these six indicators of adherence to recommended HPB because they were all measured on a similar 7-point Likert-type scale, making it easy to combine them into an overall index. Another and perhaps even more important reason for focusing on these indicators is that the specific items referred to the five general HPB that the Danish government and health authorities recommended throughout the COVID-19 pandemic, meaning that most Danes knew about them (see SST, 2020). While the repeated cross-sectional Danish COSMO survey also assessed mask wearing and vaccination, we refrained from including these HPB in the overall adherence index because they were measured on slightly different scales, and because the rules and recommendations regarding these HPB by the Danish authorities changed during the course of the COVID-19 pandemic and, in turn, the measurement occasions considered herein.

An overview of all variables assessed in the repeated cross-sectional Danish COSMO survey—across all measurement occasions and irrespective of whether considered herein or not—can be found via https://psychology.ku.dk/cosmo-dk.

Analysis

Our analytical strategy followed a two-step approach. First, we conducted five ordinary least square (OLS) regressions to conceptually replicate the observed links between individuals' proclivity to adhere to recommended HPB and the 25 different factors considered herein. More specifically, we conducted one OLS regression covering all 24 waves of the repeated cross-sectional Danish COSMO survey considered herein as well as four OLS regressions each covering a subset of six waves for the purpose of exploring the stability of the results over time (see Table S3 for an overview of the sample characteristics of the full sample and each of the four subsamples). Second, we performed five lasso regressions to identify the most relevant predictors (see McNeish, 2015) of individuals' proclivity to adhere to recommended HPB and to explore their comparative predictive importance over the course of the pandemic. That is, we again conducted one lasso regression covering all 24 waves of the repeated cross-sectional Danish COSMO survey considered herein as well as four lasso regressions each covering a subset of six waves.

Lasso regression is an extension of OLS regression in which the coefficient estimates are shrunk toward zero using what in statistical parlance is known as L1 regularization (James et al., 2013). In simple terms, L1 regularization shrinks the size of the regression coefficients by adding a penalty term to the OLS cost function that is equivalent to the sum of the absolute value of the coefficients (Tibshirani, 1996). In mathematical terms, the lasso cost function can be written as follows:
C Lasso β 1 β p = Y j = 1 p X j β j 2 + λ j = 1 p β j $$ {C}^{Lasso}\left({\beta}_1,\dots, {\beta}_p\right)={\left\Vert Y-\sum \limits_{j=1}^p{X}_j{\beta}_j\right\Vert}^2+\lambda \sum \limits_{j=1}^p\left|{\beta}_j\right| $$ (1)
in which the first part is the OLS cost function and the second part is the penalty term (McNeish, 2015). Importantly, as shown in Equation (1), the strength of the penalty term, and thus the amount of shrinkage imposed, is governed by a regularization parameter 𝜆 (James et al., 2013). When 𝜆 = 0, no shrinkage is imposed, and the lasso cost function is identical to the OLS cost function. As 𝜆 increases, so does the amount of shrinkage imposed, effectively forcing coefficient estimates for variables with little or no predictive value to be exactly zero. In effect, this means that lasso regression can be used to obtain regularized regression coefficients and to identify relevant predictors by “selecting out” predictors with coefficient estimates of zero (McNeish, 2015).

Prior to fitting the lasso regressions, we split the data into training and test sets using an 80%–20% split. We then trained the lasso regressions on the training sets and evaluated their predictive performance on the test sets. In evaluating the predictive performance of the lasso regressions, we report both the root mean squared error (RMSE) and the coefficient of determination (R2). To find optimal values for the regularization parameter, 𝜆, we used k-fold cross-validation with 10 folds (James et al., 2013). All analyses were conducted in R version 4.2.2. For the lasso regressions, we used the glmnet package (Friedman et al., 2010).

Results

OLS regressions

Across the 24 waves of the repeated cross-sectional Danish COSMO survey, we found significant relations between individuals' proclivity to adhere to recommended HPB and all but four factors included in Study 1 (Figure 1). In particular, we did not find a link between, on the one hand, individuals' self-reported proclivity for adherence, and, on the other hand, their employment status, chronic disease status, general risk preferences, and perceived pandemic efficacy (all ps > .05). In turn, the vast majority of links that have been reported between various perceptions, emotions, and personality characteristics and individuals' proclivity to adhere to recommended HPB during the COVID-19 pandemic could be replicated herein. Moreover, we found the perceived potential reputational costs of nonadherence to be negatively related and the perceived potential moral costs of nonadherence to be positively related to individuals' overall level of adherence to recommended HPB (all ps < .001). Thus, the relation between the perceived potential reputational costs of nonadherence to individuals' overall proclivity for adhering to recommended HPB was opposite to what one might have expected.

Details are in the caption following the image
Results from ordinary least square regression predicting adherence across all waves (Study 1). Note. Figure 1 shows standardized β coefficients with 95% confidence intervals. Continuous variables are mean-centered and scaled by 1 standard deviation. The reference class for chronic disease status is “Don't know.” ***p < .001; **p < .01; *p < .05. All p-values are two-tailed and not corrected for multiple testing.

Looking across the four subsamples each covering six waves of the repeated cross-sectional Danish COSMO survey, we only found the following 11 factors to be consistently related to individuals' proclivity to adhere to recommended HPB: age, cognitive risk perceptions, affective risk perceptions, worries, empathy, institutional trust, knowledge, conscientiousness, and perceived moral costs of (non)adherence (all positively), pandemic fatigue (negatively), as well as self-identifying as female (Figure S1). Taken together, these findings suggest that while a substantial number of factors were consistently related to individuals' proclivity to adhere to recommended HPB throughout the COVID-19 pandemic, others were related to this tendency only at certain times.

Lasso regressions

Turning to the lasso regressions, we found 14 of the 25 factors considered herein to be predictive of individuals' proclivity to adhere to recommended HPB across the 24 waves of the repeated cross-sectional Danish COSMO survey (Figure 2). More specifically, results indicated that individuals' chronic disease status, education, pandemic efficacy, emotionality, financial security, negative affect, perceived potential reputational costs of nonadherence, general risk preferences, employment status, perceived barriers of adherence, and gender were not predictive. By contrast, we found in ascending order that individuals' levels in openness to experience, agreeableness versus anger, extraversion, honesty-humility, cognitive risk perceptions regarding COVID-19, pandemic fatigue, worries, age, conscientiousness, institutional trust, affective risk perceptions regarding COVID-19, empathy, knowledge, and perceived moral costs of nonadherence all had some predictive power. Among the 14 factors retained by the lasso regression (i.e. factors with coefficient estimates different from zero), some factors appeared as better predictors than others with regard to individuals' overall level of adherence to recommended HPB. More precisely, we found empathy toward those most vulnerable to COVID-19, knowledge about how to protect oneself from getting infected, and perceived moral costs of nonadherence to be the three strongest predictors (all positive). Notably, as shown in Figure 2, these three factors were also the only factors that (descriptively) turned out to be more predictive of individuals' proclivity to adhere to recommended HPB than the mere passage of time.

Details are in the caption following the image
Results from lasso regression predicting adherence across all waves (Study 1). Note. Figure 2 shows standardized β coefficients. Continuous variables are mean-centered and scaled by 1 standard deviation. The reference class for chronic disease status is “Don't know.” Coefficients > 0 are marked in blue. Coefficients ≤ 0 are marked in red.

Evaluating the out of sample predictive performance of the lasso regression based on the 24 waves of the repeated cross-sectional Danish COSMO survey, we found the RMSE to be fairly low and R2 to be high. In particular, we found the RMSE to be .79 and the R2 to be .38, indicating that the lasso regression had good predictive accuracy and explained a substantial amount of variance in individuals' proclivity to adhere to recommended HPB.

Looking across the four subsamples each covering six waves of the repeated cross-sectional Danish COSMO survey, we only found the following nine factors to be consistently retained by the lasso regressions: conscientiousness, age, pandemic fatigue, worries, affective risk perceptions regarding COVID-19, institutional trust, empathy, knowledge, and perceived moral costs of (non)adherence (Figure S2). Notably, across all four subsamples, the perceived moral costs of (non)adherence was the single most predictive factor of the 25 factors considered herein, while both empathy and knowledge consistently were among the four most relevant predictors, with empathy sometimes being slightly less predictive than institutional trust. Overall, these findings suggest that only a handful of factors were consistently predictive of individuals' proclivity to adhere to recommended HPB during the COVID-19 pandemic in Denmark. At the same time, these results further suggest that the comparative predictive importance of the 25 factors considered herein remained relatively stable and that empathy, knowledge about how to protect oneself from getting infected, and the perceived moral costs of nonadherence consistently were among the most relevant predictors of individuals' proclivity to adhere to recommended HPB throughout the COVID-19 pandemic in Denmark.

Considering the out of sample predictive performance of the lasso regressions based on each of the four subsamples, we again found the RMSE to be fairly low and R2 to be high. In particular, we found the RMSE to range from .77 to .84, and the R2 to range from .30 to .36. Each of the lasso regressions based on the four subsamples thus had good predictive accuracy and explained a substantial amount of variance in individuals' proclivity to adhere to recommended HPB.

STUDY 2

Having identified three particularly relevant predictors of individuals' proclivity to adhere to recommended HPB in Denmark (i.e. empathy, knowledge, and perceived moral costs of nonadherence), we proceeded to explore the nature of the relations between these three predictors and individuals' overall proclivity for adherence to recommended HPB as they unfold over time between and within individuals. More specifically, relying on several RI-CLPMs (Hamaker et al., 2015), we explored whether the relations, as they unfold over time, between individuals' proclivity to adhere to recommended HPB and their feelings of empathy toward those most vulnerable to COVID-19, knowledge about how to protect themselves from getting infected, and perceived moral costs of not adhering to these behaviors, occurred at the trait-like interindividual level, the state-like intraindividual level, or both.

Method

Data

The data underlying Study 2 also comes from COSMO Denmark but from a separate panel survey that was conducted in parallel with the repeated cross-sectional Danish COSMO survey. Specifically, we relied on data from eight monthly waves of this panel survey (viz., from 2020-11-16 to 2021-06-27), in which participants' empathy, knowledge, and perceived moral costs of nonadherence (as well as other variables) were assessed repeatedly.

Sampling procedure

Following the same procedure as for the repeated cross-sectional Danish COSMO survey, the last author received contact information for a representative sample regarding age and gender of approximately 100,000 adult Danish citizens from Statistics Denmark in 2018. From this sample, a random subset of 15,000 Danish citizens was invited to take part in the Danish COSMO panel survey via the official digital mail system in Denmark in March 2020. A total of 2,546 respondents participated in the first wave of the panel survey and were thus invited to participate in the subsequent waves of the survey (for more information, see https://psychology.ku.dk/cosmo-dk). The first eight measurement occasions of the panel survey were conducted on a weekly basis, whereas the subsequent 11 measurement occasions were conducted monthly. Importantly, we only used eight of the monthly measurement occasions because the remaining measurement occasions did not assess all three independent variables of interest. Like the repeated cross-sectional Danish COSMO survey, the panel survey was set up and run in formr (Arslan et al., 2020).

Participants

Across the eight monthly measurement occasions considered herein, and after removing any observations in which the participants reported experiencing technical difficulties while filling out the survey (obs. = 87), data were available from 441 participants (52.80% female, 47.20% male, Mage = 58.23, SDage = 14.00 years) and 3528 observations. On average, these 441 participants participated in six of the eight measurement occasions. The final data set thus contains different levels of missing data for a total of 294 participants who did not complete all eight measurement occasions considered herein, while 147 participants had complete data for all eight waves. Figure S3 provides an overview of the missing data patterns.

Measures

Participants' proclivity to adhere to recommended HPB and their feelings of empathy toward those most vulnerable to COVID-19, knowledge about how to protect themselves from getting infected, and perceived moral costs of not adhering to these behaviors were assessed with the same items and scales as in Study 1. An overview of all variables assessed in the Danish COSMO panel survey—across all measurement occasions and irrespective of whether considered herein or not—can be found via https://psychology.ku.dk/cosmo-dk.

Analysis

To explore the nature of the relations between individuals' proclivity to adhere to recommended HPB and their feelings of empathy toward those most vulnerable to COVID-19, knowledge about how to protect themselves from getting infected, and perceived moral costs of not adhering to these behaviors, as they unfold over time between and within individuals, we estimated three RI-CLPMs (Hamaker et al., 2015). RI-CLPM is an extension of the traditional CLPM that allows for the analysis of reciprocal relations between two or more variables over time (Usami et al., 2019). Notably, as opposed to traditional CLPMs, which are known to conflate interindividual and intraindividual associations (Hamaker et al., 2015; Usami et al., 2019), the RI-CLPM effectively separates these association from each other, making it possible to investigate to what extent the relation(s) between two or more variables over time occur at the trait-like interindividual level, the state-like intraindividual level, or both. Mathematically, RI-CLPMs can be described as follows:
x it = μ t + κ i + p it $$ {x}_{it}={\mu}_t+{\kappa}_i+{p}_{it} $$ (2a)
y it = π t + ω i + q it $$ {y}_{it}={\pi}_t+{\omega}_i+{q}_{it} $$ (2b)
where xit and yit are two distinct variables measured at multiple timepoints (t) for individual (i), μt and πt are temporal group means, κi and ωi are trait-like interindividual deviations from these means, and the residuals pit and qit are state-like intraindividual deviations from the individuals expected level (i.e. μt + κi and πt + ωi; Hamaker et al., 2015; Usami et al., 2019). The state-like intraindividual deviations are modeled as follows:
p it = α t p i , t 1 + β t q i , t 1 + u it $$ {p}_{it}={\alpha}_t{p}_{i,t-1}+{\beta}_t{q}_{i,t-1}+{u}_{it} $$ (2c)
q it = δ t q i , t 1 + γ t p i , t 1 + v it $$ {q}_{it}={\delta}_t{q}_{i,t-1}+{\gamma}_t{p}_{i,t-1}+{v}_{it} $$ (2d)
where αt and δt are autoregressive parameters that represent the state-like intraindividual carry-over effect (i.e. if αt and/or δt are positive, it implies that occasions on which an individual scored above their expected level are likely to be followed by occasions on which they still score above their expected level, and vice versa), βt and γt are cross-lagged parameters that indicate the extent to which the variables of interest influence each other at the state-like intraindividual level, and uit and vit are residuals that are assumed to be normally distributed and correlated (Hamaker et al., 2015; Usami et al., 2019). To be more specific and provide a concrete example, the cross-lagged parameter γt indicates the degree to which deviations from an individual's expected level on y (i.e. qit = yit − {πt + ωi}) can be predicted from preceding deviations from the same individual's expected level on x (i.e. pi,t –1 = xi,t – 1 − {μt + κi}), while controlling for the individual's deviation of the preceding expected level on y (i.e. qi,t – 1 = yi,t – 1 − {πt – 1 + ωi}).
The fit of the RI-CLPMs was evaluated based on four indices, namely, robust versions of the comparative fit index (CFI), the Tucker–Lewis index (TLI), the root mean squared error of approximation (RMSEA), and the standardized root mean squared residual (SRMR). In line with the recommendations put forward by Hu and Bentler (1999), we considered the fit of the RI-CLPMs as satisfactory if CFI > .95, TLI > .95, RMSEA < .06, and SRMR < .08. To handle the issue of missing data, we estimated the RI-CLPMs using full information maximum likelihood (FIML; Enders & Bandalos, 2001). In short, FIML handles missing data by computing a casewise likelihood function using only those variables that are observed for case i. Assuming multivariate normality, the casewise likelihood of the observed data is obtained by maximizing the following function:
log L i = K i 1 2 log i 1 2 x i μ i i 1 x i μ i $$ \mathit{\log}{L}_i={K}_i-\frac{1}{2}\mathit{\log}\left|{\sum}_i\right|-\frac{1}{2}{\left({x}_i-{\mu}_i\right)}^{\prime }{\sum}_i^{-1}\left({x}_i-{\mu}_i\right) $$ (3)
where Ki is a constant that depends on the number of complete data points for case i, xi is the observed data for case i, and μi and Σi contain the parameter estimates of the mean vector and covariance matrix, respectively, for the variables that are complete for case i (Enders & Bandalos, 2001). The casewise likelihood functions are accumulated across the full sample and maximized as follows:
logL μ = i = 1 N log L i $$ logL\left(\mu, \sum \right)=\sum \limits_{i=1}^N\mathit{\log}{L}_i $$ (4)

Importantly, FIML yields unbiased estimates under the assumption of data missing completely at random (MCAR), as well as the less restrictive and arguably more realistic assumption of data missing at random (MAR; Enders & Bandalos, 2001). All analyses were conducted in R version 4.2.2. The RI-CLPM were conducted using lavaan (Rosseel, 2012).

Results

The fits of the RI-CLPMs concerning empathy (CFI = .980, TLI = .970, RMSEA = .066, SRMR = .069), knowledge about how to protect oneself from getting infected (CFI = .981, TLI = .972, RMSEA = .058, SRMR = .062), and perceived moral costs of nonadherence (CFI = .972, TLI = .959, RMSEA = .076, SRMR = .072) were satisfactory overall.

Across all three RI-CLPMs, we found the variance for the random intercepts to be significant (all p < .001), suggesting that stable trait-like interindividual differences regarding individuals' empathy, knowledge about how to protect themselves from getting infected, perceived moral costs of nonadherence, and overall proclivity to adhere to recommended HPB were present. Moreover, as shown in Figures S4–S6, there was a substantial positive correlation (.47 ≤ r ≤ .65) between the random intercepts in all three RI-CLPMs (all ps < .001). Combined, this indicates that individuals who in general felt more empathic toward those most vulnerable to COVID-19, who were more knowledgeable about how to protect themselves from getting infected, and who found the perceived moral costs of nonadherence to be higher also reported to adhere more to recommended HPB on average. More concretely, this means that the relations between these three variables and individuals' proclivity to adhere to recommended HPB as they unfolded over time seem to have occurred at the trait-like interindividual level to a large extent.

Looking at the state-like intraindividual part of the three RI-CLPMs, we observed some autoregressive effects for empathy, knowledge about how to protect oneself from getting infected, perceived moral costs of nonadherence, and individuals' proclivity to adhere to recommended HPB (Figures S4–S6). This indicates that individuals who at some measurement occasions showed higher/lower levels on these four factors relative to their own expected level were likely to show higher/lower levels on the same factors relative to their own expected level at the next measurement occasion as well.

Regarding the reciprocal cross-lagged effects, we only found very limited evidence that individuals who at one measurement occasion, relative to their own expected level, felt more emphatic toward those most vulnerable to COVID-19, thought of themselves as more knowledgeable on how to avoid getting infected, or perceived the moral costs of nonadherence as higher were likely to report higher levels of adherence to recommended HPB, relative to their own expected level at the next measurement occasion. More specifically, we only observed one out of eight statistically significant (i.e. p < .05) cross-lagged effects from empathy to adherence to recommended HPB, two from knowledge about how to protect oneself from getting infected, and one from the perceived moral cost of nonadherence. Similarly, we found very few significant cross-lagged effects in the opposite directions. That is, we only observed one out of eight statistically significant (i.e. p < .05) cross-lagged effects from adherence to recommended HPB to empathy, one to knowledge about how to protect oneself, and two to the perceived moral cost of non-adherence. Taken together, these findings suggest that the relations between individuals' proclivity to adhere to recommended HPB and their feelings of empathy toward those most vulnerable to COVID-19, knowledge about how to protect themselves from getting infected, and perceived moral costs of not adhering to these behaviors, as they unfolded over time, occurred at the state-like intraindividual level to a very limited extent only.

DISCUSSION

What were relevant predictors of individuals' proclivity to adhere to recommended HPB during the COVID-19 pandemic in Denmark? Applying machine learning (namely, lasso regression) to a relatively large data set including 25 factors that have been suggested by other research (across countries) as relevant for adherence, we found empathy toward those most vulnerable to COVID-19, knowledge about how to protect oneself from getting infected, and the perceived moral costs of nonadherence to be the three most relevant predictors of individuals' self-reported proclivity for adherence to recommended HPB overall. Moreover, considering the time-dependent comparative predictive importance of these and several other factors, we consistently found empathy toward those most vulnerable to COVID-19, knowledge about how to protect oneself from getting infected, and the perceived moral costs of nonadherence to be among the most relevant predictors of individuals' proclivity to adhere to recommended HPB throughout the COVID-19 pandemic in Denmark. Possible explanations for these findings are that: empathy toward those most vulnerable to COVID-19 continuously motivated citizens to constrain the spread of the pandemic and thus adhere to recommended HPB; knowledge about how to protect oneself generally made it easier for citizens to understand the usefulness of different recommended HPB and thus increased their motivation to apply them; and the perceived moral costs of nonadherence made it psychologically costly and thus intrinsically unattractive for citizens who viewed adherence as a moral obligation to deliberately refrain from or unintentionally fail to comply with the recommendations.

Having identified these three factors out of 25 factors, we further explored whether their relations to individuals' proclivity for adherence to recommended HPB as they unfold over time occurred at the trait-like interindividual level, the state-like intraindividual level, or both. Based on three RI-CLPMs, we found that the relations, as they unfold over time, largely existed at the trait-like interindividual level, and only to a very limited extent at the state-like intraindividual level.

As compared with previous research using more complex machine learning techniques (e.g. random forest) to predict individuals' proclivity to adhere to recommended HPB, the simple lasso regressions used herein performed well in terms of predictive accuracy. More specifically, whereas previous research using more complex machine learning techniques have reported R2s ranging from <.01 to .59 (Hajdu et al., 2022; Pavlović et al., 2022; van Lissa et al., 2022), we observed R2s in the range of .30 to .38. The R2s reported herein are, despite the simple method used (i.e. lasso regression), thus in the (slightly above) average range of what other research has found, indicating that while the 25 factors considered herein clearly are not all that matters for individuals' proclivity to adhere to recommended HPB, they do capture a substantial amount of variance in individuals' self-reported proclivity for adherence.

With regard to the ranking of the comparative predictive importance of the factors considered herein and elsewhere to predict individuals' proclivity to adhere to recommended HPB (see Hajdu et al., 2022; Pavlović et al., 2022; Taye et al., 2023; van Lissa et al., 2022), there appears to be little agreement regarding which factors are the most predictive ones. This is interesting because it suggests that while some factors, such as age (Hajdu et al., 2022; Pavlović et al., 2022; Taye et al., 2023), seem to be predictive of individuals' proclivity to adhere to recommended HPB across contexts, most factors appear to be predictive only contextually. In other words, which factors are the most relevant predictors of individuals' proclivity to adhere to recommended HPB is likely to depend on the (sociocultural) context in which this questions is studied as well as the specific data and methods used.

Limitations

Some limitations of our investigation should be acknowledged. First, our findings are likely to be dependent on the nature of the COVID-19 pandemic. As a consequence, the findings presented herein might not readily transfer to future pandemics that may differ in their scope, severity, or other aspects from that of the COVID-19 pandemic. Second, while we considered 25 factors in Study 1, there are arguably many more relevant predictors of individuals' proclivity to adhere to recommended HPB, some of which may be even more relevant predictors than the ones considered herein. Third, the findings presented herein might be specific to the Danish context in which both studies were conducted. In other words, the generalizability of the findings presented herein might be limited to Denmark and/or rather to countries that are similar to Denmark in certain aspects such as potentially Finland, Norway, or Sweden. Fourth, because our results exclusively rely on self-report data, it is unclear to what extent the 25 factors considered herein are predictive of individuals' actual proclivity to adhere to recommended HPB. Yet, as self-reports of past behavior have been shown to predict actual behavior (Parry et al., 2021), it seems likely that our results at least conceptually capture the predictive relations between the 25 factors considered herein and individuals' actual proclivity to adhere to recommended HPB. Addressing these limitations, future research should seek to test whether the factors identified herein as relevant predictors of individuals' self-reported proclivity to adhere to recommended HPB are also predictive of individuals' actual level of adherence across different samples, countries, and pandemics.

Theoretical and practical implications

Despite these limitations, our findings have several theoretical and practical implications. From a theoretical perspective, our findings suggest that stable trait-like interindividual differences are likely to matter more for individuals' proclivity to adhere to recommended HPB over time as compared with corresponding state-like intraindividual fluctuations—at least, when it comes to empathy, knowledge about how to protect oneself from getting infected, and the perceived moral costs of nonadherence (which seem to be particularly relevant predictors). The fact that we found virtually no reciprocal relations between individuals' proclivity to adhere to recommended HPB and their empathy, knowledge, and perceived moral costs of nonadherence at the state-like intraindividual level further suggests that if any causal relations between these three factors and individuals' proclivity for adherence exist, they are likely to exist at the trait-like interindividual level. Importantly, though, it might still be that causal relations (also) exist on the state-like intraindividual level at a more fine-grained temporal resolution (e.g. weekly, daily, or hourly) than the one considered herein (i.e. monthly). Irrespective of this, given the correlational nature of our findings at the trait-like interindividual level, we cannot rule out the possibility that the observed relations between individuals' proclivity to adhere to recommended HPB and their empathy, knowledge, and perceived moral costs of nonadherence are driven by unobserved variables or represent a case of reverse causality.

From a practical perspective, our findings suggest that governments (and arguably Danish governments in particular) who want (or need) to promote and maintain high levels of public adherence to recommended HPB in future epidemics or pandemics should carefully consider whether to implement interventions aimed at changing empathy, knowledge about how to protect oneself from getting infected, and perceived moral costs of nonadherence at the state-like intraindividual level. If anything, governments should rather strive to develop interventions aimed at stimulating lasting changes in these factors at the trait-like interindividual level either before or at the beginning of the pandemic. Using knowledge about how to protect oneself from getting infected as an example, governments may, for instance, seek to educate citizens on this matter already before new pandemics break out by making it part of the core curriculum in primary and/or secondary schools. Adding to this, governments may further seek to roll out massive information campaigns at the onset of new pandemics, as opposed to spending a vast amount of resources on continuously educating/reminding citizens on how to best protect themselves throughout the pandemic—unless, of course, new and important knowledge on this matter becomes available.

ACKNOWLEDGMENTS

This research project was funded by grants from both the Lundbeck Foundation (R349-2020-592) and the Faculty of Social Sciences, University of Copenhagen (Denmark), to Robert Böhm and Ingo Zettler.

    CONFLICT OF INTEREST STATEMENT

    The authors have no conflict of interest to disclose.

    ETHICS STATEMENT

    Data collection of the surveys underlying Studies 1 and 2 were approved by the Faculty of Social Sciences of the University of Copenhagen (#514-0136/20-2000) and followed a publicly available study protocol (https://www.psycharchives.org/en/item/8a92091d-a1b6-42ac-ae53-7ca70ed2ccc2). Participation was voluntary, and informed consent was obtained from all participants. All analyses reported herein are exploratory. Data and analyses scripts are available via the Open Science Framework (OSF): https://osf.io/as9mx/. Please note that, in line with the European General Data Protection Regulation, the raw data underlying Study 2 cannot be shared; therefore, we provide a synthetic version of this dataset created with the synthpop package in R (Nowok et al., 2016).

    DATA AVAILABILITY STATEMENT

    Data and analyses scripts are available via the Open Science Framework (OSF): https://osf.io/as9mx/. Please note that, in line with the European General Data Protection Regulation, the raw data underlying Study 2 cannot be shared; therefore, we provide a synthetic version of this dataset created with the synthpop package in R (Nowok et al., 2016).

    • 1 Note that in some contexts, there are arguably no potential reputational costs associated with not adhering to the recommended HPB and that in some contexts there might even be potential reputational costs associated with adhering to recommended HPB. At a conference hosted by conspiracy theorists who happen not to believe in the existence of COVID-19, for instance, one might be publicly shamed for practicing social distancing. Yet, across most contexts, and especially in the context of the present research conducted in Denmark where public support for the government's response to COVID-19 was strong (Jørgensen et al., 2021), it seems reasonable to assume that there are potential reputational costs associated with not adhering rather than adhering to the recommended HPB.
    • 2 For more information, see: https://www.who.int/europe/tools-and-toolkits/who-tool-for-behavioural-insights-on-covid-19 and https://www.psycharchives.org/en/item/8a92091d-a1b6-42ac-ae53-7ca70ed2ccc2.
    • 3 Note that the explicit aim of the Brief HEXACO-Inventory is to assess the six HEXACO dimensions both broadly and briefly, resulting in some low internal consistency estimates per factor (see De Vries, 2013).