Introduction

Gender inequalities women faced put their health at risk. Women are more likely to face greater barriers in accessing health services than men. These greater barriers include, but are not limited to, restrictions on women’s decision-making power and mobility, discriminatory attitudes towards women, as well as a lack of awareness on women’s health needs. Therefore, women are at greater health risks and suffer from less health treatment or supportive assistance. Estimates from the World Health Organization (WHO) show that 18% of women have disability compared to only 14.2% of men [1]. Persons with disabilities suffer from ableism and stigmatization in all aspects of life, contributing to their poorer physical and mental health. In addition, due to the limitations on daily functions than other persons, persons with disabilities often rely on caregivers for daily life.

Gender inequalities affect not only the health outcomes of women with disabilities, but also the health outcomes of female caregiver. According to a joint report by the International Labour Organization (ILO) and WHO The gender pay gap in the health and care sector, women in the health and care sector face a larger gender pay gap compared to other economic sectors, earning an average of 24% less than male counterparts [2]. The report published by WHO Fair share for health and care: gender and the undervaluation of health and care work illustrates that chronic underinvestment in health and care is exacerbating the global care crisis, women undertake 67% of the global paid health and care workforce, and 76% of the global unpaid care activities, which reduces women’s participation in the paid labour market and hinders gender equality, and gender inequality in health and care work negatively affects women and health outcomes [3]. Compared with offspring caregivers, spousal caregivers are more likely to live with their recipients [4], experience more social isolation [5], higher financial and physical burden, as well as depressive symptoms [6]. Compared with other informal caregivers, spousal caregivers are more likely to sole caregivers in the end of life of persons with disabilities [7], and most often in high-intensity caregiving [8]. Overall, spousal caregivers‘ negative experiences are dominant when caregiving, especially for who care for severely disabled older in China [9].

Disability inclusion is critical to achieving health for all, countries have an obligation to address the health inequities faced by persons with disabilities. In 2016, the Chinese government launched a targeted disability inclusion action– Long-term care insurance (LTCI)-- in 15 cities. LTCI is part of China’s national health policy and systems research agenda on disability, it prioritizes health equity for persons with disabilities, and provides a continuum of care in the form of cash or person-centered basic life care services and basic medical care services at home or in institutions, with approximately 70% reimbursement from the LTCI fund. In addition, LTCI includes family caregivers, especially women caregivers, in the paid labour force, and economically empowers them.

As recipients, persons with disabilities were directly affected by LTCI. Previous literature found that LTCI not only reduced instrumental activities of daily living (IADL) and activities of daily living (ADL) scores among middle-aged and older adults, with urban residents benefited more [10, 11], but also improved their self-rated health [12] and reduced their depressive symptoms [13,14,15,16]. Compared with healthy older adults, LTCI had a more statistically significant positive effect on depression, mental state and episodic memory in older persons with disabilities by reducing healthcare costs, increasing daily companionship and social interaction [17]. LTCI beneficiaries were 8.8% more likely to self-report better health and 2.72 days longer hospital stays than non-beneficiaries [18]. Previous literature also found that LTCI not only reduced outpatient visits, hospital stays, hospital costs, and health insurance expenditures by 8.1%, 41.0%, 17.7%, and 11.4%, respectively [19], but also benefited mortality, survival time, and ADL in older adults with disabilities, and the effect of LTCI on mortality reduction was more pronounced in areas with abundant care resources [20].

LTCI also affected caregivers’ health outcomes, because it included family caregivers in the paid labour force, economically empowered them. Previous literature found that caregiver was a stressful role [21], with higher rates of depressive symptoms compared with non-caregivers [22], and higher role overload was associated with more depressive and lower psychological well-being among spousal caregivers of older adults with disabilities [23]. A cross-country study found that respite care and nursing allowance reduced the deterioration of self-rated health among family caregivers [24], because burden care on family caregivers was reduced after using formal care provided by LTCI [25], with probability and intensity of informal care use were reduced by 5.7% and 17.4%, respectively, but there was no statistically significant policy effect for older adults with high or low incomes [26]. Previous literature also found that LTCI not only reduced the burden on informal caregivers, but also increased their participation in the labor market, there was a more pronounced reduction in care burden among spouses and LTCI primarily benefited informal carers who provided care for low-income or farmer older people [27]. Compared with urban spousal caregivers in China, rural spousal caregivers benefited more from LTCI on health [28]. However, study from Germany also found that home care allowance provided by LTCI did not affect caregivers’ physical health [29].

The effect of LTCI also spilled over to persons who without disabilities and are non-caregivers. Previous literature found that LTCI had beneficial health effect on non-caregiver family members, with greater benefits for family members with lower education levels and lower household incomes [30]. LTCI improved self-rated health among older adults, because LTCI might has a reassuring effect [31], while for middle-aged and older adults, LTCI was only effective for urban residents rather than rural residents [32]. LTCI also statistically significant reduced the depression in middle-aged and older adults [11, 33], as well as out-of-pocket inpatient and outpatient costs [12], number of outpatient visits, hospitalizations and days in hospital [11, 34]. However, Previous studies also found that the effect of LTCI on depression might not be significant, because improvement in depression might take a longer time [12], and there was no statistically significant reduction in the number of chronic diseases due to LTCI [35].

Efforts in achieving health for all must focus on reaching people most often left behind such as marginalized, stigmatized and geographically isolated people, with a particular focus on those in situations of increased vulnerability. Compared with men, women are disadvantaged by discrimination rooted in socio-cultural factors in many societies, such as unequal power relationships, and social norms associated with women that decrease education and paid employment opportunities, resulting the health outcomes of women are of particular concern. Compared with others, persons with disabilities and caregivers are also in situations of increased vulnerability. Women’s health vulnerability is evident compared with men, however, the health vulnerability in women with different roles is inconsistent and unclear. Therefore, the objectives of our study were evaluating the effect of LTCI on health outcomes for women in different roles (rather than women compared with men), including female recipients (i.e., women with disabilities), female caregivers, and female non-recipients and female non-caregivers, using staggered difference-in-differences (DID) method and nationally representative health survey data, then discussing the heterogeneity of this effect based on geographic regions and urban-rural disparities.

Our contributions were reflected in the following aspects. Firstly, compared with the current literature, we looked beyond the single perspective of recipients or caregivers to women groups, who were previously neglected with a higher prevalence of disability, the majority of global care workforce and poorer health outcomes, and evaluated the effect of LTCI on women health from a comprehensive perspective of recipients, caregivers, and non-recipients and non-caregivers. Secondly, in practical contributions, our study demonstrated that the effect of LTCI on women’s health varied across roles, geographic regions, and urban-rural disparities, which was conducive to deliver differentiated health interventions for recipients, caregivers, and non-recipients and non-caregivers in the implementation process of LTCI in developing countries, with a focus on marginalized, stigmatized, and geographically isolated groups of women. Thirdly, in social contributions, our study demonstrated that LTCI achieved tripartite welfare improvements for female recipients, female caregivers, and female non-recipients and female non-caregivers, which provides policy implications for reaching people most often left behind and in situations of vulnerability, as well as minimizing health inequalities within women.

Methods

Data

Our data drew from China Health and Retirement Longitudinal Study (CHARLS). CHARLS collected a high quality nationally representative sample of Chinese residents aged 45 and over, adopting multi-stage stratified PPS sampling. CHARLS questionnaire included basic personal and community information, family structure and financial support, health status and physical measurement, medical service utilization and work, and retirement and pension. The baseline wave was fielded in 2011, including 10,000 households and 17,500 individuals in 150 counties/districts and 450 villages/resident committees. These samples were tracked every two or three years thereafter, currently 2020 wave was the latest update. However, previous information had many missing values in 2020 wave, due to the addition of coronavirus disease pandemic module and the simplification of information in other modules. Therefore, we retained data from 2011, 2013, 2015 and 2018 wave in CHARLS.

Regarding sample screening, we preserved samples of women aged 45 and over and removed missing values for related variables. Recipients in our study were identified as persons with disabilities who had difficulties with ADL or IADL. ADL reflected the number of items that respondents have difficulties in six basic activities, such as bathing, dressing, eating, getting in and out of bed, going to the toilet, and controlling urine, value ranged from 0 to 6, the higher the scores, the worse the ADL. IADL reflected the number of items in which respondents have difficulties in five instrumental activities, such as financial management, taking medicine, shopping, cooking, and housework, value of IADL ranged from 0 to 5, the higher the scores, the worse the IADL. Caregivers in our study referred to spousal caregivers, who helped their spouse (not parents, offspring, or other people) with any ADL or IADL. Non-recipients and non- caregivers were identified as a collection of people neither recipients nor spousal caregivers. To eliminate interference, we excluded cities that implemented LTCI on their own without the approval of central government. Finally, we obtained 16,707 samples of women aged 45 and over, including 3,962 in 2011 wave, 3,833 in 2013 wave, 4,064 in 2015 wave and 4,848 in 2018 wave.

Measures

The dependent variables were health outcomes. Health outcomes were measured with representative indicators: self-rated health [12, 31, 36, 37], depression [13,14,15, 28], and chronic diseases [35]. Self-rated health corresponded to the question in CHARLS “How would you rate your health status?“. The response options were reversed encode to “1 very poor, 2 poor, 3 fair, 4 good, and 5 very good”, then we standardized the response options by converting continuous variables into a dummy variable, where “1 very poor, 2 poor” was encoded as “0 poor”, otherwise encoded as “1 good”. Depression levels were calculated from the Center for Epidemiologic Studies Depression Scale-10 (CESD-10) with value ranged from 0 to 30, a score of 10 and above was considered depressed, and the higher the scores, the more severe the depression. Chronic diseases referred to the number of chronic diseases that respondents suffered from.

The independent variable was the implementation of LTCI in pilot cities, which involved two key data– Treat and Post. Treat represented cities (pilot and non-pilot cities), Post represented policy periods (before and after LTCI). Value range of LTCI was 0 to 1. Value 1 indicated that this city not only belonged to national LTCI pilot cities, but also posted LTCI in survey year. Value 0 indicated that this city didn’t belong to national LTCI pilot cities or didn’t post LTCI in survey year. Figure 1 reports the evolution of 15 national LTCI pilot cities. Only a few cities implemented LTCI in 2016 and before, and most cities did so in 2017 and later.

Fig. 1
figure 1

Evolution of 15 national LTCI pilot cities

All analyses included series of control variables associated with women’s health outcomes according to previous studies, particularly social determinants of health, risk factors, and health system factor [17, 28, 32, 33, 38,39,40,41,42]. Control variables were specifically measured by age, public pension, education, employment, living conditions, social participation, residency arrangements, number of children, marriage status, smoking, alcohol consumption, future ADL help, and social medical insurance.

Table 1 reports definition and descriptive statistics of main variables in 16,707 women. Overall, 70.92% of women self-rated their health as good, their average depression levels exceeded 9.29, and 1.78 was their average number of chronic diseases. About 6.21% of women lived in cities where had implemented LTCI policy. Their average age was 60.77 years old, 33.99% of women received public pension, and only 22.15% of women had an education level of middle school and above. 56.77% of women worked in any job in the past year, 61.61% of women lived in rural village, and 48.09% of them participated in any social activities within the past month. 52.71% of women co-lived with their children, 2.82 was their average number of children, and only 0.07% of women unmarried. 5.36% of women currently had the habit of smoking and 13.81% of women had alcohol consumption within the past 12 months. Only 1.26% of women felt that professionals will be able to help them with their ADL needs in the future, while 93% of women participated in social medical insurance.

Table 1 Definition and descriptive statistics (N = 16,707)

Statistical analysis

The staggered difference-in-differences (DID) method is a quasi-experimental technique for constructing a counterfactual framework [43]. Due to the staggered treatment timing, we established the staggered DID method with reference to other scholars [44]:

$$\begin{aligned}{\text{H}\text{E}\text{A}}_{\text{i}\text{c}\text{t}}=&{{\alpha\:}}_{1}+{{\theta\:}}_{1}{{\text{T}\text{r}\text{e}\text{a}\text{t}}_{\text{i}\text{c}}\text{*}\text{P}\text{o}\text{s}\text{t}}_{\text{c}\text{t}}+\\&{{\lambda\:}}_{1}{\text{Z}}_{\text{i}\text{c}\text{t}}+{{\eta\:}}_{\text{c}}+{{\mu\:}}_{\text{t}}+{{\epsilon\:}}_{\text{i}\text{c}\text{t}}\end{aligned}$$
(1)

Where \({\text{H}\text{E}\text{A}}_{\text{i}\text{c}\text{t}}\) denotes health outcomes of women i who live in city c in time t. \({\text{T}\text{r}\text{e}\text{a}\text{t}}_{\text{i}\text{c}}\) represents the treated group status (i.e., pilot list status) of women i who live in city c. \({\text{P}\text{o}\text{s}\text{t}}_{\text{c}\text{t}}\) represents LTCI post status of city c in time t. \({{\theta\:}}_{1}\) measures the effect of LTCI on health outcomes in women aged 45 and over. \({{\lambda\:}}_{1}\) is a vector of control variables \({\text{Z}}_{\text{i}\text{c}\text{t}}\). \({{\eta\:}}_{\text{c}}\) and \({{\mu\:}}_{\text{t}}\) represent city and year fixed effect. \({{\epsilon}}_{\text{i}\text{c}\text{t}}\) represents random perturbations that affect health. Finally, the standard errors were clustered at city level to correct for possible autocorrelation and heteroscedasticity.

The parallel trend hypothesis is a key prerequisite for constructing staggered DID method, which requires women’s health outcomes trends in pilot and non-pilot cities must be parallel before implementation of LTCI. Therefore, using the event-study method proposed by other scholars [45], we established a parallel trend test model:

$$\begin{aligned}{\text{H}\text{E}\text{A}}_{\text{i}\text{c}\text{t}}=&{{\alpha\:}}_{1}+{{\theta\:}}_{\text{t}}\sum\:_{-3}^{3}{{\text{T}\text{r}\text{e}\text{a}\text{t}}_{\text{i}\text{c}}\text{*}\text{P}\text{o}\text{s}\text{t}}_{\text{c}\text{t}}\\&+{{\lambda\:}}_{1}{\text{Z}}_{\text{i}\text{c}\text{t}}+{{\eta\:}}_{\text{c}}+{{\mu\:}}_{\text{t}}+{{\epsilon\:}}_{\text{i}\text{c}\text{t}}\end{aligned}$$
(2)

Where \({{\theta\:}}_{\text{t}}\) reflects the health outcomes disparities in pilot and non-pilot cities in time t of LTCI policy posted. There were few data 4 years before LTCI and 3 years after LTCI, so we aggregated data 4 years before LTCI into year − 3, data 3 years after LTCI into year 3, and considered year − 4 as the base year. Other variables are synonymous with Eq. (1).

Results

Main results

Table 2 reports the results obtained by staggered DID model. In women samples, the coefficient of Treat*Post was 0.0289 (significant level was 5%) in self-rated health, -0.7727 (significant level was 1%) in depression, and − 0.0447 in chronic diseases (not significant). The coefficient of Treat*Post was 0.0664 (significant level was 5%) in self-rated health in the sample of female recipients, -0.3767 in chronic diseases (significant level was 10%) in the sample of female caregivers, -0.7754 (significant level was 1%) in depression in the sample of female non-recipients and female non-caregivers, and the coefficient of Treat*Post was not significant in other columns.

The results indicated that compared with women in non-pilot cities, LTCI statistically significant increased self-rated health and reduced depression levels in women in pilot cities, and improved the health in women with different roles by increasing self-rated health in female recipients, reducing the number of chronic diseases in female caregivers, and reducing depression levels in female non-recipients and female non-caregivers. However, there was no evidence that LTCI statistically significant improved the health among women with different roles in pilot cities in other situations.

Table 2 The results of staggered DID model in women with different roles

Parallel trend and robustness tests

Table 3 reports the parallel trend test results obtained by event-study method at 95% confidence interval with considering year − 4 as the base year. The results indicated that overall the coefficient of Treat*Post were not statistically significant in year − 3, year − 2 and year − 1 (except for self-rated health among women in year − 3 and among female recipients in year − 1), while in year 0 and after, they were mostly not only statistically significant, but also preserved correct treatment effect signs. Therefore, our sample generally passed parallel trend test.

Table 3 Parallel trend test

To evaluate the potential bias risks associated with staggered DID and increase the credibility in results [46], we conducted robustness checks from three aspects. Firstly, considering that even in the absence of time-varying processing points, some covariates may lead to bias in the two-way fixed effects estimates, we reported results of the stacked regression without covariates in women samples to understand the robustness of the effect estimates and the degree to which they rely on the inclusion of controls. Secondly, we redefined dependent variables in samples of women with different roles. Specifically, self-rated health was redefined by continuous variables, chronic diseases number was redefined by prevalence of chronic disease comorbidities, and depression levels was redefined by prevalence of depression. Thirdly, we conducted a placebo test by randomly selecting the treated group and pilot time and iterating this random selection process 500 times.

Table 4 reports the results of robustness test 1 and test 2. The results indicated that LTCI still statistically significant increased women’s self-rated health and reduced depression, increased self-rated health for female recipients, reduced chronic diseases for female caregivers, and reduced depression for female non-recipients and female non-caregivers, which were consistent with those in staggered DID model.

Table 4 Robustness test 1 and test 2

Figure 2 reports kernel density distribution of the regression coefficients across 500 iterations of simulations for women, female recipients, female caregivers, and female non-recipients and female non-caregivers. It can be found that the regression coefficients were centrally distributed around the 0 value, obeying the normal distribution, while the actual estimated coefficients (vertical dotted line) were obviously an outlier or far away from the 0 value, which indicated that the randomness factor had no statistically significant effect on our results, the placebo test passed. Therefore, our results were generally robust.

Fig. 2
figure 2

Placebo test in women with different roles

Analysis of heterogeneity

We explored the geographical and social heterogeneity of LTCI’s effect on women’s health outcomes. Table 5 reports the geographical heterogeneity results for women. The results indicated that in the west and northeast of China, compared with women in non-pilot cities, LTCI statistically significant reduced women’s depression (the coefficient of Treat*Post was − 0.6324, significant level was 5%) and chronic diseases (the coefficient of Treat*Post was − 0.1544, significant level was 5%) in pilot cities. In the east and central, LTCI statistically significant reduced women’s depression (the coefficient of Treat*Post was − 0.7628, significant level was 1%) in pilot cities compared with women in non-pilot cities. In other words, there was a more pronounced improvement in health outcomes among women in the west and northeast.

Table 5 Geographical heterogeneity of women

Table 6 reports the social heterogeneity results of women. The results indicated that in rural village, compared with women in non-pilot cities, LTCI statistically significant increased women’s self-rated health (the coefficient of Treat*Post was 0.0464, significant level was 5%) and reduced the depression (the coefficient of Treat*Post was − 1.2605, significant level was 1%) in pilot cities. However, there was no evidence that LTCI statistically significant affected health outcomes for women lived in urban community in pilot cities compared with women in non-pilot cities. In other words, there was a more pronounced improvement in health outcomes among women in rural village.

Table 6 Social heterogeneity of women

Discussion

Using nationally representative sample and staggered DID method, we demonstrated that compared with women in non-pilot cities, LTCI statistically significant increased self-rated health and reduced depression in women in pilot cities, and improved the health in women with different female roles. Specifically, compared with female counterparts in non-pilot cities, LTCI statistically significant increased female recipients’ self-rated good health, reduced female caregivers’ chronic diseases, and reduced female non-recipients’ and female non-caregivers’ depression in pilot cities.

Possible explanations were as follows. On the one hand, for recipients, who were more vulnerable than persons without disabilities, the implementation of LTCI optimizing their affordability, accessibility, and quality of care services, alleviating their disability levels [10, 11], reducing medical costs [19], and increasing daily companionship and social interaction they obtained [17], thereby improving the health outcomes of female recipients. Our results were consistent with previous studies [12,13,14,15,16]. On the other hand, for caregivers, caregiver was a stressful role [21], spousal caregivers’ negative experiences were dominant while caregiving, with higher financial and physical burden, depressive symptoms [6] and high-intensity caregiving [8]. However, the implementation of LTCI reduced their caregiving burden [25], especially for spousal caregiver [27, 30], reduced their probability and intensity of informal care use [26], thereby improving the health outcomes of female caregivers. For non-disabled women without providing care, even though they did not obtain care services and cash grants, LTCI make them reassured by providing protection against the risk of future disability, and reduced their probability of providing care and increased the duration of nighttime sleep [30], thereby improving the health outcomes of female non-recipients and female non-caregivers. Our results were supported by previous studies [12, 18, 24, 28, 33].

We also demonstrated that the effect of LTCI on women’s health outcomes was geographically and socially heterogeneous. Compared with women in in the east and central and women in urban community, the positive effect of LTCI on women’s health outcomes was more pronounced in women in the west and northeast and women in rural village. Potential explanations were as follows. Firstly, we consider the reason why the positive effect of LTCI on women’s health outcomes was more pronounced in women in the west and northeast was that regional economic differences produced differences in burden of care on family. Compared with the east and central where the economic level was better in China, the west and northeast where the economic level was worse, the purchasing power for formal care services was limited, resulting in a heavier burden of care on families. After the implementation of LTCI, families with disabilities in the west and northeast could obtained care services or cash benefits from LTCI, which greatly alleviated their burden of care.

Additionally, we consider the reason why the positive effect of LTCI on women’s health outcomes was more pronounced in women in rural village was that the rural-urban gap in chronic investment in health and care work. Chronic underinvestment in rural village was worse than that in urban community, women in rural village may undertake more unpaid care work and burden of care on family was heavier. It was estimated that 51–67% of rural population couldn’t obtain adequate basic health services, and in some countries the number of health workers available among rural population was 10 times lower than the number among urban population [47], the rate of medical rehabilitation services utilization in urban areas was almost twice that of rural village [48]. After the impletion of LTCI, the burden of care on family in rural village was greatly alleviated and caregivers were included more equitably in the paid labour workforce by obtaining care services or cash benefits.

The limitations of our study were that data on women’s health outcomes in CHARLS was self-rated and may be susceptible to memory biases. In addition, factors that influence women’s health outcomes were abundant, while we just controlled some of them. These limitations could be improved in future studies.

Conclusions

In summary, our study found that after the implementation of LTCI in China, health outcomes in women, including female recipients, female caregivers, and female non-recipients and female non-caregivers, were statistically significant improved, and the effect of LTCI on women’s health outcomes was geographically and socially heterogeneous. Our findings highlight the importance of delivering differentiated health interventions for recipients, caregivers, and non-recipients and non-caregivers in the implementation process of LTCI, and minimizing health inequalities in geography and society within women.