Introduction

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), responsible for the COVID-19 pandemic, has caused global attention not only due to its immediate symptoms derived from the virus itself but also owing to subsequent physical and mental health sequelae1. Since 2020, there have been over 761 million reported cases of SARS-CoV-2 infections, with ~6.8 million deaths among infected individuals2. Despite its relatively low fatality rate of ~1.3%3, ~10% of patients with SARS-CoV-2 infection report persistent, long-lasting comorbidities after infection termed post-acute COVID-19 condition4. Post-acute COVID-19 condition refers to persistent or new-onset health outcomes that last more than a month since SARS-CoV-2 infection, including both short-term (4–12 weeks) and long-term (>12 weeks) symptoms and sequelae5.

Given the nature of the SARS-CoV-2, the infection can trigger adverse effects on the respiratory system and post-acute COVID-19 respiratory sequelae. One previous study highlighted the association between acute respiratory complication or post-acute respiratory sequelae and COVID-196. Acute respiratory complication is an umbrella term that describes illnesses that affect the respiratory system in a sudden onset7,8. Post-acute respiratory sequelae refers to a broad category of long-term non-infectious respiratory diseases that affect the lungs and airways7,8. For perspective, the influenza virus is also a well-known viral inducer of respiratory failure9. However, insufficient attention has been given to the impact of SARS-CoV-2 infection on acute respiratory complications or post-acute respiratory sequelae in comparison with the influenza infection as a common respiratory viral infection.

Therefore, by using a binational, large-scale, long-term, population-based database with more than 22 million participants in South Korea and Japan, we aimed to investigate the impact of SARS-CoV-2 infection on pathological developments of acute respiratory complication or post-acute respiratory sequelae. We also examined whether COVID-19 vaccinations offer protection against COVID-19-related respiratory outcomes. Furthermore, we analyze the comparison of complications following SARS-CoV-2 infection versus following influenza infection.

Results

In the main cohort, there were a total of 10,027,506 participants with a mean age of 48.4 (standard deviation [SD], 13.4) years, of which 49.9% (5,000,621/10,027,506) were female (Table S1). The replication cohort includes 4,909,861 participants with a mean age of 46.8 (SD, 11.9) years and 38.3% (1,882,174/4,909,861) females (Table S2). Table 1 shows the baseline characteristics of the 1:5 propensity score matched cohort of South Korea. After 1:5 propensity score matching based on SARS-CoV-2 infection, we identified 82.9% (1,918,150/2,312,748) of participants without SARS-CoV-2 infection and 17.1% (394,598/2,312,748) of participants with SARS-CoV-2 infection, respectively.

Table 1 Baseline characteristics for 1:5 propensity score–matched cohort (COVID-19 vs. general population) in South Korea (main)

In the 1:3 propensity score-matched replication cohort, 74.4% (2,318,505/3,115,606) of participants without SARS-CoV-2 infection and 25.6% (797,101/3,115,606) of participants with SARS-CoV-2 infection were included in our final analyses (Table S3). The standardized mean differences (SMD) of all matching covariates in both multi-to-one propensity score-matched main and replication cohorts were smaller than 0.1 (Table 1).

In the main and replication cohorts, individuals with SARS-CoV-2 infection had a higher adjusted hazard ratio (HR) for post-acute respiratory sequelae compared to the general population (main: HR, 1.68 [95% confidence interval (CI), 1.62–1.75]; replication: HR, 3.32 [95% CI, 3.27–3.37]) in Table 2. Furthermore, patients with SARS-CoV-2 infection had an increased risk for acute respiratory complication compared to non-infected controls (main: HR, 8.06 [95% CI, 6.92–9.38]; replication: HR, 4.17 [95% CI, 3.90–4.45]). When directly comparing the risk for acute respiratory complication between SARS-CoV-2 and influenza infections, SARS-CoV-2 infection was significantly associated with an increased risk (main: HR, 4.32 [95% CI, 2.73–6.83]; replication: HR, 6.51 [95% CI, 5.38–7.87]) in Tables S4S6.

Table 2 HR (95% CI) for the post-acute respiratory sequelae or acute respiratory complications after SARS-CoV-2 infection in the propensity score-matched cohorts of South Korea (main) and Japan (replication)

Relative to the general population, patients with SARS-CoV-2 infection had significantly increased risk for several subtypes of post-acute respiratory sequelae, including chronic respiratory failure (main: HR, 8.92 [95% CI, 4.92-16.17]; replication: HR, 7.55 [95% CI, 6.35-8.97]), chronic obstructive pulmonary disease (COPD), emphysema, asthma, pulmonary sarcoidosis, and interstitial lung disease (main: HR, 10.38 [95% CI, 8.75-12.31]; replication: HR, 4.75 [95% CI, 4.54-4.97]) in Table 3. Notably, the risk for acute respiratory complication, including aspergillosis pneumonia (main: HR, 6.85 [95% CI, 3.48-13.50]; replication: HR, 4.97 [95% CI, 4.26-5.79]), pneumothorax, acute respiratory failure (main: HR, 112.04 [95% CI, 64.00-196.16]; replication: HR, 6.49 [95% CI, 6.32-6.65]) showed an increase in patients with SARS-CoV-2 infection compared to the general population. This tendency of increased risk for several subtypes of respiratory diseases was also shown when compared to patients with influenza infection and the overlap-weighted cohort. (Tables S7S10). Estimates of marginal prevalence showed that patients with COVID-19 had a higher prevalence compared to the general population (Tables S11 and S12).

Table 3 HR (95% CI) for the post-acute respiratory sequelae or acute respiratory complications subtypes after SARS-CoV-2 infection in the propensity score-matched cohorts in South Korea (main) and Japan (replication)

The risk of acute respiratory complication showed decreasing trends according to the number of SARS-CoV-2 vaccinations from individuals after once receiving vaccination (HR, 0.51 [95% CI, 0.38-0.68]) to those with two or more vaccinations (HR, 0.24 [95% CI, 0.19-0.30]). Interestingly, mixed types of vaccination showed the lowest risk of developing post-acute respiratory sequelae of all SARS-CoV-2 vaccination methods (HR, 0.18 [95% CI, 0.08-0.38]). The risks of acute respiratory complications were higher in patients with moderate to severe COVID-19 symptoms (HR, 39.54 [95% CI, 33.54-46.62]). Both the original strain and the delta variant of SARS-CoV-2 were shown to have a higher risk of acute respiratory complications (original strain: HR, 9.21 [95% CI, 7.19-11.80]; delta strain: HR, 7.44 [95% CI, 6.13-9.03]). In addition, the risk for post-acute respiratory sequelae also exhibited a similar pattern (Table 4 and S13).

Table 4 Subgroup analysis (COVID-19 vs. general population) of HR (95% CI) of the post-acute respiratory sequelae or acute respiratory complications after SARS-CoV-2 infection stratified by vaccination, COVID-19 severity, and SARS-CoV-2 strain in the cohort of South Korea (main)

Table 5 shows the risk of developing acute respiratory complications or post-acute respiratory sequelae based on how long it has been since the participant was infected with SARS-CoV-2 compared to the general population. The analysis of post-acute respiratory sequelae did not include the data of the first month of SARS-CoV-2 infection. The first 3 months after infection with SARS-CoV-2 had the highest risk of developing post-acute respiratory sequelae (main: HR, 2.51 [95% CI, 2.38-2.64]; replication: HR, 4.40 [95% CI, 4.30-4.51]). With increasing duration post-SARS-CoV-2 infection, the risk of post-acute respiratory sequelae significantly decreased, but the risk remained even after 6 months (main: HR, 1.10 [95% CI, 1.01-1.19]; replication: HR, 2.67 [95% CI, 2.61-2.73]). HR of time attenuation effect after SARS-CoV-2 infection showed significance compared to influenza infection likewise (Table S14). Similar associations were observed in the stratification analysis according to sex, age, household income, Charlson comorbidity index (CCI), body mass index (BMI), alcohol consumption, physical activity, region of residence, and income level and polymerase chain reaction (PCR) test in the propensity score-matched cohorts (Tables S15S24).

Table 5 Time attenuation effect analysis of HR (95% CI) for the risk of post-acute respiratory sequelae after SARS-CoV-2 infection in South Korea (main cohort) and Japan (replication cohort)

Discussion

Findings of this study

This is the first study to use population-based, binational large-scale cohort study databases from South Korean and Japanese nationwide cohorts that expresses the association of SARS-CoV-2 infection with acute respiratory complication or post-acute respiratory sequelae. First, the risk of acute respiratory complication or post-acute respiratory sequelae is significantly increased in participants with SARS-CoV-2 infection, compared to the general population. Second, SARS-CoV-2 infection induced a significantly increased risk for several specific post-acute respiratory sequelae, including chronic respiratory failure, COPD, emphysema, asthma, and interstitial lung disease, compared to the general population. In addition, several acute respiratory complications, including aspergillosis pneumonia, pneumothorax, acute respiratory failure, and pulmonary embolism, also depicted a notable increase in risk after SARS-CoV-2 infection compared to the general population. Third, people who were vaccinated, especially multiple vaccinations and mixed vaccinations, had a lower risk of developing post-acute respiratory sequelae than infected patients of SARS-CoV-2 without vaccination. Fourth, the risk of post-acute respiratory sequelae and acute respiratory complications increased with the severity of COVID-19. Fifth, infection of SARS-CoV-2 was associated with an increase of post-acute respiratory sequelae and acute respiratory complications regardless of the strain type. Lastly, the risk of post-acute respiratory sequelae diminished with time following SARS-CoV-2 infection yet persisted beyond 6 months post-infection.

Comparisons with previous studies

Some previous studies have examined the relationship between COVID-19 and respiratory complications. However, previous research encompassed countries such as Brazil (n = 88)10, Palestine (n = 705)11, and Netherlands (n = 257)12, with small cohorts and without a general population group as controls or influenza cohort as an external comparator. In addition, previous studies did not consider the subtypes of specific respiratory diseases, vaccinations, and severity of COVID-19 on the association between respiratory outcomes and SARS-CoV-2 infections13. Therefore, our study is distinct from other studies in that we compared the association of COVID-19 and respiratory diseases with that of influenza by using population-based binational cohorts with a generalizable scale (main cohort, total N = 10,027,506; replication cohort, total N = 4,909,861).

Possible explanations

Influenza and COVID-19 have similar symptoms such as fever, cough, shortness of breath, and sore throat14. However, it is known that people who are infected with SARS-CoV-2 may have more severe symptoms and take longer to recover than those infected with influenza15. The longer time and severity of COVID-19 may have had a greater influence on the patients and the overall immune system16,17. Severe COVID-19 can lead to lasting changes in hematopoietic stem and progenitor cells and immune cell phenotypes in individuals while they are recovering from COVID-1916. Also, T cells can be impaired in severe disease and can be associated with intense activation and lymphopenia18. Therefore, SARS-CoV-2 infection can make the patients more vulnerable to developing other respiratory diseases. This aligns with our findings that indicate that infection of SARS-CoV-2 has a greater influence on developing acute respiratory complications or post-acute respiratory sequelae compared to influenza.

Unlike influenza, SARS-CoV-2 infection induces a fibrosis-associated transcriptional profile in pulmonary macrophages, characterized by elevated levels of transforming growth factor beta 1 and transforming growth factor beta induced, as well as other proteins like macrophage mannose receptor 1 and cluster of differentiation 16319,20. This gene expression pattern enhances the profibrotic functions of macrophages20, potentially leading to acute respiratory distress syndrome. Aspergillosis pneumonia, an infection caused by inhaling spores of the fungus Aspergillus, prevalent in the natural environment21, does not usually develop. However, there are many countries that use antibiotics for COVID-19 treatment22,23,24, and overuse of antibiotics can heighten the risk of aspergillosis pneumonia development since the antibiotics may cause a disturbance to the immune system25,26. This explains our finding that specific diseases of acute respiratory complication or post-acute respiratory sequelae depicted a notable increase of risk after SARS-CoV-2 infection compared to influenza virus infection.

Many studies have confirmed that vaccination significantly reduces the infection rate of SARS-CoV-2 and the severity of COVID-19 symptoms27,28. The efficacy of vaccination is more profound in preventing severe cases and deaths29. The reduced severity due to vaccination may positively affect immune resilience30, thereby decreasing the incidence of acute respiratory complication or post-acute respiratory sequelae. Given that the efficacy of the vaccination drops in the first 6 months, booster vaccination might be essential to sustain protective effects29. This is consistent with the result of this study in that multiple vaccinations decreases the development of acute respiratory complications or post-acute respiratory sequelae. Furthermore, many studies showed that mixing types of vaccination for COVID-19 may lower the risk of SARS-CoV-2 infection31. This approach could also mitigate the development of future respiratory complications.

Post-recovery from COVID-19, the immune system undergoes reconstruction32. However, the elevated interferon responsive genes in monocytes can still be found after 4 months since the infection33, which implies that the immune system is not fully recovered after 4 months, and constant attention must be paid to the patients. Our findings show that as time passes after initial infection with SARS-CoV-2, the risk of developing acute respiratory complications or post-acute respiratory sequelae gradually decreased. However, the risk for respiratory sequelae in post-acute COVID-19 condition persisted beyond 6 months post-infection.

Limitations and strengths

This is the first study to utilize binational, large-scale, population-based databases to examine risk for respiratory sequelae in acute or post-acute COVID-19 conditions in patients with SARS-CoV-2 infection. However, some limitations must be taken into consideration. First, although the database used is a highly credible database that covers 98% of the Korean population and 40% of the Japanese population, individuals who could be vulnerable to influenza and COVID-19, such as immigrants and undocumented immigrants, are left out of the database34,35,36. Likewise, the JMDC database does not include the entire Japanese population and may have potential bias. Second, our data is limited to the East Asian population, specifically South Korea and Japan. Therefore, our study is difficult to generalize to other ethnic groups. Third, the K-COV-N and JMDC datasets are heterogeneous. Therefore, we opted against merging the datasets, using the K-COV-N data for the main cohort and the JMDC data for the replication cohort. In addition, we used different lists of the covariates for each main and replication cohort due to the difference in data structure. Fourth, the dataset we utilized has a risk of underdiagnoses of SARS-CoV-2 and influenza infection. There is a possibility of overlooking patients who were infected with SARS-CoV-2 or influenza but did not take the PCR test or visit a hospital to receive treatment. However, to assess the potential underdiagnoses, we analyzed the HR of post-acute respiratory sequelae and short-term acute respiratory complications with participants after PCR tests. Fifth, the HR and its 95% CI for the risk of asthma may differ from previous research due to the difference in experimental designs, including study population or definition of exposure. Sixth, the propensity score-matched cohort had differential missingness between those infected with SARS-CoV-2 and the general population for national health examination information variables (BMI, blood pressure, fasting blood glucose, glomerular filtration rate, smoking status, alcohol consumption, and aerobic physical activity; >40% versus <1%), due to their exclusion from matching criteria.

Policy implications

This binational, large-scale, population-based cohort study further emphasizes risks in relation to SARS-CoV-2 infection, the importance of vaccination, efficient vaccination methods, and post-acute COVID-19 conditions with an emphasis on acute respiratory complications or post-acute respiratory sequelae. These findings depict a need for different health policies to manage social health. To minimize adverse respiratory outcomes after being infected with SARS-CoV-2, the government should make policies to mix and match the vaccine types to individuals. Individuals should be investigated even after full recovery from COVID-19 to resolve post-acute COVID-19 conditions.

In conclusion, this study emphasizes that the risk of developing acute respiratory complications or post-acute respiratory sequelae in post-COVID-19 condition is associated with infection of SARS-CoV-2, and the risk was more pronounced with increasing COVID-19 severity. People who were vaccinated had a lower risk of developing acute respiratory complications or post-acute respiratory sequelae than those without vaccination. While the risk of acute respiratory complications or post-acute respiratory sequelae decreases with time post-SARS-CoV-2 infection, it remains evident beyond 6 months. Therefore, our findings suggest that the potential risk of respiratory sequelae in acute or post-acute COVID-19 conditions accentuates the imperative for continued vigilance and response to SARS-CoV-2.

Methods

Data source

Utilizing large-scale, population-based binational cohorts, this study incorporated a South Korean nationwide claim-based cohort (K-COV-N cohort; total N = 10,027,506) for the main cohort and a Japanese claim-based cohort (JMDC cohort; total N = 4,909,861) for the replication cohort (Fig. 1)37. Both the K-COV-N and the JMDC cohorts were constructed through data derived from a universal health insurance claims system. This study received approvals from the Korea Disease Control and Prevention Agency (KDCA), National Health Insurance Service (NHIS; KDCA-NHIS-2022-1-632), JMDC (PHP-00002201-04), and the Institutional Review Board of Kyung Hee University (KHSIRB-23-241). Under the terms of the approval, patient consent was not required to use routine health records for our study.

Fig. 1
figure 1

Study population in the main cohort (South Korea) and replication cohort (Japan).

K-COV-N cohort for main cohort

We utilized the NHIS database, which is a large-scale, nationwide, general, population-based cohort in South Korea, covering 98% of the population for the main cohort34. The NHIS and the KDCA provided data for the cohort constructed for study purposes, which includes participants ≥20 years old with a record of medical examination from January 1, 2018, to December 31, 2021 (total N = 10,027,506). The dataset consists of national health examination information, death records, health insurance data including individual demographic information, outpatient/inpatient records, and pharmaceutical data from the NHIS and COVID-19 vaccination data, SARS-CoV-2 test results, and COVID-19-related outcomes from the KDCA. The constructed K-COV-N database embodies the following characteristics, thereby affirming its significance: (1) the Korean government has established an extensive healthcare system to provide coverage for individuals infected with SARS-CoV-2; (2) all patient-related data was anonymized by the Korean government34,36; and (3) according to the prior study, the diagnostic records from the NHIS had a predictive accuracy of 82%38.

The previous diagnostic history was assessed during the pre-observation period from 2018 to 2019, the follow-up observation period was between 2020 and 2021. The follow-up date ended on December 31, 2021, at death, or at development of primary outcomes (Fig. S1). We excluded participants with the following criteria: (1) insufficient demographic information and those who died before (excluded n = 3,967,482); and (2) previous history of chronic respiratory disease in the pre-observation period (excluded n = 710,468).

Exposures

Exposure to SARS-CoV-2 is defined as an infection validated using a real-time reverse transcriptase polymerase chain reaction (RT-PCR) assay or antigen test on nasal and pharyngeal swabs, as approved by the KDCA39. Patients necessitating intensive care, oxygen therapy, extracorporeal membrane oxygenation, renal replacement, or cardio resuscitation were classified as having moderate to severe COVID-1940. All other cases were categorized as having mild COVID-1941. COVID-19 vaccination was classified according to the number of vaccinations (unvaccinated, 1, and ≥2 times) and vaccine type (unvaccinated, mRNA vaccinated [Pfizer-BioNTech and Moderna], viral vector vaccinated [Oxford-AstraZeneca and Johnson & Johnson/Janssen], and vaccinated with both types)36. Only vaccination status before SARS-CoV-2 infection was considered in our analysis. In South Korea, SARS-CoV-2 infection from January 2020 to July 31, 2021, was defined as the original stain, and the SARS-CoV-2 infection from August 1, 2021, to December 31, 2021, was defined as the delta36,37,42. In Japan, diagnosis of SARS-CoV-2 infection was categorized as infection with the original strain until May 31, 2021, and delta variants from June 1, 2021, to December 31, 202136,43. To examine the relative severity of COVID-19 in comparison with another contagious viral respiratory disease, additional exposure to influenza infection was defined. It refers to cases diagnosed through an RT-PCR assay or antigen test on nasal and pharyngeal swabs during the observation period. For individuals infected with both SARS-CoV-2 and influenza, it includes instances of influenza infection developing after the SARS-CoV-2 infection.

Outcomes

To investigate the impact of SARS-CoV-2 infection on acute respiratory complication and post-acute respiratory sequelae, respectively, our experimental designs included two distinct ‘primary outcomes’. First, we used the incidence of various post-acute respiratory sequelae after 30 days of SARS-CoV-2 as the ‘primary outcome’ for respiratory sequelae in post-acute COVID-19 conditions. Second, the ‘primary outcome’ for acute respiratory complication is the incidence of various respiratory diseases within 1 month following a diagnosis of SARS-CoV-244,45,46,47. Outcomes were defined based on appropriate International Classification of Diseases 10th (ICD-10) codes for the new-onset of the specific diagnosis with at least one claim within one year. Post-acute respiratory sequelae were defined as chronic respiratory failure, pulmonary hypertension, sleep apnea, COPD, emphysema, asthma, pulmonary sarcoidosis, and interstitial lung disease48. In addition, acute respiratory complication was defined as pneumocystis pneumonia, aspergillosis pneumonia, pleural empyema, lung abscess, pneumothorax, acute respiratory failure, and pulmonary embolism (Table S25)48.

Covariates

Participant demographic data was sourced from the insurance database, which included age (20–39, 40–59, and ≥60 years), sex (male and female), household income percentiles (low [0–39], middle [40–79], high [80–100]), and region of residence (urban and rural)34. CCI (0, 1, and ≥2), histories of cardiovascular disease and chronic kidney disease, and previous use of medication for diabetes, hyperlipidemia, and hypertension were identified using the ICD-10 codes, combined with results of general health examinations and personal medical interview34,37,41. From the health examination, BMI (underweight [<18.5 kg/m2], normal [18.5–22.9 kg/m2], overweight [23.0–24.9 kg/m2], obese [≥25.0 kg/m2], and unknown), blood pressure (systolic blood pressure < 140 mmHg and diastolic blood pressure < 90 mmHg, systolic blood pressure ≥ 140 mmHg or diastolic blood pressure ≥ 90 mmHg, and unknown), fasting blood glucose (<100, ≥100 mg/dL, and unknown), serum total cholesterol (<200, 200–239, ≥240 mg/dL, and unknown), glomerular filtration rate (<60, 60–89, ≥90 mL/min/1.73 m2, and unknown), smoking status (never, former, current smoker, and unknown), alcoholic drinks (<1, 1–2, 3–4, ≥5 days per week, and unknown), aerobic physical activity (sufficient [≥150 min/week of moderate-intensity activity or ≥75 min/week of vigorous-intensity activity or greater than an equivalent combination], insufficient, and unknown), and type of SARS-CoV-2 (original and delta) were obtained39.

Propensity score matching

To enhance the robustness and generalizability of our primary findings and balance baseline covariates, we employed exposure-driven propensity score matching. This approach compared individuals with SARS-CoV-2 infection to those without infection as a general population49. The propensity score was calculated by using a logistic regression model, adjusted for age (20–39, 40–59, and ≥60 years), sex (male and female), region of residence (urban and rural), history of cardiovascular and chronic disease, and medication use for diabetes, hyperlipidemia, and hypertension. Individuals were paired in a 1:5 ratio between the exposure group (SARS-CoV-2) and the non-exposure group. Through the prior procedures, we generated multi-to-one matched cohorts utilizing a ‘greedy nearest-neighbour’ algorithm, maintaining a caliper width of 0.001 standard deviations. The quality of the match was evaluated through the SMD, with an SMD <0.1, indicating minimal imbalances between the groups49. In addition, to investigate the relative severity of COVID-19 compared to other infectious viral respiratory diseases, an influenza group within the general population was utilized as another control group, directly matching SARS-CoV-2 infections at a 1:1 ratio.

Replication cohort in Japan

We employed the same definition of ICD-10 codes, exposure, and outcome assessments, general health examination, follow-up duration, and propensity score matching for the JMDC cohort (total N = 4,909,861) as we did with the main cohort. However, due to the lack of COVID-19 vaccination data in the JMDC cohort, we utilized this cohort primarily to validate the findings from the main cohort. Supplementary material was provided to address more detailed information about the validation cohort, which was caused by the different components and structures of the main cohort (Supplementary Material).

Statistical analysis

To estimate the HRs with 95% CIs, we applied the Cox proportional hazards model. Therefore, we assigned the ‘individual index date’ by following the criterion. For the exposed group, it is the date of the first diagnosis of SARS-CoV-2. For the individuals in the non-exposed group, the index date was allocated to match the index date of the corresponding exposed case. This approach was implemented to mitigate immortal time bias, ensuring an equitable comparison between groups.

In the matched COVID-19 cohort, we conducted various statistical analyses, detailed in Table S26. These analyses involved respiratory disease (post-acute respiratory sequelae and acute respiratory complication) and its subtypes, number of vaccinations (without, 1 time, and ≥2 times), type of previous vaccination (mRNA, viral vector, and both)36, and strain of SARS-CoV-2 (original and delta). We further assessed the time attenuation effect of respiratory diseases after SARS-CoV-2 (<3, 3–6, and ≥6 months) to reduce reverse causation. To minimize the impact of potential confounders, the following variables were used for the adjusted model: age, sex, household income, region of residence, CCI score, obesity, blood pressure, fasting blood glucose, serum total cholesterol, glomerular filtration rate, smoking status, alcoholic drinks, aerobic physical activity, previous history of cardiovascular and chronic kidney diseases, history of medication use for diabetes mellitus, dyslipidemia, and hypertension, and strain of SARS-CoV-2 (original and delta). All statistical analyses were conducted using SAS (version 9.4; SAS Institute Inc., Cary, NC, USA)50,51. A two-sided P-value < 0.05 was considered statistically significant.

Sensitivity analysis

Several sensitivity analyses were conducted to enhance the credibility of the manuscript and our primary analyses. First, to identify detection bias and validate the results of our cohort, we performed a negative control analysis by exploring the association between tympanic membrane perforation disease and SARS-CoV-2 infection (Table S27). Second, we conducted an exposure-driven propensity score matching analysis based on the claim record of individuals who tested positive for SARS-CoV-2 within 2 weeks following an RT-PCR test and those who did not, for a more sophisticated analysis of the association between SARS-CoV-2 infection and respiratory symptoms (Table S15). Third, stratification analyses were further employed in two matched cohorts stratified by factors, including sex, age group, household income, CCI, BMI status, smoking status, alcoholic drinks, region of residence, and aerobic physical activity. Fourth, to thoroughly evaluate the marginal risks and prevalence of acute and post-acute respiratory conditions post-COVID-19, we utilized the average treatment effect in the overlap method, calculating overlap-weighted hazard ratios (Tables S7S9)52,53,54.

Patient and public involvement

The Korean government and JMDC anonymized patient data by excluding patient-related data such as personal identification numbers or names for confidentiality. While direct identification of individuals was rendered impossible due to the removal of names, all other pertinent data remained intact and accessible for our analyses. Research questions and outcome measures were autonomously determined without the intervention of individuals. The research design and implementation proceeded without external consultation. However, it can be extended to include contributions from other qualified public participants who are able to offer valuable insights into the research design, analysis, and interpretation. Upon request, the researchers intend to disseminate the results of this research to all research participants and relevant communities.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.