Abstract
Background
The aim of this study was to evaluate the reliability of clinician-based perceptual assessment of voice and computerized acoustic voice analysis as screening tests for vocal fold paresis or paralysis (VFP) after thyroid and parathyroid surgery.
Methods
This was a prospective study of 181 patients undergoing thyroid or parathyroid procedure with pre and postoperative laryngoscopic vocal fold inspection, perceptual voice assessment using grade, roughness, breathiness, asthenia, and strain (GRBAS) scale and acoustic voice analysis using the multi-dimensional voice program (MDVP). Patients were divided into 2 groups for comparison; those with new postoperative VFP and those without. Potential screening tools were evaluated using the receiving operating characteristic (ROC) analysis.
Results
Fourteen (6.6%) patients had a new postoperative VFP. Postoperative GRBAS scores were significantly (P < 0.05) higher in patients with VFP compared to those without. However, there were no statistically significant differences in MDVP values between the groups. Postoperative GRBAS grade score (cut off > 0) had the best sensitivity, 93%, for predicting VFP, but the specificity was only 50%. Postoperative jitter (cut off > 1.60) in MDVP had a good specificity, 90%, but only 50% sensitivity. Combining all the GRBAS and MDVP variables with P < 0.05 in the ROC analysis yielded a test with 100% sensitivity and 55% specificity.
Conclusions
Physician-based perceptual voice assessment has a high sensitivity for detecting postoperative VFP, but the specificity is poor. The risk of VFP is low in patients with completely normal voice at discharge. However, routine laryngoscopy after thyroid and parathyroid surgery is still the most reliable exam for VFP screening.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
Vocal fold paresis or paralysis (VFP) caused by recurrent laryngeal nerve injury is a well-known complication following thyroid or parathyroid surgery with incidence rates ranging from 1.4% to 38% (1,2,3,4). In most institutions, the risk of postoperative VFP is roughly 5% [1, 3,4,5,6,7,8,9,10]. Postoperatively, VFP can be asymptomatic and go undetected unless routine laryngoscopy examinations are performed [11, 12]. The postoperative assessment of vocal fold function with laryngoscopy is time-consuming, requires special equipment and may cause discomfort to the patient. Still, it is important for the patient and for the surgeon to know if this complication has occurred. There are currently no good alternatives to laryngoscopy in detecting VFP after surgery.
Postoperative VFP may manifest with audible voice changes. These changes may be observed perceptually by a person or by a voice analyzing program. The perceptual voice quality assessment can be done using the grade, roughness, breathiness, asthenia, strain (GRBAS) scale [13]. For more objective assessment, acoustic voice analysis can be performed using the multi-dimensional voice program (MDVP) system which is currently the most commonly used and cited acoustic analysis software [14].
The primary aim of this study was to evaluate the reliability of clinician-based perceptual voice assessment (GRBAS) and computerized acoustic voice analysis (MDVP) in screening of new VFP immediately after thyroid and parathyroid surgery (before discharge). The secondary aim was to study the correlations between these two methods postoperatively in patients with or without VFP.
Material and methods
This prospective study was approved by the institutional review board. Patients gave written informed consent. All consecutive patients undergoing thyroid or parathyroid surgery over a one-year study period in 2017 in a single academic hospital were considered for recruitment (n = 213). Twenty-one patients were ineligible for the study, and eleven patients were excluded after recruitment (Fig. 1). Finally, 181 patients were included in the study. The study patients underwent the voice quality assessments and vocal fold function examinations preoperatively and postoperatively 1.1 ± 0.3 days after surgery (prior to the discharge from the hospital). Preoperative laryngoscopy was performed to exclude preoperative VFP. Patients were divided into 2 groups based on postoperative lanrygoscopic examination; those with and those without a new postoperative VFP.
All patients underwent perceptual evaluation of voice by otolaryngologists using the GRBAS scale, before and after procedure. Nine of the examiners were in training and twelve were fully trained otolaryngologists. Otolaryngologists worked independent of the surgical team. Grade (G) is overall perceived degree of dysphonia, integrating all deviant components; roughness (R) is irregular fluctuation of the fundamental frequency; breathiness (B) is turbulence due to leakage of air through the insufficient glottic closure; asthenia (A) is weakness of voice, and strain (S) is perceived excess effort. Each parameter is scored using a scale of 0 to 3, where 0 is normal, 1 is slight disturbance, 2 is moderate disturbance, and 3 is severe disturbance. After the GRBAS voice rating, the otolaryngologist performed indirect laryngoscopy to evaluate vocal fold function. Fiberoptic laryngoscopy was used in 39 (22%) cases preoperatively and in 42 (23%) cases postoperatively when the visibility in indirect laryngoscopy was inadequate or suboptimal. New VFP was defined as immobility or insufficiency of the vocal fold.
Preoperative and postoperative voice recordings were performed on all patients by a trained nurse. Patients phonated a vowel, and 5 s samples were recorded with an iOS app called OperaVox (On PErson RApid VOice eXaminer, Oxford Research Wave Ltd, UK). The OperaVox app was installed on an iPad air 2 (Apple Inc., Cupertino, CA, USA). The device’s internal microphone as a recording system is compatible with a direct digitation method [15]. The iPad was placed in a tablet holder at 30 cm distance from the patient’s lips. Patients gave the voice samples in standing position unless their physical condition prevented it. OperaVox has a color bar indicator to measure instantaneous loudness of the voice, while the recording was performed. The recorded WAV files were exported to a MDVP workstation at a different location. The most high-quality 3 s of the recordings were acoustically analyzed with the MDVP software (KayPentax, NJ, USA). The analysis produces acoustic parameters including F0, jitter, shimmer, shimmer dB and noise to harmonic ratio (NHR). F0 is the mean frequency of mucosal vibrations of the vocal folds. Jitter and shimmer are perturbation measurements that measure cycle-to-cycle frequency and amplitude variation, respectively, in the analyzed voice sample. NHR is a measurement of the degree of hoarseness obtained by estimating the proportion of noise in the subject’s voice [16].
For validation of the described voice recording method, 20 randomly selected patients without VFP and 11 patients with VFP underwent a second round of postoperative voice sample recordings 2 weeks after surgery; this was done in a dedicated voice laboratory by a trained speech and language therapist. Patients gave a recording in an acoustically isolated booth. The voice sample was recorded using the iPad system and directly to the MDVP software using a condenser microphone. These recordings were then compared and studied for correlations of each of the recording parameters.
Statistical analysis
All statistical analyses were performed using SPSS Statistics 24.0 (IBM Corp, Armonk, NY). Continuous variables were expressed as mean ± standard deviation (SD). The parameters were tested for normality by creating histograms. The group differences for normally distributed continuous variables were analyzed using the T-test. The correlation analysis was done by Pearson correlation coefficient test. Values between 0.1 and 0.3 were defined as mild, from 0.3 to 0.5 as moderate and more than 0.5 as strong correlation. Youden indexes and receiver operating characteristic (ROC) curves were generated to identify the critical values at which different variables were associated with VFP. The Youden index is an approach commonly employed to maximize both sensitivity and specificity and is calculated by summing the sensitivity and the specificity, and then subtracting number 1 from the result.
Results
Altogether, 181 patients (mean age 58 ± 15 years, 87% female) were included in the study and underwent pre and postoperative examinations. The indications for surgery were goiter in 71 (39%), suspicion of malignancy in 39 (22%), malignant tumor in 6 (3%), hyperthyroidism in 25 (14%), and hyperparathyroidism in 40 (22%) patients. The type of the procedure was hemithyroidectomy in 86 (48%), total thyroidectomy in 51 (28%), isthmectomy in 4 (2%), and parathyroid procedure in 40 (22%) patients of which one was bilateral. The final pathological diagnosis was benign in 158 (87%) and malign in 23 (13%) patients. The mean length of hospital stay was 1.4 ± 1.5 days. Postoperatively, a new VFP was detected after 14 operations (10 paresis and 4 paralysis). The recurrent laryngeal nerve was inadvertently cut and noted in one patient. In the other patients with VFP, no injury of the RLN was recorded during the surgery.
On perception analysis using the GRBAS scale, patients with VFP had significantly higher mean GRBAS scores postoperatively in all 5 domains compared to patients with no VFP (Table 1, Fig. 2). In addition, the mean change between the pre and postoperative GRBAS scores were greater among patients with new VFP; these differences were statistically significant in all except the mean change of strain (S). In the objective voice analysis using the recorded samples, no statistically significant differences were observed between patients with and those without VFP (Table 2, Fig. 3). However, there was a non-significant trend for more jitter in patients with VFP (P = 0.06).
GRBAS scores had mild to moderate correlation with all variables in the acoustic voice analysis except F0. The correlation was mostly moderate for G and B, and mild for R, A, and S (Table 3). Among patients without VFP, postoperative GRBAS scores had mild or moderate correlation with all voice analysis values except F0, whereas in patients with VFP, only few postoperative values had a statistically significant correlation due to the low number of patients in this group. However, in patients with VFP, grade (G), and breathiness (B) correlated strongly with jitter. The correlations between postoperative GRBAS and voice analyses are presented separately for patients with and without VFP in Table 4.
ROC analyses were performed in an attempt to discover a diagnostic tool for the screening of patients with VFP after surgery. Potential diagnostic tools and their evaluation methodology are presented in Table 5. Postoperative GRBAS grade score (cut off > 0) had the best sensitivity, 93%, but the specificity was only 50%. While postoperative jitter (cut off > 1.60) measurement had a good specificity (90%), the sensitivity was only 50%. The best Youden index was achieved in change of breathiness in GRBAS score (0.55). Combining 2 or more diagnostic tools did not yield a better Youden index.
The validation of the recording technique showed strong correlation between all parameters recorded with iPad compared to those recorded directly to the MDVP software. The Pearson correlation coefficient was 0.95 for F0, 0.85 for jitter, 0.77 for shimmer, 0.84 for shimmer dB, and 0.75 for NHR.
Discussion
This study demonstrated that the clinician’s perceptual assessment of the patient’s voice after thyroid or parathyroid surgery is sensitive in detecting most postoperative VFPs. If the GRBAS grade score (a composite of R, B, A and S) was more than zero, meaning that there was any significant disturbance in the patient’s postoperative voice, the sensitivity of this test being able to detect VFP was 93%. This means that 1 of the 14 postoperative VFP complications would have been missed without routine laryngoscopic examinations. However, the specificity of this test was only 50%. Therefore, using perceptual voice assessment as a screening tool, half of the patients would still have to undergo laryngoscopic examination after surgery. The additional value of objective voice analysis using MDVP parameters was minimal considering that the computerized voice analysis is more cumbersome to perform than the clinician-based assessment.
In addition to the present study, a few previous studies have demonstrated increased GRBAS scores in patients with VFP. Jesus and colleagues compared 17 patients with unilateral VFP with 43 controls; all GRBAS parameters were statistically significantly higher in the VFP group [17]. Furthermore, Jedra and colleagues examined 25 patients with iatrogenic VFP one to two days after the onset of speech impairment; all study patients had GRBAS grade score more than zero [18].
Iatrogenic VFP should be diagnosed early, preferably before discharge for 2 main reasons [19]. First, the surgeon gets immediate feedback which can help avoiding recurrent laryngeal nerve injuries in the future. Second, the patient has direct benefits from early diagnosis; aspiration problems may be prevented, symptomatic patients get referred to voice therapy early in the process, and surgical treatment will be considered in timely fashion, when needed [20, 21]. Even patients with asymptomatic VFP should be detected ahead of time. Initially, VFP may be asymptomatic because of compensatory movement of the contralateral vocal fold. However, symptoms may arise with aging when the compensation mechanisms become weaker. VFP that is detected at a later time may cause unnecessary etiological examinations.
While a patient with VFP may be asymptomatic, a patient with postoperative voice disorder may not necessarily have VFP [22]. Therefore, it is challenging to create a screening test for postoperative VFP which would be both sensitive and specific. In a previous prospective study by Ortega and colleagues including 64 patients undergoing thyroid surgery with pre and postoperative computerized acoustic voice analysis (a program created on the base of MDVP) and subjective GRBAS evaluation, the authors suggested that patients with normal findings in these tests may not need laryngoscopy to exclude VFP [23]. The sensitivity and specificity of GRBAS were 100% and 61% 1 week after the procedure and 45% and 98% for computerized acoustic voice analysis, respectively. The authors also noted that the sensitivity of computerized analysis might increase in a repeated examination 1 month after procedure. However, the study included only 5 patients with postoperative VFP, and therefore, the results should be interpreted with caution. On the other hand, our study showed similar results in the early postoperative period for GRBAS and MDVP as Ortega’s study. Performing these tests before discharge would be beneficial since the patients does not need to come back for the examination.
Combining 2 measures (presented in Table 5), 1 with good sensitivity and 1 with good specificity, such as GRBAS Grade and GRBAS Strain, did not give any better tool to screen VFP. However, a sum variable combining eleven independent variables achieved 100% sensitivity. Nevertheless, given that the specificity was only 55% and the calculation of the sum variable is fairly complex, this may not be a very practical tool for clinical use. Correlations between postoperative GRBAS scores and acoustic parameters differ between patients with and without VFP (Table 4). Patients with no VFP had correlation between nearly all GRBAS scores and acoustic parameters. In contrast, only 3 pair of GRBAS scores and acoustic parameters had correlation among patients with VFP. A possible explanation for the lower correlation of perceptual and computerized voice assessment in patients with VFP may be the difficulty to do an accurate objective analysis of a pathologic voice. Another reason could be that the number of patients with VFP was too small to show statistically significant correlation.
The low specificity of voice assessment may be explained by the high prevalence of voice disorders in the population. Nearly 8% of adults are experiencing voice problems including those who have no pathological findings in the larynx [24]. Furthermore, iatrogenic causes other than recurrent laryngeal nerve damage may cause voice changes after thyroid or parathyroid surgery, such as larynx irritation or trauma attributed to the endotracheal intubation. Intubation may cause a hematoma, laceration of vocal fold mucosa or muscle, and even subluxation of the arytenoid cartilage [25, 26]. This type of trauma may heal spontaneously. However, shortly after surgery it may cause voice changes which could be detected in MDVP voice analysis. Moreover, external branch of the superior laryngeal nerve may be damaged during the surgery. This damage is linked to cricothyroid muscle motility impairment, an altered frequency of voice, modified timbre, and deterioration in voice performance (high-pitched sounds) [19]. In addition, division, intraoperative fixation or injury of prelaryngeal strap muscles (sternohyoid, sternothyroid) leading to postoperative voice impairment has been described, and edema of the neural structures innervating the muscles needed for phonation may also cause voice changes [27]. Hence, causes other than VFP may induce changes in MDVP measures after surgery. Maeda and colleagues reported statistically significant worsening in parameters of acoustic voice analysis after thyroidectomy among 110 patients with no VFP suggesting that thyroidectomy has a distinct impact on voice quality even without recurrent laryngeal nerve injury [28]. In our study, we did not notice any significant changes in MDVP measures in patients without VFP after surgery.
Study limitations
In this study, GRBAS scale was scored by 21 different otolaryngologist who were introduced to the use of the scoring system, but had little or no previous experience in GRBAS. In addition, the pre and postoperative assessments were not always conducted by the same physician. These factors may cause variability in GRBAS grading of the study patients. However, the clinician based GRBAS rating system has been associated high interobserver reliability in previous studies [29]. Indirect laryngoscopy was performed as the primary investigation to distinguish patients with or without VFP. Fiberoptic laryngoscopes were used in case of poor visibility in indirect laryngoscopy. In addition, we recognize a potential for bias as the same otolaryngologist performed GRBAS assessment and the subsequent laryngoscopy. If the patient has no voice abnormality, the investigator could be tempted to skip the time-consuming fiberoptic laryngoscopy in case of suboptimal visibility in the indirect laryngoscopy. However, we think that the risk of this bias is low in our study because the otolaryngologists in our institution have performed routine vocal fold examinations pre and postoperatively for 200 annual patients undergoing thyroid and parathyroid surgery for several years before the study. Finally, the low number of patients with VFP event in this study may underestimate the value of the screening tests because of the possibility of type 2 statistical error.
Conclusion
Perceptual voice assessment with or without objective acoustic analysis has a high sensitivity for detecting postoperative VFP, but the specificity is poor. It is possible that using perceptual voice assessment, half of the routine laryngoscopic examinations could be avoided after thyroid and parathyroid surgery if laryngoscopy was omitted in patients with completely normal voice at discharge; the risk of postoperative VFP in these patients is low, but not zero. The utility of computerized acoustic voice analysis alone was limited. Further studies are needed to create an accurate screening test for postoperative VFP. Meanwhile, routine laryngoscopy after thyroid and parathyroid surgery is still the most accurate test for VFP screening.
References
Bergenfelz A, Jansson S, Kristoffersson A et al (2008) Complications to thyroid surgery: results as reported in a database from a multicenter audit comprising 3,660 patients. Langenbecks Arch Surg 393:667–673
Jeannon JP, Orabi AA, Bruch GA, Abdalsalam HA, Simo R (2009) Diagnosis of recurrent laryngeal nerve palsy after thyroidectomy: a systematic review. Int J Clin Pract 63:624–629
Joliat GR, Guarnero V, Demartines N, Schweizer V, Matter M (2017) Recurrent laryngeal nerve injury after thyroid and parathyroid surgery: incidence and postoperative evolution assessment. Med (Baltim) 96:e6674
Heikkinen M, Halttunen S, Terava M, Karkkainen JM, Lopponen H, Penttila E (2018) Vocal fold paresis as a surgical complication: our 10-year experience with 162 incidents. Clin Otolaryngol 44(2):179–182
Lo CY, Kwok KF, Yuen PW (2000) A prospective evaluation of recurrent laryngeal nerve paralysis during thyroidectomy. Arch Surg 135:204–207
Otto RA, Cochran CS (2002) Sensitivity and specificity of intraoperative recurrent laryngeal nerve stimulation in predicting postoperative nerve paralysis. Ann Otol Rhinol Laryngol 111:1005–1007
Acun Z, Cihan A, Ulukent SC et al (2004) A randomized prospective study of complications between general surgery residents and attending surgeons in near-total thyroidectomies. Surg Today 34:997–1001
Rosato L, Carlevato MT, De Toma G, Avenia N (2005) Recurrent laryngeal nerve damage and phonetic modifications after total thyroidectomy: surgical malpractice only or predictable sequence? World J Surg 29:780–784
Chiang FY, Wang LF, Huang YF, Lee KW, Kuo WR (2005) Recurrent laryngeal nerve palsy after thyroidectomy with routine identification of the recurrent laryngeal nerve. Surgery 137:342–347
Dhillon VK, Rettig E, Noureldine SI et al (2018) The incidence of vocal fold motion impairment after primary thyroid and parathyroid surgery for a single high-volume academic surgeon determined by pre- and immediate post-operative fiberoptic laryngoscopy. Int J Surg 56:73–78
Hanna BC, Brooker DS (2008) A preliminary study of simple voice assessment in a routine clinical setting to predict vocal cord paralysis after thyroid or parathyroid surgery. Clin Otolaryngol 33:63–66
Heikkinen M, Makinen K, Penttila E et al (2019) Incidence, risk factors, and natural outcome of vocal fold paresis in 920 thyroid operations with routine pre- and postoperative laryngoscopic evaluation. World J Surg 43:2228–2234
Iwarsson J, Bingen-Jakobsen A, Johansen DS et al (2018) Auditory-perceptual evaluation of dysphonia: a comparison between narrow and broad terminology systems. J Voice 32:428–436
Lovato A, De Colle W, Giacomelli L et al (2016) Multi-dimensional voice program (mdvp) vs praat for assessing euphonic subjects: a preliminary study on the gender-discriminating power of acoustic analysis software. J Voice 30:765.e1-765.e5
Lin E, Hornibrook J, Ormond T (2012) Evaluating iPhone recordings for acoustic voice assessment. Folia Phoniatr Logop 64:122–130
Mat Baki M, Wood G, Alston M et al (2015) Reliability of OperaVOX against Multidimensional Voice Program (MDVP). Clin Otolaryngol 40:22–28
Jesus LM, Martinez J, Hall A, Ferreira A (2015) Acoustic correlates of compensatory adjustments to the glottic and supraglottic structures in patients with unilateral vocal fold paralysis. Biomed Res Int 2015:704121
Jędra K, Sielska-Badurek E, Niemczyk K (2017) Severity of dysphonia in patients during first days after iatrogenic injury. Otolaryngol Pol 71:22–26
Dionigi G, Kim HY, Randolph GW et al (2016) Prospective validation study of Cernea classification for predicting EMG alterations of the external branch of the superior laryngeal nerve. Surg Today 46:785–791
Chen X, Wan P, Yu Y et al (2014) Types and timing of therapy for vocal fold paresis/paralysis after thyroidectomy: a systematic review and meta-analysis. J Voice 28:799–808
Kao YC, Chen SH, Wang YT, Chu PY, Tan CT, Chang WD (2017) Efficacy of voice therapy for patients with early unilateral adductor vocal fold paralysis. J Voice 31:567–575
Farrag TY, Samlan RA, Lin FR, Tufano RP (2006) The utility of evaluating true vocal fold motion before thyroid surgery. Laryngoscope 116:235–238
Ortega J, Cassinello N, Dorcaratto D, Leopaldi E (2009) Computerized acoustic voice analysis and subjective scaled evaluation of the voice can avoid the need for laryngoscopy after thyroid surgery. Surgery 145:265–271
Hur K, Zhou S, Bertelsen C, Johns MM (2018) Health disparities among adults with voice problems in the United States. Laryngoscope 128:915–920
Kambic V, Radsel Z (1978) Intubation lesions of the larynx. Br J Anaesth 50:587–590
Musholt TJ, Musholt PB, Garm J, Napiontek U, Keilmann A (2006) Changes of the speaking and singing voice after thyroid or parathyroid surgery. Surgery 140:978–979
Hong KH, Kim YK (1997) Phonatory characteristics of patients undergoing thyroidectomy without laryngeal nerve injury. Otolaryngol Head Neck Surg 117:399–404
Maeda T, Saito M, Otsuki N et al (2013) Voice quality after surgical treatment for thyroid cancer. Thyroid 23:847–853
Lu FL, Matteson S (2014) Speech tasks and interrater reliability in perceptual voice evaluation. J Voice 28:725–732
Acknowledgements
We would like to thank speech and language therapist Anita Länsivuori, biostatistician Tuomas Selander, clinical research nurse Taina Poutiainen and technical assistant Seppo Virta for their efforts and commitment to the present study.
Funding
Open access funding provided by University of Eastern Finland (UEF) including Kuopio University Hospital.. Grant support: no grant support.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Informed Consent
Informed consent was obtained from all individual participants included in the study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Heikkinen, M., Penttilä, E., Qvarnström, M. et al. Perceptual Assessment and Acoustic Voice Analysis as Screening Tests for Vocal Fold Paresis After Thyroid or Parathyroid Surgery. World J Surg 45, 765–773 (2021). https://doi.org/10.1007/s00268-020-05863-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00268-020-05863-x