Article Text
Abstract
Background A growing number of meta-analyses (MA) have investigated the use of spinal cord stimulation (SCS) as a treatment modality for chronic pain. The quality of these MAs has not been assessed by validated appraisal tools.
Objective To examine the methodological characteristics and quality of MAs related to the use of SCS for chronic pain syndromes.
Evidence review An online literature search was conducted in Ovid MEDLINE(R), Ovid EMBASE, Ovid Cochrane Database of Systematic Reviews, and Scopus databases (January 1, 2000 through June 30, 2023) to identify MAs that investigated changes in pain intensity, opioid consumption, and/or physical function after SCS for the treatment of chronic pain. MA quality was assessed using A Measurement Tool to Assess Systematic Reviews (AMSTAR-2) critical appraisal tool.
Findings Twenty-five MAs were appraised in the final analysis. Three were considered “high” quality, three “low” quality, and 19 “critically low” quality, per the AMSTAR-2 criteria. There was no association between the publication year and AMSTAR-2 overall quality (β 0.043; 95% CI −0.008 to 0.095; p=0.097). There was an association between the impact factor and AMSTAR-2 overall quality (β 0.108; 95% CI 0.044 to 0.172; p=0.002), such that studies published in journals with higher impact factors were associated with higher overall quality. There was no association between the effect size and AMSTAR-2 overall quality (β −0.168; 95% CI −0.518 to 0.183; p=0.320).
According to our power analysis, three studies were adequately powered (>80%) to reject the null hypothesis, while the remaining studies were underpowered (<80%).
Conclusions The study demonstrates a critically low AMSTAR-2 quality for most MAs published on the use of SCS for treating chronic pain. Future MAs should improve study quality by implementing the AMSTAR-2 checklist items.
PROSPERO registration number CRD42023431155.
- CHRONIC PAIN
- Spinal Cord Stimulation
- Pain Management
- analgesia
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, an indication of whether changes were made, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Introduction
Spinal cord stimulation (SCS) is a non-pharmacological interventional treatment option with an evidence basis for treating various chronic pain syndromes.1 Over the last decade, several device innovations, particularly with respect to hardware and software advancements, have transformed the field of neuromodulation. Moreover, well-designed and high-quality research has ushered in newly approved indications for SCS utilization. Since the longstanding US Food and Drug Administration (FDA) approvals for SCS in the treatment of chronic pain from nerve damage in the trunk or extremities, typically associated with failed back surgery syndrome, chronic radicular pain, and complex regional pain syndrome (CRPS) types I and II, the FDA recently provided two new indications for SCS use in treating painful diabetic neuropathy and non-surgical refractory back pain.
With the expansion of SCS indications, development of the necessary evidence basis, and an increase in clinical utilization, there has been a corresponding increase in the number of meta-analyses (MAs) appraising the level and certainty of evidence for analgesic outcomes after SCS therapy. However, not all published MAs are reliable and may be subject to methodological flaws such as deficits in study inclusion, pre-registration, protocol design, clarity of reporting, and control of potential bias.2 The concern with publishing compromised and flawed MAs is that they are regarded as the highest level of evidence in guiding clinical decision-making and directing clinical practice guidelines.3 The primary advantages of an MA include calculating a pooled estimate of treatment effect compared with effect sizes from individual studies and increasing the generalizability of results from individual studies. In addition to synthesizing diverse findings, systematic reviews with meta-analysis not only resolve inconsistencies and settle controversies in existing research, but also offer valuable insights for guiding future research studies. To elevate the rigor of data appraisal and analysis in the SCS literature, it is critical to systematically review and benchmark the methodological and statistical quality of existing MAs.
In 2007, A Measurement Tool to Assess Systematic Reviews (AMSTAR) was developed based on the Cochrane Handbook for Systematic Review Interventions with a primary focus on improving systematic reviews (SRs) and MAs methodology to facilitate an accurate and reliable presentation of data.4–6 As advances in methodology and terminology have occurred over the last decade, AMSTAR-2 was developed in 2017 to enable a more detailed assessment of SRs, including randomized and non-randomized studies of healthcare interventions.7 AMSTAR-2 is a domain-based rating system with seven critical domains and nine non-critical domains, evaluating a study’s quality based on weighted performance in each domain. After aggregating each of the 16 domains, the overall rating of the study is identified as critically low, low, moderate, or high. This validated and reliable critical appraisal tool allows for identifying areas of improvement for future SRs and MAs. Given that MAs can be improperly produced, it is imperative that authors adhere to checklists such as AMSTAR-2 and other more recently generated guidelines8–11 to provide the highest quality.
To date, no study has evaluated the quality of MAs on SCS therapy for chronic pain. Given the former, we systematically identified MAs over the last 20 years that pooled analgesic outcomes after SCS therapy for chronic pain and assessed their methodological and statistical quality using the AMSTAR-2 critical appraisal tool. We hypothesized that the quality of most MAs has remained poor and unchanged over the last 20 years.
Methods
Search strategy
The protocol for this study was registered in the International Prospective Register of Systematic Reviews (PROSPERO) database prior to beginning this study. We conducted an online literature search in the Ovid MEDLINE(R), Ovid EMBASE, Ovid Cochrane Database of Systematic Reviews, and Scopus databases from January 1, 2000 through May 30, 2023. The first randomized controlled trial with SCS was published in 2000. Thus, the investigators decided to use this year because future efforts would meta-analyze data starting from 2000 randomized controlled trial data. The search strategy was designed and conducted by an experienced librarian (LJP) with input from the principal investigator (RSD). Controlled vocabulary supplemented with keywords was used to identify MAs reporting on the use of SCS for chronic pain in human adults. The actual strategy listing all the search terms used, Boolean operators, and how they were combined is available in online supplemental eAppendix 1. Only MAs that pooled analgesic outcomes such as pain intensity, opioid consumption, or physical functioning were included, and the authors (DJK, RC, RSD) searched reference lists of included studies. This study was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines12 and when applicable, AMSTAR-2 reporting guidelines.5
Supplemental material
Eligibility criteria (PICO)
Eligibility criteria for studies using the population, intervention, control, and outcome (PICO) format were defined as follows:
Population: individual participant data MAs, MAs with trial sequential analysis, network MAs, and component network MAs on adult patients (≥18 years of age) with chronic pain, limited to English full-length text publications in peer-reviewed journals were eligible for this study.
Intervention: human patients/participants who underwent SCS implantation.
Control: control cohorts may have included a placebo, sham stimulation, or an alternative treatment. In some MAs, there were no comparative cohorts, and follow-up metrics were compared with baseline metrics.
Outcomes: we assessed MAs with outcomes of pain intensity, opioid consumption, and/or physical functioning at defined follow-up periods after SCS implantation. Our primary outcome was the AMSTAR-2 overall quality for each included MA.
Studies were excluded if they focused on SCS trial, deep brain stimulation, dorsal root ganglion stimulation, peripheral nerve stimulation, and other neuromodulation modalities unrelated to dorsal column SCS. MAs were not restricted based on their control groups and chronic pain indications.
Screening and study selection
After the initial librarian search, all studies were added to an EndNote file. Two independent reviewers (DJK and RC) conducted the primary screening. Titles and abstracts were screened for secondary inclusion in a full-text review. Full-text articles were independently reviewed by two authors (DJK and RC) to determine the final eligibility. Any discrepancies were adjudicated by a third reviewer (RSD). A list of excluded studies and reasons for exclusion after full-text review are provided in online supplemental eAppendix 2.
Supplemental material
Data extraction
Data extraction was conducted by two independent reviewers (DJK and RC). Data extraction from each MA included the following: name of the study, country, year of publication, characteristics of dorsal column SCS, follow-up time, number of included studies, number of participants, type of chronic pain indication, sources of funding, journal of publication, journal impact factor, and outcomes of interest with associated pooled effect sizes and measures of variance.
For each included MA, the quality was graded using the AMSTAR-2 criteria. For overall quality ratings, each study underwent a thorough review using the 16-item AMSTAR-2 checklist (online supplemental eAppendix 3) by two independent reviewers (DJK and RC). The checklist involves 16 domains that are scored as “yes” or “no,” with some items having the option for “partial yes” when appropriate. The “critical” domains determined by AMSTAR-2 guidelines include domain 2 (a statement of protocol registration a priori; justification of protocol deviations), domain 4 (use of a comprehensive literature search strategy), domain 7 (provide a list of excluded studies and justification for exclusion), domain 9 (use of a satisfactory technique for assessing the risk bias in individual studies), domain 11 (appropriate methods for statistical combination of results), domain 13 (account for the risk of bias in individual studies when interpreting/discussing the results), and domain 15 (adequate investigation of publication bias and discuss the impact on the results). All other domains were considered “non-critical.” A complete list of descriptions for each AMSTAR-2 item can be found in online supplemental eAppendix 4. The assessment of each checklist item was compared between independent reviewers to reach a consensus for each item and to provide an overall quality rating (critically low, low, moderate, or high). The categorization of the overall quality rating was determined by the number of missing critical domains and non-critical domains from each MA with “high” indicating no critical flaws and one or less non-critical flaws, “moderate” indicating no critical flaws and more than one non-critical flaw, “low” indicating one critical flaw with or without non-critical flaws, and “critically low” indicating more than one critical flaw with or without non-critical flaws.
Supplemental material
Supplemental material
Outcomes of interest and statistical analysis
The primary outcome of this study was to determine the overall quality of currently published MAs based on the AMSTAR-2 scoring guidelines. The AMSTAR-2 checklist items that were most often neglected (scored as “no”) were identified.
Secondary outcomes included (i) the association between publication year, journal impact factor, effect size, and overall AMSTAR-2 quality rating and (ii) the overall power of the meta-analysis based on the primary outcome. A simple linear regression analysis was conducted to identify the relationship between publication year and overall AMSTAR-2 quality, the relationship between journal impact factor and overall AMSTAR-2 quality, and the relationship between effect size and overall AMSTAR-2 quality. We elected to include a power analysis13 determination as a secondary outcome to assess the proportion of those MA with a power greater than or equal to 80%. For this analysis, a random effects model was assumed, and power was calculated using the expected value of effect size for the predefined primary outcome of the meta-analysis, average sample size of studies included, total number of studies, and the reported heterogeneity. We sought continuous primary outcomes for this analysis and evaluated the effect size with Cohen’s d. For MAs where a dichotomous primary outcome was reported, a log OR was calculated. For those circumstances where a primary outcome was not delineated, or when multiple primary outcomes were specified, a hierarchy was used to consider which outcome was used for power analysis based on clinical importance of the outcome. The hierarchy was as follows: (i) pain intensity at 6 months (followed by other time points of this outcome); (ii) physical function at 6 months (followed by other time points of this outcome); (iii) opioid consumption at 6 months (followed by other time points of this outcome). The software used in this study was SPSS (IBM SPSS for Windows, V.21.0). An alpha of 0.05 was considered statistically significant for this study.
Protocol deviations
Minor deviations from the protocol included collection of opioid consumption and physical functioning outcomes when applicable. We also added secondary analyses to assess associations in MA quality based on year of publication, effect size, and journal impact factor. Finally, we incorporated a power analysis of primary outcomes in MAs.
Results
Search results
The search strategy initially yielded 1657 articles prior to deduplication (figure 1). After removing duplicates, the remaining 1216 articles were identified and added to an EndNote file. Following independent screening, 91 articles underwent full-text review. Ultimately, 25 MAs were included in the final analysis.14–38 The remaining 65 studies that were excluded are listed in online supplemental eAppendix 2, along with reasons for each study’s exclusion after full-text review.
Study characteristics
We identified 25 MAs that assessed pain intensity, opioid consumption, and/or physical functioning outcomes after SCS therapy for chronic pain. Among these outcomes, 22 MAs14–16 18–28 30–32 34–38 assessed pain intensity as a primary or secondary outcome. Eleven MAs14 15 17 19 21 26 32 33 36–38 assessed functional scores as a primary or secondary outcome, and two MAs15 29 assessed opioid consumption as a primary or secondary outcome. The year of publication ranged from 2005 to 2023, coming from 19 different journals, with impact factors ranging from 1.70 to 17.50. Chronic pain indications were variable, but mainly focused on persistent spinal pain syndrome type 2 (ie, failed back surgery syndrome),14 16 17 19 21 37 38 CRPS,14 16 17 19 23 36 38 chronic back/leg pain,15 19 20 26 27 29 30 34 38 painful diabetic neuropathy,14 19 22 24 25 refractory angina pectoris,31–33 and various refractory neuropathic pain syndromes.14 19 28 35 37 Follow-up times for outcome assessment ranged from 2 days to 5 years, with many having interval follow-up times in the range of months. Type of waveform paradigms included tonic,14–17 19–24 28–38 burst,18 27 and 10 kHz22 24 26 29 stimulation. These waveform paradigms were often compared against each other or against placebo, conventional medical management, or other relevant modes of management for a specific chronic pain condition. Fifteen studies15 18 19 22 24–30 34–36 38 reported potential conflicts of interest based on study funding or author engagement with various industry sponsors. Nine studies14 16 17 20 21 23 31–33 reported no conflicts of interest. Further details are reported in tables 1 and 2.
AMSTAR-2 quality assessment
The AMSTAR-2 critical appraisal tool was used to appraise the overall quality of all 25 MAs. Three MAs15 19 20 had “high” overall quality, two of which were from the Cochrane Database of Systematic Reviews15 19 and one from Neuromodulation: Technology at the Neural Interface.20 Three MAs17 24 28 had “low” overall quality, and the remaining 19 MAs14 16 18 21–23 25–27 29–38 had “critically low” overall quality. When the AMSTAR-2 quality categories were converted to numerical values (0 for “critically low”, 1 for “low,” 2 for “moderate,” and 3 for “high”), the mean score was 0.48, indicating that the mean quality for all included MAs was “critically low”. The most common “critical” flaws identified were item 2 (“Did the report of the review contain an explicit statement that the review methods were established prior to the conduct of the review and did the report justify any significant deviations from the protocol?”), item 7 (“Did the review authors provide a list of excluded studies and justify the exclusions?”), and item 15 (“If they performed quantitative synthesis did the review authors carry out an adequate investigation of publication bias [small study bias] and discuss its likely impact on the results of the review?”). The percentage of MAs satisfying each AMSTAR-2 item is presented in figure 2, and the overall quality of each MA is listed in table 2.
Year of publication, journal impact factor, effect size, and quality
A scatter plot displays the AMSTAR-2 overall quality per the year of publication and overall quality per the journal impact factor (figure 3). There was no association between the publication year and AMSTAR-2 overall quality (β 0.043; 95% CI −0.008 to 0.095; p=0.097). There was an association between the journal impact factor and AMSTAR-2 overall quality (β 0.108; 95% CI 0.044 to 0.172; p=0.002). There was no association between the effect size and AMSTAR-2 overall quality (β −0.168; 95% CI −0.518 to 0.183; p=0.320).
Power analysis
A power analysis was conducted on 15 MAs to determine if they were adequately powered to reject the null hypothesis.14 15 17–21 23–25 27 28 30–32 Twelve of 15 MAs (80%) were underpowered to determine their effect size.15 17–21 24 27 28 30–32 Only three MAs were adequately powered to reject the null hypothesis.14 23 25
Discussion
Evidence summary
This study identified that 76% of MAs assessing outcomes after SCS for chronic pain were considered “critically low” for overall quality per the AMSTAR-2 critical appraisal tool. Among the “critical flaws,” MAs were frequently deficient in meeting criteria for item 2 (statement of protocol registration a priori; justification of protocol deviations), item 7 (justification of each excluded study), and item 15 (investigation of publication bias and its impact). The most common flaws that downgraded the quality of MAs were related to lack of protocol registration, absence of excluded studies with their explanations, and failure to address publication bias. These findings are concerning because MAs are often used to guide clinical decisions and direct practice guidelines because they are considered to be the highest level of evidence.7 Furthermore, given that the quality of evidence from many trials in the neuromodulation literature is already considered low39–44 pooling low-quality trials within low-quality MA likely further compounds poor-quality evidence and compromises accurate data representation.
The most commonly missed domain was providing a list of excluded studies with justification of exclusion (item 7). Shea et al5 emphasized the importance of transparency when reporting and justifying the exclusion of studies. By omitting this requirement, it introduces a risk that excluded studies remain unrecognized and/or unidentified, and the impact of their exclusion from the review is unknown.5 A common co-occurring theme identified in many articles that lacked this critical domain (item 7) was an omission of general reasons in their PRISMA flow diagram and failure to identify specific studies for exclusion. For example, in an MA on SCS for chronic pain, the authors identified general reasons for exclusion from the study in their PRISMA diagram, but did not identify a full detailed list of the 28 excluded studies.14 This was a recurrent theme among 16 of 19 MAs in this study that were deficient in this critical domain.
The majority of MAs in this review did not register a protocol a priori. MAs and SRs are considered prospectively planned studies that pool data from previously published studies. Therefore, to limit publication bias, a pre-registered protocol prior to study commencement is critical. According to the AMSTAR-2 guidelines, authors should demonstrate that they worked with an a priori written protocol that had independent verification among coinvestigators.5 The studies in this review that successfully met this domain had previously registered protocols via PROSPERO (https://www.crd.york.ac.uk/prospero/). PROSPERO provides a comprehensive list of registered SRs to help avoid duplication and reduce the risk of reporting bias by enabling a comparison of the completed review with what was planned in the protocol.45 However, it is plausible that despite the registration of a protocol prior to MA commencement, there may be protocol deviations and violations that are not reported in the final published version.
Many MAs did not adequately investigate publication bias nor discuss its implications in the results. Since publication bias is a common form of bias that is challenging to resolve, transparency of its presence with an explanation of its impact is crucial for the interpretation of published results. It is known that small studies can impact the results of MA in significant ways. Smaller studies often have lower methodological quality and a tendency to report positive results more often. In contrast, large trials tend to yield smaller, non-significant effect sizes.46 Thus, the influence of small trials on pooled estimates of treatment effects should be routinely assessed in MAs.5 46
Despite continual advancements in standardization tools over the past decades, the association between publication year and quality of MAs did not achieve statistical significance. However, there was a trend suggesting a moderately strong, positive association between publication year and study quality based on AMSTAR-2 (figure 3), and we suggest future studies should explore if this trend may be statistically significant. In addition, despite careful design strategies, SRs with MAs can differ in quality, leading to contrasting answers to the same questions due to the fact that reporting and quality assessment of SRs with MAs are distinct, but complementary processes.47 48 Investigators primarily develop their SRs with MAs using PRISMA or Quality of Reporting Of Meta-analyses (QUOROM)49 as a tool for reporting complete and transparent reviews, which was created as a guideline for researchers. PRISMA and QUOROM serve as a checklist that ensures all relevant stages and considerations are documented during the research process, primarily focusing on enhancing readability and replicability in the findings of the SRs with MAs.4 In contrast, AMSTAR is an instrument that uses a scoring system and detailed criteria for assessing the strengths and weaknesses of the SR with MA, which specifically focuses on evaluating the methodological quality of the conducted SRs with MAs. Therefore, the investigators suggest future researchers in the field of SCS for chronic pain conditions use PRISMA guidelines in conjunction with AMSTAR-2 to enhance the rigor and reliability of SRs with MAs.
Finally, only three MAs had adequate power to reject the null hypothesis.14 23 25 These three MAs had larger effect sizes, implying that a smaller number of individual studies were needed in the MA to demonstrate a significant effect (>80% power). In the majority of other MAs, there was not an adequate number of individual studies/number of patients included to detect the respective effect size. These findings may suggest that underpowered MA should not have been meta-analyzed and perhaps a SR without pooling would have been a more appropriate method of data synthesis. The higher the statistical power, the lower the risk of missing an actual effect (type 2 statistical error).50 If power is inadequate in MAs, authors should discuss this in their limitations and address the potential possibility of type 2 statistical error.5
Strengths and limitations
To the best of our knowledge, this is the first study to comprehensively review the quality of MAs that evaluate SCS for chronic pain. Where applicable, AMSTAR-2 guidelines were not only used to evaluate the quality of the included MAs, but also used as a guide in the creation of this manuscript since it is a PRISMA-compliant SR.
One potential limitation of this study is that there may be heterogeneity based on the type of statistical model used by different investigators, including fixed-effect model, random-effects model, and other types of statistical models. Our power analysis was based on the traditional random-effects model. Another potential limitation is the documented “floor effect” described by researchers studying the AMSTAR-2 critical appraisal tool for SRs and MAs since its conception in 2017. Given the flexibility in defining “critical domains,” there may be an overestimation of “critically low” confidence ratings based on which domains were determined to be “critical” versus “non-critical” when scoring. The Cochrane Handbook, PRISMA guideline, QUOROM guideline, and Meta-analyses Of Observational Studies in Epidemiology are the most commonly used guidelines for the development of SRs and MAs. However, the verbiage in some of these guidelines does not mandate reporting pre-established protocols and listing excluded studies with an explanation, which correlates with critical domains 2 and 7, respectively, in the AMSTAR-2 critical appraisal tool. Another potential limitation is that not all journals use the same source for reporting impact factors, which could lead to non-uniformity of comparison. Finally, 15 of the 25 MAs were included in the power analysis since they had all the complete variable data to conduct the power analysis. To prioritize data integrity and avoid the risk of bias in the power estimate, the investigators elected not to create assumptions for those studies that had missing data and acknowledged this was a limitation in the study.
Conclusions
Given advancements and evolutions in SCS devices and clinical utilization, there has been a significant increase in SCS-specific primary research and MAs. Unfortunately, the quality of published MAs has been largely poor and AMSTAR-2 data appraisal standards have been sparingly incorporated. Our study finds that the vast majority of 25 included MAs demonstrated “critically low” methodological quality per AMSTAR-2 criteria. The most commonly identified flaws that downgraded the quality of MAs were lack of protocol registration, absence of excluded studies with their explanations, and failure to address publication bias. These findings serve to objectively highlight specific methodological deficiencies of these MAs and provide practice recommendations for conducting future MAs.
Ethics statements
Patient consent for publication
Ethics approval
Not applicable.
Acknowledgments
The authors thank Larry J Prokop MLIS from Mayo Library System, Mayo Clinic, Rochester, MN for his contribution with the literature search.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
Contributors DJK: responsible for study design, protocol registration, screening, data extraction, data analysis/interpretation, drafting the manuscript, final manuscript approval. RC: responsible for screening, data extraction, drafting manuscript, final manuscript approval. NH, JK, EW: responsible for study design, data analysis/interpretation, final manuscript approval. RSD: responsible for study design, protocol registration, data analysis/interpretation, drafting the manuscript, final manuscript approval.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests RSD received investigator-initiated grant funding paid to his institution from Nevro Corp and Saol Therapeutics.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.