The cardiopulmonary exercise testing grey zone; optimising fitness stratification by application of critical difference

British Journal of Anaesthesia 2018;120(6):1187 – 1194  doi: 10.1016/j.bja.2018.02.062

Presented by: Dr Hannah Saitch


  • Cardiopulmonary exercise testing acts as an aid to clinical decision making pre-op.
  • American Heart Association name it as a “clinical vital sign”
  • Original study – found 18% mortality in elderly with Anaerobic threshold (AT) <11m O2 / Kg, 0.8% mortality with AT >11 ml O2/ Kg (Older et al 1993)
  • Other studies have reported an AT 9-11 ml O2/kg as a cut off for higher versus lower risk.
  • Cardio respiratory fitness is dynamic and therefore will vary – this is the biological difference.
  • The analytical difference is the difference which occurs with repeated testing for the same sample – affected by the accuracy.
  • The Critical Difference (CD) considers the analytical difference and the biological difference and is defined as random variation around a homeostatic point indicative of a change that must occur before a true difference of clinical significance can be claimed. No previous study has looked at CD for CPET testing.

Design and Setting

Two Arm Study

Arm 1

  • Aim to establish CD
  • Analytical Difference (CVA) – Simulated expired and inspired gases passed through Medigraphics Ultima Metabolic cart – 25L/min. 8 repeated trials of 10 resp cycles, middle 5 breaths averaged. Aim to simulate peak of CPET testing
  • Biological Difference(CVB)– 12 healthy volunteers, 3 CPET tests. All at different times of day, at least 24 hrs apart. Wasserman Protocol. Medgraphics equipment used – VO2 peak, oxygen uptake efficiency slope, peak oxygen pulse. Aerobic threshold manually calculated using V slope method.

= k √CVA2+CVB2

K – constant

Arm 2

213 consecutive patients from colorectal pre-assessment CPET testing in single centre.  Retrospective analysis

CPET equipment and protocol used as per arm 1

Reference metrics from American Heart Association

  • VO2- AT <11ml O2/Kg
  • VO2 peak <16ml O2/Kg/min
  • VE/VCO2 – AT >36

Statistical Analysis

  • IMB SPSS used
  • Distibution normality assessed using Shapiro-Wilk W test
  • Time of day analysis in arm 1 analysed using Bonferroni corrected repeated measures analysis
  • Continuous data – mean or median used
  • Categorical data – absolute values used
  • Sample size for Power 80% with P<0.05


  • Revised fitness stratification corrected by +/- CD to threshold boundaries. Area in between named “indeterminate fitness”
  • Compared patients for current and revised models


  • Patients who had false negative and false positive results when CD applied
  • False Positive– patients originally stratified as fit but with negatively corrected become unfit
    False Negative – patients originally stratified as unfit but with positive correction become fit
  • Patients in area of indeterminate fitness

Results – Arm 1

Critical Differences as follows:

  1. AT – 19%
  2. VO2 Peak – 12.5%
  3. VE/VCO2 – AT – 10.2%

Results – Arm 2

Based on application of critical differences to results:

  1. For AT there were 69 (32%) false positives and 59 (28%) false negatives
  2. For VO2 Peak there were 35 (16%) false positives and 33 (15%) false negatives
  3. For VE/VCO2 – AT there were 40 (20%) false positives and 37 (17%) false negatives

The following revised stratification model was developed

  1. AT – unfit – <9.2, indeterminate fitness 9.2 – 13.6, fit ≥6
  2. VO2 Peak – unfit <14.2, indeterminate fitness 14.2 – 18.3, fit ≥ 18.3
  3. VE/VCO2 – AT – unfit ≥ 40.1, indeterminate fitness 32.7 – 40.1, fit <32.7

The revised stratification model was applied to the patients CPET tests


  • Highlight potential for incorrect fitness stratification
  • Mean values were close to threshold therefore large no of patients moving into indeterminate groups
  • AT most incorrectly stratified, then VO2 Peak, then VE/VCO2 – AT
  • Revised stratification confirms potential clinical impact


  • Clinically relevant question, although these variations would have been present throughout all previous studies determining CPET interpretation therefore
  • Power 80%, P <0.05
  • Appropriate statistical tests
  • CD calculation – standard mathematical formula


  • Study arm 1 not comparable to patients in 2 (all young, healthy males in 1) plus small numbers in 1. Potential for vastly different biological variation in population in arm 2 with average AT of 11. Post study evaluation suggests larger groups (39 in each arm) should be used for randomised controlled exercise trials.
  • Data collected on single system therefore only applicable to MedGraphics equipment
  • No metabolic calibrator for equipment used (although 2.2% range of results is within expected accuracy range)
  • Retrospective data from study arm 2. States that standard protocol followed but can this be guaranteed? No mention of staff being trained to trial protocol for patients in study arm 2.
  • Does not analyse surgical outcomes in light of indeterminate risk.


  • No assessment of clinical outcomes in view of indeterminate risk
  • Highlights need for consideration of multiple factors within risk stratification
  • Consider CPET results as a dynamic range – Particularly in post operative planning

Potential for clinical impact

Moderate – fitness should not be considered as a single point estimate. With an increasingly elderly and co-morbid population CPET testing is likely to increase and therefore potential for impact there however does not provide the answer of how to manage patients in the indeterminate risk group. Further research may be indicated.