Rehabilitation & Sports Medicine
Cardiopulmonary Exercise Testing

Cardiopulmonary Exercise Testing – Methodological Aspects

Spiroergometrie – Methodische Aspekte


Cardiopulmonary exercise testing (CPET) allows for a non-invasive assessment of the integrative response of the pulmonary, cardiovascular, and skeletal muscle system during exercise. Therefore CPET in sports medicine covers a wide spectrum, ranging from diagnosis of disease, preoperative assessment, to athlete monitoring. High standards of reliability and validity are needed to ensure high-quality and diagnostically conclusive CPET data, necessitating a systematic process of quality assurance and control in the daily application of CPET. Therefore, methodological aspects such as CPET equipment principles, calibration, verification, maintenance, preparation, and plausibility checks need to be considered. As inter-technology, inter-device, and inter-unit differences in reliability and validity are reported for automated metabolic analyzers, the choice of the appropriate device should follow the purpose of use and comprehensible data on reliability and validity.

To ensure high-quality measurements, careful calibration, and verification of all sensors, the integrated overall measurement performance, and maintenance of all equipment need to be performed and monitored longitudinally. Further, standardized ambient conditions, with adequate circulation and exchange of room air are essential. As the choice of the ergometer and protocol influences various target values in CPET, appropriateness for the selected diagnostic objective as well as a corresponding standardization is needed. While patients should receive pretest information that clearly outlines the test procedure, the correct attachment of the CPET equipment is of utmost importance.

To detect and correct malfunctions of the metabolic analyzer and equipment, plausibility checks of the outcome measures validity should be performed during the resting, unloaded, loaded, and recovery test phase. A basic plausibility check should include adequate rest values and increases for a given workload rate of minute ventilation ( ˙VE), oxygen consumption ( ˙VO2) and respiratory exchange ratio (RER), using rules of thumb by Rühle. Before the final data interpretation is performed, e.g. ventilatory threshold or maximum oxygen consumption (˙VO2max) or ˙VO2peak determination, again a plausibility check should be performed and the patient‘s effort whether or not maximal should be determined.

Consequently, a standard operating procedure for quality assurance and control, including an intuitive data visualization with thresholds for “pass”, “fail” or outliers and trends of concerns should be specifically defined, taught, and implemented in each facility.

Key Words: CPET, Exercise Testing, Physical Fitness


Measuring respiratory ventilation and gas exchange during exercise allows for a non-invasive examination of the responses of the pulmonary, cardiovascular, and skeletal muscle systems to submaximal and maximal exercise. Therefore, in sports medicine, cardiopulmonary exercise testing (CPET) is commonly applied in a wide spectrum from diagnosis of diseases, preoperative assessment to athlete monitoring.
To ensure high-quality and diagnostically conclusive CPET data in cross-sectional and longitudinal settings, high standards of reliability and validity are needed, necessitating a systematic process of quality assurance and control in the daily application of CPET. Therefore, well-trained and skilled staff is essential, involved in methodological aspects, such as equipment principles, calibration, verification, and maintenance, as well as CPET preparation and plausibility checks.

This short review focuses on such methodological aspects, aiming to summarize current methodological guidelines and recommendations. For more detailed aspects like clinical application and interpretation of CPET or comprehensive standards on CPET, we refer to previous clinical reviews (7, 22, 23) and position statements (1, 18) .


Nowadays, most laboratories apply automated metabolic analyzers with either (i) breath-by-breath (BxB), (ii) conventional mixing chamber (MIX), or (iii) dynamic micro mixing chamber (DMC) technology. Although these devices allow for convenient measurements of respiratory ventilation and gas exchange more or less in real-time, inter-technology (3, 15, 16, 30), inter-device (2, 5, 12), and inter-unit differences (14, 29) in reliability and validity have been reported.

Therefore, the choice of the appropriate device should follow the purpose of use and comprehensible data on reliability and validity.

For clinical use, fast responding BxB systems allowing to track intra-breath profiles seem adequate. However, BxB systems are prone to time-delay related measurement errors, especially at high breathing frequencies (20), requiring extensive noise-reduction strategies.

Consequently, at least for the determination of peak values in an athletic population MIX and DMC systems are discussed as superior, due to their more robust gas-exchange measurement at high ventilation rates (3, 31). For more insights on measurement principles and data computing in CPET, we refer to Ward (26).

Calibration and Verification
The reliable and valid performance of a metabolic analyzer is mainly determined by (i) the accuracy and stability of the built-in sensors (ambient, volume/flow, gas), (ii) accurate calibration of all sensors, (iii) the integration of all signals by complex algorithms and corrections, and (iv) sufficient drying of the gas sample (13). Therefore, careful calibration and verification of all sensors and the integrated overall measurement performance need to be performed and monitored.

Based on the manufacturers’ guidelines and recommendations given in this short review, a standard operating procedure including a definition of “pass” or “fail” values should be defined, taught, and implemented in each facility.

In table 1 recommended calibration, and frequency of calibration for the different sensor types are summarized. Calibration procedures and algorithms are defined by the manufacturers and therefore may vary. The calibration of the gas sensors, for example, generally includes slope and offset, delay-time, and response-time. Before the calibration is performed adequate warm-up time of the metabolic analyzer according to the manufacturer’s guidelines is necessary, usually ranging between 30-60 min.
Accurate calibration is essential, because a measurement error in minute ventilation (V˙E) of e.g. 5% translates directly into 5% measurement error in oxygen consumption V˙O2, while already a 1% error in the fraction of expired oxygen (FEO2) could change V˙O2 by 6.5%, and both errors (V˙E, FEO2) could accumulate to 11.7% in V˙O2 (25).

Gas sensors are regularly calibrated via a two-point calibration, meaning determining a linear correction equation based on ambient air (O2 20.93%, CO2 0.03%) and a reference gas (O2 16%, CO2 5%). Therefore, even small deviations in the ambient air can lead to incorrect calibration, which is why it is imperative to ensure an optimal supply of fresh air.

While the verification (after sensor calibration) of single sensors could be easily performed by comparing it to a corresponding reference input value e.g. using reference gas or a volume calibration syringe, the verification of the integrated system performance by target output measurements of V˙O2, carbon dioxide output (V˙CO2) and V˙E is more complex. For this purpose, using metabolic simulators and biological validations in regular intervals are recommended.

Metabolic simulators are motor-driven syringes, together with an integrated dynamic mixing bag connected to a reference gas cylinder allowing for reliable and valid simulation of different BxB steady-state ventilation and gas exchange rates (11). Despite being hardly spread, due to the high acquisition and operating costs, most of these simulators also fail to produce human-like variations in breathing pattern waveforms, as well as human-like tempered and humid exhalat. Therefore, metabolic simulators are better suited to test the stability and reproducibility of the metabolic analyzer’s output measurements longitudinally, than for holistic evaluation of validity.

In contrast, biological validations are relatively simple to implement. A healthy individual just needs to perform a series of standardized constant steady-state workloads (about 6 min each) or ramp tests on a weekly or at least monthly basis. While repeated constant workload tests allow for verification of intra- and inter-test stability of low to medium workloads below the anaerobic threshold, ramp tests allow for the comparison of peak values and the slope between workload vs. V˙O2 which should be between 8.5 and 12.5 ml/min/W for bike ergometry (6). The computed values then need to be compared within an ongoing longitudinal measurement series.

While there are no legal target values regarding the reliability and validity of metabolic analyzers, some recommendations from different societies as well as researchers exist, which could be used as thresholds for “pass” or “fail” values used in the calibration and verification routine. In summary, recommendations for reliability and validity of V˙E, V˙O2, V˙CO2 range between 2-5% (1, 9, 10). In contrast, the total variability, consisting of technical and biological variability over multiply measurements of these metrics are given as a coefficient of variation (CV) of 6-8% (4, 19). For further understanding of variation between collected values and the definition of outlier criteria the work of Westgard et al. (27, 28) is recommended , with criteria rules are summarized in table 2.

For the visualization of the collected calibration and verification data, using a Levey-Jennings chart in combination with the Westward criteria, allowing straightforward detection of outliers as well as identifying unusual trends are suggested. Further, „pass“or „fail“ values could be simply visualized using defined thresholds. An example of visualized thresholds and a Levey-Jennings chart is given in figure 1 and 2.

To ensure proper working of all CPET equipment, regular maintenance and evaluation, as described in the manufacturer-specific recommendations is required. Most sensitive parts include the CO2 and O2 gas sensors, built-in ambient sensors, overall system tightness, sampling lines, and flow calibration syringes.

Predominantly, three different types of O2 sensors are used, (i) zirconia fuel cells, (ii) paramagnetic analyzers, and (iii) semi-disposable systems (polarographic electrodes or galvanic fuel cells) (13). The two last mentioned, regularly running in small stationary or portable devices, require replacement every 6-12 months and could show significant drift not only towards the start and end of their life span but also during long-lasting tests (8), which needs to be monitored as described.
Furthermore, sampling lines made of sulfuric acid-lined tubing (also known as NafionTM) need to be replaced based on the frequency of use about every 2 months, with discoloration of the transparent hoses to a yellowish tone being a clear indicator for replacement. Furthermore, after long-lasting testing and or short time intervals between two successive tests, it would be good practice to change sampling lines, which in turn entails a new gas calibration.

Also, gas cylinders should be checked for the expiration date, leakages, and minimal filling level. Last named will be important as most gas sensors are sensitive to pressure/flow, with too low values migth inflate the calibration process.

CPET Preparation

Test Environment
The exercise laboratory should be large enough to accommodate all equipment, while still allowing adequate access to the patient. Monitoring screens should not be visible to the patient. To allow for standardized ambient conditions, including temperature (18-22 °C), humidity (30-40%), O2 (20.93%), and CO2 (0.03%) gas concentration, the laboratory must be well acclimatized with adequate circulation and exchange of room air. If the last one is incomplete, an obligatory ventilator should be used to avert abnormal CO2 and O2 gas concentrations.

Ergometer and Protocol Selection
It is well known that the choice of ergometer and protocol influences various target values in CPET (24, 32). Therefore, appropriateness for the selected diagnostic objective as well as a corresponding standardization is essential.

CPET is typically performed using a cycle ergometer, as being appropriate for a wide range of patients (low fitness, low motoric skills, joint issues), provides an accurate measurement of external workload and allows more convenient additional monitoring (ECG, blood pressure, blood sampling). Other ergometer types like treadmills and rowing ergometers necessitate the activation of more muscle mass and therefore higher V˙O2peak, but require a higher level of fitness and specific movement skills. Therefore, in most clinical applications, cycle ergometry is the preferable mode of exercise, while more sport-specific ergometer types should be used in athletic settings.

Basically, the protocol should include a resting, unloaded, loaded, and recovery phase. For most clinical applications incremental or ramp vise exercise protocols should ideally last for 10 ±2 min, with initial workload and increase in workload adjusted to the patient‘s fitness level. The selection of an adequate workload increment should ensure that patients can realize a maximal effort and valid V˙O2peak. Too rapid increases in workload often are associated with premature test termination due to lactate acidosis, hyperventilation, and consequently the inability to determine VT2 and V˙O2peak.

For the necessary ex-ante estimation of a patient‘s exercise capacity and the selection of a suitable exercise protocol, one of several useful options is the staircase question (6, 17) shown in (table 3).

Patient Preparation
While patients should receive pretest information that clearly outlines the test procedure and includes clear agreed signs due to the restricted verbal communication (avoidance of incorrect measurements due to speaking) during CPET, the correct attachment of the CPET equipment is of utmost importance.
To allow for a dense and comfortable fit of the mask (if no mouthpiece is used) during the whole test procedure, mask size and headgear should be carefully chosen and fixed. A leakage test is mandatory with the patient inhaling and exhaling with the mask closed by the hand held in front of it.

When the patient is fully attached to all necessary equipment, if necessary including ECG, blood pressure, and pulse oximeter, and placed on the ergometer, proper free running of the sampling line and other connection cables should be verified.

Plausibility Check

To detect and correct malfunctions of the metabolic analyzers and equipment, e.g. due to defect or drift of the gas sensors, ineffective gas sample drying, mask leakage, clogged sample tubes, plausibility checks of the outcome measures validity should be performed for the resting, unloaded, loaded, and recovery phase. A basic plausibility check should include adequate rest values and increases for a given workload rate of V˙E, V˙O2, and RER, using the rule of thumbs suggested by Rühle (21), given in table 4.

Before the final data interpretation is performed, e.g. VT determination as described in previous clinical reviews (22, 23), again a plausibility check as described should be performed for each protocol phase. If necessary it should be determined if the patient‘s effort could be considered maximal, using criteria given in table 5. Importantly, RER is not per se a reliable indicator of maximal exertion due to its variability with hyperventilation and pathological specifics, and therefore should be used with caution as a secondary marker.


This short review aims to support physicians and practitioners to ensure high data quality of CPET, by providing methodological guidelines and recommendations to reduce risk of bias and eliminate potential sources of bias. However, as CPET is a complex diagnostic tool, no claim is made to completeness. Importantly, these good practice recommendations need to be adapted to the specific conditions and requirements of each facility before they can be transferred into daily practice.

To ensure high-quality and diagnostically conclusive CPET data in cross-sectional and longitudinal settings, high standards of reliability and validity are needed, necessitating a systematic process of quality assurance and control in the daily application of CPET.

Conflict of Interest

The authors have no conflict of interest.


  2. BABINEAU C, LÉGER LUC, LONG AL, BOSQUET L. Variability ofMaximum Oxygen Consumption Measurement in VariousMetabolic Systems. J Strength Cond Res. 1999; 4: 318-324.
  3. BEIJST C, SCHEP G, VAN BREDA E, WIJN PFF, VAN PUL C. Accuracy andprecision of CPET equipment: a comparison of breath-by-breathand mixing chamber systems. J Med Eng Technol. 2013; 37: 35-42.
  4. COOPER BG, BUTTERFIEL AK. Quality control in lung functiontesting. In: ERS Buyers’ Guide Respir Care Prod; 2009.
  5. CROUTER SE, ANTCZAK A, HUDAK JR, DELLAVALLE DM, HAAS JD. Accuracy and reliability of the ParvoMedics TrueOne 2400 andMedGraphics VO2000 metabolic systems. Eur J Appl Physiol.2006; 98: 139-151.
  6. DUMITRESCU D, GREIWING A, HAGER A, HOLLMANN W, MEYER K,SCHOMAKER R. Kursbuch Spiroergometrie: Technik undBefundung verständlich gemacht. 3. vollständig überarbeiteteund erweiterte Auflage. Georg Thieme Verlag: Stuttgart, NewYork; 2015.
  7. FRIEDMANN-BETTE B. Die Spiroergometrie in dersportmedizinischen Leistungsdiagnostik: Application ofSpiroergometry in the Diagnosis of Athletic Performance. DtschZ Sportmed. 2011; 62: 10-15.
  8. GARCIA-TABAR I, ECLACHE JP, ARAMENDI JF, GOROSTIAGA EM. Gasanalyzer’s drift leads to systematic error in maximal oxygenuptake and maximal respiratory exchange ratio determination.Front Physiol. 2015; 6: 308.
  9. GORE C, ED. Physiological Tests for Elite Athletes. HumanKinetics: Champaign; 2000.
  10. HODGES LD, BRODIE DA, BROMLEY PD. Validity and reliability ofselected commercially available metabolic analyzer systems.Scand J Med Sci Sports. 2005; 15: 271-279.
  11. HUSZCZUK A, WHIPP BJ, WASSERMAN K. A respiratory gas exchangesimulator for routine calibration in metabolic studies. Eur RespirJ. 1990; 3: 465-468.
  12. JAKOVLJEVIC DG, NUNAN D, DONOVAN G, HODGES LD, SANDERCOCK GRH,BRODIE DA. Lack of agreement between gas exchange variablesmeasured by two metabolic systems. J Sports Sci Med. 2008; 7:15-22.
  13. MACFARLANE DJ. Automated metabolic gas analysis systems: areview. Sports Med. 2001; 31: 841-861.
  14. MACFARLANE DJ, WU HL. Inter-unit variability in two ParvoMedicsTrueOne 2400 automated metabolic gas analysis systems. Eur JAppl Physiol. 2013; 113: 753-762.
  15. MEYER T, GEORG T, BECKER C, KINDERMANN W. Reliabilityof gas exchange measurements from two differentspiroergometry systems. Int J Sports Med. 2001; 22: 593-597.
  17. POLLOCK M, ROA J, BENDITT J, CELLI B. Estimation of ventilatoryreserve by stair climbing. A study in patients with chronicairflow obstruction. Chest. 1993; 104: 1378-1383.
  18. PRITCHARD A, BURNS P, CORREIA J, JAMIESON P, MOXON P, PURVIS J,THOMAS M, TIGHE H, SYLVESTER KP. ARTP statement oncardiopulmonary exercise testing 2021. BMJ Open Respir Res.2021; 8: e001121.
  19. REVILL SM, MORGAN MD. Biological quality control for exercisetesting. Thorax. 2000; 55: 63-66.
  20. ROECKER K, PRETTIN S, SORICHTER S. Gas exchange measurementswith high temporal resolution: the breath-by-breath approach.Int J Sports Med. 2005; 26: S11-S18.
  21. KROIDL RF, SCHWARZ S, LEHNIGK B. Kursbuch Spiroergometrie. Technik und Befundung verständlich gemacht. Georg Thieme Verlag KG, Stuttgart. 2010.
  22. SCHARHAG-ROSENBERGER F, SCHOMMER K. Die Spiroergometriein der Sportmedizin. Dtsch Z Sportmed. 2013; 64: 362-366.
  23. SCHARHAG-ROSENBERGER F. Spiroergometrie zurAusdauerleistungsdiagnostik. Dtsch Z Sportmed. 2010; 61: 146-147.
  24. TANAKA H, FUKUMOTO S, OSAKA Y, OGAWA S, YAMAGUCHI H, MIYAMOTO H. Distinctive effects of three different modes of exercise on oxygenuptake, heart rate and blood lactate and pyruvate. Int J SportsMed. 1991; 12: 433-438.
  25. TANNER RK, GORE CJ. Physiological tests for elite athletes. 2. ed.Human Kinetics: Champaign, Ill.; 2013.
  26. WARD SA. Open-circuit respirometry: real-time, laboratory-basedsystems. Eur J Appl Physiol. 2018; 118: 875-898.
  27. WESTGARD JO, GROTH T, ARONSSON T, FALK H, DE VERDIER CH. Performance characteristics of rules for internal quality control:probabilities for false rejection and error detection. Clin Chem.1977; 23: 1857-1867.
  28. WESTGARD JO, BARRY PL. Chapter 4, in: Cost-effective qualitycontrol: managing the quality and productivity of analyticalprocesses. In: Improving quality control by use of Multirulecontrol procedures; 1997.
  29. WINKERT K, KAMNIG R, KIRSTEN J, STEINACKER JM, TREFF G. Interandintra-unit reliability of the COSMED K5: Implicationsfor multicentric and longitudinal testing. PLoS One. 2020; 15:e0241079.
  30. WINKERT K, KIRSTEN J, DREYHAUPT J, STEINACKER JM, TREFF G. TheCOSMED K5 in Breath-by-Breath and Mixing Chamber Modeat Low to High Intensities. Med Sci Sports Exerc. 2020; 52: 1153-1162.
  31. WINKERT K, KIRSTEN J, KAMNIG R, STEINACKER JM, TREFF G. Differencesin V˙O2max Measurements Between Breath-by-Breath andMixing-Chamber Mode in the COSMED K5. Int J Sports PhysiolPerform. 2021; 16: 1335-1340.
  32. ZUNIGA JM, HOUSH TJ, CAMIC CL, BERGSTROM HC, TRAYLOR DA,SCHMIDT RJ, JOHNSON GO. Metabolic parameters for ramp versusstep incremental cycle ergometer tests. Appl Physiol Nutr Metab.2012; 37: 1110-1117.
Dr. hum.biol. Kay Winkert
University Hospital Ulm
Division of Sports and
Rehabilitation Medicine
Leimgrubenweg 14, 89075 Ulm, Germany