Validation of an Assistance System for Motion Analysis
Validierung eines Assistenzsystems zur Bewegungskontrolle
Background: In the rehabilitative setting, the use of senor technologies is gaining in importance, especially the use of marker- and contactless systems. Reliably detecting motion and error patterns isabsolute prerequisite for use in exercise therapy. In this study, an annotated algorithm for the skeletal model of the sensor Kinect 1.0 was compared with movement detection by a therapist.
Methods: 18 test subjects (male: 10, female: 8; age 68.3±5.9 years) performed 3 sets of hip abduction exercises with 10 repetitions using the rope pull. The assessment of the movement by an experienced therapist was coded and then directly compared to the Kinect 1.0 detection results. The diagnostic parameters sensitivity, specificity, false-positive rate, positive predictive value, and negative predictive value were calculated.
Results: There was 70% agreement over all error patterns. Sensitivity was between 12.9% (hip rotated outside) and 66.6% (bent knee), and specificity was around 80%. The false-positive value rangedbetween 13% (wrong plane) and 22.7% (hip rotated outside). The positive predictive value for hip rotated outside was 74.3%. The negative predictive value ranged between 77.1% (upper body) and 92.7% (bent knee).
KEY WORDS: Motion Detection, Error Patterns, 3D-Depth Sensor, Rope Pull
Problemstellung: Im rehabilitativen Umfeld gewinnt die Anwendung von Sensortechnologien, insbesondere der Einsatz marker- und kontaktloser Systeme, an Bedeutung. Die zuverlässige Bewegungserkennung, aber auch die Erkennung von Fehlerbildern während der Bewegung sind hierbei Grundvoraussetzungen. Daher war es Ziel der Studie, einen entwickelten Algorithmus zur Fehlerbilderkennung, welcher für das Skelettmodell der Kinect 1.0 annotiert wurde, mit der Bewegungserkennung eines erfahrenen Trainingstherapeuten zu vergleichen.
Methoden: 18 Probanden (männlich: 10, weiblich: 8; Alter 68,3±5,9 Jahre) führten die Übung Hüftabduktion in 3 Sätzen zu 10 Wiederholungen am Seilzug durch. Die Bewegungserkennung eines erfahrenen Therapeuten wurde durch Codes direkt mit der des Systems abgeglichen. Damit wurden die diagnostischen Parameter Sensitivität, Spezifität, falsch-positiver Wert sowie positiver und negativer Vorhersagewert ermittelt.
Ergebnisse: Es ergab sich über alle Fehlerbilder eine Übereinstimmung von über 70%. Die Sensitivität lag zwischen 12,9% (hip rotated outside) und 66,6% (bent knee), die Spezifität um 80%. Der falsch-positive Wert betrug zwischen 13% (wrong plane) und 22,7% (hip rotated outside). Der positive Vorhersagewert ist für hip rotated outside 74,3%. Der negative Vorhersagewert liegt zwischen 77,1% (upper body) und 92,7% (bent knee).
Diskussion: Es werden nicht alle Fehlerbilder vom System erkannt, jedoch die richtige Bewegungsausführung als korrekt identifiziert. Letzteres ist hinsichtlich der Nutzerakzeptanz von Bedeutung. Bei den Fehlerbildern upper body und hip rotated outside ist davon auszugehen, dass die erkannten Fehler tatsächlich aufgetreten sind. Die Daten zeigen mit über 70% Übereinstimmung eine genügend zuverlässige Bewegungserkennung, sodass das Assistenzsystem in der Therapie als Unterstützung genutzt werden kann. Weitere Studien sollen den Einsatz des Systems in der klinischen Praxis aufgreifen, bspw. die Nutzerakzeptanz der Probanden oder die Umsetzbarkeit des Feedbacks.
SCHLÜSSELWÖRTER: Bewegungserkennung, Bewegungsmuster, 3D-Tiefensensor, Seilzug
The use of sensor technologies for motion detection is increasingly gaining in importance in exercise therapy. Due to demographic changes, the number of elderly patients is constantly increasing. Furthermore, there is currently insufficient care regarding exercise therapy. The supervision of 12 to 15 patients makes it impossible for the therapist to correct and eliminates movement errors (13). This means that therapists can not meet the requirements to care for a maximum of 10 patients on the training area (5). Therefore, sensor systems and visual feedback will need to be the therapists’ “third eye”. This is relevant if many patients train at the same time, when the therapist cannot detect any motion error or give correction instructions. Patients can control and correct their movement through feedback generated by sensor technology. The use of sensors is only helpful if they identify movement similarly to an experienced therapist. 3D-depth sensors are often used due to their markerless and contactless motion detection. The advantages of these sensors compared with marker-based motion capture systems are their user friendly operation and low costs. For use in exercise therapy, marker-based systems are impracticable because attaching the markers is complex and requires skilled personnel. In a review, Verbrugghe et al. (19) showed the use of technical systems in rehabilitative exercise therapy. Kato et al. (10) carried out an investigation using a self-developed prototype consisting of a 3D-depth sensor and feedback system. They tested healthy and injured test subjects (brain injury). The 3D-depth sensor was used to detect the position of the test subjects. A vibrator was used for sensory feedback. For the training of the upper limbs and the balance target object were displayed that should be tracked. Patients needed more time to complete their first tasks. The time was shortened from the first to the last attempt. The balance task should also be completed in a shorter time. The angle values of the patients also approximated those of the healthy ones. The researchers considered the use of 3D-depth sensor in the system as useful.
Kato et al. (10), Wang et al. (20) and Fernandes-Baena et al. (7) found that the 3D-depth sensor is suitable for movement analysis in rehabilitative exercise therapy.
Wang et al. (20) and Fernandes-Baena et al. (7) carried out investigations to evaluate the 3D-depth sensor in comparison with marker-based motion capture systems. They tested motion detection in rehabilitative exercise therapy. Wang et al. (20) examined the motion detection accuracy of the 3D-depth sensor in the first and second generation with an optical marker-based motion capture system. For determination of joint localization of all three systems, 20 joint points were determined. These are the same in the three systems. 12 exercises were carried out for the test. The determination of the bone length showed for the 3D-depth sensor in first generation larger offsets and standard deviation, especially in the femur. The cause may be the offset of the hip joint. The sensor in second generation was robust against interference and showed an exact tracking of the position. With regard to the present study the information on offsets of the hip is of importance. Movements of the lower extremities are observed, so that shifts are also possible here. Furthermore in the present study Kinect 1.0 is used. Fernandes-Baena et al. (7) came to similar conclusions when comparing with the optical system. The results show a synchronicity in the signals of the 3D-depth sensor and the optical marker-based system within the movement patterns, so that the offsets for tested knee and hip movements were less than 10°. Furthermore, the position of the extremities to the sensor plays an important role. In extreme rotation, the leg is perpendicular to the Kinect sensor and there are errors when tracking hip or knee. For the rehabilitation of knee patients, a sensor system was developed which uses a 3D-depth sensor to record the movement, count the repetitions, and give feedback regarding movement quality. Technical assistance systems are often used in motor rehabilitation (2, 8). Feedback is generated for the user from the resulting measurement data.
In the study by Banala et al. (2), the movements were captured using force and pressure sensors, and Hirokawa & Matsumura (8) detected the foot position electronically. In addition, it has been proven that visual feedback from assistance systems improves the movement quality (1, 12, 18, 19).
Previous studies have tested the accuracy of movement detection in various technical systems and showed that markerless optical sensors can provide support (7, 20). So far known, the 3D-depth sensor has only been validated with other optical sensors.
This paper aims to compare the developed algorithm for error detection with the movement assessment of a therapist. For validation, important parameters of clinical studies were calculated. These are sensitivity, specificity, false-positive value, positive predictive value and negative predictive value. The contactless and markerless depth sensor Kinect 1.0 was used due to its user-friendly operation, which is essential in exercise therapy.
18 healthy subjects participated in the study (68.3±5.9 years) (Table 1). Before the study, all subjects were informed about the study design and purpose and then gave informed consent to their participation in writing.
The inclusion criterion for participation in the study was the age between 50 and 80 years. Exclusion criteria were neurological health restrictions and pain during the movement to be performed. The examination standards corresponded to the Declaration of Helsinki.
The test subjects were asked to perform the exercise hip abduction on a rope pull (Fig. 1), which is a therapeutically relevant exercise for strengthening the hip abductors (13). A cuff was attached above the ankle joint for resistance transmission. The resistor was then attached to the weight block of the rope pull and the subject stood at the side of the device. To become familiarized with the test procedure, each subject completed five repetitions without weight. The subjects should do a physiological hip abduction without specification of the abduction angle. Subsequently, subjects carried out three sets with 10 repetitions under load (10kg) (set pause 30 s).
Meanwhile, the 3D -RGB and depth sensor Kinect 1.0 (Microsoft Corporation, Remond, WA, 98052-7329, USA) in the assistance system recorded the movements. At the same time, an experienced training therapist observed the patient’s movements. An evaluation sheet was used to record error images perceived by the therapist (more than five years clinical practise). The error images concentrated on flexed upper body (UB), wrong plane (WP), bent knee (BK), and hip rotated outside (HO) (Fig. 1) and are based on an expert survey (13). EquipmentThe sensor was placed four meters from the front of the subject training on the rope hoist. The PC connected to the sensor processed the skeletal data to detect error patterns. Prior to the measurements, the assistance system was trained to detect the above error patterns using skeletal data from the sensor Kinect 1.0, machine learning methods, and Incremental Dynamic Time Warping (IDTW). Thus, the error patterns were automatically annotated to the assistance system by previously recorded example sequences of the correct and faulty exercise executions. By normalizing the skeletal data, it was possible to teach the system independently of the size and stature of a person (15).
The comparison of the subject’s movements recorded by the assistance system and the evaluation by the algorithm took place according to the following rules: BK: measured knee angle <165°; UB: angle between shoulder center and left ankle, with pivot point hip center, <160°; WP: distance of right ankle to correct plane of motion >380 mm; HO: detection by foot position (Fig. 1). On this basis, movements outside these limits were classified by a support vector machine (SVM) as error patterns. The error patterns recorded for each frame (time) were then filtered again to smooth the error patterns displayed to the user. For this purpose, a time window was created for each error pattern, which contains the detection results of the last 3 or 5 frames for each error pattern (15). This was only visualized if the system detected an error for the majority (majority of 3 frames are 2, majority of 5 frames are 3) of the elements contained in this time window.
For the statistical analysis of the data, the correct classification rate was evaluated with the statistical software R (14). This indicates the number of correct predictions in relation to the total number of observations. The codes issued by the assistance system (e.g. 0; 1; 0; 0) were offset against the evaluation of the therapist, which was also available as a code (e.g. 0; 1; 0; 0). The sequence of digits corresponds to the order of the error pattern UB; WP; BK; HO. 0 means that no error was detected, 1 means that an error was present according to the evaluation. The difference between the assistance system and the therapist was calculated for each individual repetition of each test subject. The given numerical example then provides a difference code of 0; 0; 0; 0, i.e. system and therapist matched. If there were positive values (+1) in the difference codes, the system recognized an error that the therapist did not specify. Negative values (-1) were those values where the therapist indicated an error that the system did not recognize. Thus, the difference codes shown in Fig. 2 were created, reflecting the occurrence and distribution of the error pattern. It was assumed that the therapist correctly judged the movements, so that the therapist’s evaluation was used as a reference for the evaluation of the assistance system.
To determine the diagnostic parameters sensitivity, specificity, false-positive value, as well as the positive and negative predictive values, the cases of equal and unequal evaluation per error pattern were calculated on the basis of the code output using a four-field table (Table 2; equations 1-5).
In the present study, the parameters relating to error detection by the assistance system and the therapist are defined as follows (Equations see Table 5):
Sensitivity (SENS): The ability of the system to identify a error pattern as such (Equation 1)
Specificity (SPEC): The ability of the system to identify a correct movement as such (Equation 2)
False positive value (FPV): Error of the system to identify a correct movement as an error pattern (Equation 3)
Positive Predictive Value (PPV): The probability that an error pattern is actually present if it has been identified by the system (Equation 4).
Negative predictive value (NPV): The probability with which a correct movement actually exists if it has been identified by the system (Equation 5) (3, 4, 16).
SENS and PPV are therefore used as measures for predicting error patterns. SPEC and NPV, on the other hand, are measures for predicting correct motion.
To visualize the prediction accuracy of the assistance system, ROC curves (Receiver Operating Characteristic Curve) were generated for the individual error pattern. The area under the curve (AUC) represents the accuracy of the error statistics.
A total of 18 subjects completed the test task consisting of three sets of ten repetitions, so that 540 cases were included in the calculation. In 253 cases, the therapist and the system matched with respect to error pattern recognition. Counting statistics were used to determine 47 different difference codes, explained in the previous section, which occur at different frequencies (Fig. 2). The values +1 and -1 resulted from cases of unequal evaluations between therapist and system. In 165 cases, the graphic shows a match in three error pattern, and one error pattern was evaluated differently. In 122 cases, the evaluations differed for more than one error pattern.
The cases of unequal evaluations were used as base values to determine the statistical parameters SENS, SPEC and FPV, as well as PPV and NPV (Table 3).
The agreement between therapist and system was more than 70% for all error patterns (UB: 74.6%, WP: 80.7%, BK: 81.5%, HO: 81.5%). These values were calculated on the basis of the frequencies of the individual error patterns within the individual repetitions.
SENS ranged from 12.9% (HO) to 66.6% (BK). SPEC was 80% (77% HO, 87% WP), and FPV assumed values between 13% (WP) and 22.7% (HO). NPV was highest for HO (74.3%) and lowest for WP (45.8%). NPV was between 77.1% (UB) and 92.7% (HO) (Table 4).
The ROC curves (Fig. 3) show the accuracy of the test statistics for the determined error patterns, which is expressed by the AUC. Values close to the diagonal correspond to a random test result, since the hit rate and the false positive rate are close together. The AUC for the individual error patterns is between 0.693 (WP) and 0.822 (HO).
Assistance systems to improve motion control must meet high standards with regard to the quality criteria to actually be beneficial in practice. The aim of the present study was to validate a markerless assistance system for motion analysis based on depth image information. Therefore, the algorithm developed for error pattern recognition was compared with the recognition skills of an experienced therapist. SENS, SPEC, FPV, as well as PPV and NPV were analyzed to be able to make statements on the accuracy of the system.
All error patterns evaluated showed an agreement between therapist and system of more than 70%. Therapist and system showed the lowest agreement for the evaluation of the error pattern UB, with 74.6%, and for the evaluation of the error pattern BK and HO the highest agreement, with 81.5%.
This means that the algorithm detected at least 70% of the movement errors detected by the therapist. The highest SENS was 66.6% (BK) and the highest SPEC was 86.9% (WP).
Variations were found for the HO error pattern with a low SENS of 12.9%, an average SPEC of 77.3%, and the highest FPV of 22.7% (Tab. 4). SENS describes the ability of the system to identify an error pattern as such. A low SENS means a low susceptibility to errors (for HO). That means, if the system is not so sensitive, fewer errors for HO will be detected. However, with low SENS the PPV increases and it describes that it can be assumed that the detected HO actually occurred.
In clinical tests, SENS indicates how many of the patients affected by a disease are actually identified as positive by the test. Determining SPEC is just as important. A test with a high SPEC identifies healthy individuals as healthy and is therefore negative. SPEC decreases if a test produces many FPVs. In addition, PPV reflects the probability that a disease is actually present when it is identified by the test. The NPV is therefore the probability with which a healthy person is actually identified as such by the test (4, 6, 16).
The SENS found in this study shows that not every error pattern indicated by the therapist was also detected by the algorithm. For the patient, this may mean that incorrect movement sequences may be automated, since the error was not considered necessary for correction, either by the system or by the patient themselves. As a result, muscle groups are trained that are not the primary focus of the therapy.
The analyzed SPEC shows that correct movements were mostly evaluated as correct. This plays a decisive role for motivation and user acceptance.
The FPV, which here lie between 13% and 22.7% (Table 4), indicate that in individual cases correct movements were detected as errors. Displaying a movement error despite correct execution may lead to uncertainty in the patient and reduce acceptance of technical aids.
High SENS values are associated with a greater susceptibility to errors, i.e. the more sensitive the system reacts the more errors are detected. A low SENS, such as the HO error pattern (12.9%) and a high PPV (74%), indicate that this error was not detected as well by the system. However, if this was the case, the error was actually present. This is confirmed by the AUCs of the ROC curves, which was 0.822 for HO.
The parameters SENS and SPEC are based on the occurrence or non-occurrence of incorrect movements, which is determined in validation studies using reference standards (6). Such a reference measure is not available in this study. In summary, therefore, the tested assistance system can be used in practice to support movement control, but cannot replace a therapist.
In all error patterns the SPEC assumes that it has been correctly identified as correct. In the case of UB and HO the probability are the highest that the error pattern detected by the system occurred. Due to the care situations, it is advantageous to be able to use marker and contactless systems as support.
Markerless and contactless sensors have become established for use in therapy. They permit movements without impairments and do not require any additional assistance, as for example with the marker attachment of other motion capture systems. Therefore, the 3D-depth sensor Kinect 1.0 was used in this study.
In addition to the number of repetitions, the number of sets and the angle (ROM) of the movement, the error patterns that occur also provide important information for patients and therapists. For an assistance system to provide support, the execution of the movement and the errors must be reliably detected. This is the basic prerequisite for generating feedback to the user from the recorded movement. Hopper et al. (9), Kim et al. (11), Lösch et al. (12) have already been able to show positive effects by adhering to a visual target value in motion control. For example, bar graph tracking at Kim et al. (11) resulted in positive adjustments in stride length and walking speed. Visual feedback can assist the patient in movement execution and giving information, for example on amplitude and speed.
Until now, Kato et al. (10) and Verbruggheet al. (19) have described the reliable use of technical assistance systems in training therapy. Markerless systems have become established for motion detection. They result in fewer movement restrictions that can be caused by the placement of the sensors. Markerless systems can increase user-friendliness because they do not require complicated installation prior to use.
Further studies compared the motion detection accuracy of the markerless 3D-depth sensor in the first and second generation against marker-based motion capture systems. It is therefore known that the skeletal model of the 3D-depth sensor, in particular for the first generation, shows inaccuracies in skeletal recognition. Wang et al. (20) found deviations of about 200 mm in the localization of the hip joints. In addition, statistically significant differences were shown between the first and second generation of the sensors in the skeletal points SPINE, ROOT (trunk), HIP, ANK (ANKLE), and FOO (FOOT). Also at second generation, the foot and ankle joint positions are still approx. 100 mm from the base level. In addition, Fernandes-Baena et al. (8) describe that the positioning of the extremities to the sensor has a decisive influence on the accuracy of skeletal recognition. Xu & McGorry (21) found a significantly more accurate detection of the upper extremities compared to the lower extremities for the 3D-depth sensor in both generations. For example, the accuracy level for hand and arm was within 100 mm, while for hip and legs it was over 100 mm.
Previous studies as well as the present results show that a 100% error image recognition is currently not possible. The underlying skeletal model of the depth sensor is assumed to be the cause (7, 20, 21).
A more accurate skeletal model can help to increase the accuracy of error detection, which reliably determines the positions of the joints in three-dimensional space. For this purpose, a separate model must be developed and evaluated. Furthermore, it has to be investigated to what extent further feature vectors improve the error classification. The further development of the existing algorithm, which is based on a mean value filter, will contribute to improving filtering. This will make it possible to filter out any error detections.
Limitation and Conclusion
In the present study, the assistance system developed for the hip abduction exercise on the rope pull was tested for reliable error recognition. It is well known that other exercises with high therapeutic benefits also present problems due to several degrees of freedom in the execution of movement. Statements about the validity of error recognition in other exercises cannot currently be made.
The assessment of the movement by a therapist was used for validation. The therapist observed the movement for the occurrence of the error patterns upper body (UB), wrong plane (WP), bent knee (BK), and hip rotated outside (HO), which the system was trained to evaluate. In an expert survey these error patterns were reported as frequent errors. The therapist’s assessment was then used as a reference. At this point, a professionally experienced therapist with more than five years practice was used for observation. The practical experience of the therapist can be considered sufficient and valid. For completely validation, it will be necessary in future studies to have the system tested by at least 3 therapists.
Technical assistance systems for motion control are becoming increasingly important in training therapy. In this study, error pattern recognition by an algorithm annotated to the sensor was compared with the error pattern recognition of a therapist for the first time. This resulted in an agreement of approx. 70%. The developed assistance system can thus be used as an objective instrument for motion control in clinical settings. In the long term, such a system could provide patients with visual feedback on movement and document the course of therapy.
Funding & Declaration
This study was funded by the European Social Fund (ESF). Since this contribution included study on humans, the study plan was reviewed by the Ethics Commission of the TU Chemnitz, and the subjects agreed in writing to their participation in the study. The guidelines of the Declaration of Helsinki were observed.
Conflict of Interest
The authors have no conflict of interest.
- Validationof kinect-based telerehabilitation system with totalreplacement patients. Journal of Telemedicine. 2016; 22: 192-197.
- Robot assisted gaittraining with active leg exoskeleton (ALEX). IEEE Trans NeuralSyst Rehabil Eng. 2009; 17: 2-8.
- An introduction to medical statistics. 3. ed., (Nachdr.).Oxford: Oxford University Press; 2009. Oxford medicalpublications.
- Kurzgefasste Statistik für die klinischeForschung: Leitfaden für die verteilungsfreie Analyse kleinerStichproben; mit 91 Tabellen. 2., aktualisierte und bearb. Aufl.Heidelberg: Springer; 2003. Springer-Lehrbuch.
- MedizinischeTrainingstherapie - Durchführung-. [18th December 2018].
- Sensitivität, Spezifität, positiver und negativerVorhersagewert. Rehabilitation (Stuttg). 2005; 44: 44-49.
- BiomechanicalValidation of Upper-Body and Lower-Body Joint Movements ofKinect Motion Capture Data for Rehabilitation Treatments. In:2012 Fourth International Conference on Intelligent Networkingand Collaborative Systems: IEEE; 2012: 656-661.
- Biofeedback gait training system fortemporal and distance factors. Med Biol Eng Comput. 1989; 27:8-13.
- The influence ofvisual feedback on power during leg press on elite women fieldhockey players. Phys Ther Sport. 2003; 4: 182-186.
- Development andevaluation of a new telerehabilitation system based on VRtechnology using multisensory feedback for patients with stroke.J Phys Ther Sci. 2015; 27: 3185-3190.
- Effects ofVisual Feedback Distortion on Gait Adaptation: Comparisonof Implicit Visual Distortion Versus Conscious Modulation onRetention of Motor Learning. IEEE Trans Biomed Eng. 2015; 62:2244-2250.
- Visuelle Bewegungskontrolle geführter Kraftübungen bei jungenErwachsenen und Senioren. Ger J Exerc Sport Res. 2018; 48: 428-437.
- Einsatz und Bedeutung von Seilzügen in der MedizinischenTrainingstherapie am Beispiel Hüft-Totalendoprothese– eine Expertenperspektive. B & G. 2018; 34: 20-28.
- A language and environment for statisticalcomputing.: R Foundation for Statistical Computing. [18th December 2018].
- Motion Error Classificationfor Assisted Physical Therapy - A Novel Approach usingIncremental Dynamic Time Warping and NormalisedHierarchical Skeleton Joint Data. In: Proceedings of the 6thInternational Conference on Pattern Recognition Applicationsand Methods: SCITEPRESS - Science and TechnologyPublications; 2017: 281-288.
- Deskriptive Statistik. Konstanz: UVKVerl.-Ges; 2015. UTB; 3969. [18th December 2018].
- Persistence in visual feedbackcontrol by the elderly. Exp Brain Res. 1998; 119: 467-474.
- Augmented visual, auditory,haptic, and multimodal feedback in motor learning: a review.Psychon Bull Rev. 2013; 20: 21-53.
- Motion detection supportedexercise therapy in musculoskeletal disorders: A systematicreview. Eur J Phys Rehabil Med. 2018; 54: 591-604.
- Evaluation of Pose TrackingAccuracy in the First and Second Generations of MicrosoftKinect. International Conference on Healthcare Informatics(ICHI). 2015: 380-389.
- The validity of the first and second generationMicrosoft Kinect™ for identifying joint center locations duringstatic postures. Appl Ergon. 2015; 49: 47-54.
Technische Universität Chemnitz
Fakultät für Human- und Sozialwissenschaften,
Institut für Angewandte
Thüringer Weg 11, 09126 Chemnitz