MBA Knowledge Base

Business • Management • Technology

Home » Research Literature Reviews » Literature Review – Employee Training and Development

Literature Review – Employee Training and Development


Human resources are considered by many to be the most important asset of an organization, yet very few employers are able to harness the full potential from their employees (Radcliffe, 2005). Human resource is a productive resource consisting of the talents and skills of human beings that contribute to the production of goods and services (Kelly, 2001). Lado and Wilson (1994) define human resource system as a set of distinct but interrelated activities, functions, and processes that are directed at attracting, developing, and maintaining a firm’s human resources. According to Gomez-Mejia, Luis R., David B. Balkin and Robert L. Cardy, (2008), it is the process of ensuring that the organization has the right kind of people in the right places at the right time. The objective of Human Resources is to maximize the return on investment from the organization’s human capital and minimize financial risk. It is the responsibility of human resource managers to conduct these activities in an effective, legal, fair, and consistent manner (Huselid, 1995).

Employee Training and Development

Training and development is a subsystem of an organization that emanate from two independent yet interdependent words training and development. Training is often interpreted as the activity when an expert and learner work together to effectively transfer information from the expert to the learner (to enhance a learner’s knowledge, attitudes or skills) so the learner can better perform a current task or job. Training activity is both focused upon, and evaluated against, the job that an individual currently holds (Learner R., 1986). On the other hand development is often viewed as a broad, ongoing multi-faceted set of activities (training activities among them) to bring someone or an organization up to another threshold of performance. This development often includes a wide variety of methods, e.g., orienting about a role, training in a wide variety of areas, ongoing training on the job, coaching, mentoring and forms of self-development. Some view development as a life-long goal and experience. Development focuses upon the activities that the organization employing the individual, or that the individual is part of, may partake in the future, and is almost impossible to evaluate (Nadler Leonard, 1984).

Training and development ensures that randomness is reduced and learning or behavioral change takes place in structured format. In the field of human resource management , training and development is the field concerned with organizational activity aimed at bettering the performance of individuals and groups in organizational settings. It has been known by several names, including employee development, human resource development , and learning and development (Harrison Rosemary, 2005).

As the generator of new knowledge, employee training and development is placed within a broader strategic context of human resources management , i.e. global organizational management, as a planned staff education and development, both individual and group, with the goal to benefit both the organization and employees. To preserve its obtained positions and increase competitive advantage , the organization needs to be able to create new knowledge , and not only to rely solely on utilization of the existing (Vemic, 2007). Thus, the continuous employee training and development has a significant role in the development of individual and organizational performance . The strategic procedure of employee training and development needs to encourage creativity, ensure inventiveness and shape the entire organizational knowledge that provides the organization with uniqueness and differentiates it from the others.

The Value of Training and Development

According to Beardwell & Holden (1997) human resource management has emerged as a set of prescriptions for managing people at work. Its central claim is that by matching the size and skills of the workforce to the productive requirements of the organization, and by raising the quality of individual employee contributions to production, organizations can make significant improvements on their performance.

The environment of an organization refers to the sum total of the factors or variables that may influence the present and future survival of an organization (Armstrong, 1998). The factors may be internal or external to the organization. Cascio W. F, (1995), uses the terms societal environment to define the varying trends and general forces that do not relate directly to the company but could impact indirectly on the company at some point in time. Four of these forces are identified as economic, technological, legal and political and socio-cultural and demographic forces. The second type of environment is the task environment that comprises elements directly influencing the operations and strategy of the organization. These may include the labour market, trade unions, competition and product markets comprising customers, suppliers and creditors. The task environment elements are directly linked to the company and are influenced by the societal environment.

However, variables in the task, competitive or operative environment as they are variously referred to, affect organizations in a specific industry and it is possible to control them to some extent. As such, environmental change, whether remote or task, disrupts the equilibrium that exists between the organization’s strategy and structure, necessitating adjustment to change. Pfeffer (1998) proposes that there is evidence demonstrating that effectively managed people can produce substantially enhanced economic performance. Pfeffer extracted from various studies, related literature, and personal observation and experience a set of seven dimensions that seem to characterize most if not all of the systems producing profits through people. He named them the seven practices of successful organizations and they are: employment security, selective hiring of new personnel, self-managed teams and decentralization of decision making as the basic principles of organizational design, comparatively high compensation contingent on organizational performance , extensive training, reduced status distinctions and barriers, including dress, language, office arrangements, and wage differences across levels, and extensive sharing of financial and performance information throughout the organization.

Effect of Training and Development on Employee Productivity

McGhee (1997) stated that an organization should commit its resources to a training activity only if, in the best judgment of managers, the training can be expected to achieve some results other than modifying employee behavior. It must support some organizational goals , such as more efficient production or distribution of goods and services, product operating costs, improved quality or more efficient personal relations is the modification of employees behavior affected through training should be aimed at supporting organization objectives.

Effect of Training and Development on Employee Motivation

Motivation is concerned with the factors that influence people to behave in certain ways. Arnold etal (1991), have listed the components as being, direction-what a person is trying to do, effort- how hard a person is trying to and persistence- how long a person keeps on trying. Motivating other people is about getting them to move in the direction you want them to go in order to achieve a result, well motivated people are those with clearly defined goals who take action that they expect will achieve those goals. Motivation at work can take place in two ways. First, people can motivate themselves by seeking, finding and carrying out that which satisfies their needs or at least leads them to expect that their goals will be achieved. Secondly, management can motivate people through such methods as pay, promotion, praise and training (Synderman 1957). The organization as a whole can provide the context within which high levels of motivation can be achieved training the employees in areas of their job performance.

Effect of Training and Development on Competitive Advantage

Competitive advantage is the essence of competitive strategy . It encompasses those capabilities, resources, relationships, and decisions, which permits an organization to capitalize on opportunities in the marketplace and to avoid threats to its desired position, (Lengnick-Hall 1990). Boxall and Purcell (1992) suggest that ‘human resource advantage can be traced to better people employed in organizations with better processes.’ This echoes the resource based view of the firm, which states that ‘distinctive human resource practices help to create the unique competences that determine how firms compete’ (Capelli and Crocker- Hefter, 1996). Intellectual capital is the source of competitive advantage for organizations. The challenge is to ensure that firms have the ability to find, assimilate, compensate, and retain human capital in shape of talented individual who can drive a global organization that both responsive to its customer and ‘the burgeoning opportunities of technology’ (Armstrong, 2005)

Effect of Training and Development on Customer Relations

William Edward Deming , one of the quality Gurus defines quality as a predictable degree of uniformity and dependability at low costs and suitable to the market, he advises that an organization should focus on the improvement of the process as the system rather than the work is the cause of production variation (Gale 1994). Many service organizations have embraced this approach of quality assurance by checking on the systems and processes used to deliver the end product to the consumer.  Essentially this checks on; pre-sale activities which encompass the advice and guidance given to a prospective client, customer communications ( how well the customers are informed of the products and services, whether there are any consultancy services provided to help the customers assess their needs and any help line available for ease of access to information on products), the speed of handling a client’s transactions and processing of claims, the speed of handling customers calls and the number of calls abandoned or not answered, on the selling point of Products/Services a customer would be interested to know   about the opening   hours of the organization, the convenience of the location and such issues (Gale 1994). This is only possible when employees are well trained and developed to ensure sustainability of the same.

  • Armstrong, M (1998): Human Resource Management: Strategy and Action, Irwin, Boston
  • Betcherman, G., K. McMullen and K. Davidman (1998), Training for the New Economy: A Synthesis Report, Canadian Policy Research Network, Ottawa, pp. 117
  • Cascio, W. F. (1995). Whither industrial and organizational psychology in a changing world of work?American Psychologist, 50, 928—939
  • Harrison Rosemary (2005). Learning and Development.CIPD Publishing. pp.  5
  • Huselid, M. A. (1995) The impact of human resource management practices on turnover, productivity and corporate financial performance, Academy of Management Journal, 38(3), 635-672
  • Kelly D, (2001), Dual Perceptions of HRD: Issues for Policy: SME’s, Other Constituencies, and the Contested Definitions of Human Resource Development,
  • Lado, A., & Wilson, M. (1994) Human resource systems and sustained competitive advantage: A competency-based perspective, Academy of Management Journal, 19(4), 699-727
  • Learner, R. (1986).Concepts and Theories of Human Development (2nd ed.). New York: Random House).
  • Nadler, Leonard (1984). The Handbook of Human Resource Development (Glossary). New York: John Wiley & Sons.
  • Pfeffer J., (1998), The Human Equation; Building Profits by Putting People First, HBS press, Boston
  • Tessema, M. and Soeters, J. (2006) Challenges and prospects of HRM in developing countries: testing the HRM-performance link in Eritrean civil service, International Journal of Human Resource Management, 17(1), 86 -105

Related Posts:

  • How to Write a Good Literature Review
  • Literature Review about E-Banking In India
  • Literature Review - Organizational Learning
  • Literature Review - Credit Derivatives
  • Literature Review - Quality Management Systems
  • Literature Review - Social Media Marketing Strategies

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed .

  • Subject List
  • Take a Tour
  • For Authors
  • Subscriber Services
  • Publications
  • African American Studies
  • African Studies
  • American Literature
  • Anthropology
  • Architecture Planning and Preservation
  • Art History
  • Atlantic History
  • Biblical Studies
  • British and Irish Literature
  • Childhood Studies
  • Chinese Studies
  • Cinema and Media Studies
  • Communication
  • Criminology
  • Environmental Science
  • Evolutionary Biology
  • International Law
  • International Relations
  • Islamic Studies
  • Jewish Studies
  • Latin American Studies
  • Latino Studies
  • Linguistics
  • Literary and Critical Theory
  • Medieval Studies
  • Military History
  • Political Science
  • Public Health
  • Renaissance and Reformation
  • Social Work
  • Urban Studies
  • Victorian Literature
  • Browse All Subjects

How to Subscribe

  • Free Trials

In This Article Expand or collapse the "in this article" section Training and Development

Introduction, general overviews.

  • Reference Works
  • Instructional Systems Design
  • Needs Assessment
  • Training Methods
  • Pre-training Interventions
  • Training Media
  • Training Teams
  • Training Evaluation
  • Learner Characteristics
  • Learning Context
  • Employee Development
  • Macroperspectives

Related Articles Expand or collapse the "related articles" section about

About related articles close popup.

Lorem Ipsum Sit Dolor Amet

Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Aliquam ligula odio, euismod ut aliquam et, vestibulum nec risus. Nulla viverra, arcu et iaculis consequat, justo diam ornare tellus, semper ultrices tellus nunc eu tellus.

  • Alternative Work Arrangements
  • Career Studies
  • Career Transitions and Job Mobility
  • Global Human Resources
  • Goal Setting
  • Human Resource Management
  • Organization Culture

Other Subject Areas

Forthcoming articles expand or collapse the "forthcoming articles" section.

  • Corporate Globalization
  • Organization Design
  • Organizational Learning and Knowledge Transfer
  • Find more forthcoming articles...
  • Export Citations

Training and Development by Kenneth G. Brown LAST REVIEWED: 13 July 2020 LAST MODIFIED: 26 October 2015 DOI: 10.1093/obo/9780199846740-0013

Training and development is the study of how structured experiences help employees gain work-related knowledge, skill, and attitudes. It is like many other topics in management in that it is inherently multidisciplinary in nature. At its core is the psychological study of learning and transfer. A variety of disciplines offer insights into this topic, including, but not limited to, industrial and organizational psychology, educational psychology, human resource development, organizational development, industrial and labor relations, strategic management, and labor economics. The focus of this bibliography is primarily psychological with an emphasis on theory and practice that examines training processes and the learning outcomes they seek to influence. Nevertheless, literature from other perspectives will be introduced on a variety of topics within this area of study.

These articles and chapters provide background for the study of training and development, particularly as studied by management scholars with backgrounds in human resource management, organizational behavior, human resource development, and industrial and organizational psychology. Kraiger 2003 examines training from three different perspectives. Aguinis and Kraiger 2009 provides a narrative review of ten years of research on training and employee development, focusing on the many benefits of providing structured learning experiences to employees. Brown and Sitzmann 2011 also reviews the literature and emphasizes research on the processes that are required to ensure that training benefits emerge. Arthur, et al. 2003 meta-analyzes the literature on training effectiveness. Russ-Eft 2002 proposes a typology of training designs. Salas, et al. 2012 offers recommendations for evidence-based training practice. Noe, et al 2014 examines training in a broader context, relative to the roles of informal learning and knowledge transfer.

Aguinis, Herman, and Kurt Kraiger. “Benefits of Training and Development for Individuals and Teams, Organizations, and Society.” Annual Review of Psychology 60.1 (January 2009): 451–474.

DOI: 10.1146/annurev.psych.60.110707.163505

A comprehensive review of training and development literature from 1999 to 2009 with an emphasis on the benefits that training offers across multiple levels of analysis.

Arthur, Winfred A., Jr., Winston Bennett Jr., Pamela S. Edens, and Suzanne T. Bell. “Effectiveness of Training in Organizations: A Meta-analysis of Design and Evaluation Features.” Journal of Applied Psychology 88.2 (April 2003): 234–245.

DOI: 10.1037/0021-9010.88.2.234

Offers a comprehensive meta-analysis of the relationships among training design and evaluation features and various training effectiveness outcomes (reaction, learning, behavior, and results).

Brown, Kenneth G., and Traci Sitzmann. “Training and Employee Development for Improved Performance.” In APA Handbook of Industrial and Organizational Psychology . Vol. 2, Selecting and Developing Members for the Organization . Edited by Sheldon Zedeck, 469–503. Washington, DC: American Psychological Association, 2011.

DOI: 10.1037/12170-000

A comprehensive review of training and development in work organizations with an emphasis on the processes necessary for training to be effective for improving individual and team performance.

Kraiger, Kurt. “Perspectives on Training and Development.” In Handbook of Psychology . Vol. 12. Edited by Irving B. Weiner and Walter C. Borman, Daniel R. Ilgen, and Richard J. KIlimoski, 171–192. Hoboken, NJ: John Wiley, 2003.

DOI: 10.1002/0471264385

Reviews training literature from three perspectives: instruction, learning, and organizational change.

Noe, Raymond A., Alena D. M. Clarke, and Howard J. Klein. “Learning in the Twenty-first-century Workplace.” Annual Review of Organizational Psychology and Organizational Behavior 1 (2014): 245–275.

DOI: 10.1146/annurev-orgpsych-031413-091321

A review that places training and development in a broader context with other learning-related interventions and practices such as informal learning and knowledge sharing. The chapter explains factors that facilitate learning in organizations.

Russ-Eft, Darlene. “A Typology of Training Design and Work Environment Factors Affecting Workplace Learning and Transfer.” Human Resource Development Review 1 (March 2002): 45–65.

DOI: 10.1177/1534484302011003

Presents a typology summarizing elements of training and work environments that foster transfer of training.

Salas, Eduardo, Scott I. Tannenbaum, Kurt Kraiger, and Kimberly A. Smith-Jentsch. “The Science of Training and Development in Organizations: What Matters in Practice.” Psychological Science in the Public Interest 13.2 (2012): 74–101.

DOI: 10.1177/1529100612436661

Reviews meta-analytic evidence and offers evidence-based recommendations for maximizing training effectiveness.

back to top

Users without a subscription are not able to see the full content on this page. Please subscribe or login .

Oxford Bibliographies Online is available by subscription and perpetual access to institutions. For more information or to contact an Oxford Sales Representative click here .

  • About Management »
  • Meet the Editorial Board »
  • Abusive Supervision
  • Adverse Impact and Equal Employment Opportunity Analytics
  • Alliance Portfolios
  • Applied Political Risk Analysis
  • Approaches to Social Responsibility
  • Assessment Centers: Theory, Practice and Research
  • Attributions
  • Authentic Leadership
  • Bayesian Statistics
  • Behavior, Organizational
  • Behavioral Approach to Leadership
  • Behavioral Theory of the Firm
  • Between Organizations, Social Networks in and
  • Brokerage in Networks
  • Business and Human Rights
  • Certified B Corporations and Benefit Corporations
  • Charismatic and Innovative Team Leadership By and For Mill...
  • Charismatic and Transformational Leadership
  • Compensation, Rewards, Remuneration
  • Competitive Dynamics
  • Competitive Heterogeneity
  • Competitive Intensity
  • Computational Modeling
  • Conditional Reasoning
  • Conflict Management
  • Considerate Leadership
  • Corporate Philanthropy
  • Corporate Social Performance
  • Corporate Venture Capital
  • Counterproductive Work Behavior (CWB)
  • Cross-Cultural Communication
  • Cross-Cultural Management
  • Cultural Intelligence
  • Culture, Organization
  • Data Analytic Methods
  • Decision Making
  • Dynamic Capabilities
  • Emotional Labor
  • Employee Aging
  • Employee Engagement
  • Employee Ownership
  • Employee Voice
  • Empowerment, Psychological
  • Entrepreneurial Firms
  • Entrepreneurial Orientation
  • Entrepreneurship
  • Entrepreneurship, Corporate
  • Entrepreneurship, Women’s
  • Equal Employment Opportunity
  • Faking in Personnel Selection
  • Family Business, Managing
  • Financial Markets in Organization Theory and Economic Soci...
  • Findings, Reporting Research
  • Firm Bribery
  • First-Mover Advantage
  • Fit, Person-Environment
  • Forecasting
  • Founding Teams
  • Global Leadership
  • Global Talent Management
  • Grounded Theory
  • Hofstedes Cultural Dimensions
  • Human Capital Resource Pipelines
  • Human Resource Management, Strategic
  • Human Resources, Global
  • Human Rights
  • Humanitarian Work Psychology
  • Humility in Management
  • Impression Management at Work
  • Influence Strategies/Tactics in the Workplace
  • Information Economics
  • Innovative Behavior
  • Intelligence, Emotional
  • International Economic Development and SMEs
  • International Economic Systems
  • International Strategic Alliances
  • Job Analysis and Competency Modeling
  • Job Crafting
  • Job Satisfaction
  • Judgment and Decision Making in Teams
  • Knowledge Sharing and Collaboration within and across Firm...
  • Leader-Member Exchange
  • Leadership Development
  • Leadership Development and Organizational Change, Coaching...
  • Leadership, Ethical
  • Leadership, Global and Comparative
  • Leadership, Strategic
  • Learning by Doing in Organizational Activities
  • Management History
  • Management In Antiquity
  • Managerial and Organizational Cognition
  • Managerial Discretion
  • Meaningful Work
  • Multinational Corporations and Emerging Markets
  • Neo-institutional Theory
  • Neuroscience, Organizational
  • New Ventures
  • Organization Design, Global
  • Organization Development and Change
  • Organization Research, Ethnography in
  • Organization Theory
  • Organizational Adaptation
  • Organizational Ambidexterity
  • Organizational Behavior, Emotions in
  • Organizational Citizenship Behaviors (OCBs)
  • Organizational Climate
  • Organizational Control
  • Organizational Corruption
  • Organizational Hybridity
  • Organizational Identity
  • Organizational Justice
  • Organizational Legitimacy
  • Organizational Networks
  • Organizational Paradox
  • Organizational Performance, Personality Theory and
  • Organizational Responsibility
  • Organizational Surveys, Driving Change Through
  • Organizations, Big Data in
  • Organizations, Gender in
  • Organizations, Identity Work in
  • Organizations, Political Ideology in
  • Organizations, Social Identity Processes in
  • Overqualification
  • Paternalistic Leadership
  • Pay for Skills, Knowledge, and Competencies
  • People Analytics
  • Performance Appraisal
  • Performance Feedback Theory
  • Planning And Goal Setting
  • Proactive Work Behavior
  • Psychological Contracts
  • Psychological Safety
  • Real Options Theory
  • Recruitment
  • Regional Entrepreneurship
  • Reputation, Organizational Image and
  • Research, Ethics in
  • Research, Longitudinal
  • Research Methods
  • Research Methods, Qualitative
  • Resource Redeployment
  • Resource-Dependence Theory
  • Response Surface Analysis, Polynomial Regression and
  • Role of Time in Organizational Studies
  • Safety, Work Place
  • Selection, Applicant Reactions to
  • Self-Determination Theory for Work Motivation
  • Self-Efficacy
  • Self-Fulfilling Prophecy In Management
  • Self-Management and Personal Agency
  • Sensemaking in and around Organizations
  • Service Management
  • Shared Team Leadership
  • Social Cognitive Theory
  • Social Evaluation: Status and Reputation
  • Social Movement Theory
  • Social Ties and Network Structure
  • Socialization
  • Sports Settings in Management Research
  • Stakeholders
  • Status in Organizations
  • Strategic Alliances
  • Strategic Human Capital
  • Strategy and Cognition
  • Strategy Implementation
  • Structural Contingency Theory/Information Processing Theor...
  • Team Composition
  • Team Conflict
  • Team Design Characteristics
  • Team Learning
  • Team Mental Models
  • Team Newcomers
  • Team Performance
  • Team Processes
  • Teams, Global
  • Technology and Innovation Management
  • Technology, Organizational Assessment and
  • the Workplace, Millennials in
  • Theory X and Theory Y
  • Time and Motion Studies
  • Training and Development
  • Trust in Organizational Contexts
  • Unobtrusive Measures
  • Virtual Teams
  • Whistle-Blowing
  • Work and Family: An Organizational Science Overview
  • Work Contexts, Nonverbal Communication in
  • Work, Mindfulness at
  • Workplace Aggression and Violence
  • Workplace Coaching
  • Workplace Commitment
  • Workplace Gossip
  • Workplace Meetings
  • Workplace, Spiritual Leadership in the
  • World War II, Management Research during
  • Privacy Policy
  • Cookie Policy
  • Legal Notice
  • Accessibility

Powered by:

  • [|]
  • Open access
  • Published: 03 November 2023

Automatic literature screening using the PAJO deep-learning model for clinical practice guidelines

  • Yucong Lin 1   na1 ,
  • Jia Li   ORCID: 2   na1 ,
  • Huan Xiao 3 ,
  • Lujie Zheng 4 ,
  • Ying Xiao 5 ,
  • Hong Song 4 ,
  • Jingfan Fan 6 ,
  • Deqiang Xiao 6 ,
  • Danni Ai 6 ,
  • Tianyu Fu 1 ,
  • Feifei Wang 3 , 7 ,
  • Han Lv   ORCID: 2 &
  • Jian Yang 6  

BMC Medical Informatics and Decision Making volume  23 , Article number:  247 ( 2023 ) Cite this article

Metrics details

Clinical practice guidelines (CPGs) are designed to assist doctors in clinical decision making. High-quality research articles are important for the development of good CPGs. Commonly used manual screening processes are time-consuming and labor-intensive. Artificial intelligence (AI)-based techniques have been widely used to analyze unstructured data, including texts and images. Currently, there are no effective/efficient AI-based systems for screening literature. Therefore, developing an effective method for automatic literature screening can provide significant advantages.

Using advanced AI techniques, we propose the Paper title, Abstract, and Journal (PAJO) model, which treats article screening as a classification problem. For training, articles appearing in the current CPGs are treated as positive samples. The others are treated as negative samples. Then, the features of the texts (e.g., titles and abstracts) and journal characteristics are fully utilized by the PAJO model using the pretrained bidirectional-encoder-representations-from-transformers (BERT) model. The resulting text and journal encoders, along with the attention mechanism, are integrated in the PAJO model to complete the task.

We collected 89,940 articles from PubMed to construct a dataset related to neck pain. Extensive experiments show that the PAJO model surpasses the state-of-the-art baseline by 1.91% (F1 score) and 2.25% (area under the receiver operating characteristic curve). Its prediction performance was also evaluated with respect to subject-matter experts, proving that PAJO can successfully screen high-quality articles.


The PAJO model provides an effective solution for automatic literature screening. It can screen high-quality articles on neck pain and significantly improve the efficiency of CPG development. The methodology of PAJO can also be easily extended to other diseases for literature screening.

Peer Review reports

Clinical practice guidelines (CPGs) are curated collections of the best practices used to guide, optimize, and establish norms for clinical practice and are thu s essential to clinicians, administrators, the public, and program managers [ 1 ]. CPGs are built using materials with quality evidence [ 2 ], which implies that clear, explicit, and unbiased information is selected. Hence, CPGs require frequent systematic reviews to ensure their curation and reduce the risk of medical malpractice.

Scholarly published articles are the key source of critical evidence that feeds CPGs, leading to a regular need for screening the most recent evidence based on research topics. However, the number of articles is witnessing an exponentially growth, it is reported that over 120 million papers have been published so far [ 3 ]. This large volume of papers brings enormous challenges for curation. Besides, curator selection is restricted to top field experts, making curation scheduling a tough, time-consuming task. Undesirable selective bias and human mistakes occur. In this regard, an automated curating tool for overall reviewing and assessing the quality of domain-related publications could be of use to CPG creators.

We first assume that there are clearly identifiable features that delineate high-quality articles from the rest. Neural networks have made vast improvements in the identification and assessment of text features. Several have already been adapted for medical text analyses. Advanced natural language processing (NLP) methods are now being used in many fields for literature screening.

Presently, traditional classifiers such as, random forest and support vector machine (SVM) models are effectively applied to simple medical text-processing tasks. For example, a term frequency–inverse document frequency (TF-IDF) feature extraction technique was developed with a naïve Bayes classifier that automatically screens for medical guidelines [ 4 ]. The SVM classifier was used to screen medical articles [ 5 ]. Compared with the traditional TF-IDF feature engineering strategy, the deep learning method was also applied and performed more effective than the TF-IDF method [ 6 ]. An ensembled method based on classical machine learning and deep learning approaches was further adopted, which improving the performance of the single best model on small datasets [ 7 ]. These traditional models facilitate comprehensive information mining by ranking features of the texts, leading to interpretable results.

With the development of deep-learning techniques, more complex and advanced methods are now available improving the performance of text mining [ 8 ]. An attention-based convolutional neural network (CNN) was adopted for medical code prediction [ 9 ]; this first aggregated information from a document using a CNN, and it then applied an attention mechanism to select the most relevant segments, making accurate selections from thousands of possible lines of code. The CNN and long short-term memory (LSTM) models was further combined, where the CNN was used to extract word-level semantic features, and the LSTM was used to extract timing characteristics [ 10 ]. A composite index test algorithm for literature screening was proposed in [ 11 ]. Various bidirectional-encoder-representations-from-transformers (BERT) methods have been adapted for text processing problems, including “A Lite” BERT [ 12 ], “scientific” BERT [ 13 ], and “biomedical” BERT [ 14 ]. For example, Moen et al. [ 15 ] combined the prediction results of eight models, including a BERT and a bidirectional LSTM (BiLSTM), to determine an article’s relevancy. A Knowledge Language Model (K-LM) model was developed for knowledge injection based on Generative Pre-trained Transformer 2 (GPT-2) and BERT, which improved the performance relative to classical machine learning methods [ 16 ]. As demonstrated by numerous experiments, BERT models do an excellent job of “understanding” text following sufficient model training. Additionally, they can be flexibly combined with ancillary network structures, depending on the task at hand. Compared with traditional deep-learning NLP models, the BERT models are the best.

This study seeks to provide the capability to quickly locate and classify high-quality medical studies. As a starting point, we focus the scope on the diagnosis and treatment of neck pain. To this end, we construct a dataset of candidate articles from PubMed. Those cited by the extant CPG, as well as systemic scholarly reviews, are regarded as positive samples for model training; all others are treated as negative. Using this, we provide a binary text classification problem for a BERT NLP model. Various attributes from the textual information found in the articles, alongside selected journal characteristics, are used for feature extraction and analysis. The resulting novel Paper title, Abstract, and JOurnal (PAJO) model, which is based on the pretrained PubMedBERT model, was applied to neck-pain medical article screening for generating CPG. Compared with the best baseline, the Text-based Recurrent Convolutional Neural Network (TextRCNN), the PAJO model achieves 1.91% improvement in the F1-score and 2.25% in the area under the receiver operating characteristic curve. This research article presents the following contributions of our study:

We developed a novel PubMedBERT-based PAJO deep-learning neural network, which mines the textual information of articles and the journal characteristics for their feature information to CPG article screening.

We designed a general framework for automatic medical literature screening that includes data collection, feature extraction, model-building, and performance evaluation.

Taking neck-pain as a case study, we demonstrated that the proposed PAJO model can accurately screen a greater number of high-quality neck-pain articles for curating the related CPGs, when compared to several state-of-the-art methods.

The remainder of this paper is organized as follows. Methods section explains the PAJO methodology. Results section presents our experimental results compared with several state-of-the-art methods, and Discussion section presents our ablation study. Finally, Conclusions section concludes this paper.

Dataset construction

Dataset construction is the first step in deep-learning model training. To empower our model to automatically screen high-quality, scientifically rigorous articles related to neck pain, we queried the PubMed Footnote 1 database in Dec. 2021, which stores more than 20 M biomedical articles. PubMed’s MeSH tool is a powerful query method that allows researchers to search for various combinations of keywords and phrases expressed as Boolean relationships. Our “neck pain” query resulted in 41 entry terms, as shown in Fig.  1 .

figure 1

Mesh search terms related to “neck pain.” The left panel displays the retrieval interface and search term of “neck pain,” and the right panel lists the entries found, which can be selected for further screening

These 41 terms were regarded as our initial set of keywords and phrases related to “neck pain.” To reduce the size of this set, we conducted a literature survey on neck pain and found the most used keywords. We verified our selection with experts from the field. We then narrowed the keyword and phrase combinations to “back pain,” “pain back,” “neck pain,” “pain neck,” and “cervical pain.” Using these five entries, we undertook our secondary search.

We used MeSH to perform a fuzzy search on the entire PubMed database, with our five keywords and phrases targeting the fields title , abstract , and publisher . We then matched the five key phrases with the collection of titles and abstracts using a fuzzy retrieval strategy. For example, in any given article, if the two words comprising a key phrase are separated by no more than three additional words, this meets our matching rule; the title “Consensus practice guidelines on interventions for cervical spine joint pain from a multispecialty international working group” contains the words “cervical” and “pain,” separated two other words, “spine joint.” This meets the “cervical pain” keyword and phrase criteria. Hence, this title is included in the final dataset.

Using this method, 89,940 articles were retained. Data preprocessing was then performed as follows. First, the duplicate articles were removed. Second, noting that the publishing journal identification provides valuable information about the article’s quality, articles lacking the associated journal information were discarded. A total of 27,406 articles remained to comprise Set A (the complete collection). From this set, 1,005 articles were cited in the existing CGPs and systemic reviews for neck pain; thus, they were deemed the most important for our task. These articles comprised Set B (positive samples). The remainder (26,401) was regarded as Set C (negative samples). Obviously, \(\text{A}=\text{B}\cup \text{C}\) . Hence, in this study, we used Sets B and C for positive and negative model training, respectively. The specific dataset construction process is illustrated in Fig.  2 .

figure 2

Dataset construction process. We first applied our keyword and phrase retrieval matching rule to all PubMed articles. We then performed deduplication and removed records for which the associated journal information was unavailable. Finally, we classified Set A as the complete collection of 27,406 samples. Set B was classified with 1,005 articles that were cited in the CGPs and systematic reviews (positive samples). Finally, Set C contained the 26,401 negative samples

Handling category imbalances

Given that our ratio of positive to negative samples was \(1:28.3\approx 0.035\) , we faced an extremely imbalanced case that would not result in good model training. Hence, we applied a focal loss function to dichotomize the unbalanced data [ 17 ]. For each sample i , \({Y}_{i}=1\) if it is positive; otherwise, \({Y}_{i}=0\) (negative). Furthermore, \({p}_{i}=P({Y}_{i}=1)\) , which reflects the probability that a classification model predicts a positive sample. Let \({\widetilde{p}}_{i}={p}_{i}\) if \({Y}_{i}=1\) , and \({\widetilde{p}}_{i}=1-{p}_{i}\) if \({Y}_{i}=0\) . For sample i , the focal loss is defined as follows:

Here, two hyperparameters were used in the focal loss. Hyperparameter \(\gamma\) is the exponent of the modulating factor, which is usually a positive integer. This reduces the contributions of easily separable samples and increases the hard-separable proportion for balancing. Hyperparameter \({\tilde\alpha }={\upalpha }\) if \({Y}_{i}=1\) , and \({\tilde\alpha }=1-{\upalpha }\) if \({Y}_{i}=0\) , giving us a weighting factor in [0, 1], which is used to adjust the ratio between positive and negative sample losses. To find the appropriate values of \({\upgamma }\) and \({\upalpha }\) , we first set a varying range for each hyperparameter based on our preliminary analysis. Then we used grid search to find the optimal values. We found the best hyperparameter values were \({\upgamma }=2\) and \({\upalpha }=0.8\) . We also applied a downsampling technique to Set C (negative) to supplementarily balance the data for subsequent model training, validation, and testing. Downsampling is a widely applied technique to balance the sample sizes in datasets. In this work, Set C has a much larger sample size than Set B. To balance these two datasets, we need to select a subset from Set C. To this end, we first assign each negative sample in Set C with an equal sampling probability. Then we randomly selected 2,345 samples from Sec C using the technique of sampling without replacement. The final ratio of positive to negative samples was 3:7.

Feature extraction

When contemplating feature extraction, we were burdened with creating a deep-learning model from scratch, including selecting and testing its subcomponents. Fortunately, because our research field is closely related to biomedicine, we noted that a pretrained PubMedBERT NLP already exists in that field [ 18 ]. Different from other BERT-type models, which are trained on millions of articles related to a wide range of topics, PubMedBERT is pre-trained from scratch on biomedical research literature. By training PubMedBERT with our information collected from PubMed, we could directly obtain relevant predictions based on “neck pain.”

To identify journal features, we leveraged the following pre-existing attributes:

Journal Impact Factor (IF) [ 19 ]. This feature reflects the “influence” of academic journals in terms of their average annual citations in new articles. To account for IF value fluctuations over time, we considered them annually from 2015 to 2021.

CiteScore (CS) [ 20 ]. This feature reflects an Elsevier metric launched in 2016 that conveys the annual citations per article per journal compiled from the Scopus database over the previous four years.

Scientific Journal Ranking (SJR) [ 21 ]. This feature reflects both the number and quality of citations and weighs those of prestigious journals.

Source-Normalized Impact per Paper (SNIP) [ 22 ]. This feature reflects another Elsevier measurement issued in 2012 that uses the Scopus database. It is the reference weight based on the total number of citations in a subject area. Therefore, a citation is assigned a higher value if it is cited in disciplines outside its domain. SNIP also corrects for differences in journal citation behaviors in different subject areas.

The Science Citation Index or Journal Citation Reports Divisions of the Chemical Abstracting Service. This feature reflects 14 major disciplines, and in each, journals are ranked according to their impact factors: Zone 1 (top 5%), Zone 2 (top 5–20%), Zone 3 (top 20–50%), and Zone 4 (the remainder).

The H-Index [ 23 ]. This feature reflects the productivity and impact of a researcher or journal and is calculated based on the number of articles published by a journal and the number of times an article is cited. A journal with n articles cited at least n times each has an H-index of n .

The PAJO classification model

Our PAJO model has three modules: an in-sample text feature encoder that converts title and abstract strings to embedding vectors, an attention encoder that converts inter-sample text feature vectors from single samples into weighted representations between samples, and a journal feature encoder that extracts the journal features listed in Feature extraction section. PAJO’s network architecture is illustrated in Fig.  3 .

figure 3

PAJO network architecture. Each article’s raw title and abstract are fed into the PubMedBERT text encoder for conversion to embedding vectors. The vectors are passed to an attention encoder for weight sample representation. The original journal features are normalized and passed to a feed-forward layer with a rectified linear unit (ReLU) activation function. The obtained text and journal features are concatenated to obtain the overall feature representation of an article. Finally, the feature representation is passed to a feed-forward layer with a SoftMax function to predict the article’s label

The text encoder module focuses on in-sample text feature representations. For each sample, i , we use the PubMedBERT tokenizer to tokenize the title’s text as sequence \({T}_{i}\) . Similarly, the abstract is tokenized as \({S}_{i}\) . Subsequently, both \({T}_{i}\) and \({S}_{i}\) are fed to PubMedBERT for encoding, where an attention mechanism allows the embedding vector corresponding to each word to incorporate the information of all words in the text. In the last hidden layer, the embedding associated with the reserved [CLS] token is used for downstream classification tasks [ 24 ]. We define \({r}_{i}\) as the in-sample vector representation of \({T}_{i}\) and \({n}_{i}\) as the inter-sample vector representation of \({S}_{i}\) , which is represented as follows:

We define d as the embedding size, where \({r}_{i}, {n}_{i}\in {\text{R}}^{d}.\) The word embedding and encoding layers in PubMedBERT are fine-tuned by our dataset in the model training process. Note that \({r}_{i}\) and \({n}_{i}\) are associated with [CLS] tokens and contain information about the entire text.

The second PAJO module is an attention encoder that focuses on inter-sample text feature representations. The title vector from each article is recoded as a weighted title vector between articles using the attention mechanism, which forces the model to learn from the subtle gaps in title and abstract representations across multiple articles. Define the batch size to be s , which is the number of samples analyzed in each epoch. Then \(r\;=\;\left[r_1,\;r_2,\;\dots,\;r_s\right]^{\mathrm T}\) denotes the in-sample title vector representation used for the entire batch. We denote \(\text{Q},\text{K},\text{V}\in {\text{R}}^{d}\times {\text{R}}^{d}\) as the corresponding trained query, key, and value matrices, respectively. We then multiply r by training matrices \({\text{W}}^{\text{Q}}\) , \({\text{W}}^{\text{K}}\) , and \({\text{W}}^{\text{V}}\) to obtain Q, K, and V, respectively. Hence, \(\text{Q} =\text{r}{\text{W}}^{\text{Q}}\) , \(\text{K} =\text{r}{\text{W}}^{\text{K}}\) , and \(\text{V} =\text{r}{\text{W}}^{\text{V}}\) . We define \({\alpha }_{i}\) as the computed weight vector of Sample i , which is represented by other in-sample text vectors from the same batch. The title text feature, \({\text{R}}_{i}\) , is formulated as follows:

Ideally, when applying the attention mechanism to calculate the weights of vectors between samples, every sample in the dataset should be considered. However, owing to limited computing resources, we created sample sets from each batch. Thus, all text vectors are weighted in the same batch to obtain the inter-sample representation of each vector.

The third PAJO module is the journal feature encoder, with which the categorical features based on the journal characteristics are transformed into dummy variables. Continuous features are normalized, and all categorical and continuous features are concatenated to obtain the full journal feature vector, \({j}_{i}\) for sample i . This is then submitted to a feed-forward layer for the linear combination of different journal features. Finally, the output features are fed into the ReLU activation function, which is commonly used in deep neural networks.

For sample i , we then obtain its intersample title text feature \({R}_{i}\) , intersample abstract text feature \({ N}_{i}\) , and enhanced journal feature \({J}_{i}\) . By concatenating these, we obtain the overall feature, \({X}_{i}\) , for final classification, where \({X}_{i} = contact({R}_{i},{N}_{i},{J}_{i})\) . For a batch with s samples, \(X\;=\;\left[X_1,\;X_2,\;\dots,\;X_s\right]\) , and its predicted labels are \(\widehat{Y}\) . Thus, in the fully connected layer, we have

where \(W\in {\text{R}}^{\stackrel{\sim}{d}}{\times \text{R}}^{\stackrel{\sim}{d}}\) , and \(\stackrel{\sim}{\text{d}}\) is twice the embedding size plus the journal feature size. Finally, we summarize the implementation of PAJO in Algorithm 1.

figure a

Algorithm 1: Label prediction process of PAJO classification model

Models and Metrics

We conducted a series of experiments to investigate the classification performance of our proposed PAJO model. The corresponding data and codes are available at . For comparison purposes, we considered several competitors, including CNNs, recurrent neural networks (RNNs), other attention mechanisms, and pretrained language models. The text and journal feature inputs are similar for all models.

Random Forest [ 25 ]. With this model, the title, abstract, and journal features are concatenated and fed to a classifier with 800 decision trees.

Logistic Regression with L1 Penalty (L1LR) [ 26 ]. This regression-based method performs both variable selection and classification. We used the same inputs as those given to the Random Forest model and conducted 10-fold cross-validation to determine the hyperparameters.

BiLSTM [ 27 ]. This is an RNN that performs well with text classification tasks and is used for sentiment analysis and question classification. The text encoder is a two-layer BiLSTM, whose output hidden states are concatenated with the journal feature representations, which use simple fully connected layers. The word embedding dimensions and BiLSTM hidden states were 128 and 256, respectively, and we applied a fully connected layer and a SoftMax function to make the final prediction.

BiLSTM + Attention [ 28 ]. This is the same BiLSTM with an additional attention layer at the text encoder output.

TextCNN [ 29 ]. This CNN is used for sentence classification tasks and has a kernel size of 32, which is used to extract sentence-level features. For journal feature representations, we applied the BiLSTM method and fed the concatenated representations into the classifier.

TextRCNN [ 30 ]. This combination CNN + RNN uses the BiLSTM architecture for the RNN. The BiLSTM’s output is concatenated with the text embeddings, and a global max pooling layer is applied to obtain the final text representation. The text and journal representations are concatenated and used for the final prediction.

PubMedBERT. In this method, PubMedBERT is used as text encoder and then fine-tuned during the training process. Then, the title, abstract, and journal features represented by feature encoders are directly concatenated for final classification.

Our dataset was randomly split into training (80%) and testing (20%). After training each on the same training dataset, we evaluated their prediction performances on the testing dataset. We counted the true positives (TPs), which reflect number of correctly predicted positive values, true negatives (TNs), which denote the number of correctly predicted negative values, false positives (FPs), which denote number of samples incorrectly predicted as positive, and false negatives (FNs), which denote the number of samples incorrectly predicted as negative. Based on these, Precision = TP / (TP + FP), Recall = TP / (TP + FN), Specificity = TN / (FP + TN), Accuracy = (TP + TN) / (TP + FP + FN + TN), F1-score = 2 × [(Precision × Recall) / (Recall + Precision)], and area under the receiver operating characteristic curve (AUC) = \({\int }_{0}^{1}Recall \text{d}\left(Precision\right).\) Finally, to illustrate the computational complexity of PAJO, we compare it with other deep learning baselines using FLOPs(T), i.e., the floating point operations per second (Unit is T).

Data exploration and illustration

Prior to applying PAJO, we conducted a text and journal feature data exploration for illustrative purposes, which is useful when envisioning the state space and expected model outcomes. To get an idea of the influence of journal features on classification performance, we compared the distributions of Sets B and C in terms of IF, CS, and SJR. The representative boxplots are shown in Fig.  4 . As can be seen, the values of IF, CS, and SJR in the positive samples were significantly higher than those in the negative samples. This finding suggests that journal features are usually positively correlated with article quality.

figure 4

Boxplots of Journal Impact Factor (IF) from 2020 to 2021, CiteScore (CS), and Scientific Journal Ranking (SJR) reflecting positive vs. negative samples. Positive samples have higher journal feature values than negative samples, indicating that these features are useful in distinguishing high-quality articles

Next, we focus on the text features extracted from titles and abstracts. To visualize these features, we calculated the frequency of each word from Set A and present the top 100 with the highest frequencies as a word cloud in Fig.  5 . The higher the frequency, the larger the word. The highest-frequency words are intuitively related to neck pain based on the original search criteria. This demonstrates that our dataset is suitable for screening new articles.

figure 5

Word cloud containing the top 100 most frequent words found in the full sample set

Experimental results

Table  1 lists the predicted performance values of each model using the same test dataset. As shown, our PAJO model achieved the best prediction performance. PAJO surpassed the strongest baseline by 0.75% in Accuracy, 1.91% in F1-score, and 2.25% in AUC. These results demonstrate the improved predictive ability of this model. However, when evaluated based on precision and specificity, L1LR achieved the best results. Notably, Precision and Recall are trade-off measures, meaning that they should be assessed together. Recall can be more important than Precision as a higher recall value can avoid missing important articles [ 5 ]. Thus, although PAJO does not achieve the best precision, its good recall performance indicates a higher practical applicability. As suggested by Recall, the phenomenon of false negatives is noteworthy. To avoid false negatives, a practical screening procedure can be used. Specifically, we can first set a lower threshold to allow more samples to be counted as positive. We hope that more true positive samples can be covered in this manner. Then all the predicted positive samples are sorted in a decreasing order of their predicted probabilities, which are easier for researchers to screen. Finally, focusing on the FLOPs(T), we find our proposed PAJO indeed has a bigger computational complexity than other deep leaning models. This is mainly because we include PubMedBERT and also use the attention mechanism. This architecture of PAJO contributes to its superior performance in classification.

Ablation experiments

As discussed in Methods section, the PAJO model consists of a text encoder, an attention encoder, and a journal feature encoder. To exploit textual information, both the title and abstract of an article are used. We conducted a series of ablation experiments to explore the utility of each part, as it contributes to the performance of the whole. Specifically, we considered five scenarios based on the intake of three types of features: title (T), abstract (A), and journal (J). Hence, PAJO-T intakes only title features, and PAJO-A intakes only article features, both into the text encoder. PAJO-TJ intakes titles into the text encoder and uses the journal feature encoder to extract journal information. Similarly, PAJO-TA intakes article titles and abstracts into the text encoder. PAJO-AJ intakes article abstracts into the text encoder and uses the journal feature encoder to extract journal information. Lastly, PAJO-FULL refers to a fully functional model. This conclusion is consistent with that of a study by Zhang et al., who discovered that their BERT-CAM model, which also utilizes abstract characteristics, performed better than other techniques in terms of accuracy, precision, recall, and F1 value [ 31 ]. This shows that an important aspect of the effectiveness of NLP models is their utilization of abstract properties.

Table  2 lists the detailed results of the ablation experiments. Notably, PAJO-FULL achieved the best performance, apart from Precision. Interestingly, PAJO-A had the highest precision, 72.22%, whereas PAJO-FULL only scored 71.55%. However, as discussed, Precision and Recall are defined according to a given threshold, and they should not be interpreted as performance measures alone. A more integrated measure for Precision and Recall is the AUC and we found PAJO-FULL has the highest AUC value. Based on Table  2 , we also found that abstract features play a significant role in overall performance, as their removal resulted in much lower scores in all metrics. The largest drop was observed in precision, which decreased 12.68% from 71.55% (PAJO-FULL) to 58.87% (PAJO-TJ).

PAJO prediction performance

To further evaluate the efficacy of the proposed model in screening important articles related to neck pain, we additionally collected articles published in 2022. We regarded these articles as a new testing set, and then applied PAJO to classify them with prediction probabilities. The higher the probability, the more important the article. The prediction threshold was set to 0.5, which resulted in 60 positively classified studies. For comparison purposes, we randomly selected the same number of articles with prediction probabilities larger than 0.5 published in 2021.

All articles were evaluated from two perspectives by a trained radiologist who was blinded to the prediction results. The first perspective was the degree of relationship to the topic, neck pain, rated on a scale of 1–4. The higher the score, the stronger the relationship. The second was article quality , again rated on a scale of 1–4. This scale is based on the Grading of Recommendations Assessment, Development, and Evaluation method used by the American College of Radiology Appropriateness Criteria 1, 2. Finally, the two scores were simply summed up. The summation has a scale of 2–8 for each article.

We examined the prediction distributions, which are illustrated as boxplots in Fig.  6 . As shown, articles in groups with higher expert scores had correspondingly higher prediction scores. For example, the median probabilities of articles falling into the score groups of seven and eight are all around 0.95; while the median probabilities of articles falling into the other groups are all below 0.9. This trend indicates good consistency between the model predictions and the ground truth. For each score group, we also tested the differences in predictions made for articles published in 2022 and 2021. The results showed no significant differences between the groups, suggesting that articles published in different years have similar patterns. The PAJO model’s ability to handle large-scale datasets and its robustness to noise suggest that it could be used in a variety of real-world applications, such as information retrieval, document classification, and recommendation systems. Future studies could examine these prospective uses and assess how well the model functions in them. Future research may further examine the PAJO model’s application in fields other than neck pain research to gauge its adaptability and versatility.

The PAJO model might benefit from the incorporation of more varied features, based on the findings of the ablation experiments. For instance, the performance of the PAJO model might be enhanced using pre-trained language models (BERT) in the BERT-CAM [ 31 ] and AFR-BERT by Ji et al. [ 32 ] models. Additionally, the Bidirectional Long Short-Term Memory (BiLSTM) used by the AFR-BERT model for pre-processing data might be considered for incorporation into the PAJO model.

figure 6

Boxplots of article applicability + quality. The y-axis reflects the interquartile model predictions, where the red dots are outliers. The x-axis reflects the scale of expert-provided ground-truth applicability + quality, with “8” being the highest. The left panel reflects articles published in 2022, and the right panel reflects articles published in 2021

CPGs are important documents that provide healthcare best practices for clinicians, administrators, the public, and program managers. The screening of high-quality related articles plays a vital role in the development of CPGs. However, given the vast number of articles, manual curation is too time-consuming and labor-intensive. To help resolve this problem, we developed the PAJO model to assist practitioners in screening high-quality articles automatically. This model includes text, attention, and journal feature encoders. In addition to titles and abstracts, the model comprehensively considers article characteristics (e.g., IF and SJR). Taking “neck pain” as the focus area of this study, we constructed a dataset with highly relevant articles extracted via a query from PubMed. We then conducted extensive experiments using PAJO to identify the most important articles. Our results show that PAJO model performs better than several state-of-the-art methods on the literature screening task. To further verify the model efficacy, we tested its prediction performance using articles published in 2022 as the test set. Experts volunteered to provide ground-truth evaluations. The results show that there was a strong matching between model predictions and ground-truth results in terms of identifying the highest-quality articles. This result shows that the PAJO model can assist CPG curators with their jobs.

We now present some limitations of the PAJO model, which may set future directions. Treating all articles not cited by CPGs as negative samples is a crude first approximation. A more flexible approach using positive unlabeled learning may be applied to handle these negative samples. Second, the feature extraction method in PAJO may bias the model towards certain journals. Further investigation in this regard is needed. Mining textual characteristics from articles is another direction for future work. The prediction performance of PAJO was evaluated by a single trained radiologist. More evaluations by other radiologists must be conducted in future. Here, we examined the performance of PAJO on neck pain. In the future, the PAJO model can be extended to more diseases to verify its wide applicability.

Availability of data and materials

The datasets and codes used during the current study are available at . .


Area under the receiver operating characteristic curve

Bidirectional encoder representations from transformers

Bidirectional long short-term memory

Convolutional neural network

  • Clinical practice guideline

CiteScore metric

False negative

False positive

Journal impact factor

Logistic Regression with L1 Penalty

Long short-term memory

  • Natural language processing

Our proposed Paper title, Abstract, and JOurnal deep-learning model

Rectified linear unit

Recurrent neural network

Scientific journal ranking

Source-normalized impact per paper

Support vector machine

True negative

True positive

Chen Y, Yang K, Marušić A, Qaseem A, Meerpohl JJ, Flottorp S, et al. A reporting tool for practice guidelines in health care: the RIGHT statement. Ann Intern Med. 2017;166:128–32.

Article   PubMed   Google Scholar  

Shekelle PG. Clinical practice guidelines: what’s Next? J Am Med Assoc. 2018;320:757–8.

Article   Google Scholar  

Fire M, Guestrin C. Over-optimization of academic publishing metrics: observing Goodhart’s Law in action. GigaScience. 2019;8(6):giz053.

Article   PubMed   PubMed Central   Google Scholar  

Harmsen W, de Groot J, Harkema A, van Dusseldorp I, De Bruin J, Van den Brand S et al. Artificial intelligence supports literature screening in medical guideline development: Towards up-to-date medical guidelines. Medicine. 2021. .

Feng Y, Liang S, Zhang Y, Chen S, Wang Q, Huang T, et al. Automated medical literature screening using artificial intelligence: a systematic review and meta-analysis. J Am Med Inform Assoc. 2022;29:1425–32.

Dessi D, Helaoui R, Kumar V et al. TF-IDF vs word embeddings for morbidity identification in clinical notes: an initial study. 2021;DOI .

Kumar V, Recupero DR, Riboni D, et al. Ensembling classical machine learning and deep learning approaches for morbidity identification from clinical notes. IEEE Access. 2020;9:7107–26.

Wu S, Roberts K, Datta S, Du J, Ji Z, Si Y, et al. Deep learning in clinical natural language processing: a methodical review. J Am Med Inform Assoc. 2020;27:457–70.

Mullenbach J, Wiegreffe S, Duke J, Sun J, Eisenstein J. Explainable prediction of medical codes from clinical text. 2018; DOI: .

Prabhakar SK, Won DO. Medical text classification using hybrid deep learning models with multihead attention. Comput Intell Neurosci. 2021;2021:9425655.

Zhang Y, Liang S, Feng Y, Wang Q, Sun F, Chen S, et al. Automation of literature screening using machine learning in medical evidence synthesis: a diagnostic test accuracy systematic review protocol. Syst Rev. 2022;11:11.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R. ALBERT: A lite BERT for self-supervised learning of language representations. In International Conference on Learning Representations. 2020:1311–28.

Beltagy I, Lo K, Cohan A. SciBERT: A Pretrained Language Model for Scientific Text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019:3615–20.

Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36:1234–40.

Article   CAS   PubMed   Google Scholar  

Moen H, Alhuwail D, Björne J, et al. Towards Automated Screening of Literature on Artificial Intelligence in Nursing. Stud Health Technol Inform. 2022;290:637–40.

Kumar V, Recupero DR, Helaoui R, et al. K-LM: knowledge augmenting in Language Models within the Scholarly Domain. IEEE Access. 2022;10:91802–15.

Lin TY, Goyal P, Girshick R, He K, Dollar P. Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell. 2020;42:318–27.

Gu Y, Tinn R, Cheng H, et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans Comput Healthc. 2021;3(1):1–23.

Garfield E. The history and meaning of the journal impact factor. J Am Med Assoc. 2006;295:90–3.

Article   CAS   Google Scholar  

Van Noorden R. Impact factor gets heavyweight rival. J Cit Rep. 2016;30:20.

Google Scholar  

Falagas ME, Kouranos VD, Arencibia-Jorge R, Karageorgopoulos DE. Comparison of SCImago journal rank indicator with journal impact factor. FASEB J. 2008;22:2623–8.

Leydesdorff L, Opthof T. Scopus’s source normalized impact per paper (SNIP) versus a journal impact factor based on fractional counting of citations. J Am Soc Inf Sci. 2010;61:2365–9.

Roldan-Valadez E, Salazar-Ruiz SY, Ibarra-Contreras R, Rios C. Current concepts on bibliometrics: a brief review about impact factor, eigenfactor score, CiteScore, SCImago journal rank, source-normalised impact per paper, H-index, and alternative metrics. Ir J Med Sci. 2019;188:939–51.

Devlin J, Chang MW, Lee K, Toutanova K, Bert. Pre-training of deep bidirectional transformers for language understanding. In Annual Conference of the North American Chapter of the Association for Computational Linguistics. 2019:4171–86.

Sun Y, Li Y, Zeng Q, et al. Application research of text classification based on random forest algorithm. In 2020 3rd International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE). 2020:370–4.

Aseervatham S, Antoniadis A, Gaussier E, Burlet M, Denneulin Y. A sparse version of the ridge logistic regression for large-scale text categorization. Pattern Recognit Lett. 2011;32:101–6.

Qing L, Linhong W, Xuehai D. A novel neural network-based method for medical text classification. Future Internet. 2019;11:255.

Deng J, Cheng L, Wang Z. Attention-based BiLSTM fused CNN with gating mechanism model for chinese long text classification. Comput Speech Lang. 2021;68:101182.

Kim Y. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014:1746–51.

Lai S, Xu L, Liu K, et al. Recurrent convolutional neural networks for text classification. In The 29th AAAI Conference on Artificial Intelligence. 2015:2267–73.

Pan L, Lim WH, Gan Y. A method of Sustainable Development for three Chinese short-text datasets based on BERT-CAM. Electronics. 2023;12(7):1531.

Mingyu J, Jiawei Z, Ning W. AFR-BERT: attention-based mechanism feature relevance fusion multimodal sentiment analysis model. PLoS ONE. 2022;17(9):e0273936.

Download references


Not applicable.

This work is supported by the National Key R&D Program of China (2021ZD0113201), the National Natural Science Foundation of China (62025104, 62171297, 72371241), and the MOE Project of Key Research Institute of Humanities and Social Sciences (22JJD910001).

Author information

Yucong Lin and Jia Li contributed equally to this work.

Authors and Affiliations

School of Medical Technology, Beijing Institute of Technology, Beijing, 100081, China

Yucong Lin & Tianyu Fu

Department of Radiology, Beijing Friendship Hospital, Capital Medical University, Beijing, 100050, China

Jia Li & Han Lv

School of Statistics, Renmin University of China, Beijing, 100872, China

Huan Xiao & Feifei Wang

School of Computer Science & Technology, Beijing Institute of Technology, Beijing, 100081, China

Lujie Zheng & Hong Song

School of Automation, Beijing Institute of Technology, Beijing, 100081, China

Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China

Jingfan Fan, Deqiang Xiao, Danni Ai & Jian Yang

Center for Applied Statistics, Renmin University of China, Beijing, 100872, China

Feifei Wang

You can also search for this author in PubMed   Google Scholar


YL and HL designed the work. JL and JY made substantial contributions to the conception. FW and HL wrote the main manuscript. YL developed the model. HX, LZ, and YX collected and analyzed the data. HS, JF, DX, DA, and TF reviewed and revised the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Feifei Wang , Han Lv or Jian Yang .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit . The Creative Commons Public Domain Dedication waiver ( ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Cite this article.

Lin, Y., Li, J., Xiao, H. et al. Automatic literature screening using the PAJO deep-learning model for clinical practice guidelines. BMC Med Inform Decis Mak 23 , 247 (2023).

Download citation

Received : 18 February 2023

Accepted : 06 October 2023

Published : 03 November 2023


Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Deep learning

BMC Medical Informatics and Decision Making

ISSN: 1472-6947

literature review about training and development no longer supports Internet Explorer.

To browse and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail


Profile image of Rahul Mehra

In this competitive world, training plays an important role in the competent and challenging format of business. Training is the nerve that suffices the need of fluent and smooth functioning of work which helps in enhancing the quality of work life of employees and organizational development too. Development is a process that leads to qualitative as well as quantitative advancements in the organization, especially at the managerial level, it is less considered with physical skills and is more concerned with knowledge, values, attitudes and behaviour in addition to specific skills. Hence, development can be said as a continuous process whereas training has specific areas and objectives. So, every organization needs to study the role, importance and advantages of training and its positive impact on development for the growth of the organization. Quality of work life is a process in which the organization recognizes their responsibility for excellence of organizational performance as well as employee skills. Training implies constructive development in such organizational motives for optimum enhancement of quality of work life of the employees. These types of training and development programs help in improving the employee behaviour and attitude towards the job and also uplift their morale. Thus, employee training and development programs are important aspects which are needed to be studied and focused on. This paper focuses and analyses the literature findings on importance of training and development and its relation with the employees' quality of work life.

Related Papers

Renilda A Magsino

literature review about training and development

Fakhrorazi A Arshad

Alex Choong , Yuen Onn Choong

The main purpose of current paper is to examine the relationship between intrinsic motivation and organizational commitment of academicians in Malaysian Private Universities. The research is aim to appraise the existing literature and eventually build up the conceptual framework as well as hypotheses. A stratified proportionate sampling design has been employed. A total of 247 academicians from four Malaysian Private Universities have participated in this research survey. Further to this, intrinsic motivation is significantly correlated with the three components of commitment namely; affective, continuance and normative commitment. Besides, the finding also postulated that the intrinsic motivation has significantly predicted the organizational commitment. It is recommended that Heads of management, deans and human resources management should provide new and existing academicians with adequate training, workshop, seminar and conference that are related to the job scope. Apart from this, this is encouraged to conduct socialization programs for new academicians. By doing this, universities and faculties will be able to further enhance the academicians intrinsic motivation within an institution. Subsequently, this will strengthen the academics organizational commitment and increase performance. Hopeful to this, the universities will be able to strive for better status, reputation and performance. Eventually, it will be able to attract more foreign students enroll their study in Malaysian Private Universities. And, consequently it will assist in transforming Malaysia from middle-nation income to high-nation income with both inclusive and sustainable by 2020.

Nagara Akuma

International Refereed Research Journal of Arts, Researchers World

Hari Babu Thammineni

Deerghasi Vizai Bhakar -born in 1958 in Ampolu village in Srikakulam district of Andhra Pradesh-is a notable playwright in the contemporary Telugu theatre. He has pursued academic studies in theatre, and done his PhD on Brecht’s influence on Telugu drama. In his own works, he has engaged with sociopolitical issues and the idea of social justice using Vedic, Upanishadic and Puranic references, and popular religious symbols and metaphors to convey his message of emancipation. Fourth World is a raised era for social emancipation and all his works deal with the core values of emancipation such as humanity, social concern, and moral values. This paper attempts to critically evaluate Vizai Bhaskar as an emerging Fourth-World Playwright with deeper social concern and humanism. His major plays include Ruthwik, Kurchi, Kalakootam and Riding the Tiger. All of these have been translated into English and performed in languages including Kannada, Malayalam and Hindi. Keywords: Fourth World, Social Concern and Humanism.

Fayaz Ahmad Bhat

Abhik Maiti

In the introduction to The Cambridge Introduction to the American Short Story (2006), Martin Scofield, quotes the famous Irish short story writer Frank O‟Connor, who commented, “The short story in America has for almost two centuries held a prominent, even pre-eminent place in the American literary tradition. For the Americans the short story had become „a national art form‟” (Scofield: 2006). This short story, carrying close affinities with many novelistic elements, is perfect in the hand of American narrator O‟ Henry . Taking one of his most celebrated stories “The Last Leaf ” this short paper attempts to see him as a master of masterpiece in the field of „the art of storytelling‟. Henry James had once described short stores as a “slice of life” and for nearly a hundred years, thousands of Americans have found in O. Henry‟s short stories the magic key to their own brand of Arabian Nights entertainment. In the ten years of his greatest activity from 1900 to 1910, he wrote nearly three hundred stories filled with good-humored comments on the fortunes of men and women and touched the deals routine of city life with the brush of romance and partaking of local colours bearing stylistic similarities with Joyce and Cowed. H.W. Wells, regarding the course of a short story wrote that a short story is or should be a simple thing: „it aims at producing one single vivid effect. It has to seize the attention at the onset, and never relaxing, gathering until the climax is reached.‟ As Tagore said about the ending of a short story „ses hoyeo hoilo na ses‟ (it doesn‟t end even after ending), the Last Leaf begins in the text but end in the reader.

Revati Deshpande

Stress in the workplace reduces productivity, increases management pressures, and makes people ill in many ways, evidences of which are still increasing. So, it is very important for organizations to understand; manage and reduce stress at work. The present study is significant as it targets the causes, the effects on the employees working in Production, finance & Account and HR department relating to certain demographic feature like age, position etc that might play a significant role in the levels of stress being caused in individuals. Data were collected with the help of questionnaire distributed amongst the 116 respondents which include 61 productions, 35 Finance &Accounts and 20 from HR department employees who were facing stress at their work place. The study will reduce the gap between what are the effective methods for stress management and other methods that would be used which is feasible for both the organizations to implement and the employees to adapt to. The descriptive & inferential statistical analysis revealed the significant differences in terms of sources, effects and relieving techniques used as coping instruments for stress. This research work concentrates on each facet of stress thereby trying to find out the perception view and functioning of individuals department and how they are affected through the questionnaire which studies the views of a generic group of the organization.

Princy Thomas


Maria Cristina Santos

IJAR Indexing

Temesgen Thomas

Thowseaf Ahamed

Researchers World

Vidhya Shanmugam

Journal of Arts, Science & Commerce


daniel chua

Salisu O J O N E M I PAUL

Teminto Berry

Wellman Kondowe , Flemmings F Ngwira

Thyagi Pushpika

Madanant Naik

Journal of Arts, Science and Commerce International Refereed Research Journal

Dr. Suraj K U M A R Singh

Dr. Nalla Bala Kalyan

Kiyara Singh

AARF Publications Journals

SKIREC Publication- UGC Approved Journals

Dr Showkeen Bilal

Omprakash Meena


apoorvi sood

Dr.S.Saravanan Saran

Dr Bharti Ahuja

Yohanes Ari

Jyoti K Chandel


International Research Journal of Human Resources and Social Sciences

Pulla Rao Kota


Dibyendu Choudhury

Dr Geetha R

Journal Approved)

Thawhidul Kabir

Revati Deshpande , Anushree Karani

Ibohal Meitei

Nakkeeran Senthilkumar

Bonfring International Journal

Wellman Kondowe

Norazah Mohd Suki

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2023
  • Reference Manager
  • Simple TEXT file

People also looked at

Review article, chatgpt in orthopedics: a narrative review exploring the potential of artificial intelligence in orthopedic practice.

literature review about training and development

  • 1 IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
  • 2 Residency Program in Orthopedics and Traumatology, University of Milan, Milan, Italy
  • 3 Department of Plastic Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA, United States
  • 4 Department of Orthopaedic, Trauma, and Reconstructive Surgery, RWTH University Medical Centre, Aachen, Germany
  • 5 Department of Orthopedics and Trauma Surgery, Academic Hospital of Bolzano (SABES-ASDAA), Teaching Hospital of the Paracelsus Medical University, Bolzano, Italy
  • 6 Dipartimento di Scienze Biomediche per la Salute, Università degli Studi di Milano, Milan, Italy

The field of orthopedics faces complex challenges requiring quick and intricate decisions, with patient education and compliance playing crucial roles in treatment outcomes. Technological advancements in artificial intelligence (AI) can potentially enhance orthopedic care. ChatGPT, a natural language processing technology developed by OpenAI, has shown promise in various sectors, including healthcare. ChatGPT can facilitate patient information exchange in orthopedics, provide clinical decision support, and improve patient communication and education. It can assist in differential diagnosis, suggest appropriate imaging modalities, and optimize treatment plans based on evidence-based guidelines. However, ChatGPT has limitations, such as insufficient expertise in specialized domains and a lack of contextual understanding. The application of ChatGPT in orthopedics is still evolving, with studies exploring its potential in clinical decision-making, patient education, workflow optimization, and scientific literature. The results indicate both the benefits and limitations of ChatGPT, emphasizing the need for caution, ethical considerations, and human oversight. Addressing training data quality, biases, data privacy, and accountability challenges is crucial for responsible implementation. While ChatGPT has the potential to transform orthopedic healthcare, further research and development are necessary to ensure its reliability, accuracy, and ethical use in patient care.

1. Introduction

Musculoskeletal disorders affect millions of individuals worldwide each year and orthopedic surgeons often face challenging situations requiring quick and complex decisions. Furthermore, patients' education and compliance in orthopedics are essential in improving treatment outcomes and active participation in recovery ( 1 ).

Over the years, technological advancements have significantly influenced the practice of orthopedics, with the integration of artificial intelligence (AI) systems showing great potential in improving patient care and outcomes. In fact, this new imposing reality is developing exponentially in the healthcare sector, especially due to the improvement in computing power, the increase in health data, and the ability to access large sets of exploitable data ( 2 ). There are numerous stages of patient management where AI could play a useful role, ranging from the diagnostic to the therapeutic aspect. Among the various AI-based systems, ChatGPT, a language natural processing (LNP) technology developed by OpenAI (San Francisco, CA), was launched in November 2022.

ChatGPT is one of the LNP models based on the transformer architecture and trained on a vast corpus of textual data, enabling it to generate human-like responses to user questions in an interactive way. Its ability to understand and generate contextually relevant and coherent responses has led to its exploration and application in various sectors, including healthcare. In the field of orthopedics, this AI-based tool can provide clinical contributions to the complex decision-making process by facilitating information exchange with patients and providing accessible and accurate information to both healthcare professionals and patients themselves. The trends AI research following the launch of ChatGPT have recently been analyzed with the aim of identifying key developments and future directions. Alessandri-Bonetti et al. conducted a bibliometric analysis of the literature in the first 7 months since the introduction of ChatGPT until July 1st, 2023, collecting 724 articles ( 3 ). A significant increase in publications exploring ChatGPT use across various medical disciplines has been observed, especially in the medical field, suggesting a growing relevance of ChatGPT in the healthcare sector. Interestingly, a decrease in studies focused on ethical considerations has been noted, indicating a shift in research focus. The results highlight the increasing integration of ChatGPT in various medical disciplines, underscoring its expanding role in healthcare.

Among all areas of medicine, orthopedics deserves particular attention. Orthopedic conditions encompass a wide range of pathologies, including fractures, joint disorders, spinal deformities, and sports injuries. ChatGPT has the potential to serve as a clinical decision-support tool by providing clinicians with relevant information based on patient symptoms, medical history, and radiological findings. Its features can be helpful in differential diagnosis and suggest diagnostic tests or appropriate imaging modalities for further evaluation. Therapeutic recommendations in orthopedics are often based on evidence-based guidelines and clinical experience. AI technologies can optimize this process by assisting clinicians in synthesizing a vast amount of medical literature and providing updated therapeutic recommendations based on the specific characteristics of the patient and their condition. This can contribute to optimizing treatment plans, promoting adherence to evidence-based practices, and reducing variability in clinical decisions. Furthermore, ChatGPT could play a fundamental role in patient communication and education. Orthopedic conditions can often be complex, and patients often have numerous questions and concerns about their diagnosis, treatment options, and expected outcomes. ChatGPT can provide patients with reliable and understandable information, addressing their questions and alleviating their anxieties. Also, patients might enhance their knowledge and preparedness prior to surgeon's consult, potentially resulting not only in patient's readiness but also time saving for the physician. This can lead to improved patient satisfaction, interaction, and adherence to treatment plans. While this perspective could open up new opportunities for patients, it is likely dangerous to envision the use of ChatGPT as a substitute for the physical examination by a medical professional or specialist consultation.

Finally, ChatGPT can be a valuable tool for literature review and research in orthopedics, which is continuously evolving, with new studies and publications being released regularly. Keeping up with the latest evidence can be a challenge for clinicians and researchers. ChatGPT can assist in conducting literature searches, summarizing research articles, and identifying key findings, thus facilitating evidence-based practice and promoting knowledge translation. The significant impact of Artificial Intelligence in writing or assisting researchers has led several international scientific journals to require the declaration of whether AI software was used in writing an article. Indeed, despite the numerous potential advantages, it is essential to ensure scientific integrity and ethics in AI-assisted research and writing. Simultaneously, transparency regarding the use of AI in documents is a mandatory step towards genuine scientific responsibility.

Protecting this and many other aspects must be mandatory in approaching this pivotal shift in the medical and orthopedic world. The aim of this review is to provide a comprehensive overview of the use of ChatGPT in orthopedics, highlighting the pros and cons of each application. By synthesizing the available evidence, we hope to shed light on the strengths, limitations, and future implications of ChatGPT in enhancing patient care, clinical decision-making, and workflow optimization. The findings of this review will inform healthcare professionals, researchers, and policymakers about the current state of knowledge in this field and provide guidance for future research and implementation of ChatGPT in orthopedic practice.

2. Materials and methods

Studies were searched on PubMed database using the keywords “ChatGPT” OR “language natural processing” AND “Orthopaedics”. Last search was conducted on July 1st, 2023. Only studies describing the application of ChatGPT in orthopedics were included in the review. Studies involving the use of ChatGPT in orthopedic settings, such as clinical practice, patient education, decision support, and remote monitoring, will be considered. Exclusion criteria will include studies not relevant to orthopedics, non-English articles, and studies with inadequate information on the use of ChatGPT.

Two independent reviewers (R.G. and A.L.) performed the study selection, data extraction, and quality assessment. Any discrepancies will be resolved through consensus or consultation with a third reviewer (M.A.B.). The extracted data will include study characteristics, study design, and key findings.

A diverse range of studies on the use of ChatGPT in orthopedics was observed. The results are presented in a narrative synthesis, organized according to the different domains of orthopedic practice in which ChatGPT has been utilized. The main topics in which ChatGPT was tested were clinical decision-making, patient education, and workflow optimization. Table 1 provides key study characteristics.

Table 1 . Key study caractreristics.

4. Discussion

4.1. exploring the diverse applications of chatgpt in orthopedics: from diagnosis to treatment planning.

The use of ChatGPT in the field of orthopedics has started to be explored in the scientific literature, with numerous articles discussing its potential. As reported in the work of Poduval et al. ( 4 ), it is now essential to understand and embrace robotics and AI, along with traditional clinical skills, in modern orthopedic practice. Since AI has the potential to be a positive and disruptive force in orthopedic surgery, orthopedic surgeons must accept and explore its possibilities. Indeed, advantageous prospects can be found in improving diagnostic accuracy, optimizing surgical planning, providing effective intraoperative assistance, and personalizing treatments. At the same time, the potential disruptive force of this technology must be monitored in areas such as data security, the need for continuous medical supervision, and the maintenance of medical ethics and integrity.

According to Cheng et al. ( 5 ), the main roles of ChatGPT can be found in scientific research, disease diagnosis, treatment options, preoperative planning, intraoperative support, and postoperative rehabilitation.The incredible potential of AI in orthopedic surgery is further discussed in the paper by Hernigou ( 6 ). According to the authors, the unique characteristic of AI that is well-suited to this medical field is its ability to analyze large amounts of data and generate useful information. With this feature, AI can not only assist in diagnosis, preoperative planning, or intraoperative guidance but also provide clinical decision support based on predictive analysis and personalized treatment plans.

Karnuta et al. ( 7 ) even compare the transformative potential of AI technology to historical advancements such as the introduction of metallic instruments and the Industrial Revolution. In this article as well, the authors hypothesize a real revolution in orthopedic practice in areas such as personalized patient care, image analysis, and surgical decision-making. To overcome the current limitations of ChatGPT in synthesizing complex orthopedic knowledge and answering intricate questions, the authors suggest that specialized training and exposure to orthopedic texts and manuscripts could enable AI systems to achieve higher performance levels and even surpass orthopedic exams. Although clinical applications are still lacking, and the AI technology still appears weak in real-life complex scenarios, according to the available reports, it is foreseeable that ChatGPT potential or future AI models will dramatically change orthopedics practice.

4.2. Empowering patients with AI: assessing the role of ChatGPT in providing reliable health information

In today's world, where patients have access to a vast amount of data (often not accurate and up-to-date), a crucial role could be played by this AI-based tool in patient information. In the paper by Dubin et al. ( 8 ), a comparison is made between the appropriateness and reliability of ChatGPT and Google web search as resources for patients seeking health information online. The study compares frequently asked questions (FAQs) related to total knee arthroplasty (TKA) and total hip arthroplasty (THA) obtained from both sources. Only 25% of the questions were similar when performing a Google web search and a search of ChatGPT for all search terms, with 13/20 Google results from commercial sites and 15/20 ChatGPT results from government sites. 11/20 numerical questions had different responses. ChatGPT provided heterogeneous questions and responses. In conclusion, it is not yet a reliable source of information for patients. More research is needed to determine its accuracy and reliability. Until then, patients should consult with a healthcare professional for medical questions or concerns.

4.3. Challenging ChatGPT

ChatGPT has been put to the test in various fields of medicine, and some have even attempted to “challenge” the AI-based ChatGPT model in the field of orthopedics, comparing it to human knowledge. The interesting work by Cuthbert ( 9 ) aimed to evaluate whether ChatGPT could pass Section 1 of the Fellowship of the Royal College of Surgeons (FRCS) examination in Traumatology and Orthopedic Surgery. The results demonstrated that ChatGPT achieved only 35.8%, significantly lower than the passing rate of FRCS and the average score obtained by human candidates at all levels of training. The main shortcomings of ChatGPT were identified in its inability to exercise higher-order judgment and the multilogical thinking required to pass the examination. These limitations should be recognized and publicized to ensure that clinicians are aware of them. The results of this study also underline the importance of critically assessing the reliability and limitations of artificial intelligence systems in the context of real-life complex scenarios. While ChatGPT has shown promise in generating contextually relevant text, its performance in a highly specialized and technical domain like orthopedic surgery has been insufficient. This suggests that AI models like ChatGPT may not necessarily possess the necessary expertise and clinical reasoning skills required for complex medical decisions. Additionally, the study revealed that ChatGPT failed to recognize its own limitations, providing incorrect explanations for questions it answered incorrectly. This represents a significant and dangerous limitation of this tool. Clinicians and educators should be cautious about relying solely on artificial intelligence systems for assessments or decisions without understanding their limitations and ensuring adequate human oversight. Adapting the training data and refining the model with specialized orthopedic knowledge could enhance its performance in this domain. Furthermore, efforts should be made to address the lack of contextual understanding exhibited by ChatGPT, as this is a crucial aspect of clinical decision-making. A recent study aimed to evaluate ChatGPT's performance in the Italian Residency Admission National Exam to assess its level of medical knowledge compared to graduate medical doctors in Italy ( 10 ). In June 2023, ChatGPT3 was employed to undertake this exam, which consists of a computer-based multiple-choice test comprising 140 questions, taken annually by all Italian medical graduates. The exam evaluates basic medical science knowledge and its application. ChatGPT's performance was compared with that of 15,869 medical graduates, revealing that ChatGPT answered 122 out of 140 questions correctly. The score ranked in the 98.8th percentile among the 15,869 medical graduates. Among the 18 incorrect answers, 10 related to direct questions about basic medical science knowledge, while 8 concerned applied clinical knowledge and reasoning through case presentations. Errors were logical (2 incorrect answers, ChatGPT motivated correctly the answer, but provided the wrong multiple-choice answer) and informational in nature (16 incorrect answers, ChatGPT provided incorrect answer and reasoning). Interestingly, all explanations for correct answers were deemed “appropriate.” Comparing with national statistics regarding the minimum score required to access each specialty, ChatGPT's performance demonstrated it would have qualified the candidate for any specialty. Thus, ChatGPT displayed competence in basic medical science knowledge and applied clinical knowledge. Further research should evaluate ChatGPT's impact and reliability in clinical practice.

4.4. ChatGPT in scientific literature: opportunities, challenges, and the imperative of ethical standards

There is another important aspect where ChatGPT is gaining traction, namely the field of scientific literature. The potential uses can vary widely, ranging from grammar correction and proofreading to planning the highlights of scientific articles. According to Bi et al. ( 11 ), the ability of ChatGPT to generate manuscript drafts and its potential to streamline the writing process should be acknowledged. However, concerns are raised about the accuracy of the generated content and the need for quality control and fact-checking. Ollivier et al. ( 12 ) discuss the problem of plagiarism and false content in scientific literature. We agree with the points raised by the authors, emphasizing the importance of maintaining high ethical standards and accuracy in scientific research. While large language models like ChatGPT have the potential to assist in text generation and information synthesis, it is essential to critically evaluate their results for scientific validity. The authors propose measures such as data sharing, improved training and education, and the development of technologies and tools to detect plagiarism and misconduct. The need to verify and corroborate the information generated by AI models, as well as the importance of ethical standards, transparency, and reliability in scientific research and publication, remain pressing. The role of human evaluation and critical thinking is still indispensable for the effective and responsible use of AI-generated content.

4.5. Cautions and recommendations

Given the aforementioned points, we feel obligated to provide cautions and recommendations for the interpretation of data derived from ChatGPT, as already shared by many authors of the aforementioned studies. New horizons and challenges such as data privacy, security, validation, and ethical considerations arise when ensuring responsible implementation of AI in orthopedic surgery. The responsible use of this tool must be based on an awareness of its limitations and biases. Foremost among them is the dangerous concept of AI hallucination ( 6 ). This phenomenon involves the possibility of generating incorrect responses but still providing confident and plausible-sounding explanations. The authors cite the following example: when asked to generate a report on an event after its last update, the chatbot falsely discusses the announcement but later admits that it has no information about this communication because it lacks temporal data availability. AI can rely on machine learning algorithms trained on extensive datasets to assess source credibility through reputation analysis and source consistency, aiming to identify potential patterns of misinformation or the spread of false information. Another tool at its disposal is spoken natural language analysis, which, through semantic and syntactic examination, can help recognize inaccurate or misleading information. Despite having these resources, AI hallucinations can be extremely perilous when critical analysis of obtained information is not conducted. Therefore, careful scrutiny is needed to avoid the inadvertent distribution of misleading or inaccurate medical knowledge.

Another aspect that deserves caution is the potential risk of bias in ChatGPT's responses. The generated answers could be influenced by the training data, which may reflect biases or trends in the original texts that are not necessarily accurate or up-to-date. This could manifest as formulating biased or unrepresentative recommendations or diagnoses in orthopedics. Therefore, we emphasize the importance of conducting a critical assessment of the responses and considering possible measures to mitigate any bias. Another challenge may lie in the demand for Structured Content Generation by ChatGPT. In the field of orthopedics, this could translate into the creation of orthopedic medical reports, which require strict formatting and organization of information. In this case, we also recommend a careful manual review of the generated documents to ensure the proper structuring of data.

Other peculiar elements that deserve attention are described in the article by Karnuta et al. ( 7 ), such as the “garbage in, garbage out” principle, emphasizing the importance of ensuring high-quality and unbiased data as input for AI systems to avoid perpetuating biases and misinformation. The same article also discusses the responsibility and obligation to ensure robust safety mechanisms and clear roles for stakeholders in the event of system malfunctions and harm to patients ( 7 ). These models may not possess in-depth domain-specific knowledge and may lack the ability to apply higher-order judgment and reasoning, especially in complex medical contexts ( 13 ). Aspects such as transparency, responsibility, and thorough evaluation of AI systems need to be sought and improved to ensure the reliability and quality of the generated results ( 13 ). Lastly, there are other elements worth mentioning, such as data privacy, quality control, biases in training data, and the challenge of authors' attribution ( 5 , 14 ). Therefore, careful regulation and ethical use of tools like ChatGPT in orthopedics and medicine seem necessary.

Finally, we recommend caution in managing multiple tasks simultaneously. Although ChatGPT can handle a wide range of tasks, this could pose limitations. In the field of orthopedics, a physician may have to address multiple questions simultaneously in a single interaction with ChatGPT. This may necessitate greater care in formulating questions and interpreting responses to ensure no confusion and thus provide accurate answers.

With these considerations, physicians should actively shape the trajectory of AI, providing feedback to regulatory bodies and developers, promoting dialogue, and ensuring a thorough examination of the implications of AI implementation in clinical practice ( 7 ).

4.6. Strengths and limitations

To the best of our knowledge, this is the first review of ChatGPT in the orthopedic field. This paper provides a comprehensive overview of the use of ChatGPT in orthopedics, covering various aspects such as clinical decision-making, patient education, workflow optimization, and scientific literature. The present study presents both the potential benefits and limitations of using ChatGPT, highlighting the need for caution, ethical considerations, and human oversight.

This study also presents several limitations. First of all, it is a narrative review. So, although the review mentions the use of independent reviewers and quality assessment tools, it does not follow a standard systematic review methodology, such as a predefined protocol or PRISMA guidelines. Secondly, there is a lack of critical appraisal. This narrative review does not provide an evaluation of the quality or risk of bias of individual studies. A critical appraisal of the included studies would allow readers to assess the strength of the evidence presented. In conclusion, while this narrative review provides a comprehensive overview of the potential applications of ChatGPT in orthopedics and highlights the need for caution and ethical considerations, its limitations as a non-systematic review and lack of critical appraisal of included studies should be considered when interpreting the findings. 

5. Conclusions

The integration of AI technologies, including ChatGPT, holds tremendous promise for transforming orthopedic healthcare. Although the potential applications of ChatGPT in orthopedics are promising, several challenges and considerations need to be addressed. The reliability and accuracy of the responses generated by ChatGPT depend on the quality of the training data and algorithms used. It is essential to ensure that the language model is trained on diverse and high-quality orthopedic data to minimize the risk of bias and incorrect recommendations. Furthermore, ethical and legal aspects of AI use in healthcare, such as data privacy, security, and accountability, must be carefully addressed to ensure patient confidentiality and trust.

Addressing the challenges and considerations associated with its use is crucial to ensure the reliability, accuracy, and ethical implementation of this technology. Ongoing research and development in this field will pave the way for the integration of ChatGPT and other artificial intelligence systems in orthopedics, benefiting both patients and healthcare providers.

Author contributions

RG: Conceptualization, Methodology, Visualization, Writing – original draft, Writing – review & editing. MA: Conceptualization, Investigation, Writing – original draft. AL: Conceptualization, Investigation, Writing – review & editing. FM: Conceptualization, Writing – review & editing. NR: Investigation, Methodology, Writing – review & editing. LM: Methodology, Supervision, Writing – review & editing. GP: Supervision, Writing – review & editing.

The author(s) declare financial support was received for the research, authorship, and/or publication of this article.

This research was funded by the Italian Ministry of Health—Ricerca Corrente. The Italian Ministry of Health has also paid the APCs (Article Processing Charges).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

1. Loftus TJ, Tighe PJ, Filiberto AC, Efron PA, Brakenridge SC, Mohr AM, et al. Artificial intelligence and surgical decision-making. JAMA Surg . (2020) 155:148–58. doi: 10.1001/jamasurg.2019.4917

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Kaul V, Enslin S, Gross SA. History of artificial intelligence in medicine. Gastrointest Endosc . (2020) 92:807–12. doi: 10.1016/j.gie.2020.06.040

3. Alessandri-Bonetti M, Liu HY, Giorgino R, Nguyen VT, Egro FM. The first months of life of ChatGPT and its impact in healthcare: a bibliometric analysis of the current literature. Ann Biomed Eng . (2023. doi: 10.1007/s10439-023-03325-8

CrossRef Full Text | Google Scholar

4. Poduval M, Ghose A, Manchanda S, Bagaria V, Sinha A. Artificial intelligence and machine learning: a new disruptive force in orthopaedics. Indian J Orthop . (2020) 54:109–22. doi: 10.1007/s43465-019-00023-3

5. Cheng K, Li Z, Li C, Xie R, Guo Q, He Y, et al. The potential of GPT-4 as an AI-powered virtual assistant for surgeons specialized in joint arthroplasty. Ann Biomed Eng . (2023) 51:1366–70. doi: 10.1007/s10439-023-03207-z

6. Hernigou P, Scarlat MM. Two minutes of orthopaedics with ChatGPT: it is just the beginning; it’s going to be hot, hot, hot!. Int Orthop . (2023) 47:1887–93. doi: 10.1007/s00264-023-05887-7

7. Karnuta JM. CORR Insights®: can artificial intelligence pass the American board of orthopaedic surgery examination? Orthopaedic residents versus ChatGPT. Clin Orthop Relat Res . (2023) 481:1631–3. doi: 10.1097/CORR.0000000000002741

8. Dubin JA, Bains SS, Chen Z, Hameed D, Nace J, Mont MA, et al. Using a google web search analysis to assess the utility of ChatGPT in total joint arthroplasty. J Arthroplasty . (2023) 38:1195–202. doi: 10.1016/j.arth.2023.04.007

9. Cuthbert R, Simpson AI. Artificial intelligence in orthopaedics: can chat generative pre-trained transformer (ChatGPT) pass section 1 of the fellowship of the royal college of surgeons (trauma & orthopaedics) examination? Postgrad Med J . (2023):qgad053. doi: 10.1093/postmj/qgad053

10. Alessandri Bonetti M, Giorgino R, Gallo Afflitto G, De Lorenzi F, Egro FM. How does ChatGPT perform on the Italian residency admission national exam compared to 15,869 medical graduates? Ann Biomed Eng . (2023). doi: 10.1007/s10439-023-03318-7

11. Bi AS. What’s important: the next academic-ChatGPT AI? J Bone Joint Surg Am . (2023). doi: 10.2106/JBJS.23.00269

12. Ollivier M, Pareek A, Dahmen J, Kayaalp ME, Winkler PW, Hirschmann MT, et al. A deeper dive into ChatGPT: history, use and future perspectives for orthopaedic research. Knee Surg Sports Traumatol Arthrosc . (2023) 31:1190–2. doi: 10.1007/s00167-023-07372-5

13. Kunze KN, Jang SJ, Fullerton MA, Vigdorchik JM, Haddad FS. What’s all the chatter about? Bone Joint J . (2023) 105-B:587–9. doi: 10.1302/0301-620X.105B6.BJJ-2023-0156

14. Parsa A, Ebrahimzadeh MH. ChatGPT in medicine; a disruptive innovation or just one step forward? Arch Bone Jt Surg . (2023) 11:225–6. doi: 10.22038/abjs.2023.22042

Keywords: orthopedics, artificial intelligence, AI, ChatGPT, patient education, clinical decision-making

Citation: Giorgino R, Alessandri-Bonetti M, Luca A, Migliorini F, Rossi N, Peretti GM and Mangiavini L (2023) ChatGPT in orthopedics: a narrative review exploring the potential of artificial intelligence in orthopedic practice. Front. Surg. 10:1284015. doi: 10.3389/fsurg.2023.1284015

Received: 27 August 2023; Accepted: 16 October 2023; Published: 1 November 2023.

Reviewed by:

© 2023 Giorgino, Alessandri-Bonetti, Luca, Migliorini, Rossi, Peretti and Mangiavini. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Riccardo Giorgino [email protected]

This article is part of the Research Topic

Artificial Intelligence-Based Multimodal Prediction Modeling in Orthopedic Surgery

  • Open access
  • Published: 01 November 2023

Deep learning-enabled natural language processing to identify directional pharmacokinetic drug–drug interactions

  • Joel Zirkle 1 ,
  • Xiaomei Han 1 ,
  • Rebecca Racz 1 ,
  • Mohammadreza Samieegohar 1 ,
  • Anik Chaturbedi 1 ,
  • John Mann 1 ,
  • Shilpa Chakravartula 1 &
  • Zhihua Li 1  

BMC Bioinformatics volume  24 , Article number:  413 ( 2023 ) Cite this article

87 Accesses

Metrics details

During drug development, it is essential to gather information about the change of clinical exposure of a drug (object) due to the pharmacokinetic (PK) drug-drug interactions (DDIs) with another drug (precipitant). While many natural language processing (NLP) methods for DDI have been published, most were designed to evaluate if (and what kind of) DDI relationships exist in the text, without identifying the direction of DDI (object vs. precipitant drug). Here we present a method for the automatic identification of the directionality of a PK DDI from literature or drug labels.

We reannotated the Text Analysis Conference (TAC) DDI track 2019 corpus for identifying the direction of a PK DDI and evaluated the performance of a fine-tuned BioBERT model on this task by following the training and validation steps prespecified by TAC.

This initial attempt showed the model achieved an F-score of 0.82 in identifying sentences as containing PK DDI and an F-score of 0.97 in identifying object versus precipitant drugs in those sentences.

Discussion and conclusion

Despite a growing list of NLP methods for DDI extraction, most of them use a common set of corpora to perform general purpose tasks (e.g., classifying a sentence into one of several fixed DDI categories). There is a lack of coordination between the drug development and biomedical informatics method development community to develop corpora and methods to perform specific tasks (e.g., extract clinical exposure changes due to PK DDI). We hope that our effort can encourage such a coordination so that more “fit for purpose” NLP methods could be developed and used to facilitate the drug development process.

Peer Review reports

Background and significance

Over the past decade, there has been a surge of interest in developing natural language processing (NLP) methods to automatically extract and process information from biomedical literature (including regulatory drug labels). One such NLP application under active research is the automatic identification of drug-drug interactions (DDIs) [ 1 ]. This is driven by the high prevalence of potential DDIs that may lead to significant adverse events in clinical settings, and the rapid expansion of biomedical documents containing established DDI information in natural language format [ 2 ]. Recent advances in machine learning techniques, especially deep learning/neural networks, have made it possible to extract DDIs from biomedical documents automatically [ 2 ].

One clear example demonstrating the need for automatic methods for NLP of DDI information is the identification of the change in clinical exposures of an object drug due to other precipitant drugs (Fig.  1 ). This kind of pharmacokinetic (PK) DDI information is not only important in a clinical setting when prescribing medications [ 3 ], but also critical during drug development: for example, in evaluating a drug’s potential to cause QT prolongation or proarrhythmic adverse events, clinical and nonclinical studies are required by international regulatory guidelines [ 4 ] to cover the so-called high clinical exposure scenario (defined as the expected exposure when the drug is used in the presence of intrinsic or extrinsic factors, such as impaired renal function, PK DDI etc.). Given a specific drug of interest (the object drug), gathering information from existing biomedical literature and regulatory labels about all other drugs (precipitant drugs) that could change the object drug’s clinical exposure through DDI is an important step towards establishing its high clinical exposure.

figure 1

An example pair of sentences about pharmacokinetic (PK) drug-drug interaction (DDI) involving verapamil. For the left sentence, verapamil is the precipitant. For the right sentence, verapamil is the object. Our method (the BioBERT_directionalDDI model) can automatically distinguish the two sentences and label the precipitant vs object drugs

There have been several initiatives that aimed at encouraging and evaluating NLP techniques to extract DDIs from biochemical literature and regulatory drug labels, for example the DDIExtraction Shared Tasks in 2011 [ 5 ] and 2013 [ 6 ], and the Text Analysis Conference (TAC) DDI tracks 2018 [ 7 ] and 2019 [ 8 ]. Various NLP methods, including traditional machine learning methods based on syntactic and lexical features, and deep learning methods based on neural networks, have been evaluated under these initiatives with varying degrees of success. However, it is difficult to apply these existing methods to the problem of automatic extraction of clinical exposure changes for object drugs due to DDI with precipitant drugs. For example, given the task of “identify all DDIs where clinical exposure of verapamil is changed by another drug from natural language text”, most published methods can only finish the first step of sentence classification: screen all sentences in literature or product labels and identify those that describe DDI relations involving verapamil. Because verapamil is both an inhibitor of cytochrome P450 enzymes and P-glycoprotein [ 9 ], and a substrate of CYP3A4 [ 10 ], there will be a large pool of sentences identified from the first step where verapamil can be either the object or precipitant drug. Consequently, in the second step most of these sentences need to be filtered out, leaving only a small subset of DDI sentences with the “correct” direction: those that describe verapamil as an object drug whose clinical exposure can be altered by other (precipitant) drugs (Fig.  1 ). This second step belongs to the typical NLP task of Named Entity Recognition (NER).

To the best of our knowledge the only time the task of identifying the directionality of a PK DDI was addressed was in tasks 3 and 4 of the TAC 2019 DDI track. Of the four teams that submitted methods, only one team attempted task 4 [ 8 ]. However, it does not appear that these methods were made publicly available. As such, currently there does not appear to be any published NLP method to automatically identify the direction of a PK DDI from natural language text.

Here we report the development of a complete solution to finish both steps through NLP. Our method is based on the state-of-the-art pre-trained neural network language model BERT (Bidirectional Encoder Representations from Transformers) [ 11 ]. We manually annotated a corpus to label object versus precipitant drugs, and then fine-tuned a previously published BERT model that was pre-trained on biomedical literature (BioBERT, see [ 12 ]). We have named the resulting model BioBERT_directionalDDI, and it is designed to finish the two steps sequentially: first identify a sentence that involves PK DDI, and then label the object drug versus precipitant drug in that sentence. Of note the first step of our procedure classifies sentences into one of the relation categories without identifying which entities in the sentence have such a relation. In comparison, relation extraction (RE) tasks in the literature usually identifies relation categories associated with entities in sentences, with the entities pre-identified and anonymized [ 2 , 12 , 13 , 14 ]. This makes our sentence classification task (1st step of our procedure) similar to the RE tasks in the sense that a relation category is identified, but identifying which entities are involved in this relation is not part of the task. The 2nd step of our procedure will complete this NER task.

Our model has enabled the efficient evaluation of high clinical exposures for some reference drugs during the development of international guidelines for cardiac safety [ 4 ], and is expected to play an important role in drug development activities where gathering information about specific drugs’ clinical exposure changes due to DDI with other precipitant drugs is necessary.

The TAC 2019 DDI track [ 8 ] provided 4 training datasets: (1) 22 FDA labels fully annotated and used for TAC 2018 training, (2) Additional 180 FDA labels reannotated according to the TAC 2018 guideline, (3) 57 FDA labels used for TAC 2018 testing, (4) Additional 66 FDA labels with only the Drug Interactions and Clinical Pharmacology sections annotated. The labels were provided as Structured Product Labeling (SPL) documents in XML format, where sections and sentences were annotated according to prespecified guidelines ( ). The combined set of training data has 21,593 sentences, each annotated as one of the 4 categories: no DDI, PK DDI, PD (pharmacodynamic) DDI, or unspecified DDI. For the purpose of our model, the no DDI, unspecified DDI, and PD DDI categories were combined into a single category of “other or no DDI”. These sentences labeled as two categories (“PK DDI” vs. “other or no DDI”) were used as training data for the first step (PK DDI sentence classification). On top of sentence-level annotations, each of these sentences also has entity-level annotation. The original XML files annotated entities of Precipitant, Trigger, and SpecificInteractions. For our model, we need Precipitant and Object entities annotated. Of note the original XML files used a definition of Precipitant that is different from ordinary DDI definitions: any drug X involved in a DDI with the labeled drug (the drug the XML file is a SPL document for) was annotated as Precipitant, even if the labeled drug actually affects drug X’s PK or PD (i.e. drug X is actually the Object drug). The third task of TAC 2019 DDI was the normalization of sentences involving PK DDI to National Cancer Institute (NCI) Thesaurus codes. Hence each PK sentence contains an NCI code label from which the correct object and precipitant drugs can be identified. We have reannotated the entities in each sentence so that the correct definition of object and precipitant is used, without having to refer to NCI codes. The resulting dataset is marked following Inside-Outside-Beginning2 (IOB2) format to indicate the boundaries of object and precipitant drugs in each sentence and used as the training data for the second step (identifying precipitant and object drugs).

Separately, the TAC 2019 DDI track provided 1 dataset containing 81 FDA labels as testing/validation data. Following the steps above, 10,592 sentences were extracted and reannotated from the XML files and used as independent validation to check the performance of our model for both steps. A diagram of the training and validation procedure can be found in Fig.  2 .

figure 2

Training and validation procedure. 325 and 81 FDA labels prespecified by TAC DDI 2019 [ 8 ] were used for model training and validation, respectively. These labels were provided as Structured Product Labeling (SPL) documents as XML files. Sentences were extracted from the XML files and re-annotated to fit the purpose of the two steps of our model (DDI relation extraction to identify PK DDI sentences, and precipitant/object entity recognition in those sentences). This training/validation procedure was applied twice, each for one step of the model

Transformer-based large language model

BERT is a recently proposed pre-training language representation model with a transformer-based large language model architecture that has demonstrated state-of-the-art results on a series of NLP tasks [ 11 ]. Building on top of BERT, Lee et al. developed BioBERT, a BERT model retrained on large scale biomedical corpora [ 12 ]. We used BioBERT-Large v1.1, which was developed by pre-training BERT-large architecture (24 layers of neural networks, 340 million parameters) on PubMed abstracts (4.5 billion words, letter case preserved) for 1 million steps, with a custom 30,000 word vocabulary ( ). The pre-trained BioBERT weights in the format of TensorFlow version 1 ( ) were downloaded from the above GitHub repository. To convert TensorFlow version 1 weights to version 2, a tf1–tf2 convert script from was used. These converted weights were loaded into an in-house developed TensorFlow version 2 implementation of BERT, modified from . The preloaded model was then trained (fine-tuned) to finish the two steps of the task: relation extraction (RE) to identify PK DDI sentences and named entity recognition (NER) to identify precipitant and object drugs in each PK DDI sentence. This trained neural network, referred to as BioBERT_directionalDDI, and its performance was subsequently evaluated using validation data.

For the first step of the task, the BioBERT_directionalDDI model was fine-tuned on the training data containing sentences in two categories (PK DDI and other or no DDI; see Datasets section above) with epoch size 2 and max_seq_len 128. For the second step of the task, the model was fined-tuned on the training data where precipitant and object drugs are labeled as named entities (see Datasets section above) with epoch size 50 and max_seq_len 128. Generally, we used the same hyperparameters as given in the BioBERT GitHub repository. The only difference is that we found that 2 epochs for the first step was sufficient (instead of 3 epochs as originally used in the BioBERT repository). For both steps, multiple independent models were run from random seeds to ensure that the model performance was not an outlier. It was found that the model performance was stable and so the results from a single model are presented.

In addition to using traditional classification performance metrics like precision, recall, and F score to evaluate model performance, we also performed a systematic error analysis by manually going through each wrongly predicted sentence (for step 1) or precipitant/object entity (for step 2) as an attempt to understand why the model makes a mistake. Although there were no pre-defined error categories, we noticed that most mistakes can be categorized to one of a few reasons. And we have listed a few example mistakes for each error category to facilitate discussion (see Discussion section).

Using the model to scan all FDA prescription drug labels

The set of all human prescription drug labels was downloaded from the NIH website ( ) on 3/15/2023 in XML format and then processed to extract all sentences. Note that the majority of text is drawn from the lists and paragraph nodes in the XMLs, however text occurring in tables is not included. Any text that is contained inside of an image was likewise not extracted. Finally, some post-process cleaning of the extracted sentences was performed, for example removal of special characters like bullet points, concatenating items in lists into a single sentence, and removing hyperlinked references.

After processing, we extracted all sentences containing one of the 28 drug names of interest (see Results) and created a data set of sentences for each drug. Then we ran our model on each drug’s data set and found all sentences that contain PK DDI information as well as all sentences where that drug appears as the object in the PK DDI. Lastly some custom scripts were used to delete redundant sentences and identify those sentences where some quantitative information were mentioned as the consequence of the PK DDI (e.g., the Cmax of a drug of interest was increased by X% when co-administered with drug Y).

Model development using pre-specified training and validation datasets

We followed the pre-specified data split for training and validation from TAC 2019 DDI track (see Methods). Three hundred and twenty-five annotated FDA drug labels were used for model training, and 81 labels were set aside for model validation. In total there are 21,593 and 10,592 sentences for training and validation, respectively (Fig.  2 ). As the BioBERT_directionalDDI model contains two sequential sub-models for the two steps (relation extraction RE followed by named entity recognition NER), the performance evaluation (using the 10,592 sentences in the validation dataset) also has two sequential steps: first evaluate the accuracy of classifying all sentences into PK DDI and other or no DDI categories, then evaluate the accuracy of classifying object and precipitant drug entities in the PK DDI sentences. We report the precision, recall and F-score for both steps.

Model performance of the first step (identifying PK-DDI sentences)

For the sentence classification task, our BioBERT_directionalDDI model resulted in a precision of 82.7%, a recall of 80.6% and an F-score of 81.6% (Table 1 ). This suggests that, for all sentences that actually carry PK DDI information, about 81% will be correctly classified by the model while the remaining 19% will be mistakenly classified as other or no DDI (meaning either no DDI information or DDI of other types such as pharmacodynamics).

Model performance of the second step (identifying object vs precipitant drugs in PK-DDI sentences)

For the second step (identifying object vs precipitant drugs in PK DDI sentences) our BioBERT_directionalDDI model resulted in a precision of 100% for both object and precipitant entities (there were no false positives). The recall for object entities was 93.7% and for precipitant entities it was 94.6%. The F-score for object entities was 96.7% and for precipitants entities it was 97.2% (Table 2 ). Therefore about 94% of all entities (object and precipitant combined) are correctly identified by the model. Such high precision and recall suggest that, given a PK DDI sentence, it is very likely that this model will correctly identify the object and precipitant drugs.

Model application to identify clinical exposure changes due to DDI

Next, we applied the model to a specific use case: identify DDI-mediated clinical exposure changes of some reference drugs that were proposed to support the development of new cardiac safety regulatory guidelines [ 15 ]. The results for each of the 28 reference drugs after scanning all FDA labels for prescription drugs are shown in Table 3 . The number of sentences mentioning the reference drugs ranges from around 150 (Bepridil) to over 30,000 (Quinidine). After applying the two-step approach with the model, most of the reference drugs have anywhere between a few to over a hundred unique sentences identified where the drug appears as the object in a PK DDI. These sentences form the knowledge base that was used to provide evidence and facilitate discussion for the high clinical exposure scenario of the drug.

Background of project initiation

In this paper we reported the development of a transformer-based large language model to automatically identify precipitant and object drugs involved in a PK DDI relation. This project was started during the development of international cardiac safety regulatory guidelines where the change of clinical exposure of a drug (object) due to DDI with another drug (precipitant) needs to be considered to assess the “high clinical exposure” of the object drug. We were surprised by the lack of automatic solutions (either commercial or open source) to this important task, and decided to develop the current model (BioBERT_directionalDDI) by manually annotating a corpus and then fine tuning the state-of-the-art language model BERT [ 11 ].

A comprehensive and properly annotated corpus to identify precipitant and object drugs

To identify the clinical exposure change due to PK DDI from a sentence there are naturally two steps: first to identify those sentences that carry DDI information in the PK category, then to identify the precipitant and object drugs in those sentences. Almost all published NLP methods were designed to finish the first step only. The lack of existing methods to tackle the second step of identifying the directionality of the PK DDIs could be due to the lack of a large and properly annotated corpus for this task. It’s worthwhile to acknowledge that creating such a corpus is not a simple task as it may require dealing with sentences where the PK DDI is bi-directional or is ambiguously worded and the annotator will have to deal with these cases in a consistent manner. To the best of our knowledge there are only two corpora with the proper annotations of object and precipitant in the context of PK DDIs: the PK DDI corpus from Boyce et al. [ 16 ] and TAC 2019 DDI corpus (after translating the associated NCI codes). However, the Boyce corpus was based on only 64 product labels, and only 1 to 2 selected sections from each label were extracted and annotated. In contrast, the TAC 2019 DDI corpus we re-annotated was from 406 product labels (training and validation combined), and for most of these labels the entire documents were annotated. Probably because of the small amount of data available for training, even though their corpus contains the annotations of object and precipitant for PK DDIs, Boyce’s methods were only built to detect PK DDIs and their “modality” but not identify the objects or precipitants [ 16 ]. Another well-known DDI corpus from Herrero-Zazo et al. [ 1 ] identifies DDIs of the PK category (through the type “mechanism”) and annotates the entities involved in this PK DDI. However, the entities are labeled in the sequence they appear in the sentence, not for their functionality in the DDI (i.e. not as precipitant or object). We decided to re-annotate the TAC 2019 DDI corpus with the entities of precipitant and object readily identified (without recourse to NCI codes) for ease of use in our method. This corpus was then used in our training and validation process.

Fine-tuning existing BERT-based language models achieved reasonable performance

In the beginning of our project we searched for available methods that can identify PK DDI sentences and the associated precipitants/objects. The only published method that can potentially finish both steps is from the Human Language Technology Research Institute (HLTRI) at the University of Texas at Dallas (UTD) as a participating team for TAC 2019 [ 17 ]. However, their method predicts NCI codes, which will need to be further translated to precipitant/object relationships. And to the best of our knowledge, the method is not open sourced, making it hard to reapply their method to our corpus to evaluate or compare performance. In the absence of state-of-the-art or reference solutions, we fine-tuned the pretrained model BioBERT-Large v1.1 [ 12 ] on our annotated training datasets directly, without trying to modify the model structure to further improve the performance. We used traditional classification performance metrics like precision and recall, as well as F score, to assess the accuracy of the model. Based on the validation datasets prespecified by the TAC 2019 DDI track (and newly annotated by us, see above and Methods), our model has an F-score of 0.82 in identifying PK DI sentences (first step) and an F-score of 0.97 in identifying object vs precipitant drugs (second step). Of note the last layer of our neural network is a softmax layer that will produce the probability of the input sample being in each of the categories. For example, after the 1st step, each sentence will be assigned a probability X (0 < X < 1) to be in “PK-DDI” category and 1-X to be in “other or no DDI” category. Since X is a continuous variable, in theory one could use Receiver Operating Characteristic (ROC) curves to illustrate the performance over the whole range of possible classification thresholds (which is the range of X) and pick a threshold for maximum performance. We used a simpler “maximum argument” approach that essentially fix the classification thresholds of X to be 0.5, as this approach is widely used in the machine learning literature adopting neural networks for classification [ 2 , 11 , 12 ].

Error analysis

For the first step, a detailed investigation into the false negatives revealed several reasons for missing some of the PK DDI sentences.

Sometimes the sentence itself does not contain enough information to be classified as PK DDI (Table 4 A). For example, the sentence “Griseofulvin decreases the activity of warfarin-type anticoagulants so that patients receiving these drugs concomitantly may require dosage adjustment of the anticoagulant during and after griseofulvin therapy” was manually annotated as (and hence has a true label of) PK DDI in the validation dataset. Although it is generally accepted that griseofulvin decreases warfarin activities through PK mechanisms such as inducing metabolizing enzymes and interfering with absorption [ 18 ], such information is not contained in the sentence above that was presented to the model. This explains why the model misclassified it as other or no DDI.

Another reason is unique to some documents in the validation dataset: each document is the label of a specific FDA-approved drug (which is referred to as “label drug” hereafter), and in some sections of some old labels the name of the label drug is omitted from a sentence (Table 4 A). For example, the sentence “Elimination can be accelerated by the following procedures: 1) Administer cholestyramine 8 g orally 3 times daily for 11 days” does convey the DDI information between cholestyramine and some other drug. The other drug is leflunomide (Arava), which is the label drug and hence is omitted from the sentence. Consequently, the model did not classify it as a PK DDI sentence. This kind of sentence is a unique feature of old drug labels and is unlikely to be encountered when examining more recent drug labels or literature in scientific journals.

We also performed a similar error analysis for false positives (Table 4 B). Some sentences were mistakenly classified as PK DDI because they contain information about interaction between a drug and a non-drug factor (e.g. body weight or smoking). This can be seen from the sentence “Smoking: Following oral rivastigmine administration (up to 12 mg/day) with nicotine use, population pharmacokinetic analysis showed increased oral clearance of rivastigmine by 23% (n = 75 smokers and 549 nonsmokers)”. In addition, there are also some sentences that do not carry enough information to be classified as PK DDI or other or no DDI by themselves, such as “Intervention: Dose reductions and increased frequency of glucose monitoring may be required when BASAGLAR is co-administered with these drugs”. Overall, we calculated the specificity of the model on the sentence classification step and found that it was extremely high; about 0.99, this indicates that the fraction of other or no DDI sentences that are wrongly classified as PK DDI is small.

Error analysis of the second step (Table 5 ) suggests that some object/precipitant classifications were wrong because the corresponding drug names appear in the sentence in a complex way. For example, in the sentence: “In patients taking ARAVA, exposure of drugs metabolized by CYP1A2 (e.g., alosetron, duloxetine, theophylline, tizanidine) may be reduced”, the model correctly identified that ARAVA is the precipitant drug while alosetron, duloxetine, theophylline, and tizanidine are the object drugs. However, the original sentence also labeled “drugs metabolized by CYP1A2” as a general term to cover object drugs, which the model missed. Notice that this example shows that the model can handle situations where there are multiple entities of the same class; in this case there are multiple object drugs. There are also other object/precipitant drugs that were misclassified without obvious reason (Table 5 ). But overall, the high precision and recall (both > 0.9) indicate that these wrongly classified directional DDI entities are relatively rare.

Potential model application use cases

As mentioned earlier this model was developed to facilitate the gathering of high clinical exposure information for reference drugs during the discussion of cardiac safety regulatory guidelines [ 4 ]. In addition, our model could be used in specific drug development program when the drug of interest has relevant information in other drug labels or scientific literature. For example, a comprehensive scanning of all drug labels and/or literature to gather information about DDI-associated clinical exposure increase of a drug of interest could potentially be used to help the selection of a target clinical exposure for this drug in a first-in-human QT assessment to fulfill the International Council for Harmonisation (ICH) E14 Q & A 5.1 requirement [ 4 ]. And natural text mining using the model could be used for post marketing pharmacovigilance surveillance for specific drugs [ 19 ].


A few limitations of our method should be noted. First, there is potentially useful PK information contained in tables and figures in drug labels that our method currently cannot use. Extraction of information in these forms can be challenging, however there has been some recent work in the area [ 20 ]. Another limitation is that our method analyzes each sentence individually; whereas sometimes contextual knowledge from surrounding sentences can be useful in determining whether a sentence contains PK DDI and also its directionality. Lastly, we mention that after annotating our corpus and training our model that they are fixed in time, and may need to be updated; for instance, if changes are made to how drug interaction information is recorded.

Potential next steps

As stated above, some classification errors are attributed to a lack of information contained in the sentence. This may require new generations of AI methods that enquire external sources during the classification steps. For example, in the case of sentences from drug labels that allude to the label drug, without explicitly naming it in the sentence, we could pull the label drug name from other parts of the drug label or from a database such as RxNorm [ 21 ]. For other classification errors where the relevant information is contained in the sentence already, they may be resolved by improving the existing BERT-based pipelines, such as supplementing the pre-training materials (which are mostly biomedical literature) with FDA drug labels, adjusting the number of layers, etc.

Even though general DDI corpora may exist, these usually can only be used to develop methods for general purpose DDI extraction (e.g., classifying a sentence into one of several DDI categories). Hence it is important that once users have defined a more specific task (e.g., identifying clinical exposure changes of object drugs due to PK DDI with precipitant drugs), they provide a specific corpus that can support the development of NLP methods to perform the task. Here we hope our model provides a temporary solution to the task of automatic identification of directional DDI from biomedical literature and drug product labels. More importantly, we hope our initial attempt can encourage the biomedical informatics method development community to engage the drug development community more to develop “fit for practical purpose” methods, and the drug development community to annotate and release high quality corpora for specific tasks they are facing in the drug development process.

Availability of data and materials

All scripts and datasets used can be found at .

Herrero-Zazo M, et al. The DDI corpus: an annotated corpus with pharmacological substances and drug–drug interactions. J Biomed Inform. 2013;46(5):914–20.

Article   PubMed   Google Scholar  

Zhu Y, et al. Extracting drug-drug interactions from texts with BioBERT and multiple entity-aware attentions. J Biomed Inform. 2020;106: 103451.

Hefner G, et al. Prevalence and sort of pharmacokinetic drug–drug interactions in hospitalized psychiatric patients. J Neural Transm (Vienna). 2020;127(8):1185–98.

Article   CAS   PubMed   Google Scholar  

Harmonisation, I.C.f. ICH E14/S7B Clinical and Nonclinical Evaluation of QT/QTc Interval Prolongation and Proarrhythmic Potential Questions and Answers 2022 3/1/2023]; Available from: .

Segura-Bedmar I, Martinez P, Sanchez-Cisneros D. The 1st DDIExtraction-2011 challenge task: extraction of drug–drug interactions from biomedical texts. 2011;2011:1–9.

Segura-Bedmar I, Martinez P, Herrero-Zazo M. Lessons learnt from the DDIExtraction-2013 shared task. J Biomed Inform. 2014;51:152–64.

Demner-Fushman D, Fung KW, Do P, Boyce RD, Goodwin TR. Overview of the TAC 2018 drug–drug interaction extraction from drug labels track. In: Text analysis conference 2018. 2018.

Goodwin TR, Demner-Fushman D, Fung KW, Do P. Overview of the TAC 2019 Track on drug–drug interaction extraction from drug labels. In: Text analysis conference 2019. 2019.

FDA, U. Drug Development and Drug Interactions | Table of Substrates, Inhibitors and Inducers. 3/1/2023]; Available from: .

Tracy TS, et al. Cytochrome P450 isoforms involved in metabolism of the enantiomers of verapamil and norverapamil. Br J Clin Pharmacol. 1999;47(5):545–52.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Devlin J, et al. Bert: pre-training of deep bidirectional transformers for language understanding. 2018. arXiv preprint arXiv:1810.04805

Lee J, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2019;36(4):1234–40.

Article   PubMed Central   Google Scholar  

Soares LB, et al. Matching the blanks: distributional similarity for relation learning. In: 57th Annual meeting of the association for computational linguistics (Acl 2019). 2019;2895–2905.

Weber L, et al. PEDL: extracting protein-protein associations using deep language models and distant supervision. Bioinformatics. 2020;36:490–8.

Article   Google Scholar  

Li Z, Garnett C, Strauss DG. Quantitative systems pharmacology models for a new international cardiac safety regulatory paradigm: an overview of the comprehensive in vitro proarrhythmia assay in silico modeling approach. CPT Pharmacomet Syst Pharmacol. 2019;8(6):371–9.

Article   CAS   Google Scholar  

Boyce R, Gardener G, Harkema H. Using natural language processing to extract drug–drug interaction information from package inserts. In: BioNLP: proceedings of the 2012 workshop on biomedical natural language processing. Montréal, Canada; 2012.

Maldonado R, Weinzierl M, Harabagiu S. The University of Texas at Dallas HLTRI at TAC 2019. In: The text analysis conference (TAC) drug–drug interaction track. 2019.

Weser JK, Sellers E. Drug interactions with coumarin anticoagulants. 2. N Engl J Med. 1971;285(10):547–58.

Zhang PY, et al. Translational biomedical informatics and pharmacometrics approaches in the drug interactions research. CPT Pharmacomet Syst Pharmacol. 2018;7(2):90–102.

Milosevic N, et al. A framework for information extraction from tables in biomedical literature. Int J Doc Anal Recogn. 2019;22(1):55–78.

Nelson SJ, et al. Normalized names for clinical drugs: RxNorm at 6 years. J Am Med Inform Assoc. 2011;18(4):441–8.

Article   PubMed   PubMed Central   Google Scholar  

Download references


This project was supported by the Research Participation Program at CDER, administered by the Oak Ridge Institute for Science and Education (ORISE) through an interagency agreement between the US Department of Energy and the FDA. This study used the computational resources of the High Performance Computing clusters at the Food and Drug Administration, Center for Devices and Radiological Health.


This article reflects the views of the authors and should not be construed to represent the FDA’s views or policies.

Not applicable.

Author information

Authors and affiliations.

Division of Applied Regulatory Science, Office of Clinical Pharmacology, Office of Translational Sciences, Center for Drug Evaluation and Research, Food and Drug Administration, WO Bldg 64 Rm 2078, 10903 New Hampshire Ave, Silver Spring, MD, 20993, USA

Joel Zirkle, Xiaomei Han, Rebecca Racz, Mohammadreza Samieegohar, Anik Chaturbedi, John Mann, Shilpa Chakravartula & Zhihua Li

You can also search for this author in PubMed   Google Scholar


ZL designed the study and performed the research. JZ and RR performed the research and conducted the analysis. XH, RR, MS, AC, JM, and SC conducted analysis. All authors reviewed the manuscript.

Corresponding author

Correspondence to Zhihua Li .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit . The Creative Commons Public Domain Dedication waiver ( ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Cite this article.

Zirkle, J., Han, X., Racz, R. et al. Deep learning-enabled natural language processing to identify directional pharmacokinetic drug–drug interactions. BMC Bioinformatics 24 , 413 (2023).

Download citation

Received : 20 March 2023

Accepted : 04 October 2023

Published : 01 November 2023


Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Pharmacokinetic
  • Drug-drug interactions
  • Natural language processing
  • Directionality
  • Transformer language model

BMC Bioinformatics

ISSN: 1471-2105

literature review about training and development

  • Open access
  • Published: 19 February 2023

Control strategies used in lower limb exoskeletons for gait rehabilitation after brain injury: a systematic review and analysis of clinical effectiveness

  • Jesús de Miguel-Fernández   ORCID: 1 , 2 ,
  • Joan Lobo-Prat   ORCID: 3 ,
  • Erik Prinsen 4 ,
  • Josep M. Font-Llagunes   ORCID: 1 , 2 &
  • Laura Marchal-Crespo   ORCID: 5 , 6 , 7  

Journal of NeuroEngineering and Rehabilitation volume  20 , Article number:  23 ( 2023 ) Cite this article

3837 Accesses

10 Citations

8 Altmetric

Metrics details

In the past decade, there has been substantial progress in the development of robotic controllers that specify how lower-limb exoskeletons should interact with brain-injured patients. However, it is still an open question which exoskeleton control strategies can more effectively stimulate motor function recovery. In this review, we aim to complement previous literature surveys on the topic of exoskeleton control for gait rehabilitation by: (1) providing an updated structured framework of current control strategies, (2) analyzing the methodology of clinical validations used in the robotic interventions, and (3) reporting the potential relation between control strategies and clinical outcomes.

Four databases were searched using database-specific search terms from January 2000 to September 2020. We identified 1648 articles, of which 159 were included and evaluated in full-text. We included studies that clinically evaluated the effectiveness of the exoskeleton on impaired participants, and which clearly explained or referenced the implemented control strategy.

(1) We found that assistive control (100% of exoskeletons) that followed rule-based algorithms (72%) based on ground reaction force thresholds (63%) in conjunction with trajectory-tracking control (97%) were the most implemented control strategies. Only 14% of the exoskeletons implemented adaptive control strategies. (2) Regarding the clinical validations used in the robotic interventions, we found high variability on the experimental protocols and outcome metrics selected. (3) With high grade of evidence and a moderate number of participants (N = 19), assistive control strategies that implemented a combination of trajectory-tracking and compliant control showed the highest clinical effectiveness for acute stroke. However, they also required the longest training time. With high grade of evidence and low number of participants (N = 8), assistive control strategies that followed a threshold-based algorithm with EMG as gait detection metric and control signal provided the highest improvements with the lowest training intensities for subacute stroke. Finally, with high grade of evidence and a moderate number of participants (N = 19), assistive control strategies that implemented adaptive oscillator algorithms together with trajectory-tracking control resulted in the highest improvements with reduced training intensities for individuals with chronic stroke.


Despite the efforts to develop novel and more effective controllers for exoskeleton-based gait neurorehabilitation, the current level of evidence on the effectiveness of the different control strategies on clinical outcomes is still low. There is a clear lack of standardization in the experimental protocols leading to high levels of heterogeneity. Standardized comparisons among control strategies analyzing the relation between control parameters and biomechanical metrics will fill this gap to better guide future technical developments. It is still an open question whether controllers that provide an on-line adaptation of the control parameters based on key biomechanical descriptors associated to the patients’ specific pathology outperform current control strategies.

Brain injury is a wide open concept associated with damage to the brain due to events inside of the body, i.e., non-traumatic brain injuries, or external forces, i.e., traumatic brain injuries (TBIs). Non-traumatic brain injuries include stroke or cerebral palsy. Brain injuries are one of the major causes of death and disability worldwide [ 1 ]. The global incidence of stroke increases by more than 13.7 million new cases each year [ 2 ], and is the third leading cause of disability worldwide [ 3 ]. The prevalence of cerebral palsy is estimated to be from nearly 2 to nearly 3 per 1000 newborns worldwide [ 4 , 5 ]. Traumatic brain injury is another leading cause of disability around the globe, with 69 million survivors every year [ 6 ].

Difficulty in standing and walking is one of the major consequences of brain injuries. For instance, over 63% of stroke survivors suffer from half-mild to severe motor and cognitive disabilities [ 7 ], and 30–36% are unable to walk without assistive aids [ 8 , 9 ]. This results in loss of independent mobility and limits community participation and social integration, which causes secondary health conditions [ 10 ]. Individuals with brain injuries can exhibit common motor impairments, like paralysis, spasticity, or abnormal muscle synergies, leading to compensatory movements and gait asymmetries [ 11 , 12 , 13 , 14 , 15 ]. This pathological gait hinders a skilful, comfortable, safe, and metabolically efficient ambulation [ 16 ].

The recovery process after a brain injury takes months to years and neurological impairments can be permanent [ 17 ]. There is strong evidence that early, intensive, and repetitive task- and goal-oriented training, which is progressively adapted to the patients level of impairment and rehabilitation stage, can improve functional ambulatory outcomes [ 11 , 18 , 19 , 20 , 21 , 22 , 23 ]. However, due to limited resources and the heterogeneity of impairment, it is challenging for physiotherapists to provide the required intensity and dose of training, while extracting quantitative information to maximize functional walking ability for a specific patient.

Robotics can play a promising role in gait rehabilitation for individuals with brain injuries. Robots allow performance of wide range of tasks—e.g., walking, sitting up/down, or walking on a slope—with high intensity. Some robotic controllers might also promote patients’ active participation and engagement during the training process, e.g., by varying the level of the assistive force [ 24 , 25 ]. High repeatability and intensity of training, together with patients’ engagement, have been listed as crucial factors to induce neural plasticity and motor learning [ 26 , 27 , 28 ]. Importantly, clinical evidence suggests that combining robotic and conventional rehabilitation training positively impacts the ability to walk independently, walking speed, and walking capacity, although there is still no solid evidence about the superiority of robotic rehabilitation over conventional therapy [ 29 , 30 , 31 , 32 , 33 ].

Lower-limb exoskeletons promote task-oriented repetitive movements, muscle strengthening, and movement coordination, which have been shown to positively impact energy efficiency, gait speed, and balance control [ 34 , 35 ]. Exoskeletons, compared to other robotic solutions, e.g., patient-guided suspension systems and end-effector devices, allow for full control of the leg joint angles and torques, and are the preferred robotic solutions for training brain-injured patients who suffer from severe motor disabilities [ 36 ]. Thereby, we consider that focusing on exoskeleton technology is a wide and rich enough topic to extract conclusions on the clinical effectiveness of the control strategies in the broad group of brain-injured patients [ 37 , 38 , 39 ].

The interest on lower-limb exoskeletons for gait rehabilitation has increased exponentially in the last years, which is reflected in the considerable number of reviews published within the last decade [ 38 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 ]. However, the majority of these reviews focus on hardware, while only a few of them analyzed the control strategies implemented on lower limb exoskeletons and their effects on walking function in individuals with brain injuries [ 38 , 41 , 42 , 54 , 55 , 56 , 57 , 58 , 59 , 60 ]. Yet, the control strategy—as ergonomics and robot actuation—might play a key role on the effectiveness of the robotic treatment [ 61 ]. As in every biological system, control rules are essential to modulate every action attending to internal and external factors [ 62 ].

We found a few literature surveys that focused on control strategies for lower-extremity exoskeletons: Baud et al. and Li et al. categorised the control strategies and actuation systems implemented on lower-limb exoskeletons [ 41 , 42 ]; Chen et al. presented a review on wearable hip exoskeletons for gait rehabilitation and human performance augmentation that addressed actuation system technologies and control strategies [ 57 ]; Zhang et al. presented a review on lower-limb exoskeletons offering details about actuation systems, high-level control, and human–robot synchronization tools [ 38 ]; Tucker et al. [ 55 ] reviewed several control strategies, gait pattern recognition, and biofeedback approaches for lower extremity robotic prosthetics and orthotics. Finally, a recent systematic review on wearable ankle rehabilitation robots for post-stroke rehabilitation focused on actuation technologies, gait event detection, control strategies, and the clinical effects of the robotic intervention [ 59 ].

In this systematic review, we aim at complementing previous literature surveys by providing an updated structured framework of current control strategies, analyzing the methodology of clinical validations used in the robotic interventions, and reporting the potential relation between the employed control strategies and clinical outcomes. In this literature survey we seek to answer the following three research questions: (1) Which control strategies have been used on powered lower limb exoskeletons for individuals with brain injuries?, (2) What are the experimental protocols and outcome metrics used in the clinical validation of robotic interventions?, and (3) What is the current clinical evidence on the effectiveness of the different control strategies?

Search strategy

To answer the first research question—i.e., which control strategies have been used on powered lower limb exoskeletons for individuals with brain injuries?—we conducted a literature search on the 17th of September 2020, including English-language studies published from January 2000 to September 2020 in four databases: Web of Science, Scopus, PubMed, and IEEE Xplore. The search included the following keywords: (“brain injury” OR “cerebral” OR “palsy” OR “stroke” OR “hemipare*” OR “hemiplegi*” OR “CVA” OR “cerebrovascular accident” OR “cerebral infarct” OR “cerebral hemorrhage” OR “ABI” OR “acquired brain injury” OR “motor learning” OR “neuroplasticity” OR “neural plasticity” OR “neuroplastic”) AND ((“lower” AND (“limb*” OR “extremit*”)) OR “walk*” OR “ambulat*” OR “gait”) AND (“power*” OR “active” OR “robot*” OR “wearable”) AND ( “assistive” OR “exo*” OR “exosuit” OR “exo-suit” OR “brace*” OR “ortho*”) AND “control*”.

The search query led to 1648 studies (991 after removing duplicates). After a title and abstract screening, the number of studies was reduced to 255. Then, a full-text screening process was carried out with the following criteria: studies should (1) involve active orthoses/exoskeletons for lower-limb training, (2) provide technical details about the control strategy used, (3) validate the device on individuals with a brain injury, and (4) report biomechanical or clinical outcome metrics that allow for a comparison among different control strategies. The last condition was associated with the analysis of the clinical methodology followed in robotic interventions. After the full-text screening, a total of 159 publications were included in this review (see Fig.  1 ), with a total of 43 different lower limb exoskeletons. The resulting studies will be used to answer the first two research questions outlined in this review. See Additional file 1 for a detailed list of the studies included.

figure 1

PRISMA flowchart for identification and screening of eligible studies for the current review. The total number of studies is not equal to the sum of the studies divided per level of impairment and/or acuity as, in some studies, participants were pooled together independently of their acuity or GMFCS levels

Clinical comparison

To answer the third research question—i.e., what is the current clinical evidence on the effectiveness of the different control strategies?—we conducted a stricter screening of the 159 publications focusing on the studies that performed an assessment before and after the robotic intervention; the studies that focused only on assessments during the robotic intervention while wearing the robotic device or only immediately after a single training session were not included (see Fig.  1 ).

To perform an unbiased clinical comparison between different exoskeleton controllers, we subdivided the individuals with stroke and CP into different subgroups, based on their impairment level and/or acuity before the robotic intervention. For the stroke group, we used three levels of acuity: acute (≤ 2 weeks from stroke onset), subacute (≤ 6 months from stroke onset), and chronic (> 6 months from stroke onset). In the case of CP, we followed the four levels of the Gross Motor Function Classification System (GMFCS) [ 63 ].

Applying a final screening process, we only compared controllers tested with participants who shared similar levels of impairment before the robotic treatment, i.e., similar scores in Functional Ambulation Category (FAC) and in the metrics mentioned in “ Outcomes of interest for the clinical comparison ” section. This resulted in the exclusion of six studies on individuals with acute [ 64 ], subacute [ 64 , 65 , 66 , 67 , 68 , 69 ], and chronic [ 69 ] stroke.

This final screening process led to 73 studies of which 57 studies included stroke survivors (78.08% of the studies) and 16 children/adults with CP (21.91% of the studies). From the 57 studies that analyzed the benefits of robotic exoskeleton lower-limb training on stroke survivors, five studies included participants with acute stroke, 13 studies with subacute stroke, and 42 studies with chronic stroke. From the 16 studies with children/adults with CP, five studies included participants with GMFCS I, 15 studies with GMFCS II, 12 studies with GMFCS III, and four studies with GMFCS IV. Note that the total number of studies is not equal to the sum of the studies divided per level of impairment and/or acuity as, in some studies, participants were pooled together independently of their acuity and GMFCS levels. See Additional file 2 for a detailed list of the studies included in the clinical analysis.

To further analyze, compare, and discuss the effectiveness of different control strategies, we also took into consideration: (1) the grade of evidence based on the type of intervention—e.g., (randomized) clinical trials or observational studies, (2) the training intensity of the robotic treatment—i.e., the product of the session duration, number of sessions, and frequency of the training, and (3) the number of participants who trained with each type of control.

Following the guidelines presented in [ 70 , 71 ], we considered that a study had a high level of evidence (level I study) when it was a Randomized Clinical Trial (RCT). When the study was a Clinical Trial (CT), we considered that its level of evidence was moderate (level II study). Finally, the level of evidence of observational studies was considered low (level III study). The grade of evidence of the clinical effects of the robotic treatment was considered strong when there was a preponderance of level I and/or level II studies that supported the result—this must include at least one level I study. The grade of evidence was considered moderate when there was a preponderance of level II and/or level III studies that supported the result—this must include at least one level II study. Finally, the evidence was classified as weak when only level III studies supported the result.

Outcomes of interest for the clinical comparison

The selection of the outcome measures of interest was based on those recommended by surveys and studies that evaluated stroke and CP rehabilitation [ 72 , 73 , 74 , 75 , 76 , 77 , 78 ]. To evaluate the effectiveness of the control methods on stroke survivors, we selected the following scales: Berg Balance Scale (BBS), 10 m Walk Test (10MWT), 6 min Walk Test (6MWT), Timed-Up and Go (TUG), Fugl–Meyer Assessment of Lower Extremity (FMA-LE), and Functional Independence Measure (FIM)—this last one only for acute stroke. To evaluate the effectiveness of the control strategies on individuals with CP, we selected the following scales: Gross Motor Function Measure (GMFM)-66/88 dimensions D and E, 10MWT, and 6MWT.

Control strategies taxonomy

To analyze the state of the art of control strategies for lower limb exoskeletons in rehabilitation, we propose a hierarchical classification of control methods based on an adapted version of the categorization presented in [ 55 ]. The hierarchy establishes three different levels: High-level control, Mid-level control, and Low-level control (see Fig.  2 ).

figure 2

General control system diagram. The signals from the Human–Exoskeleton —e.g., human–robot interaction forces, limbs’ kinematics, and/or recorded human muscle or brain activity—are processed sequentially by three different blocks—each corresponding to High-, Mid- and Low-level control, to generate the actuation command . High-Level Control: the Control Aim defines the role of the exoskeleton in the overall performance of the human-exoskeleton system, i.e., enhance or hinder task completion. The Human–Robot Synchronization block generates an estimation of the actual state and is used by the Mid-level control, together with the control aim, to provide reference values—e.g., desired position or force—to the Low-level control. The Low-level Control then transforms that reference into actual assistive/resistive force/motion and sends the actuation command to the exoskeleton hardware

High-level controllers are defined as control strategies that identify the human’s volitional intent and select the appropriate exoskeleton response behaviour. The exoskeleton Mid-level control reacts to the current state of the user and defines the reference position or force that the robot should follow based on the control aim and the state estimated by the human–robot synchronization algorithm (both embedded in the High-level control) and the sensors measurements. Finally, the Low-level control tries to achieve the desired state determined by the Mid-level controller by applying feedforward or feedback control. In this systematic review, we have focused on High- and Mid-level controllers since they are highly related to exoskeleton use, while Low-level controllers are directly linked to the hardware and can be applied in other types of robots [ 41 ].

High-level control

A High-level control system provides a command that modifies the state of the actuation system according to the control aim [ 79 , 80 , 81 ] (see Fig.  3 A). The Control Aim varies the purpose of the exoskeleton based on the desired treatment approach, e.g., assists or challenge the patients.

figure 3

Taxonomy of High- and Mid-level controllers. A  Control Aim: AI  In Assistive Control , the exoskeleton provides support to enhance the movement performance during training. AII Conversely, in Challenge-Based Control mode, the exoskeleton provides actions that hinder the human performance. AIII   Adaptive Control adjusts the system parameters based on the human–robot performance to provide adjusted assistance or resistance. B  Human–Robot Synchronization: BI   Threshold-Based Algorithms ensure the transition between states whether the detection metric fulfils a pre-defined threshold. BII In Stochastic Algorithms , the transition between states for the same set of initial conditions and algorithm parameters might be different due to the inherent randomness of the models used. BIII   Adaptive Oscillators use the periodic motion of the patient to extract its phase either to generate a control signal or to determine the actual state of the patient, e.g., the phase of the gait. C  We categorize the Mid-level control strategies used in lower-limb exoskeletons for gait rehabilitation into three families: Trajectory-Tracking Control generates reference assistive or resistive torque/position profiles based on parameterized or pre-recorded position/torque trajectories; Neuromuscular Control uses recorded biosignals (e.g., brain/muscle signals) to generate the control signal for the Low-level control; and Compliant Controllers regulate the impedance or admittance of the exoskeleton by modifying the dynamic relation between movement and force or force and velocity, respectively

Assistive High-level controllers facilitate functional training by supporting the patients' movements to complete the task—e.g., sit-to-stand [ 82 ], achieve stability during the loading response of the gait [ 83 ], or plantarflexion assistance in late stance [ 84 ]. Assistance can be provided while patients are fully guided by the exoskeleton and remain passive during the training—i.e., haptic demonstration [ 81 ], or while patients actively execute the task while they are guided/corrected by the robot—i.e., haptic assistance [ 81 ]. It is thought that guiding movements while patients remain passive may improve gait performance [ 85 , 86 , 87 ], especially in those suffering from severe impairment [ 55 , 88 ]. Additionally, mobilizing the affected limbs while patients remain passive allows for stretching the muscles and might reduce spasticity [ 89 ], provides somatosensory stimulation that facilitates restoring normative patterns of motor output [ 87 ], and importantly, provides an environment for safe, high intensity, and motivating locomotion training.

On the contrary, Challenge-based High-level controllers aim at, e.g., strengthening the muscles by opposing to task completion—e.g., resistive methods [ 90 ], enhancing error detection—e.g., error augmentation methods [ 91 ], and increasing movement variability—e.g., perturbation methods [ 92 ]. These challenge-based control strategies might lead to improvements in physical performance, movement control, walking speed, and functional independence, especially in people in the late stages of the rehabilitation or with mild impairment [ 93 , 94 , 95 , 96 ].

Adaptive control strategies aim to modify the control parameters based on the patient's specific needs [ 97 ]. In general, the control parameters of the exoskeletons have to be tuned to properly adapt to each specific patient's walking capabilities, as they are not generalized enough to capture the heterogeneity of gait disorders [ 98 , 99 ]. It has been found that when setting up the exoskeleton, tuning the control parameters, together with donning, requires the highest amount of time [ 98 , 100 , 101 ]. Tuning is a laborious process, as therapists must manually modify the parameters relying only on subjective feedback from the patients and visual assessments of the gait pattern [ 99 , 102 ]. A potential solution to guide the physiotherapists through the tuning process might be to provide an initial set of parameters that has been automatically tuned off-line based on the users’ baseline performance [ 99 , 103 ]. However, automatic off-line or manual tuning might lead to a suboptimal set of parameters, which does not take advantage of the full potential of the exoskeleton to improve the rehabilitation effect [ 98 ]. Therefore, strategies that automatically adapt the control parameters of the exoskeleton in real-time, e.g., based on the patient's performance, could increase the positive effect of the exoskeleton while enhancing its usability by reducing the time needed to tune the control parameters.

Synchronization to the user’s motion is a key factor to effectively benefit from the exoskeleton therapy, e.g., reducing adaptation time and metabolic rate [ 104 ]. Most of the Mid-level control strategies need an estimation of the current action performed by the user to properly assist or resist her/his motion, i.e., to synchronize the human and the robot. The Human–Robot Synchronization sub-level within the High-level control estimates the state of the patient by using deterministic or stochastic methods based on recorded kinematic, kinetic, and/or bioelectric data—e.g., joint kinematics [ 105 ], ground reaction forces [ 106 ], human–robot interaction forces [ 107 ], muscular activity [ 108 ], and brain activity [ 109 ] (see Fig.  3 B).

Threshold-based algorithms differentiate between states—e.g., gait phases [ 110 ], falling [ 111 ], and start-stop walking [ 112 ]—following a state-machine structure that allows the transition between states depending on logical rules.

Stochastic algorithms , on the other hand, infer the state throughout statistical models, e.g., using Linear Discriminant Analysis (LDA) [ 109 ], Hidden Markov Models [ 113 ], Principal Component Analysis [ 114 ], K-Nearest Neighbours [ 115 ], or Neural Networks [ 116 ]. This family of human–robot synchronization methods is particularly useful for planning the gait pattern of the exoskeleton based on vision-based environment classification due to the high performance of stochastic algorithms to classify environments using images [ 117 ].

Bio-inspired models are emerging as an alternative to threshold-based and stochastic algorithms. For example, adaptive oscillators are non-linear models that synchronize with a teaching signal—e.g., the thigh angle in the sagittal plane [ 118 ]—in phase, frequency and amplitude, mimicking bio-inspired behaviours [ 119 ]. The estimated output from the adaptive oscillator—e.g., phase of the input signal—is used to estimate the phase of the gait or to generate reference joint trajectories to assist or resist the human motion [ 118 , 120 , 121 ]. The main disadvantage of adaptive oscillators, however, is that they require precise parameter tuning to quickly synchronize with the human periodic motion [ 122 ].

Nevertheless, all human–robot synchronization methods require a parameter tuning to properly adapt to each specific patient's gait as they are not generalizable enough to avoid patient-to-patient variability [ 98 ]. This process is laborious, as therapists must manually tune the parameters off-line relying only on feedback from the patients and subjective visual assessments [ 99 , 102 ]. Automatic adaptation [ 123 ] based on the patient's intention and/or gait parameters, such as gait speed [ 124 , 125 , 126 ], might facilitate the usability of these methods.

Mid-level control

Mid-level control employs sensor measurements, the control aim, and the state inferred by the human–robot synchronization to generate reference control commands used by the Low-level control to apply the actuation command (see Fig.  2 ). Three different families of Mid-level control strategies can be distinguished depending on the control inputs/outputs and controllers employed (see Fig.  3 C).

Trajectory-tracking control generates predefined position or force trajectories as reference commands to provide assistance/resistance. These trajectories are usually determined based on pre-recordings of unimpaired individuals (e.g., hip and knee flexion-extension, and ankle plantarflexion-dorsiflexion torques [ 127 ]), information from the non-paretic limb (e.g., hip and knee flexion-extension angles [ 105 , 128 ]), or pre-recorded trajectories during therapist-guided assistance (e.g., foot trajectory [ 129 ] or knee flexion-extension [ 130 ]).

Neuromuscular control strategies use biosignal recordings as control signals to decode the actions of the patient and send reference values to the Low-level control [ 131 ]. Common approaches, like myoelectric [ 132 , 133 , 134 ] and Brain-Computer Interface (BCI) [ 135 , 136 ] control, use muscular—electromyography (EMG)—and brain—electroencephalography (EEG)—signals, respectively, to handle the control objective.

Lastly, compliant controllers [ 137 , 138 ] regulate the impedance [ 113 , 139 ] or admittance [ 140 , 141 ] levels of the exoskeleton by modifying the dynamic relation between movement and force or force and velocity, respectively, using virtual dynamics of springs, dampers, or masses. The combination of trajectory-tracking control [ 142 ] or neuromuscular control [ 143 ] with compliant control usually provides a more flexible behavior to the exoskeleton during rehabilitation—e.g., by allowing more movement variability around the desired trajectory, compared to conventional rigid Low-level controllers such as proportional-derivative (PD) controllers [ 144 , 145 ].

Neuroscience evidence behind current control developments

Neuroscience evidence seems to indicate that the aim of the control strategy of an exoskeleton for individuals with brain injuries should be to stimulate physical/cognitive engagement and motor learning rather than enforce repetitive movements with low variability [ 21 , 146 , 147 , 148 ]. For this reason, control strategies for individuals with brain injuries should guarantee the patient’s active physical and cognitive engagement by providing tailored and compliant assistance or resistance. In particular, in individuals with moderate/mild brain injuries, excessive assistance may have a negative influence on motor learning, as the dynamics of the task to be learned is different from the trained task [ 149 ]. To promote patient's active participation, the device should engage the users wearing the exoskeleton to, e.g., actively initiate each step, inter-joint coordination or control their balance. This can be achieved by, e.g., adjusting the level of assistance or resistance based on real-time biomechanical measurements during locomotion. Thus, non-compliant generic controllers [ 104 ] that do not adapt their assistance/resistance might not be the most effective ones for gait rehabilitation of individuals with brain injuries who preserve partial or full volitional control [ 14 , 147 , 150 ]. Robotic training using controllers that modulate the assistance based on patient's performance or that allow for more compliant human–robot interaction might be more effective to stimulate motor learning than those that enforce generic “normative” movements independently of the patients' capabilities [ 151 ].

Implementation of control strategies

In this section we provide an overview of the High and Mid-level control strategies implemented in the studies included in this review from a technological point of view, without focusing on clinical aspects (see Fig.  4 A). Exoskeletons used with individuals with stroke and cerebral palsy are highlighted as these two were the most predominant pathologies in the reviewed studies (see Fig.  4 B, C).

figure 4

Overview of exoskeletons based on their High- and Mid-level control strategies. Each color represents the different families inside the High- and Mid-level controllers and the symbols point out the categories inside these families. A  Percentage of exoskeletons that implemented the different families and categories of High- and Mid-level controllers for all the pathologies included in this review. Note that the same exoskeleton could incorporate different controllers, and therefore, the summation of percentages can be higher than 100%. B , C  Circular plots illustrate the High- and Mid-level control strategies of exoskeletons tested on individuals with stroke ( B ) and cerebral palsy ( C ). Each circular sector represents a different exoskeleton and every ring represents different levels of the control hierarchy. The outer ring is the control aim, the middle ring is the human–robot synchronization, and the inner ring is the Mid-level control. If a symbol lies in the middle of a subdivision within a sector, it implies that the characteristic related to that symbol applies to both subdivisions

High-level control: control aim

All of the exoskeletons validated on stroke survivors and children/adults with cerebral palsy implemented assistive strategies. On the other hand, only 10.5% of the exoskeletons for stroke rehabilitation and 20.0% for cerebral palsy validated challenged-based control strategies, e.g., using resistive forces [ 152 , 153 , 154 ], perturbing forces [ 92 ], or haptic error augmentation [ 155 ]. Note that the same exoskeleton could incorporate different controllers, and therefore, the summation of percentages can be higher than 100%.

Notably, only 14% of the exoskeletons used adaptive assistive control strategies and 2% used adaptive resistive control strategies. The parameters of the exoskeleton were automatically adapted in real-time based on real-time measurements of the patient’s biomechanics, e.g., the ankle angle tracking error [ 154 , 156 ], hip and knee kinematics [ 98 ], gait speed [ 157 , 158 ] or vertical ground reaction force [ 159 ]. The rest of the devices automatically or manually tuned the magnitude of the assistance off-line based on the patient’s motor function, previously assessed by therapists [ 98 , 99 , 160 ].

The lack of studies that adapted the assistance or resistance based on direct gait biomechanical descriptors of the brain-injured population might be due to the small number of reviewed studies that analyzed the effect of the control parameters on the patients' gait kinematics and kinetics [ 103 , 156 , 161 , 162 , 163 , 164 ]. Besides, the majority of these few studies only focused on analyzing the effect of the timing and magnitude of the assistive torque or position trajectories on ankle power [ 103 ], walking speed, step length, joint kinematics [ 161 , 163 , 164 ], metabolic cost, or muscular activity [ 162 ]. Only one study explored the effect of varying the parameters of an impedance model on the ankle position on the sagittal plane [ 156 ]. Yet, biomechanical metrics—e.g., step length [ 165 ], hip hiking [ 166 ], and trailing-limb angle during the stance phase [ 167 ]—might more directly reflect the patients' rehabilitation progress. Thus, control strategies based on these descriptors might increase the rehabilitation effect of the exoskeleton in comparison to non-adaptive strategies.

High-level control: human–robot synchronization

Threshold-based approaches were the most implemented human–robot synchronization algorithms on lower-limb exoskeletons for individuals with brain injuries in general (72.1% of the exoskeletons), and stroke survivors (73.6% of the exoskeletons) and cerebral palsy participants (80.0% of the exoskeletons) in particular.

Adaptive oscillators were tested with individuals with stroke in four different exoskeletons (10.5% of the exoskeletons) using sagittal lower-limb segment angles, joint angles, or robot–human interaction forces as synchronization signals [ 161 , 168 , 169 , 170 ].

A few number of devices (25.6%) did not implement any type of event detection algorithm for human–robot synchronization, probably because they did not strictly need it [ 171 , 172 , 173 , 174 , 175 , 176 , 177 ]. Most of them were grounded exoskeletons that either enforced joint angle reference trajectories during gait—based on the unimpaired joint movement—using assistive control strategies [ 171 , 172 ], or employed an assistive controller around the desired trajectory [ 173 , 174 , 175 , 176 , 177 ].

Only one exoskeleton in this review implemented stochastic methods to distinguish between different locomotion modes, i.e., stop, normal walk, acceleration, and deceleration [ 109 ]. They used linear discriminant analysis (LDA) with EEG signals to differentiate between the frequencies of the brain activity associated to each mode.

We consider that two main reasons may have led to the lack of implementation of stochastic methods: (1) having a stochastic model that is flexible and able to capture the variance of the population (i.e., does not underfit) requires training data that captures the heterogeneity of individuals with brain injuries, which might be difficult to obtain [ 178 ]; and (2) the difficulty of getting robust stochastic models hinders their application in commercial exoskeletons, as regulatory bodies impose strict safety standards to validate such devices for clinical use [ 179 ].

Exoskeletons and prosthesis share similar challenges in terms of human–robot synchronization, but in the case of prosthetic devices, the tendency to apply stochastic methods is higher than using threshold-based approaches [ 180 , 181 ]. This might be explained by the homogeneity in the gait of amputees compared to the heterogeneity observed in individuals with brain injuries [ 182 , 183 , 184 ]. Nevertheless, as in the case of lower-limb exoskeletons, there is a lack of use of stochastic methods in commercially available prostheses [ 185 ].

We have not found any exoskeleton in the framework of this review that implements algorithms that automatically adapt the threshold values or model parameters related to gait event identification algorithms. Gait state detection methods with the ability to adapt to diverse walking conditions, e.g., different cadences [ 186 ], are still pending to be implemented and validated on exoskeletons for individuals with brain injuries.

The most common metric used to detect gait events was the vertical ground reaction force (62.8% for all the pathologies, 60.5% for stroke and 50.0% for CP), probably due to its simplicity in the theoretical and practical implementation [ 187 ]. Ground reaction forces are directly related with the physics of foot-ground interaction. The normal or vertical force component is the one that allows to identify the phases of the foot contact and lift. Force-sensing resistors, placed at particular foot locations—e.g., heel, toe, and first and/or fifth metatarsals—were generally used to measure this metric [ 92 , 107 , 157 , 158 , 159 , 162 , 164 , 177 , 188 , 189 , 190 , 191 , 192 , 193 , 194 , 195 , 196 , 197 , 198 , 199 , 200 , 201 , 202 , 203 ]. Alternatively, instrumented treadmills were employed to measure anterior-posterior ground reaction forces to determine the timing of the ankle plantarflexion assistance [ 133 , 204 ]. However, the suitability of this metric to treat individuals with brain injuries is questionable due to their irregular center of pressure trajectory along a walking cycle. The lack of uniformity might come from equinovarus deformity [ 205 ], excessive hip external rotation [ 16 , 206 ], or reduced propioception [ 207 , 208 ]. Thus, it might be challenging to develop robust gait event detection algorithms that use ground reaction forces for this specific population.

Human–robot interaction forces have only been implemented on two exoskeletons (4.6%). In the first exoskeleton, the human–robot interaction forces were employed to feed a threshold-based algorithm to detect the swing phase [ 107 ], while in the second exoskeleton they were used as the teaching signal of a pool of adaptive oscillators [ 161 ]. Only a few devices used human–robot interaction forces as control inputs [ 103 , 107 , 109 , 190 , 209 ], which might explain the scarce use of this metric in exoskeletons for individuals with brain injuries. The mechanical adaptations needed on the exoskeleton’s structure to install a force/torque sensor might also explain why the measurements of human–robot interaction forces as control inputs are not commonly used.

Only a few reviewed studies incorporated biosignals as metrics in their human–robot synchronization algorithms (4.6% of the exoskeletons for all the pathologies). For example, EEG was used by only one exoskeleton [ 109 ] to detect different locomotion modes, i.e., stop, normal walk, acceleration, and deceleration. Problems related to EEG analysis, such as feature extraction and artifact removal [ 58 , 210 , 211 ], might make the implementation of reliable control strategies a challenge. Furthermore, EEG-based synchronization might require high levels of attention from the patient, which might result in mental fatigue [ 212 ], and thus, might limit the training duration. Nevertheless, brain activity might be especially useful for individuals who suffer from a severe neurological condition, such as paraplegia [ 213 , 214 ].

In people who preserve their voluntary muscle control over the affected limbs, muscular activity might be a more suitable metric compared to brain activity. Yet, only two devices [ 108 , 159 ] validated muscular activity as an event detection metric in individuals with brain injuries. These devices employed muscular activity (EMG) from the trunk, hip, and knee flexor/extensor muscles to trigger the control action. There are several limitations associated with the use of muscular activity to detect gait events. First, surface electromyography (sEMG) signals suffer from non-robustness due to patient-to-patient variability and sensor-placement dependency [ 38 , 59 ]. Moreover, muscular activity might not be reliable in individuals who have abnormal muscle activation patterns, such as stroke and CP survivors [ 58 , 215 ].

We consider that joint or body segment kinematics (used in 41.8% of the exoskeletons for all the pathologies, 36.8% for stroke, and 40.0% for CP) might be more reliable metrics than the aforementioned metrics in previous paragraphs for the human–robot synchronization algorithms when detecting events with brain-injured people [ 216 ], as they show higher homogeneity among individuals with hemiplegic gait [ 217 , 218 ]. In particular, the shank absolute angle and angular velocity in the sagittal plane have been shown to be especially robust metrics to detect gait events in individuals with hemiplegic gait [ 219 ].

Trajectory-tracking control is the most used Mid-level control strategy in lower limb exoskeletons for rehabilitation (97.7%). The most common approach is to enforce predefined reference position or torque trajectories defined based on data of unimpaired joints [ 69 , 192 , 220 ]. Trajectory-tracking control was combined with compliant control (28.9% of the exoskeletons for stroke and 20.0% of the exoskeletons for CP) in assistive controllers based on potential [ 14 , 107 , 176 , 190 , 221 ] or velocity fields [ 142 ]. In these examples, the assistive action of the exoskeleton varied based on the joint kinematic errors.

Only four devices (13.9% of the exoskeletons) that used myoelectric control were validated on individuals with brain injuries [ 66 , 133 , 159 , 204 ]. Myoelectric control is one of the least often employed Mid-level control strategies in post-stroke (10.5% of the exoskeletons) and cerebral palsy (10.0%) rehabilitation, according to the results of this review. The aforementioned issues with muscle activity recording and analysis (see “ High-level control: human–robot synchronization ” for a detailed discussion) might be behind the low adoption of this Mid-level control technique. Nonetheless, myoelectric control has a high applicability for people who preserve volitional control of the muscles, such as users of robotic prosthetic devices [ 222 ].

None of the reviewed studies incorporated BCI control with individuals with brain injuries. Problems related to the extraction of relevant information from, e.g., EEG recordings (see “ High-level control: human–robot synchronization ” for a detailed discussion) might also explain the lack of usage of this Mid-level control technique in exoskeletons for individuals with brain injuries. EMG is a viable alternative or adjunct to EEG for detecting movement intention or generating control signals, but the practical benefits of using EMG over EEG, e.g., shorter set-up time, more compactness, and lower doning/offing times, might explain why myoelectric control has been more often used than BCI control [ 214 ]. Few studies, aside the ones included in this review, evaluated the feasibility of using EEG signals for BCI control of exoskeletons for individuals with brain injuries [ 223 , 224 ].

Clinical validation

This section provides an overview of the most important characteristics of the clinical validation of the robotic interventions, i.e., participants’ demographics, protocol design, and outcome measures. The results summarized in this section only incorporate participants who tested the exoskeletons and not participants in the control group. See Additional file 1 to have a more detailed description about the studies included in the clinical validation.

Participants’ demographics

Stroke was the main pathology of the participants recruited for the studies included in this review (74% of the studies) (see Fig.  5 A). The majority of the participants with stroke were in the chronic phase (55.41% of participants with stroke), followed by subacute (33.83%) and acute (10.76%) phases. Cerebral Palsy was included in only 20% of the studies, while the representation of other brain injuries, like traumatic brain injury (1.2%) or other acquired brain injury (1.88%), was scarce. It is especially remarkable that despite the high incidence of traumatic brain injury, only two studies focused on this specific population [ 225 , 226 ].

figure 5

Overview of the participants’ demographics and experimental protocol characteristics. A  Percentage distribution of the pathologies of the participants included in the reviewed studies. B  Percentage distribution of main control conditions in the studies. C  Histograms of number of participants (top left), number of sessions (top right), "session frequency" (times per week; bottom left), and session duration (bottom right) across the selected studies

Experimental protocol

High variability was found in the number of participants ( \({14.87 \pm 13.53}\) ), number of sessions ( \({11.77 \pm 12.20}\) ), session frequency (times per week; \({3.09 \pm 1.68}\) ), and session duration ( \({50.57 \pm 34.06}\) min) (see Fig.  5 C). Previous reviews that analyzed the protocol of robotic treatments reported similar high variability [ 40 , 46 ]. Some studies did not provide complete information about the experimental protocol, e.g., they did not mention the number (15.09%), duration (33.33%), or frequency (31.44%) of the training sessions.

Free walking without the exoskeleton was the condition most often employed to compare the robotic treatment with (39.62%) (see Fig.  5 B). There were also studies that compared the robotic treatment with conventional gait therapy (22.01%), while other studies compared the robotic treatment with the effect of using the device unpowered (10.69%) or in zero torque mode (6.92%).

The average level of evidence of the studies included in this review was low. The majority of the studies were observational (66.04%), while only 10.06% and 22.64% were CTs and RCTs, respectively. Only 12.58% of the studies did a follow-up evaluation after the robotic intervention, on average four months after the last intervention.

Outcomes of interest

Ambulation scales were the main metrics used to classify the initial functional level of participants for all the studies. The participants’ baseline was determined using metrics that analyzed their level of impairment and motor function—GMFM (19.50%), FMA (13.21%), and Brunnstrom Stage (BS) (7.55%)—, mobility—TUG (10.06%), FAC (29.56%), BBS (14.47%)—, spasticity—modified Ashworth scale (MAS) (15.09%)—, and functional capacity and activities of daily living—walking speed (56.6%), 10MWT (20.75%), 6MWT (16.98%), FIM (8.81%), and Barthel Index (BI) (11.32%) (see Fig.  6 A).

figure 6

Overview of the main baseline and outcome metrics.  A  Percentage distribution of the metrics that were used in at least 5% of the studies to determine the initial functioning level of participants. B  Percentage distribution of the outcome measures of the robotic interventions that were used in at least 5% of the studies grouped by categories, i.e., ambulation scales/tests, spatio-temporal measurements, joint kinematics, muscular activity (EMG), dynamics, and energy expenditure. Functional Ambulation Category (FAC), 10 m Walk Test (10MWT), 6 min Walk Test (6MWT), Gross Motor Function Measure (GMFM), Modified Ashworth Scale (MAS), Berg Balance Scale (BBS), Fugl–Meyer Assessment (FMA), Barthel Index (BI), Timed-Up and Go (TUG), Functional Independence Measure (FIM) and Brunnstrom Stage (BS)

A critical limitation we encountered when comparing robotic treatments was the low homogeneity across studies in the selected outcome measures after the treatment, as no metric was used in more than 50% of the studies (see Fig.  6 B). Ambulation scales together with spatio-temporal parameters were similarly used to determine the effect of the robotic treatment (62.89% of the studies). Within these families of metrics, gait speed was the most used metric in the reviewed studies (37.74%), followed by cadence (25.16%) and step length (23.27%). Joint kinematics was also often used to quantify the effect of the robotic intervention (44.65%). Hip (22.64%), knee (27.67%), and ankle (18.87%) ranges of motion (RoM) in the sagittal plane were the most often selected kinematic metrics.

Finally, the number of studies that analyzed the muscular activity through sEMG was lower in comparison with the aforementioned families of metrics (20.75%). The main analyzed muscles were the ankle dorsiflexor (tibialis anterior, 10.69%) and plantaflexor (gastrocnemius, 8.81%; and soleus, 6.92%) muscles, and the knee extensor (rectus femoris, 9.43%; and vastus lateralis, 5.03%) and flexor (semitendinosus, 6.92%) muscles. Less frequently employed metrics include those related to gait dynamics (18.23%, where the most used was ankle torque in 5.03% of the studies)—i.e., joint torques and ground reaction forces, energy expenditure (10.69%, where the most used was oxygen consumption in 5.66% of the studies), and neural activity, i.e., brain activation and cortex excitability (6.29%).

Clinical comparison of the control strategies

This section quantifies the relation between the control strategies and the clinical metrics presented in “ Outcomes of interest for the clinical comparison ” section to compare among strategies.

A total of 12 control strategies were evaluated in this section in terms of training intensity (min/week) and percentage of improvement of the outcome metrics selected. We considered that the most efficient control strategy for the metric analyzed would be the one that results in the highest improvement with the lowest training intensity. We also evaluated the grade of evidence—i.e., high, mid and low—and the number of participants and studies.

Based on the analyzed studies, we could only extract moderate conclusions from the studies that included post-stroke participants. The studies that involved patients with other brain injuries such as CP or traumatic brain injury did not allow for a comparison of the control strategies implemented, due to the lack of studies with exoskeletons using different control strategies. See “ Limitations and future steps ” section for more details.

As a general introductory comment to the results, all the control strategies evaluated provided a positive effect on the selected outcomes of interest for participants with stroke (see Fig.  7 ). Only one control strategy, i.e., assistive control with a threshold-based approach using EMG as detection metric and control signal, provided a negative impact on chronic participants for the TUG test [ 227 , 228 ]. See Additional file 3 for a detailed table of the control strategies implemented in the reviewed studies and the results obtained in the main outcomes of interest for individuals with stroke.

figure 7

Clinical comparison of the control strategies per outcome metric and acuity level of stroke. Relation between the training intensity and percentage of improvement for A  acute, B – D  subacute and E – I  chronic stroke for the selected outcome metrics. The shape of each symbol corresponds to each of the control strategies, the color is related to the grade of evidence and the intensity of the color is associated to the number of participants of the studies included. The error bars indicate the range of values for the training intensity (horizontal lines) and range of percentage of improvement (vertical lines). The control strategies included are combinations of (i) Control aim: Assistive (A), Challenge-Based (CB); (ii) Human–Robot Synchronization: Threshold-Based (TB) and Adaptive Oscillator (AO), with metrics Ground Reaction Forces (GRF), Electromyography (EMG), and Joint Kinematics (K); (iii) Mid-Level Control: Trajectory Tracking (TT), Compliant (C), Myoelectric (M). Other acronyms: Not Available (N/A), 10 m Walk Test (10MWT), 6 min Walk Test (6MWT), Berg Balance Scale (BBS), Fugl–Meyer Assessment of Lower Extremity (FMA-LE), Functional Independence Measure (FIM), and Timed-Up and Go (TUG)

Acute stroke

From the originally listed outcome metrics of interest, FIM was the only metric that allowed a comparison of the effectiveness of different control strategies in acute stroke rehabilitation [ 134 , 229 , 230 , 231 , 232 ]. The participants included in the considered studies ( \({35.80 \pm 22.07}\) participants) presented an average initial FIM score of \({2.50 \pm 1.29}\) and an average training intensity of the robotic intervention of \(840\;[360, 1620]\) min/week.

Assistive control strategies that implemented a combination of trajectory-tracking and compliant Mid-level control showed an improvement after training of 272.73% in FIM [ 229 ] with a strong grade of evidence (see Fig.  7 A). Conversely, assistive strategies that included a threshold-based algorithm based on EMG recordings as detection metric and control signal showed a lower improvement after training of \(58.33\;[0.00, 150.00]\%\) in FIM with moderate grade of evidence [ 134 , 230 , 231 , 232 ].

However, this comparison is based on partial information as in [ 229 ] authors did not report the frequency of the sessions. We could deem that the observed higher improvement in FIM in the compliant assistive control strategies could also be explained by the longer training duration ( \(\approx\) 600 min) compared to the duration of training with neuromuscular assistive strategies ( \(\approx\) 240 min). Thus, there is not a control strategy that is clearly better than others to improve the patients' functional status (based on the FIM assessment) for acute stroke.

Subacute stroke

The metrics analyzed in studies with people in the subacute phase after stroke focused on: motor function (FMA-LE) [ 233 , 234 , 235 , 236 , 237 ], gait endurance (6MWT) [ 69 , 234 , 236 , 238 , 239 ], and general mobility (TUG) [ 234 , 236 , 238 ]. The initial scores of the outcomes of interest that allowed for comparison between different control strategies were on average: FMA-LE = \({18.87 \pm 3.75}\) , 6MWT = \({114.45 \pm 40.77}\) m and TUG = \({29.42 \pm 10.2}\) s. The number of participants and the training intensity were on average \({26.31 \pm 17.83}\) and \({3103.63 \pm 3059.54}\) min/week, respectively.

The results presented in Fig.  7 B–D pointed out that neuromuscular assistive control strategies outperformed trajectory tracking and compliant control strategies when evaluating the outcomes of interest. In particular, neuromuscular assistive control strategies that incorporated a threshold-based algorithm using EMG as the detection metric and control signal provided the highest improvements in all outcome measures with a high level of evidence [ 233 , 234 , 235 ]. Importantly, this type of control showed similar or higher improvements with lower training intensity and higher grade of evidence (high level) in 6MWT (69.59%; see Fig.  7 B), FMA-LE (12.66% improvement; see Fig.  7 C) and TUG (50.74%; see Fig.  7 D), compared to the other control strategies implemented in other studies. However, the average number of participants (8 participants) was smaller than in other studies (25.5 participants), which reduces the impact of this result.

Only one study combined two different control strategies separately on the same robotic treatment [ 236 ]. In particular, the authors combined EMG and assistive control with a trajectory-tracking (Mid-level control) that used a threshold-based synchronization algorithm with ground reaction forces as input data. When compared with other control strategies from other studies, the combination of the two control strategies in Watanabe et al. [ 236 ] reached similar improvements with lower training intensity and higher grade of evidence in 6MWT (60.39%; see Fig.  7 B), FMA-LE (8.42%; see Fig.  7 C), and TUG (39.57%; see Fig.  7 D).

In the high-evidence study from Watanabe et al. [ 236 ], the results of myoelectric control were boosted when combined with a control strategy that did not use muscle activation. As we mentioned in “ High-level control: human–robot synchronization ” section, myoelectric control suffers from several technical limitations when employed in individuals with abnormal muscle activation patterns. Thus, it is possible that alternative detection metrics (e.g., based on lower-limb kinematics) and Mid-level control strategies (e.g., trajectory-tracking with compliant control) might produce higher improvements with shorter training time in subacute stroke participants [ 240 , 241 , 242 , 243 , 244 ].

Aligned with the aforementioned comments, assistance provided by trajectory tracking and compliant control showed the highest improvement for the FMA-LE (21.95%; see Fig.  7 C) for a high number of participants (38 participants). Nevertheless, the training intensity was higher and the grade of evidence was lower than the other studies.

Chronic stroke

Studies on people in the chronic phase after stroke were the only ones that used all the metrics described in “ Outcomes of interest for the clinical comparison ” section. The mean baseline values (i.e., baseline condition) were: 6MWT =  \({197.03 \pm 58.53}\)  m [ 160 , 168 , 171 , 190 , 245 , 246 , 247 , 248 , 249 , 250 , 251 , 252 , 253 , 254 ], 10MWT =  \({0.42 \pm 0.23}\) m/s [ 160 , 168 , 171 , 227 , 228 , 245 , 248 , 255 , 256 , 257 , 258 ], BBS =  \({44.00 \pm 6.94}\) [ 160 , 168 , 171 , 190 , 227 , 228 , 245 , 246 , 250 , 252 , 253 , 254 , 259 , 260 , 261 ], TUG =  \({31.85 \pm 20.00}\) s [ 160 , 190 , 245 , 248 , 249 , 252 , 258 , 259 , 260 ], and FMA-LE =  \({37.26 \pm 53.80}\) [ 160 , 168 , 170 , 171 , 190 , 247 , 249 , 250 , 252 , 257 ]. The average number of participants per study and the mean training intensity were \({16.95 \pm 12.80}\) and \({3414.31 \pm 3518.10}\) min/week, respectively.

Assistive control together with adaptive oscillators that use lower-limb kinematic information to synchronize the robot with the patient's motion and with a trajectory-tracking control as Mid-level control, achieved the best results in general (see Fig.  7 E–H). Robotic treatments using this strategy showed higher or similar improvements—i.e., improvement of 46.00% in 6MWT (see Fig.  7 G), 34.00% in 10MWT (see Fig.  7 E), \(25.19\;[21.40, 28.99]\%\) in FMA-LE (see Fig.  7 H), and 11.30% in BBS (see Fig.  7 I) after the treatment—with lower or similar training intensity ( \(2025\;[1620, 2430]\) min/week) and higher grade of evidence, compared to other control strategies implemented in other studies [ 168 , 170 ]. Yet, despite being high-evidence studies with a moderate number of participants (19 participants), there were only two studies that supported the efficacy of this control strategy.

Similar improvements in the 10MWT and BBS (28.82 [9.76, 53.85]%; see Fig.  7 E and 12.39 [11.82, 12.96]%; see Fig.  7 F, respectively) were observed when assistive controllers with a threshold-based approach using EMG as detection metric and control signal were employed [ 227 , 228 , 255 ]. However, the grade of evidence and the number of participants were lower in comparison to the aforementioned studies that implemented controllers that were not neuromuscular-based. Furthermore, this type of controller was the only one that had a negative effect on the TUG score ( \(-3.61\;[-10.27, 3.06]\%\) ; see Fig.  7 I) [ 227 , 228 ].

Control strategies that implemented assistance in combination with trajectory tracking and compliant control showed the highest increase in TUG ( \(20.43\;[12.5, 41.30]\%\) ; see Fig.  7 I) and FMA-LE ( \(27.76\;[11.30, 60.00]\%\) ; see Fig.  7 H), with a strong grade of evidence [ 247 , 248 , 249 , 250 , 257 , 258 , 259 , 260 ]. Nevertheless, these studies also involved the highest training intensity. Therefore, the superior improvement might be related not only to the control strategy employed, but also to the higher training intensity ([1080, 10500] min/week); in comparison with the studies that used different control strategies and that also evaluated these metrics ([480, 4500] min/week).

Only one study evaluated resistive control strategies in people in the chronic phase of stroke [ 253 ]. The authors reported an improvement in 6MWT (5.00%; see Fig.  7 G) and BBS (7.14%; see Fig.  7 F), which are similar to the ones reported for assistive control. Based on this, we advocate that more studies implementing resistive control strategies need to be carried out to provide stronger evidence on their clinical effectiveness.

The main contribution of this systematic review is that it provides a classification of the control strategies implemented on lower-limb exoskeletons, analyzes the experimental methodology used in the robotic interventions, and compares the clinical effectiveness of the control strategies when used—together with the exoskeleton—as a gait rehabilitation tool for individuals with stroke. In the following subsections, we answer to the posed three research questions of this review.

Which control strategies have been used on powered lower limb exoskeletons for individuals with brain injuries?

Regarding the implementation of High-level controllers, we found that assistive control strategies are the most widely implemented on lower-limb exoskeletons for individuals with brain injuries. Despite the potential of adaptive control (see “ Neuroscience evidence behind current control developments ”), most of the controllers included in this review did not adapt the control parameters based on meaningful biomechanical metrics, such as hip hiking or circumduction. Thus, it is an open question whether adaptive controllers would potentially outperform current solutions. Comprehensive studies analyzing the effect of the exoskeleton control parameters on clinically meaningful biomechanical metrics might allow the development of adaptive control rules that directly tackle the main gait abnormalities of individuals with brain injuries [ 97 , 262 , 263 ].

As for human–robot synchronization, we found that threshold-based techniques , which rely on ground reaction force as detection metric, are extensively used. Only a few devices used adaptive oscillators to synchronize the motion of the exoskeleton with that of the patient. Yet, adaptive oscillators seem to have a high potential for this specific population. As an interesting result, only one device included in this systematic review implemented stochastic methods for human–robot synchronization, despite their popularity in research and potential application in identification and classification of states and actions of the human–robot system. In recent years, novel approaches have been proposed that estimate biological joint torques using musculoskeletal modelling to control the action of the exoskeleton in a state-independent manner, i.e., with no need to detect gait events or different walking conditions, e.g., stair ascent and descent [ 143 , 264 , 265 ]. However, these control strategies still need further investigation to evaluate their potential clinical effectiveness on individuals with brain injury.

For the Mid-level control, position trajectory-tracking control was the most commonly used strategy, which was combined in some cases with compliant control to dynamically relate joint angles to forces or torques. We consider that this approach might be the most appropriate for devices that provide partial assistance, as it promotes a dynamic synergy between the patient and the device. Only a few devices implemented myoelectric control , while none of them employed BCI to control lower-limb exoskeletons in this population. We attribute this shortage to the difficulty of developing generalized control laws that use EMG or EEG as control signals for individuals with brain injuries.

What are the experimental protocols and outcome metrics used for the clinical validation of robotic interventions?

We found a wide heterogeneity in the experimental protocols and the selection of the outcomes of interest to evaluate the robotic interventions. Walking speed was the preferred metric to evaluate the patients' initial impairment level and the effectiveness of the robotic treatment. Almost all studies included in this review focused on testing the exoskeletons on participants with stroke and CP. Other types of brain injury represented a low portion of the reviewed studies.

Regarding the outcome metrics used, we consider that the field should not continue focusing solely on performance-based outcome metrics such as walking speed over a certain distance. Standard clinical metrics have been useful in the past years to quantitatively evaluate the progress of impaired individuals. However, they might fail in showing the specific biomechanical effects of the treatments. For example, participants could have a better score on the 10MWT after training, i.e., walk faster, but they might use more compensatory strategies, e.g., hip hiking or circumduction, or a stronger involvement of the non-impaired leg. This does not necessarily have to be a negative circumstance, but the 10MWT alone does not allow to link a potential effect of the robotic training to the underlying (biomechanical) mechanism through which the improvement is achieved.

Thus, we consider that using biomechanical metrics, which are more directly related to the impairment itself, could complement standard clinical outcomes by providing a more detailed perspective of the effect of the robotic treatment. Moreover, improvements in these biomechanical metrics might also result in better scores in the standard clinical tests as they are related. Finally, biomechanical outcomes, e.g., step length or temporal symmetry ratio, can be used independently of the level of impairment. In fact, there are some clinical tests, e.g., the 6MWT, that are quite difficult to be carried out by participants with high levels of impairment. In this literature survey, we have found that the number of studies that used outcomes directly related to gait disorders is increasing. However, those studies included a wide range of biomechanical metrics that did not allow a comparison among studies. For instance, step length, which is one of the quantifiable markers of impaired walking performance and motor recovery, was used only in 23% of the studies. Future research could focus on identifying a core set of biomechanical outcome metrics to be used in prospective trials.

As complementary results to the biomechanical outcomes, we think that reporting information about the set-up timings and the level of participation of the patient through the treatment might provide relevant information to compare control strategies. None of the studies included in this review reported set-up times of the control parameters of the exoskeletons. Yet, the time to tune the control parameters might be a relevant point to consider when comparing control strategies. This is important because during practice therapists have to tailor the device for multiple patients in a reduced time frame. Fortunately, recent publications of exoskeletons tested on individuals with brain injuries have started to include this metric [ 100 , 101 ]. Furthermore, we did not find any study that reported metrics that might be directly associated with the level of participation of the patient through the robotic treatment, e.g., direct measurements of muscle activation or indirect estimation through the reporting of control parameters. For example, reporting the value of the adapted control parameters along the training might provide an estimate of the patients’ walking ability [ 266 , 267 ].

The variety in the experimental protocols and the reported performance metrics are the main factors which hinder a systematic comparison between the controllers effectiveness. We think that the same outcomes and experimental protocol have to be used for studies in which the participants have the same pathology and level of impairment, so it is possible to compare among control strategies. Furthermore, studies with exoskeletons that target the same joints should use the same experimental methods to allow for hardware-independent comparisons among control strategies.

What is the current clinical evidence on the effectiveness of the different control strategies?

We were unable to identify a control strategy that is clearly superior for acute stroke patients. Assistive control strategies that implemented a combination of trajectory-tracking and compliant control showed the highest clinical effectiveness, with high grade of evidence and a moderate number of participants (19 participants), but they also required the longest training time. Assistive control strategies that followed a threshold-based algorithm with EMG as detection metric and control signal provided the highest improvements with the lowest training intensities and low number of participants (8 participants) in the outcome measures of interest for subacute stroke. Finally, adaptive oscillators that used lower limb kinematic information to assist the motion of the user together with trajectory-tracking as Mid-level control showed the highest improvements with reduced training intensities for chronic stroke with high evidence (all RCT) with a moderate mean number of participants (19 participants). Finally, we were not able to determine the efficacy of adaptive control strategies as none of the studies that implemented these strategies fullfilled the inclusion criteria for the clinical comparison.

Note that these conclusions should be treated with the consideration that the number of studies for the clinical comparison of the control strategies was low. A total of 73 studies were included in the comparison (see Fig.  1 ), but our conclusions are based on only 57 studies. Nevertheless, the majority of these studies are of high quality (see Additional file 3 for detailed information about the quality of the studies for each family of control strategies). Thus, there is a trade-off between quality and quantity. On the one hand, we consider that results are fairly consistent across the studies and come from high-level evidence studies with large sample sizes, i.e., RCTs or CTs, which minimizes the risk of bias. On the other hand, we think that the results would be stronger if treatments included a larger aggregate pool of participants and/or with wider inclusion criteria, which would allow the generalizability of the outcomes of the studies to a broader population.

Limitations and future steps

Although the number of studies that evaluated the effectiveness of robotic-assisted gait rehabilitation has increased exponentially in the last decade, we still found critical limitations in the clinical comparison of the effectiveness of different control strategies. Only a few studies compared control strategies on the same participants and using the same exoskeleton, hindering the possibility to extract clear conclusions regarding the clinical effectiveness of each control strategy for gait rehabilitation. In addition, spontaneous recovery [ 69 ] and compensation strategies probably contributed to increased scores on the outcome metrics, making it challenging to purely evaluate the effect of the different control strategies on functional recovery among different studies.

Another relevant limitation is that our comparison was limited to individuals with stroke. We were not able to evaluate control strategies of studies that involved patients with CP or traumatic brain injury, due to the lack of studies with exoskeletons using different control strategies and the heterogeneity of the level of the impairment of the participants. For the case of CP, in the studies that reported the main outcomes of interest, participants were pooled together, independently of their GMFCS level [ 268 , 269 , 270 , 271 ]. Only in a few studies that used the Lokomat [ 272 , 273 , 274 , 275 , 276 ] and CPWalker [ 277 ], authors analyzed the outcomes of interest selected in “ Outcomes of interest for the clinical comparison ” section and differentiated between the GMFCS levels. However, those studies implemented the same family of control strategies, namely assistive control strategies without human–robot synchronization algorithms that combined trajectory-tracking and compliant control, and thus, no comparison between controllers was possible.

While the level of evidence of the studies included in the clinical comparisons is high, the number of studies for each family of control strategies is still low. The reduced number of studies might be a consequence of the regulatory framework for medical devices, which limits the opportunity of validating the technology at early stages of development (see Fig.  8 for the geographical location of the studies included in the clinical analysis). With current tight regulations, testing devices at a low Technology Readiness Level (TRL) is subject to the same requirements as those devices that are ready to be certified [ 111 , 179 , 278 ]. Furthermore, there is a lack of an ethical and regulatory framework that enables researchers to involve end-users in the co-creation and validation of early-stage prototypes to quickly make technology accessible to the users, while guaranteeing the well-being of patients and therapists.

figure 8

Location of studies included in the clinical comparison. A  Number of the studies per location for participants with stroke. B  Number of the studies per location for participants with cerebral palsy

This paper presents one of the first reviews that focuses on the effectiveness on rehabilitation of different control algorithms used in lower limb exoskeletons for gait rehabilitation after brain injury. This literature survey is a first step towards determining the most effective control algorithms for each pathology and level of impairment. The main findings from this review are: (1) We found that assistive controllers that followed threshold-based algorithms relying on ground reaction force thresholds in conjunction with trajectory-tracking control were the most implemented control strategies. Few devices implemented adaptive control strategies that modulated the control parameters based on the patients’ performance. (2) Aligned with other reviews on clinical practice of robotic interventions, we found high variability in the experimental protocols and selected outcome metrics. (3) Assistive control strategies that implemented a combination of trajectory-tracking and compliant control showed the highest clinical effectiveness for acute stroke. Assistive control strategies that followed a threshold-based algorithm with EMG as detection metric and control signal provided the highest improvements in the outcome measures of interest for subacute stroke. Assistive control strategies, which followed threshold-based or adaptive oscillator algorithms together with trajectory-tracking control, resulted in the highest improvements for individuals with chronic stroke. For other brain injuries included in this review—i.e., cerebral palsy and traumatic brain injury—the lack of standardization on the clinical studies made it impossible to analyze the effect of the control strategies on the clinical outcomes of interest.

Although remarkable efforts have been made into developing novel sophisticated motor-learning driven controllers to enhance gait rehabilitation, the majority of the reviewed studies only provided a general overview of the effect of the robotic controller on individuals with brain injuries. Future research should evolve into structured and standardized studies that aim at finding the relation between control strategies and a core-set of clinical outcome measures, controlling for the effects of participants’ initial impairment level and training intensity. Current limitations might be overcome when clinicians, researchers, industry, and regulatory bodies work together to solve this urgent societal and scientific problem.

Availability of data and materials

All data generated or analysed during this study are included in this published article and its Additional files.

Cieza A, Causey K, Kamenov K, Hanson SW, Chatterji S, Vos T. Global estimates of the need for rehabilitation based on the global burden of disease study 2019: a systematic analysis for the global burden of disease study 2019. Lancet. 2020;396(10267):2006–17.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Johnson CO, Nguyen M, Roth GA, Nichols E, Alam T, Abate D, Abd-Allah F, Abdelalim A, Abraha HN, Abu-Rmeileh NM, et al. Global, regional, and national burden of stroke, 1990–2016: a systematic analysis for the global burden of disease study 2016. Lancet Neurol. 2019;18(5):439–58.

Article   Google Scholar  

Johnson W, Onuma O, Owolabi M, Sachdev S. Stroke: a global response is needed. Bull World Health Organ. 2016;94(9):634.

Article   PubMed   PubMed Central   Google Scholar  

McGuire DO, Tian LH, Yeargin-Allsopp M, Dowling NF, Christensen DL. Prevalence of cerebral palsy, intellectual disability, hearing loss, and blindness, national health interview survey, 2009–2016. Disabil Health J. 2019;12(3):443–51.

Sellier E, Platt MJ, Andersen GL, Krägeloh-Mann I, De La Cruz J, Cans C, Surveillance of Cerebral Palsy Network, Van Bakel M, Arnaud C, Delobel M, et al. Decreasing prevalence in cerebral palsy: a multi-site European population-based study, 1980 to 2003. Dev Med Child Neurol. 2016;58(1):85–92.

Article   PubMed   Google Scholar  

Dewan MC, Rattani A, Gupta S, Baticulon RE, Hung Y-C, Punchak M, Agrawal A, Adeleye AO, Shrime MG, Rubiano AM, et al. Estimating the global incidence of traumatic brain injury. J Neurosurg. 2018;130(4):1080–97.

Crichton SL, Bray BD, McKevitt C, Rudd AG, Wolfe CD. Patient outcomes up to 15 years after stroke: survival, disability, quality of life, cognition and mental health. J Neurol Neurosurg Psychiatry. 2016;87(10):1091–8.

Kelly-Hayes M, Beiser A, Kase CS, Scaramucci A, D’Agostino RB, Wolf PA. The influence of gender and age on disability following ischemic stroke: the Framingham study. J Stroke Cerebrovasc Dis. 2003;12(3):119–26.

Jørgensen HS, Nakayama H, Raaschou HO, Olsen TS. Recovery of walking function in stroke patients: the Copenhagen stroke study. Arch Phys Med Rehabil. 1995;76(1):27–32.

Winstein CJ, Stein J, Arena R, Bates B, Cherney LR, Cramer SC, Deruyter F, Eng JJ, Fisher B, Harvey RL, et al. Guidelines for adult stroke rehabilitation and recovery: a guideline for healthcare professionals from the American heart association/American stroke association. Stroke. 2016;47(6):98–169.

Teasell R, Viana R. Evidence-based benefit of rehabilitation after stroke. In: Textbook of neural repair and rehabilitation. Cambridge: Cambridge University Press; 2014. p. 601–14.

Chapter   Google Scholar  

Roelker SA, Bowden MG, Kautz SA, Neptune RR. Paretic propulsion as a measure of walking performance and functional motor recovery post-stroke: a review. Gait Posture; 2018.

Google Scholar  

Lin P-Y, Yang Y-R, Cheng S-J, Wang R-Y. The relation between ankle impairments and gait velocity and symmetry in people with stroke. Arch Phys Med Rehabil. 2006;87(4):562–8.

Murray SA, Ha KH, Hartigan C, Goldfarb M. An assistive control approach for a lower-limb exoskeleton to facilitate recovery of walking following stroke. IEEE Trans Neural Syst Rehabil Eng. 2014;23(3):441–9. .

Wiszomirska I, Błażkiewicz M, Kaczmarczyk K, Brzuszkiewicz-Kuźmicka G, Wit A. Effect of drop foot on spatiotemporal, kinematic, and kinetic parameters during gait. Appl Bionics Biomech. 2017. .

Woolley SM. Characteristics of gait in hemiplegia. Top Stroke Rehabil. 2001;7(4):1–18.

Article   CAS   PubMed   Google Scholar  

Bernhardt J, Hayward KS, Kwakkel G, Ward NS, Wolf SL, Borschmann K, Krakauer JW, Boyd LA, Carmichael ST, Corbett D, et al. Agreed definitions and a shared vision for new standards in stroke recovery research: the stroke recovery and rehabilitation roundtable taskforce. Int J Stroke. 2017;12(5):444–50.

Hebert D, Lindsay MP, McIntyre A, Kirton A, Rumney PG, Bagg S, Bayley M, Dowlatshahi D, Dukelow S, Garnhum M, et al. Canadian stroke best practice recommendations: stroke rehabilitation practice guidelines, update 2015. Int J Stroke. 2016;11(4):459–84.

Teasell R, Hussein N. Chapter 4. Motor rehabilitation: lower extremity and mobility. In: Stroke rehabilitation clinician handbook; 2016.

Teasell R, Hussein N. Chapter 2. Brain reorganization, recovery, and organized care. In: Stroke rehabilitation clinician handbook; 2016.

Schröder J, Truijen S, Van Criekinge T, Saeys W. Feasibility and effectiveness of repetitive gait training early after stroke: a systematic review and meta-analysis. J Rehabil Med. 2019;51(2):78–88.

Kwah L, Kwakkel G, Veerbeek J. Prediction of motor recovery and outcomes after stroke. In: Stroke rehabilitation. Elsevier: Amsterdam; 2018. p. 23–47. .

Langhorne P, Bernhardt J, Kwakkel G. Stroke rehabilitation. Lancet. 2011;377(9778):1693–702.

Koenig A, Omlin X, Bergmann J, Zimmerli L, Bolliger M, Müller F, Riener R. Controlling patient participation during robot-assisted gait training. J Neuroeng Rehabil. 2011;8(1):1–12.

Kim B, Deshpande AD. An upper-body rehabilitation exoskeleton harmony with an anatomical shoulder mechanism: design, modeling, control, and performance evaluation. Int J Robot Res. 2017;36(4):414–35.

Fisher BE, Sullivan KJ. Activity-dependent factors affecting poststroke functional outcomes. Top Stroke Rehabil. 2001;8(3):31–44.

Krakauer JW. Motor learning: its relevance to stroke recovery and neurorehabilitation. Curr Opin Neurol. 2006;19(1):84–90.

Winter DA. The biomechanics and motor control of human gait. University of Waterloo Press, 1987.

Mehrholz J, Thomas S, Kugler J, Pohl M, Elsner B. Electromechanical-assisted training for walking after stroke. Cochrane Database Syst Rev. 2020. .

Goffredo M, Iacovelli C, Russo E, Pournajaf S, Di Blasi C, Galafate D, Pellicciari L, Agosti M, Filoni S, Aprile I, et al. Stroke gait rehabilitation: a comparison of end-effector, overground exoskeleton, and conventional gait training. Appl Sci. 2019;9(13):2627.

Tedla JS, Dixit S, Gular K, Abohashrh M. Robotic-assisted gait training effect on function and gait speed in subacute and chronic stroke population: a systematic review and meta-analysis of randomized controlled trials. Eur Neurol. 2019;81:1–9.

Moucheboeuf G, Griffier R, Gasq D, Glize B, Bouyer L, Dehail P, Cassoudesalle H. Effects of robotic gait training after stroke: a meta-analysis. Ann Phys Rehabil Med. 2020;63(6):518–34.

Sczesny-Kaiser M, Trost R, Aach M, Schildhauer TA, Schwenkreis P, Tegenthoff M. A randomized and controlled crossover study investigating the improvement of walking and posture functions in chronic stroke patients using HAL exoskeleton-the HALESTRO study (HAL-exoskeleton stroke study). Front Neurosci. 2019;13:259.

Roth EJ, Merbitz C, Mroczek K, Dugan SA, Suh WW. Hemiplegic gait: relationships between walking speed and other temporal parameters1. Am J Phys Med Rehabil. 1997;76(2):128–33.

Trushkova N, Cochran O, Ermolina N, Zelano G. Is training with a focus on motor learning effective in improving body coordination in chronic post stroke patients? J Neurol Sci. 2021;429: 118583.

Marchal-Crespo L, Riener R. Robot-assisted gait training. Amsterdam: Elsevier; 2018. p. 227–40.

Marks D, Schweinfurther R, Dewor A, Huster T, Paredes LP, Zutter D, Möller JC. The Andago for overground gait training in patients with gait disorders after stroke-results from a usability study. Physiother Res Rep. 2019;2:1–8.

Zhang X, Yue Z, Wang J. Robotics in lower-limb rehabilitation after stroke. Behav Neurol. 2017. .

Chrif F, Nef T, Lungarella M, Dravid R, Hunt KJ. Control design for a lower-limb paediatric therapy device using linear motor technology. Biomed Signal Process Control. 2017;38:119–27.

Rodríguez-Fernández A, Lobo-Prat J, Font-Llagunes JM. Systematic review on wearable lower-limb exoskeletons for gait training in neuromuscular impairments. J Neuroeng Rehabil. 2021;18(1):1–21.

Baud R, Manzoori A, Ijspeert AJ, Bouri M. Review of control strategies for lower-limb exoskeletons to assist gait. J Neuroeng Rehabil. 2021. .

Li W-Z, Cao G-Z, Zhu A-B. Review on control strategies for lower limb rehabilitation exoskeletons. IEEE Access. 2021;9:123040–60.

Young AJ, Ferris DP. State of the art and future directions for lower limb robotic exoskeletons. IEEE Trans Neural Syst Rehabil Eng. 2016;25(2):171–82.

Meng W, Liu Q, Zhou Z, Ai Q, Sheng B, Xie SS. Recent development of mechanisms and control strategies for robot-assisted lower limb rehabilitation. Mechatronics. 2015;31:132–45.

Shi D, Zhang W, Zhang W, Ding X. A review on lower limb rehabilitation exoskeleton robots. Chin J Mech Eng. 2019;32(1):1–11.

Article   CAS   Google Scholar  

Contreras-Vidal JL, Bhagat NA, Brantley J, Cruz-Garza JG, He Y, Manley Q, Nakagome S, Nathan K, Tan SH, Zhu F, et al. Powered exoskeletons for bipedal locomotion after spinal cord injury. J Neural Eng. 2016;13(3): 031001.

Huo W, Mohammed S, Moreno JC, Amirat Y. Lower limb wearable robots for assistance and rehabilitation: a state of the art. IEEE Syst J. 2014;10(3):1068–81.

Esquenazi A, Talaty M. Robotics for lower limb rehabilitation. Phys Med Rehabil Clin. 2019;30(2):385–97.

Chen B, Ma H, Qin L-Y, Gao F, Chan K-M, Law S-W, Qin L, Liao W-H. Recent developments and challenges of lower extremity exoskeletons. J Orthop Transl. 2016;5:26–37.

del Carmen Sanchez-Villamañan M, Gonzalez-Vargas J, Torricelli D, Moreno JC, Pons JL. Compliant lower limb exoskeletons: a comprehensive review on mechanical design principles. J Neuroeng Rehabil. 2019;16(1):55.

Morone G, Paolucci S, Cherubini A, De Angelis D, Venturiero V, Coiro P, Iosa M. Robot-assisted gait training for stroke patients: current state of the art and perspectives of robotics. Neuropsychiatr Dis Treat. 2017;13:1303.

Weber LM, Stein J. The use of robots in stroke rehabilitation: a narrative review. NeuroRehabilitation. 2018;43(1):99–110.

Louie DR, Eng JJ. Powered robotic exoskeletons in post-stroke rehabilitation of gait: a scoping review. J Neuroeng Rehabil. 2016;13(1):53.

Marchal-Crespo L, Reinkensmeyer DJ. Review of control strategies for robotic movement training after neurologic injury. J Neuroeng Rehabil. 2009;6(1):20.

Tucker MR, Olivier J, Pagel A, Bleuler H, Bouri M, Lambercy O, del R Millán J, Riener R, Vallery H, Gassert R. Control strategies for active lower extremity prosthetics and orthotics: a review. J Neuroeng Rehabil. 2015;12(1):1.

Yan T, Cempini M, Oddo CM, Vitiello N. Review of assistive strategies in powered lower-limb orthoses and exoskeletons. Robot Auton Syst. 2015;64:120–36.

Chen B, Zi B, Qin L, Pan Q. State-of-the-art research in robotic hip exoskeletons: a general review. J Orthop Transl. 2019;20:4–13.

Li M, Xu G, Xie J, Chen C. A review: motor rehabilitation after stroke with control based on human intent. Proc Inst Mech Eng H J Eng Med. 2018;232(4):344–60.

Shi B, Chen X, Yue Z, Yin S, Weng Q, Zhang X, Wang J, Wen W. Wearable ankle robots in post-stroke rehabilitation of gait: a systematic review. Front Neurorobot. 2019;13:63.

Hobbs B, Artemiadis P. A review of robot-assisted lower-limb stroke therapy: unexplored paths and future directions in gait rehabilitation. Front Neurorobot. 2020;14:19.

Xiloyannis M, Alicea R, Georgarakis A-M, Haufe FL, Wolf P, Masia L, Riener R. Soft robotic suits: state of the art, core technologies, and open challenges. IEEE Trans Robot. 2021;38(3):1343–62.

Madhav MS, Cowan NJ. The synergy between neuroscience and control theory: the nervous system as inspiration for hard control challenges. Annu Rev Control Robot Auton Syst. 2020;3:243–67.

Palisano RJ, Rosenbaum P, Bartlett D, Livingston MH. Content validity of the expanded and revised gross motor function classification system. Dev Med Child Neurol. 2008;50(10):744–50.

Nilsson A, Vreede KS, Häglund V, Kawamoto H, Sankai Y, Borg J. Gait training early after stroke with a new exoskeleton-the hybrid assistive limb: a study of safety and feasibility. J Neuroeng Rehabil. 2014;11(1):1–11.

Van Nunen MPM, Gerrits KHL, Konijnenbelt M, Janssen TWJ, De Haan A. Recovery of walking ability using a robotic device in subacute stroke patients: a randomized controlled study. Disabil Rehabil Assist Technol. 2015;10(2):141–8. .

Wall A, Borg J, Vreede K, Palmcrantz S. A randomized controlled study incorporating an electromechanical gait machine, the hybrid assistive limb, in gait training of patients with severe limitations in walking in the subacute phase after stroke. PLoS ONE. 2020;15(2):0229707. .

Leon D, Cortes M, Elder J, Kumru H, Laxe S, Edwards DJ, Tormos JM, Bernabeu M, Pascual-Leone A. TDCS does not enhance the effects of robot-assisted gait training in patients with subacute stroke. Restor Neurol Neurosci. 2017;35(4):377–84. .

Husemann B, Müller F, Krewer C, Heller S, Koenig E. Effects of locomotion training with assistance of a robot-driven gait orthosis in hemiparetic patients after stroke: a randomized controlled pilot study. Stroke. 2007;38(2):349–54. .

Molteni F, Gasperini G, Gaffuri M, Colombo M, Giovanzana C, Lorenzon C, Farina N, Cannaviello G, Scarano S, Proserpio D, Liberali D, Guanziroli E. Wearable robotic exoskeleton for overground gait training in sub-acute and chronic hemiparetic stroke patients: preliminary results. Eur J Phys Rehabil Med. 2017;53(5):676–84. .

Haynes RB, Sackett DL, Richardson WS, Rosenberg W, Langley GR. Evidence-based medicine: how to practice & teach EBM. Can Med Assoc J. 1997;157(6):788.

Guyatt GH, Rennie D. Users’ guides to the medical literature. JAMA. 1993;270(17):2096–7.

Sullivan JE, Crowner BE, Kluding PM, Nichols D, Rose DK, Yoshida R, Pinto Zipp G. Outcome measures for individuals with stroke: process and recommendations from the American physical therapy association neurology section task force. Phys Ther. 2013;93(10):1383–96.

Bushnell C, Bettger JP, Cockroft KM, Cramer SC, Edelen MO, Hanley D, Katzan IL, Mattke S, Nilsen DM, Piquado T, et al. Chronic stroke outcome measures for motor function intervention trials: expert panel recommendations. Circ Cardiovasc Qual Outcomes. 2015;8(6–suppl–3):163–9.

Oeffinger D, Bagley A, Rogers S, Gorton G, Kryscio R, Abel M, Damiano D, Barnes D, Tylkowski C. Outcome tools used for ambulatory children with cerebral palsy: responsiveness and minimum clinically important differences. Dev Med Child Neurol. 2008;50(12):918–25.

Debuse D, Brace H. Outcome measures of activity for children with cerebral palsy: a systematic review. Pediatr Phys Ther. 2011;23(3):221–31.

Knox V, Vuoskoski P, Mandy A. Use of outcome measures in children with severe cerebral palsy: a survey of UK physiotherapists. Physiother Res Int. 2019;24(4):1786.

Ferre-Fernández M, Murcia-González MA, Espinosa MDB, Ríos-Díaz J. Measures of motor and functional skills for children with cerebral palsy: a systematic review. Pediatr Phys Ther. 2020;32(1):12–25.

Vargus-Adams JN. Outcome assessment and function in cerebral palsy. Phys Med Rehabil Clin N Am. 2019;31(1):131–41.

Proietti T, Crocher V, Roby-Brami A, Jarrasse N. Upper-limb robotic exoskeletons for neurorehabilitation: a review on control strategies. IEEE Rev Biomed Eng. 2016;9:4–14.

Basteris A, Nijenhuis SM, Stienen AH, Buurke JH, Prange GB, Amirabdollahian F. Training modalities in robot-mediated upper limb rehabilitation in stroke: a framework for classification based on a systematic review. J Neuroeng Rehabil. 2014;11(1):1–15.

Basalp E, Wolf P, Marchal-Crespo L. Haptic training: which types facilitate (re) learning of which motor task and for whom answers by a review. IEEE Trans Haptics. 2021;14(4):722–39.

Shepherd MK, Rouse EJ. Design and validation of a torque-controllable knee exoskeleton for sit-to-stand assistance. IEEE/ASME Transactions on Mechatronics. 2017;22(4):1695–704.

Lerner ZF, Damiano DL, Bulea TC. A robotic exoskeleton to treat crouch gait from cerebral palsy: initial kinematic and neuromuscular evaluation. In: 2016 38th annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE; 2016. p. 2214–7.

Thalman CM, Hertzell T, Lee H. Toward a soft robotic ankle-foot orthosis (sr-afo) exosuit for human locomotion: preliminary results in late stance plantarflexion assistance. In: 2020 3rd IEEE international conference on soft robotics (RoboSoft). IEEE; 2020. p. 801–7.

Rossini PM, Dal Forno G. Integrated technology for evaluation of brain function and neural plasticity. Phys Med Rehabil Clin. 2004;15(1):263–306.

Crespo LM, Reinkensmeyer DJ. Effect of robotic guidance on motor learning of a timing task. In: 2008 2nd IEEE RAS & EMBS international conference on biomedical robotics and biomechatronics. IEEE; 2008. p. 199–204.

Harkema SJ. Neural plasticity after human spinal cord injury: application of locomotor training to the rehabilitation of walking. Neuroscientist. 2001;7(5):455–68.

Hesse S, Kuhlmann H, Wilk J, Tomelleri C, Kirker SG. A new electromechanical trainer for sensorimotor rehabilitation of paralysed fingers: a case series in chronic and acute stroke patients. J Neuroeng Rehabil. 2008;5(1):1–6.

Reinkensmeyer DJ, Kahn LE, Averbuch M, McKenna-Cole A, Schmit BD, Rymer WZ. Understanding and treating arm movement impairment after chronic brain injury: progress with the arm guide. J Rehabil Res Dev. 2014;37(6):653–62.

Conner BC, Luque J, Lerner ZF. Adaptive ankle resistance from a wearable robotic device to improve muscle recruitment in cerebral palsy. Ann Biomed Eng. 2020;48:1–13.

Wei Y, Patton J, Bajaj P, Scheidt R. A real-time haptic/graphic demonstration of how error augmentation can enhance learning. In: Proceedings of the 2005 IEEE international conference on robotics and automation. IEEE; 2005. p. 4406–11.

Blanchette AK, Noël M, Richards CL, Nadeau S, Bouyer LJ. Modifications in ankle dorsiflexor activation by applying a torque perturbation during walking in persons post-stroke: a case series. J Neuroeng Rehabil. 2014;11(1):1–11. .

Veldema J, Jansen P. Resistance training in stroke rehabilitation: systematic review and meta-analysis. Clin Rehabil. 2020;34(9):1173–97.

Ouellette MM, LeBrasseur NK, Bean JF, Phillips E, Stein J, Frontera WR, Fielding RA. High-intensity resistance training improves muscle strength, self-reported function, and disability in long-term stroke survivors. Stroke. 2004;35(6):1404–9.

Lamberti N, Straudi S, Malagoni AM, Argirò M, Felisatti M, Nardini E, Zambon C, Basaglia N, Manfredini F. Effects of low-intensity endurance and resistance training on mobility in chronic stroke survivors: a pilot randomized controlled study. Eur J Phys Rehabil Med. 2016;53(2):228–39.

PubMed   Google Scholar  

Li Y, Lamontagne A, et al. The effects of error-augmentation versus error-reduction paradigms in robotic therapy to enhance upper extremity performance and recovery post-stroke: a systematic review. J Neuroeng Rehabil. 2018;15(1):1–25.

Fricke SS, Smits HJ, Bayón C, Buurke JH, van der Kooij H, van Asseldonk EH. Effects of selectively assisting impaired subtasks of walking in chronic stroke survivors. J Neuroeng Rehabil. 2020;17(1):1–13.

Fricke SS, Bayón C, Der Kooij HV, Van Asseldonk EHF. Automatic versus manual tuning of robot-assisted gait training in people with neurological disorders. J Neuroeng Rehabil. 2020;17(1):1–15. .

Gandolla M, Guanziroli E, D’Angelo A, Cannaviello G, Molteni F, Pedrocchi A. Automatic setting procedure for exoskeleton-assisted overground gait: proof of concept on stroke population. Front Neurorobot. 2018;12(MAR):1–11. .

Orekhov G, Fang Y, Cuddeback CF, Lerner ZF. Usability and performance validation of an ultra-lightweight and versatile untethered robotic ankle exoskeleton. J Neuroeng Rehabil. 2021;18(1):1–16.

de Miguel-Fernández J, Pescatore C, Mesa-Garrido A, Rikhof C, Prinsen E, Font-Llagunes JM, Lobo-Prat J. Immediate biomechanical effects of providing adaptive assistance with an ankle exoskeleton in individuals after stroke. IEEE Robot Autom Lett. 2022;7(3):7574–80.

Atashzar SF, Shahbazi M, Patel RV. Haptics-enabled interactive neurorehabilitation mechatronics: classification, functionality, challenges and ongoing research. Mechatronics. 2019;57:1–19.

Siviy C, Bae J, Baker L, Porciuncula F, Baker T, Ellis TD, Awad LN, Walsh CJ. Offline assistance optimization of a soft exosuit for augmenting ankle power of stroke survivors during walking. IEEE Robot Autom Lett. 2020;5(2):828–35. .

Poggensee KL, Collins SH. How adaptation, training, and customization contribute to benefits from exoskeleton assistance. bioRxiv. 2021. .

Hassan M, Kadone H, Ueno T, Hada Y, Sankai Y, Suzuki K. Feasibility of synergy-based exoskeleton robot control in hemiplegia. IEEE Trans Neural Syst Rehabil Eng. 2018;26(6):1233–42. .

Zhu H, Nesler C, Divekar N, Peddinti V, Gregg R. Design principles for compact, backdrivable actuation in partial-assist powered knee orthoses. IEEE/ASME Trans Mechatron. 2021;26(6):3104–15.

Puyuelo-Quintana G, Cano-de-la-Cuerda R, Plaza-Flores A, Garces-Castellote E, Sanz-Merodio D, Goni-Arana A, Marin-Ojea J, Garcia-Armada E. A new lower limb portable exoskeleton for gait assistance in neurological patients: a proof of concept study. J Neuroeng Rehabil. 2020;17(1):1–16. .

Kawamoto H, Taal S, Niniss H, Hayashi T, Kamibayashi K, Eguchi K, Sankai Y. Voluntary motion support control of robot suit HAL triggered by bioelectrical signal for hemiplegia. In: Conference proceedings: 2010 annual international conference of the IEEE engineering in medicine and biology society. 2010. p. 462–6. .

Gui K, Liu H, Zhang D. Toward multimodal human–robot interaction to enhance active participation of users in gait rehabilitation. IEEE Trans Neural Syst Rehabil Eng. 2017;25(11):2054–66. .

Grimmer M, Schmidt K, Duarte JE, Neuner L, Koginov G, Riener R. Stance and swing detection based on the angular velocity of lower limb segments during walking. Front Neurorobot. 2019;13:57.

He Y, Eguren D, Luu TP, Contreras-Vidal JL. Risk management and regulations for lower limb medical exoskeletons: a review. Med Devices. 2017;10:89.

Alaoui OM, Expert F, Morel G, Jarrassé N. Using generic upper-body movement strategies in a free walking setting to detect gait initiation intention in a lower-limb exoskeleton. IEEE Trans Med Robot Bionics. 2020;2(2):236–47.

Chen G, Qi P, Guo Z, Yu H. Gait-event-based synchronization method for gait rehabilitation robots via a bioinspired adaptive oscillator. IEEE Trans Biomed Eng. 2016;64(6):1345–56.

Miyake T, Kobayashi Y, Fujie MG, Sugano S. Timing of intermittent torque control with wire-driven gait training robot lifting toe trajectory for trip avoidance. In: 2017 international conference on rehabilitation robotics (ICORR). IEEE; 2017. p. 320–5.

Nomura S, Takahashi Y, Sahashi K, Murai S, Kawai M, Taniai Y, Naniwa T. Power assist control based on human motion estimation using motion sensors for powered exoskeleton without binding legs. Appl Sci. 2019;9(1):164.

Gurriet T, Tucker M, Duburcq A, Boeris G, Ames AD. Towards variable assistance for lower body exoskeletons. IEEE Robot Autom Lett. 2019;5(1):266–73.

Laschowski B, McNally W, Wong A, McPhee J. Environment classification for robotic leg prostheses and exoskeletons using deep convolutional neural networks. bioRxiv. 2021. .

Aguirre-Ollinger G, Narayan A, Yu H. Phase-synchronized assistive torque control for the correction of kinematic anomalies in the gait cycle. IEEE Trans Neural Syst Rehabil Eng. 2019;27(11):2305–14.

Marder E, Bucher D. Central pattern generators and the control of rhythmic movements. Curr Biol. 2001;11(23):986–96.

Aguirre-Ollinger G. Exoskeleton control for lower-extremity assistance based on adaptive frequency oscillators: adaptation of muscle activation and movement frequency. Proc Inst Mech Eng H J Eng Med. 2015;229(1):52–68.

De La Fuente J, Subramanian SC, Sugar TG, Redkar S. A robust phase oscillator design for wearable robotic systems. Robot Auton Syst. 2020;128: 103514.

Ronsse R, Lenzi T, Vitiello N, Koopman B, Van Asseldonk E, De Rossi SMM, Van Den Kieboom J, Van Der Kooij H, Carrozza MC, Ijspeert AJ. Oscillator-based assistance of cyclical movements: model-based and model-free approaches. Med Biol Eng Comput. 2011;49(10):1173.

Huang C, Li Y, Yao X. A survey of automatic parameter tuning methods for metaheuristics. IEEE Trans Evol Comput. 2019;24(2):201–16.

Schicketmueller A, Rose G, Hofmann M. Feasibility of a sensor-based gait event detection algorithm for triggering functional electrical stimulation during robot-assisted gait training. Sensors. 2019;19(21):4804.

Seel T, Landgraf L, Schauer T. Online gait phase detection with automatic adaption to gait velocity changes using accelerometers and gyroscopes. Biomed Tech. 2014;59:795–8.

Muller P, Steel T, Schauer T. Experimental evaluation of a novel inertial sensor based realtime gait phase detection algorithm. In: Proceedings of the technically assisted rehabilitation conference. 2015.

Franks PW, Bryan GM, Martin RM, Reyes R, Collins SH. Comparing optimized exoskeleton assistance of the hip, knee, and ankle in single and multi-joint configurations. bioRxiv. 2021.

Lora-Millan JS, Sanchez-Cuesta FJ, Romero JP, Moreno JC, Rocon E. A unilateral robotic knee exoskeleton to assess the role of natural gait assistance in hemiparetic patients. J NeuroEng Rehabil. 2021;19(1):109.

Emken JL, Harkema SJ, Beres-Jones JA, Ferreira CK, Reinkensmeyer DJ. Feasibility of manual teach-and-replay and continuous impedance shaping for robotic locomotor training following spinal cord injury. IEEE Trans Biomed Eng. 2007;55(1):322–34.

Manchola MDS, Mayag LJA, Munera M, García CAC. Impedance-based backdrivability recovery of a lower-limb exoskeleton for knee rehabilitation. In: 2019 IEEE 4th Colombian conference on automatic control (CCAC). IEEE; 2019. p. 1–6.

Gordleeva SY, Lobov SA, Grigorev NA, Savosenkov AO, Shamshin MO, Lukoyanov MV, Khoruzhko MA, Kazantsev VB. Real-time EEG–EMG human–machine interface-based control system for a lower-limb exoskeleton. IEEE Access. 2020;8:84070–81.

Gordon KE, Ferris DP. Learning to walk with a robotic ankle exoskeleton. J Biomech. 2007;40(12):2636–44.

McCain EM, Dick TJM, Giest TN, Nuckols RW, Lewek MD, Saul KR, Sawicki GS. Mechanics and energetics of post-stroke walking aided by a powered ankle exoskeleton with speed-adaptive myoelectric control. J Neuroeng Rehabil. 2019;16:1–12. .

Tan CK, Kadone H, Watanabe H, Marushima A, Yamazaki M, Sankai Y, Suzuki K. Lateral symmetry of synergies in lower limb muscles of acute post-stroke patients after robotic intervention. Front Neurosci. 2018;12:276. .

Benabid AL, Costecalde T, Eliseyev A, Charvet G, Verney A, Karakas S, Foerster M, Lambert A, Morinière B, Abroug N, et al. An exoskeleton controlled by an epidural wireless brain-machine interface in a tetraplegic patient: a proof-of-concept demonstration. Lancet Neurol. 2019;18(12):1112–22.

Xu R, Jiang N, Mrachacz-Kersting N, Lin C, Prieto GA, Moreno JC, Pons JL, Dremstrup K, Farina D. A closed-loop brain–computer interface triggering an active ankle-foot orthosis for inducing cortical neural plasticity. IEEE Trans Biomed Eng. 2014;61(7):2092–101.

Calanca A, Muradore R, Fiorini P. A review of algorithms for compliant control of stiff and fixed-compliance robots. IEEE/ASME Trans Mechatron. 2015;21(2):613–24.

Schumacher M, Wojtusch J, Beckerle P, von Stryk O. An introductory review of active compliant control. Robot Auton Syst. 2019;119:185–200.

Nagarajan U, Aguirre-Ollinger G, Goswami A. Integral admittance shaping: a unified framework for active exoskeleton control. Robot Auton Syst. 2016;75:310–24.

Liang W. Mechanical design and control strategy for hip joint power assisting. J Healthc Eng. 2018. .

Aguirre-Ollinger G, Colgate JE, Peshkin MA, Goswami A. Active-impedance control of a lower-limb assistive exoskeleton. In: 2007 IEEE 10th international conference on rehabilitation robotics. IEEE; 2007. p. 188–95.

Martinez A, Lawson B, Goldfarb M. A velocity-based flow field control approach for reshaping movement of stroke-impaired individuals with a lower-limb exoskeleton. In: Conference proceedings: 2018 annual international conference of the IEEE engineering in medicine and biology society. 2018. p. 2797–800. .

Lotti N, Xiloyannis M, Durandau G, Galofaro E, Sanguineti V, Masia L, Sartori M. Adaptive model-based myoelectric control for a soft wearable arm exosuit: a new generation of wearable robot control. IEEE Robot Autom Mag. 2020;27(1):43–53.

Adams RJ, Hannaford B. Stable haptic interaction with virtual environments. IEEE Trans Robot Autom. 1999;15(3):465–74.

Hogan N. Impedance control: an approach to manipulation: Part I—theory. 1985.

Cramer SC, Sur M, Dobkin BH, O’brien C, Sanger TD, Trojanowski JQ, Rumsey JM, Hicks R, Cameron J, Chen D, et al. Harnessing neuroplasticity for clinical applications. Brain. 2011;134(6):1591–609.

Escalona MJ, Bourbonnais D, Goyette M, Duclos C, Gagnon DH. Wearable exoskeleton control modes selected during overground walking affect muscle synergies in adults with a chronic incomplete spinal cord injury. Spinal Cord Ser Cases. 2020;6(1):1–9.

Oyake K, Suzuki M, Otaka Y, Tanaka S. Motivational strategies for stroke rehabilitation: a descriptive cross-sectional study. Front Neurol. 2020;11:553.

Schmidt RA, Young DE, Swinnen S, Shapiro DC. Summary knowledge of results for skill acquisition: support for the guidance hypothesis. J Exp Psychol Learn Mem Cognit. 1989;15(2):352.

Lv G, Zhu H, Gregg RD. On the design and control of highly backdrivable lower-limb exoskeletons: a discussion of past and ongoing work. IEEE Control Syst Mag. 2018;38(6):88–113. .

Lotze M, Braun C, Birbaumer N, Anders S, Cohen LG. Motor learning elicited by voluntary drive. Brain. 2003;126(4):866–72.

Conner BC, Luque J, Lerner ZF. Adaptive ankle resistance from a wearable robotic device to improve muscle recruitment in cerebral palsy. Ann Biomed Eng. 2020;48(4):1309–21. .

Yen S-C, Schmit BD, Wu M. Using swing resistance and assistance to improve gait symmetry in individuals post-stroke. Hum Mov Sci. 2015;42:212–24. .

Asin-Prieto G, Martinez-Exposito A, Barroso FO, Urendes EJ, Gonzalez-Vargas J, Alnajjar FS, Gonzalez-Alted C, Shimoda S, Pons JL, Moreno JC. Haptic adaptive feedback to promote motor learning with a robotic ankle exoskeleton integrated with a video game. Front Bioeng Biotechnol. 2020. .

Kao PC, Srivastava S, Higginson JS, Agrawal SK, Scholz JP. Short-term performance-based error-augmentation versus error-reduction robotic gait training for individuals with chronic stroke: a pilot study. Phys Med Rehabil Int. 2015;2(9):1066.

PubMed   PubMed Central   Google Scholar  

Koopman B, Van Asseldonk EHF, Van Der Kooij H. Selective control of gait subtasks in robotic gait training: foot clearance support in stroke survivors with a powered exoskeleton. J Neuroeng Rehabil. 2013;10(1):1–21. .

Blaya JA, Herr H. Adaptive control of a variable-impedance ankle-foot orthosis to assist drop-foot gait. IEEE Trans Neural Syst Rehabil Eng. 2004;12(1):24–31. .

Arnez-Paniagua V, Rifaï H, Amirat Y, Ghedira M, Gracies JM, Mohammed S. Adaptive control of an actuated ankle foot orthosis for paretic patients. Control Eng Pract. 2019;90:207–20. .

Cecilia Villa-Parra A, Lima J, Delisle-Rodriguez D, Vargas-Valencia L, Frizera-Neto A, Bastos T. Assessment of an assistive control approach applied in an active knee orthosis plus walker for post-stroke gait rehabilitation. Sensors. 2020;20(9):2452. .

Yeung LF, Ockenfeld C, Pang MK, Wai HW, Soo OY, Li SW, Tong KY. Randomized controlled trial of robot-assisted gait training with dorsiflexion assistance on chronic stroke patients wearing ankle-foot-orthosis. J Neuroeng Rehabil. 2018;15(1):1–12. .

Mizukami N, Takeuchi S, Tetsuya M, Tsukahara A, Yoshida K, Matsushima A, Maruyama Y, Tako K, Hashimoto M. Effect of the synchronization-based control of a wearable robot having a non-exoskeletal structure on the hemiplegic gait of stroke patients. IEEE Trans Neural Syst Rehabil Eng. 2018;26(5):1011–6. .

Orekhov G, Fang Y, Luque J, Lerner ZF. Ankle exoskeleton assistance can improve over-ground walking economy in individuals with cerebral palsy. IEEE Trans Neural Syst Rehabil Eng. 2020;28(2):461–7.

Sulzer JS, Roiz RA, Peshkin MA, Patton JL. A highly backdrivable, lightweight knee actuator for investigating gait in stroke. IEEE Trans Robot. 2009;25(3):539–48. .

Lerner ZF, Damiano DL, Bulea TC. A lower-extremity exoskeleton improves knee extension in children with crouch gait from cerebral palsy. Sci Transl Med. 2017;9(404):9145. .

Allen JL, Kautz SA, Neptune RR. Step length asymmetry is representative of compensatory mechanisms used in post-stroke hemiparetic walking. Gait Posture. 2011;33(4):538–43.

Kerrigan DC, Frates EP, Rogan S, Riley PO. Hip hiking and circumduction: quantitative definitions. Am J Phys Med Rehabil. 2000;79(3):247–52.

Lewek MD, Sawicki GS. Trailing limb angle is a surrogate for propulsive limb forces during walking post-stroke. Clin Biomech. 2019;67:115–8.

Buesing C, Fisch G, O’Donnell M, Shahidi I, Thomas L, Mummidisetty CK, Williams KJ, Takahashi H, Rymer WZ, Jayaraman A. Effects of a wearable exoskeleton stride management assist system (SMA®) on spatiotemporal gait characteristics in individuals after stroke: a randomized controlled trial. J Neuroeng Rehabil. 2015;12(1):1–14. .

Kawamoto H, Kadone H, Sakurai T, Sankai Y. Modification of hemiplegic compensatory gait pattern by symmetry-based motion controller of HAL. In: 2015 37th annual international conference of the IEEE engineering in medicine and biology society (EMBC). 2015. p. 4803–7.

Lee H-J, Lee S-H, Seo K, Lee M, Chang WH, Choi B-O, Ryu G-H, Kim Y-H. Training for walking efficiency with a wearable hip-assist robot in patients with stroke a pilot randomized controlled trial. Stroke. 2019;50(12):3545–52. .

Seo HG, Lee WH, Lee SH, Yi Y, Kim KD, Oh B-M. Robotic-assisted gait training combined with transcranial direct current stimulation in chronic stroke patients: a pilot double-blind, randomized controlled trial. Restor Neurol Neurosci. 2017;35(5):527–36. .

Jung C, Jung S, Chun MH, Lee JM, Park S, Kim S-J. Development of gait rehabilitation system capable of assisting pelvic movement of normal walking. Acta Med Okayama. 2018;72(4):407–17.

Duschau-Wicke A, Von Zitzewitz J, Caprez A, Lunenburger L, Riener R. Path control: a method for patient-cooperative robot-aided gait rehabilitation. IEEE Trans Neural Syst Rehabil Eng. 2009;18(1):38–48.

Hidayah R, Bishop L, Jin X, Chamarthy S, Stein J, Agrawal SK. Gait adaptation using a cable-driven active leg exoskeleton (C-ALEX) with post-stroke participants. IEEE Trans Neural Syst Rehabil Eng. 2020;28(9):1984–93. .

Bayón C, Lerma S, Ramírez O, Serrano JI, Del Castillo MD, Raya R, Belda-Lois JM, Martínez I, Rocon E. Locomotor training through a novel robotic platform for gait rehabilitation in pediatric population: short report. J Neuroeng Rehabil. 2016;13(1):1–6. .

Banala SK, Kim SH, Agrawal SK, Scholz JP. Robot assisted gait training with active leg exoskeleton (ALEX). In: Proceedings of the 2nd biennial IEEE/RAS-EMBS international conference on biomedical robotics and biomechatronics, BioRob 2008. 2008. p. 653–8. .

Wei D, Li Z, Wei Q, Su H, Song B, He W, Li J. Human-in-the-loop control strategy of unilateral exoskeleton robots for gait rehabilitation. IEEE Trans Cogn Dev Syst. 2019;13(1):57–66.

Kaku A, Parnandi A, Venkatesan A, Pandit N, Schambra H, Fernandez-Granda C. Towards data-driven stroke rehabilitation via wearable sensors and deep learning. arXiv preprint. 2020. arXiv:2004.08297 .

Rupal BS, Rafique S, Singla A, Singla E, Isaksson M, Virk GS. Lower-limb exoskeletons: research trends and regulatory guidelines in medical and non-medical applications. Int J Adv Robot Syst. 2017;14(6):1729881417743554.

Vu HTT, Dong D, Cao H-L, Verstraten T, Lefeber D, Vanderborght B, Geeroms J. A review of gait phase detection algorithms for lower limb prostheses. Sensors. 2020;20(14):3972.

Bhakta K, Camargo J, Donovan L, Herrin K, Young A. Machine learning model comparisons of user independent & dependent intent recognition systems for powered prostheses. IEEE Robot Autom Lett. 2020;5(4):5393–400.

Tura A, Raggi M, Rocchi L, Cutti AG, Chiari L. Gait symmetry and regularity in transfemoral amputees assessed by trunk accelerations. J Neuroeng Rehabil. 2010;7(1):1–10.

Highsmith MJ, Schulz BW, Hart-Hughes S, Latlief GA, Phillips SL. Differences in the spatiotemporal parameters of transtibial and transfemoral amputee gait. JPO. 2010;22(1):26–30.

Vanicek N, Strike S, McNaughton L, Polman R. Gait patterns in transtibial amputee fallers vs. non-fallers: biomechanical differences during level walking. Gait Posture. 2009;29(3):415–20.

Fluit R, Prinsen EC, Wang S, van der Kooij H. A comparison of control strategies in commercial and research knee prostheses. IEEE Trans Biomed Eng. 2019;67(1):277–90.

Tan X, Zhang B, Liu G, Zhao X, Zhao Y. Cadence-insensitive soft exoskeleton design with adaptive gait state detection and iterative force control. IEEE Trans Autom Sci Eng. 2021;19(3):2108–21.

Park JS, Lee CM, Koo S-M, Kim CH. Gait phase detection using force sensing resistors. IEEE Sens J. 2020;20(12):6516–23.

Kawamoto H, Hayashi T, Sakurai T, Eguchi K, Sankai Y. Development of single leg version of hal for hemiplegia. In: 2009 annual international conference of the IEEE engineering in medicine and biology society. IEEE; 2009. p. 5038–43.

Calanca A, Piazza S, Fiorini P. A motor learning oriented, compliant and mobile gait orthosis. Appl Bionics Biomech. 2012;9(1):15–27. .

Bortole M, Venkatakrishnan A, Zhu F, Moreno JC, Francisco GE, Pons JL, Contreras-Vidal JL. The H2 robotic exoskeleton for gait rehabilitation after stroke: Early findings from a clinical study Wearable robotics in clinical testing. J Neuroeng Rehabil. 2015;12(1):1–14. .

Kim SJ, Na Y, Lee DY, Chang H, Kim J. Pneumatic AFO powered by a miniature custom compressor for drop foot correction. IEEE Trans Neural Syst Rehabil Eng. 2020;28(8):1781–9. .

Nakagawa K, Tomoi M, Higashi K, Utsumi S, Kawano R, Tanaka E, Kurisu K, Yuge L. Short-term effect of a close-fitting type of walking assistive device on spinal cord reciprocal inhibition. J Clin Neurosci. 2020;77:142–7. .

Martínez A, Durrough C, Goldfarb M. A single-joint implementation of flow control: knee joint walking assistance for individuals with mobility impairment. IEEE Trans Neural Syst Rehabil Eng. 2020;28(4):934–42. .

Strausser KA. Development of a human machine interface for a wearable exoskeleton for users with spinal cord injury. Ph.D. thesis, UC Berkeley; 2011.

Ward J, Sugar T, Boehler A, Standeven J, Engsberg JR. Stroke survivors’ gait adaptations to a powered ankle-foot orthosis. Adv Robot. 2011;25(15):1879–901. .

Kwon J, Park J-H, Ku S, Jeong Y, Paik N-J, Park Y-L. A soft wearable robotic ankle-foot-orthosis for post-stroke patients. IEEE Robot Autom Lett. 2019;4(3):2547–52. .

Yeung L-F, Ockenfeld C, Pang M-K, Wai H-W, Soo O-Y, Li S-W, Tong K-Y. Design of an exoskeleton ankle robot for robot-assisted gait training of stroke patients. In: 2017 international conference on rehabilitation robotics (ICORR). IEEE; 2017. p. 211–5.

Kim JY, Hwang SJ, Kim YH. Development of an active ankle-foot orthosis for hemiplegic patients. In: i-CREATe 2007—proceedings of the 1st international convention on rehabilitation engineering and assistive technology in conjunction with 1st Tan Tock Seng hospital neurorehabilitation meeting. 2007. p. 110–3. .

Forrester LW, Roy A, Hafer-Macko C, Krebs HI, Macko RF. Task-specific ankle robotics gait training after stroke: a randomized pilot study. J Neuroeng Rehabil. 2016;13(1):51. .

Li Y, Hashimoto M. PVC gel soft actuator-based wearable assist wear for hip joint support during walking. Smart Mater Struct. 2017;26(12):125003. .

Swift TA, Strausser KA, Zoss AB, Kazerooni H. Control and experimental results for post stroke gait rehabilitation with a prototype mobile medical exoskeleton. In: Dynamic systems and control conference, vol. 44175. 2010. p. 405–11.

Patane F, Rossi S, Del Sette F, Taborri J, Cappa P. WAKE-up exoskeleton to assist children with cerebral palsy: design and preliminary evaluation in level walking. IEEE Trans Neural Syst Rehabil Eng. 2017;25(7):906–16. .

Graf ES, Bauer CM, Power V, de Eyto A, Bottenberg E, Poliero T, Sposito M, Scherly D, Henke R, Pauli C, et al. Basic functionality of a prototype wearable assistive soft exoskeleton for people with gait impairments: a case study. In: Proceedings of the 11th pervasive technologies related to assistive environments conference. 2018. p. 202–7.

Takahashi KZ, Lewek MD, Sawicki GS. A neuromechanics-based powered ankle exoskeleton to assist walking post-stroke: a feasibility study. J Neuroeng Rehabil. 2015;12(1):1–13. .

Lawrence SJ, Botte MJ. Management of the adult, spastic, equinovarus foot deformity. Foot Ankle Int. 1994;15(6):340–6.

Burnfield M. Gait analysis: normal and pathological function. J Sports Sci Med. 2010;9(2):353.

Sullivan JE, Hedman LD. Sensory dysfunction following stroke: incidence, significance, examination, and intervention. Top Stroke Rehabil. 2008;15(3):200–17.

O’Sullivan SB, Schmitz TJFA. Davis PT collection. Philadelphia: F. A Davis Company; 1994.

Vallery H, Veneman J, Van Asseldonk E, Ekkelenkamp R, Buss M, Van Der Kooij H. Compliant actuation of rehabilitation robots. IEEE Robot Autom Mag. 2008;15(3):60–9.

Tariq M, Trivailo PM, Simic M. EEG-based BCI control schemes for lower-limb assistive-robots. Front Hum Neurosci. 2018;12:312.

He Y, Eguren D, Azorín JM, Grossman RG, Luu TP, Contreras-Vidal JL. Brain–machine interfaces for controlling lower-limb powered robotic systems. J Neural Eng. 2018;15(2): 021004.

Frolov AA, Mokienko O, Lyukmanov R, Biryukova E, Kotov S, Turbina L, Nadareyshvily G, Bushkova Y. Post-stroke rehabilitation training with a motor-imagery-based brain-computer interface (BCI)-controlled hand exoskeleton: a randomized controlled multicenter trial. Front Neurosci. 2017;11:400.

López-Larraz E, Trincado-Alonso F, Rajasekaran V, Pérez-Nombela S, Del-Ama AJ, Aranda J, Minguez J, Gil-Agudo A, Montesano L. Control of an ambulatory exoskeleton with a brain–machine interface for spinal cord injury gait rehabilitation. Front Neurosci. 2016;10:359.

Balasubramanian S, Garcia-Cossio E, Birbaumer N, Burdet E, Ramos-Murguialday A. Is EMG a viable alternative to BCI for detecting movement intention in severe stroke? IEEE Trans Biomed Eng. 2018;65(12):2790–7.

Maeshima S, Osawa A, Nishio D, Hirano Y, Takeda K, Kigawa H, Sankai Y. Efficacy of a hybrid assistive limb in post-stroke hemiplegic patients: a preliminary report. BMC Neurol. 2011;11(1):116.

Prasanth H, Caban M, Keller U, Courtine G, Ijspeert A, Vallery H, Von Zitzewitz J. Wearable sensor-based real-time gait detection: a systematic review. Sensors. 2021;21(8):2727.

Seo K, Park YJ, Lee J, Hyung S, Lee M, Kim J, Choi H, Shim Y. RNN-based on-line continuous gait phase estimation from shank-mounted IMUs to control ankle exoskeletons. In: 2019 IEEE 16th international conference on rehabilitation robotics (ICORR). IEEE; 2019. p. 809–15.

Visscher RM, Sansgiri S, Freslier M, Harlaar J, Brunner R, Taylor WR, Singh NB. Towards validation and standardization of automatic gait event identification algorithms for use in paediatric pathological populations. Gait Posture. 2021;86:64–9.

Yang S, Zhang J-T, Novak AC, Brouwer B, Li Q. Estimation of spatio-temporal parameters for post-stroke hemiparetic gait using inertial sensors. Gait Posture. 2013;37(3):354–8.

Bae J, Awad LN, Long A, O’Donnell K, Hendron K, Holt KG, Ellis TD, Walsh CJ. Biomechanical mechanisms underlying exosuit-induced improvements in walking economy after stroke. J Exp Biol. 2018. .

Van Kammen K, Boonstra AM, Van Der Woude LHV, Reinders-Messelink HA, Den Otter R. Differences in muscle activity and temporal step parameters between Lokomat guided walking and treadmill walking in post-stroke hemiparetic patients and healthy walkers. J Neuroeng Rehabil. 2017;14(1):1–11. .

Fleming A, Stafford N, Huang S, Hu X, Ferris DP, Huang HH. Myoelectric control of robotic lower limb prostheses: a review of electromyography interfaces, control paradigms, challenges and future directions. J Neural Eng. 2021;18(4):041004.

He Y, Nathan K, Venkatakrishnan A, Rovekamp R, Beck C, Ozdemir R, Francisco GE, Contreras-Vidal JL. An integrated neuro-robotic interface for stroke rehabilitation using the NASA x1 powered lower limb exoskeleton. In: 2014 36th annual international conference of the IEEE engineering in medicine and biology society; IEEE. 2014. p. 3985–8.

García-Cossio E, Severens M, Nienhuis B, Duysens J, Desain P, Keijsers N, Farquhar J. Decoding sensorimotor rhythms during robotic-assisted treadmill walking for brain computer interface (BCI) applications. PLoS ONE. 2015;10(12):0137910.

Lapitskaya N, Nielsen JF, Fuglsang-Frederiksen A. Robotic gait training in patients with impaired consciousness due to severe traumatic brain injury. Brain Injury. 2011;25(11):1070–9. .

Esquenazi A, Lee S, Wikoff A, Packel A, Toczylowski T, Feeley J. A comparison of locomotor therapy interventions: partial-body weight- supported treadmill, Lokomat, and G-EO training in people with traumatic brain injury. PM &R. 2017;9(9):839–46.

Kawamoto H, Kamibayashi K, Nakata Y, Yamawaki K, Ariyasu R, Sankai Y, Sakane M, Eguchi K, Ochiai N. Pilot study of locomotion improvement using hybrid assistive limb in chronic stroke patients. BMC Neurol. 2013;13:141. .

Yoshimoto T, Shimizu I, Hiroi Y, Kawaki M, Sato D, Nagasawa M. Feasibility and efficacy of high-speed gait training with a voluntary driven exoskeleton robot for gait and balance dysfunction in patients with chronic stroke: nonrandomized pilot study with concurrent control. Int J Rehabil Res. 2015;38(4):338–43. .

Forrester LW, Roy A, Krywonis A, Kehs G, Krebs HI, Macko RF. Modular ankle robotics training in early subacute stroke: a randomized controlled pilot study. Neurorehabil Neural Repair. 2014;28(7):678–87. .

...Watanabe H, Marushima A, Kadone H, Ueno T, Shimizu Y, Kubota S, Hino T, Sato M, Ito Y, Hayakawa M, Tsurushima H, Takada T, Tsukada A, Fujimori H, Sato N, Maruo K, Kawamoto H, Hada Y, Yamazaki M, Sankai Y, Ishikawa E, Matsumaru Y, Matsumura A. Effects of gait treatment with a single-leg hybrid assistive limb system after acute stroke: a non-randomized clinical trial. Front Neurosci. 2020. .

Fukuda H, Samura K, Hamada O, Saita K, Ogata T, Shiota E, Sankai Y, Inoue T. Effectiveness of acute phase hybrid assistive limb rehabilitation in stroke patients classified by paralysis severity. Neurologia Medico-Chirurgica. 2015;55(6):487–92. .

Taki S, Imura T, Iwamoto Y, Imada N, Tanaka R, Araki H, Araki O. Effects of exoskeletal lower limb robot training on the activities of daily living in stroke patients: retrospective pre-post comparison using propensity score matched analysis. J Stroke Cerebrovasc Dis. 2020;29(10):105176. .

Tan CK, Kadone H, Watanabe H, Marushima A, Hada Y, Yamazaki M, Sankai Y, Matsumura A, Suzuki K. Differences in muscle synergy symmetry between subacute post-stroke patients with bioelectrically-controlled exoskeleton gait training and conventional gait training. Front Bioeng Biotechnol. 2020;8:770. .

Watanabe H, Goto R, Tanaka N, Matsumura A, Yanagi H. Effects of gait training using the hybrid assistive limb® in recovery-phase stroke patients: a 2-month follow-up, randomized, controlled study. NeuroRehabilitation. 2017;40(3):363–7. .

Yoshikawa K, Mizukami M, Kawamoto H, Sano A, Koseki K, Sano K, Asakawa Y, Kohno Y, Nakai K, Gosho M, Tsurushima H. Gait training with hybrid assistive limb enhances the gait functions in subacute stroke patients: a pilot study. NeuroRehabilitation. 2017;40(1):87–97. .

Watanabe H, Tanaka N, Inuta T, Saitou H, Yanagi H. Locomotion improvement using a hybrid assistive limb in recovery phase stroke patients: a randomized controlled pilot study. Arch Phys Med Rehabil. 2014;95(11):2006–12. .

Kim SJ, Lee HJ, Hwang SW, Pyo H, Yang SP, Lim M-H, Park GL, Kim EJ. Clinical characteristics of proper robot-assisted gait training group in non-ambulatory subacute stroke patients. Ann Rehabil Med. 2016;40(2):183–9. .

...Goffredo M, Guanziroli E, Pournajaf S, Gaffuri M, Gasperini G, Filoni S, Baratta S, Damiani C, Franceschini M, Molteni F, Befani S, Cannaviello G, Colombo M, Criscuolo S, De Pisi F, Gabbani D, Galafate D, Gattini D, Gison A, Giovanzana C, Giuliani C, Infantino D, Infarinato F, Le Pera D, Lorenzon C, Magoni L, Marella R, Marino MT, Petruccelli S, Piermarini B, Riolo S, Riommi M, Romano P, Russo EF, Russo M, D’Elia TS, Schiatti R, Vitullo V. Overground wearable powered exoskeleton for gait training in subacute stroke subjects: clinical and gait assessments. Eur J Phys Rehabil Med. 2019;55(6):710–21. .

Mayr A, Kofler M, Quirbach E, Matzak H, Fröhlich K, Saltuari L. Prospective, blinded, randomized crossover study of gait rehabilitation in stroke patients using the Lokomat gait orthosis. Neurorehabil Neural Repair. 2007;21(4):307–14. .

Cesqui B, Tropea P, Micera S, Krebs HI. EMG-based pattern recognition approach in post stroke robot-aided rehabilitation: a feasibility study. J Neuroeng Rehabil. 2013;10(1):1–15.

Lee SW, Wilson KM, Lock BA, Kamper DG. Subject-specific myoelectric pattern classification of functional hand movements for stroke survivors. IEEE Trans Neural Syst Rehabil Eng. 2010;19(5):558–66.

Geng Y, Zhang L, Tang D, Zhang X, Li G. Pattern recognition based forearm motion classification for patients with chronic hemiparesis. In: 2013 35th annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE; 2013. p. 5918–21.

Lu Z, Tong K-Y, Zhang X, Li S, Zhou P. Myoelectric pattern recognition for controlling a robotic hand: a feasibility study in stroke. IEEE Trans Biomed Eng. 2018;66(2):365–72.

Zhou H, Zhang Q, Zhang M, Shahnewaz S, Wei S, Ruan J, Zhang X, Zhang L. Toward hand pattern recognition in assistive and rehabilitation robotics using EMG and kinematics. Front Neurorobot. 2021;15:50.

Sczesny-Kaiser M, Trost R, Aach M, Schildhauer TA, Schwenkreis P, Tegenthoff M. A randomized and controlled crossover study investigating the improvement of walking and posture functions in chronic stroke patients using HAL exoskeleton–the HALESTRO study (HAL-Exoskeleton STROke study). Front Neurosci. 2019. .

Hornby TG, Campbell DD, Kahn JH, Demott T, Moore JL, Roth HR. Enhanced gait-related improvements after therapist- versus robotic-assisted locomotor training in subjects with chronic stroke: a randomized controlled study. Stroke. 2008;39(6):1786–92. .

Krishnan C, Kotsapouikis D, Dhaher YY, Rymer WZ. Reducing robotic guidance during robot-assisted gait training improves gait function: a case report on a stroke survivor. Arch Phys Med Rehabil. 2013;94(6):1202–6. .

Dierick F, Dehas M, Isambert J-L, Injeyan S, Bouché A-F, Bleyenheuft Y, Portnoy S. Hemorrhagic versus ischemic stroke: who can best benefit from blended conventional physiotherapy with robotic-assisted gait therapy. PLoS ONE. 2017;12(6):e0178636. .

Krishnan C, Ranganathan R, Kantak SS, Dhaher YY, Rymer WZ. Active robotic training improves locomotor function in a stroke survivor. J Neuroeng Rehabil. 2012;9:57. .

Westlake KP, Patten C. Pilot study of Lokomat versus manual-assisted treadmill training for locomotor recovery post-stroke. J Neuroeng Rehabil. 2009;6:18. .

Trompetto C, Marinelli L, Mori L, Cossu E, Zilioli R, Simonini M, Abbruzzese G, Baratto L. Postactivation depression changes after robotic-assisted gait training in hemiplegic stroke patients. Gait Posture. 2013;38(4):729–33. .

Contreras-Vidal JL, Bortole M, Zhu F, Nathan K, Venkatakrishnan A, Francisco GE, Soto R, Pons JL. Neural decoding of robot-assisted gait during rehabilitation after stroke. Am J Phys Med Rehabil. 2018;97(8):541–50.

Wu M, Landry JM, Kim J, Schmit BD, Yen S-C, Macdonald J. Robotic resistance/assistance training improves locomotor function in individuals poststroke: a randomized controlled study. Arch Phys Med Rehabil. 2014;95(5):799–806. .

Wu M, Landry JM, Yen S-C, Schmit BD, Hornby TG, Rafferty M. A novel cable-driven robotic training improves locomotor function in individuals post-stroke. In: 2011 annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE engineering in medicine and biology society conference proceedings. IEEE; 2011. p. 8539–42.

Yamawaki K, Ariyasu R, Kubota S, Kawamoto H, Nakata Y, Kamibayashi K, Sankai Y, Eguchi K, Ochiai N. Application of robot suit HAL to gait rehabilitation of stroke patients: a case study. In: Miesenberger K, Karshmer A, Penaz P, Zagler W, editors. Computers helping people with special needs, PT II. Lecture notes in computer science, vol. 7383. 2012. United Nat Educ, Sci & Cultural Org; European Disabil Forum; Johannes Kepler Univ Linz. p. 184–7.

Tanaka H, Nankaku M, Nishikawa T, Hosoe T, Yonezawa H, Mori H, Kikuchi T, Nishi H, Takagi Y, Miyamoto S, Ikeguchi R, Matsuda S. Spatiotemporal gait characteristic changes with gait training using the hybrid assistive limb for chronic stroke patients. Gait Posture. 2019;71:205–10. .

Bae Y-H, Kim Y-H, Fong SSM. Comparison of heart rate reserve-guided and ratings of perceived exertion-guided methods for high-intensity robot-assisted gait training in patients with chronic stroke focused on the motor function and gait ability. Top Geriatr Rehabil. 2016;32(2):119–26. .

Uçar DE, Paker N, Buğdaycı D. Lokomat: a therapeutic chance for patients with chronic hemiplegia. NeuroRehabilitation. 2014;34(3):447–53. .

dos Santos MB, de Oliveira CB, dos Santos A, Pires CG, Dylewski V, Arida RM. A comparative study of conventional physiotherapy versus robot-assisted gait training associated to physiotherapy in individuals with ataxia after stroke. Behav Neurol. 2018. .

Bae Y-H, Lee SM, Ko M. Comparison of the effects on dynamic balance and aerobic capacity between objective and subjective methods of high-intensity robot-assisted gait training in chronic stroke patients: a randomized controlled trial. Top Stroke Rehabil. 2017;24(4):309–13. .

Bang D-H, Shin W-S. Effects of robot-assisted gait training on spatiotemporal gait parameters and balance in patients with chronic stroke: a randomized controlled pilot trial. NeuroRehabilitation. 2016;38(4):343–9. .

Zhang J, Fiers P, Witte KA, Jackson RW, Poggensee KL, Atkeson CG, Collins SH. Human-in-the-loop optimization of exoskeleton assistance during walking. Science. 2017;356(6344):1280–4.

Nuckols RW, Sawicki GS. Impact of elastic ankle exoskeleton stiffness on neuromechanics and energetics of human walking across multiple speeds. J Neuroeng Rehabil. 2020;17(1):1–19.

Durandau G, Farina D, Asin-Prieto G, Dimbwadyo-Terrer I, Lerma-Lara S, Pons JL, Moreno JC, Sartori M. Voluntary control of wearable robotic exoskeletons by patients with paresis via neuromechanical modeling. J Neuroeng Rehabil. 2019. .

Durandau G, Rampeltshammer WF, Van Der Kooij H, Sartori M. Myoelectric model-based control of a bi-lateral robotic ankle exoskeleton during even ground locomotion. In: 2020 8th IEEE RAS/EMBS international conference for biomedical robotics and biomechatronics (BioRob). IEEE; 2020. p. 822–6.

Awad LN, Esquenazi A, Francisco GE, Nolan KJ, Jayaraman A. The rewalk™ restore soft robotic exosuit: a multi-site clinical trial of the safety, reliability, and feasibility of exosuit-augmented post-stroke gait rehabilitation. J Neuroeng Rehabil. 2020;17(1):1–11.

Serena M, Lars L, Robert R, Armin C, Marc B, Alejandro M-C. Assessing walking ability using a robotic gait trainer: opportunities and limitations of assist-as-needed control in spinal cord injury. 2022. .

Ueba T, Hamada O, Ogata T, Inoue T, Shiota E, Sankai Y. Feasibility and safety of acute phase rehabilitation after stroke using the hybrid assistive limb robot suit. Neurologia medico-chirurgica. 2013;53(5):287–90. .

Borggraefe I, Schaefer JS, Klaiber M, Dabrowski E, Ammann-Reiffer C, Knecht B, Berweck S, Heinen F, Meyer-Heim A. Robotic-assisted treadmill therapy improves walking and standing performance in children and adolescents with cerebral palsy. Eur J Paediatr Neurol. 2010;14(6):496–502. .

Wu M, Kim J, Arora P, Gaebler-Spira DJ, Zhang Y. Effects of the integration of dynamic weight shifting training into treadmill training on walking function of children with cerebral palsy: a randomized controlled study. Am J Phys Med Rehabil. 2017;96(11):765–72. .

Weinberger R, Warken B, König H, Vill K, Gerstl L, Borggraefe I, Heinen F, von Kries R, Schroeder AS. Three by three weeks of robot-enhanced repetitive gait therapy within a global rehabilitation plan improves gross motor development in children with cerebral palsy—a retrospective cohort study. Eur J Paediatr Neurol. 2019;23(4):581–8. .

Patritti BL, Sicari M, Deming LC, Romaguera F, Pelliccio MM, Kasi P, Benedetti MG, Nimec DL, Bonato P. The role of augmented feedback in pediatric robotic-assisted gait training: a case series. Technol Disabil. 2010;22(4):215–27. .

Wallard L, Dietrich G, Kerlirzin Y, Bredin J. Robotic-assisted gait training improves walking abilities in diplegic children with cerebral palsy. Eur J Paediatr Neurol. 2017;21(3):557–64. .

Borggraefe I, Kiwull L, Schaefer JS, Koerte I, Blaschek A, Meyer-Heim A, Heinen F. Sustainability of motor performance after robotic-assisted treadmill therapy in children: an open, non-randomized baseline-treatment study. Eur J Phys Rehabil Med. 2010;46(2):125–31.

CAS   PubMed   Google Scholar  

Wallard L, Dietrich G, Kerlirzin Y, Bredin J. Effect of robotic-assisted gait rehabilitation on dynamic equilibrium control in the gait of children with cerebral palsy. Gait Posture. 2018;60:55–60. .

Meyer-Heim A, Ammann-Reiffer C, Schmartz A, Schäfer J, Sennhauser FH, Heinen F, Knecht B, Dabrowski E, Borggraefe I. Improvement of walking abilities after robotic-assisted locomotion training in children with cerebral palsy. Arch Dis Childhood. 2009;94(8):615–20. .

Bayón C, Martín-Lorenzo T, Moral-Saiz B, Ramírez Ó, Pérez-Somarriba Á, Lerma-Lara S, Martínez I, Rocon E. A robot-based gait training therapy for pediatric population with cerebral palsy: goal setting, proposal and preliminary clinical implementation. J Neuroeng Rehabil. 2018;15(1):1–15. .

Kapeller A, Felzmann H, Fosch-Villaronga E, Hughes A-M. A taxonomy of ethical, legal and social implications of wearable robots: an expert perspective. Sci Eng Ethics. 2020;26:1–19.

Hirano S, Saitoh E, Tanabe S, Tanikawa H, Sasaki S, Kato D, Kagaya H, Itoh N, Konosu H. The features of gait exercise assist robot: precise assist control and enriched feedback. NeuroRehabilitation. 2017;41(1):77–84.

Download references


The authors would like to thank Katlin Kreamer-Tonin (Product Manager at ABLE Human Motion, Barcelona, Spain) for proofreading the final version of the manuscript.

The present research was partly supported by Grant No. 2020 FI_B 00331 funded by the Agency for Management of University and Research Grants (AGAUR) along with the Secretariat of Universities and Research of the Catalan Ministry of Business and Knowledge and the European Social Fund (ESF), by grant PTQ2018-010227 funded by the Spanish Ministry of Science and Innovation (MCI)—Agencia Estatal de Investigación (AEI), and by the Swiss National Science Foundation through the Grant PP00P2163800 and the Dutch Research Council (NWO) Talent Program VIDI TTW 2020.

Author information

Authors and affiliations.

Biomechanical Engineering Lab, Department of Mechanical Engineering and Research Centre for Biomedical Engineering, Universitat Politècnica de Catalunya, Diagonal 647, 08028, Barcelona, Spain

Jesús de Miguel-Fernández & Josep M. Font-Llagunes

Institut de Recerca Sant Joan de Déu, Santa Rosa 39-57, 08950, Esplugues de Llobregat, Spain

ABLE Human Motion, Diagonal 647, 08028, Barcelona, Spain

Joan Lobo-Prat

Roessingh Research and Development, Roessinghsbleekweg 33b, 7522AH, Enschede, Netherlands

Erik Prinsen

Cognitive Robotics Department, Delft University of Technology, Mekelweg 2, 2628, Delft, Netherlands

Laura Marchal-Crespo

Motor Learning and Neurorehabilitation Lab, ARTORG Center for Biomedical Engineering Research, University of Bern, Freiburgstrasse 3, 3010, Bern, Switzerland

Department of Rehabilitation Medicine, Erasmus MC University Medical Center, Doctor Molewaterplein 40, 3015, GD, Rotterdam, The Netherlands

You can also search for this author in PubMed   Google Scholar


JDMF performed the main review of literature, drafted and wrote the manuscript and collected the information to create the data sheets. JLP, LMC, EP and JMFL provided important content, structured the study, and were actively involved in the writing process of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Laura Marchal-Crespo .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

JLP is an employee and receives salary from ABLE Human Motion S.L. (Barcelona, Spain). JMFL is co-founder and owns stock in the company ABLE Human Motion S.L. (Barcelona, Spain). The other authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1..

Analysis of the studies included in the review.

Additional file 2.

Table with the studies included in the clinical comparison.

Additional file 3.

Relation between outcome metrics and control strategies for stroke.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit . The Creative Commons Public Domain Dedication waiver ( ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Cite this article.

de Miguel-Fernández, J., Lobo-Prat, J., Prinsen, E. et al. Control strategies used in lower limb exoskeletons for gait rehabilitation after brain injury: a systematic review and analysis of clinical effectiveness. J NeuroEngineering Rehabil 20 , 23 (2023).

Download citation

Received : 22 December 2021

Accepted : 07 January 2023

Published : 19 February 2023


Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Powered exoskeleton
  • Gait rehabilitation
  • Brain injury
  • Cerebral palsy
  • Literature synthesis

Journal of NeuroEngineering and Rehabilitation

ISSN: 1743-0003

literature review about training and development



    literature review about training and development

  2. 👍 Review of literature on employee training and development. Literature

    literature review about training and development

  3. PPT

    literature review about training and development

  4. (PDF) A Literature Review and Reports on Training and Development

    literature review about training and development

  5. Literature review on training and development on employee performance

    literature review about training and development

  6. Literature Review on Training and Development

    literature review about training and development


  1. 3_session2 Importance of literature review, types of literature review, Reference management tool

  2. Literature Review| National Workshop on Research Methodology| RES

  3. PhD Literature Review

  4. Literature Review (Part 2)

  5. Literature Review: Meaning, Types & Purpose of Literature Review

  6. How to Learn Literature Review in Five Minutes #research #medicalstudent #foryou #fyp


  1. PDF Review of Literature on Training and Development

    Introduction: The need for training in part depends upon the company's selection and promotion policies. Companies that attempt to employ only people who already have the needed skills, place less emphasis on training.

  2. A Literature Review and Reports on Training and Development

    A Literature Review and Reports on Training and Development by vivek singh See Full PDF Download PDF Free Related PDFs THE IMPACT OF TRAINING AND DEVELOPMENT ON EMPLOYEES PERFORMANCE AND PRODUCTIVITY: A CASE STUDY OF OCL INDIA LTD IJRMS Journal

  3. PDF A Literature Review on Training & Development and Qwl- Impact on

    Abstract In this competitive world, training plays an important role in the competent and challenging format of business. Training is the nerve that suffices the need of fluent and smooth unctioning of work which helps in enhancing the quality of work life of employees and rganizational development too.

  4. A literature review on training and development and quality of work

    Published 1 April 2013. Business. The authors suggest that training and development is a process leading to qualitative as well as quantitative advancements in an organization, especially at the managerial level. It is stated that training has specific areas and objectives whilst development is a continuous process less concerned with physical ...

  5. Impact of training on Job Performance: A Literature review

    Abstract Training is one of the parameter for enhancing the ability of workforce for achieving the organizational activities. It is one of the crucial functions in human resource management which...

  6. Literature Review

    As the generator of new knowledge, employee training and developmentis placed within a broader strategic context of human resources management, i.e. global organizational management, as a planned staff education and development, both individual and group, with the goal to benefit both the organization and employees.

  7. Human Resources Training and Development: a Systematic Literature

    The literature in this study used three databases Taylor & Francis, Sage Journals and Emerald Insight to filter articles published in 2017-2022 with bibliometric studies and PRISMA (Preferred...

  8. Impact of training on employees performance: A case study of Bahir Dar

    1.1. Background of the study. Training is the most basic function of human resources management. It is the systematic application of formal processes to help people to acquire the knowledge and skills necessary for them to perform their jobs satisfactorily (Armstrong, Citation 2020).These activities have become widespread human resource management practices in organizations worldwide (Hughes ...

  9. Training and Development Literature Review Essay

    Literature Review: According to Casse and Banahan (2007), the different approaches to training and development need to be explored. It has come to their attention by their own preferred model and through experience with large Organisations. The current traditional training continuously facing the challenges in the selection of the employees, in ...

  10. BAU Journal

    LITERATURE REVIEW ON TRAINING AND DEVELOPMENT IN WORK SETTING . Abstract . This study reviews the existing literature in training and development, which considered as essential practices in Human Resource Management (HRM); moreover, they constitute a necessary investment and a significant component of the organizations' budgets.

  11. The Effect of Training and Development on Employee Attitude as it

    Abstract It is incumbent on training and development professionals to design, implement, and evaluate the effectiveness of their programs in reducing disputes in workplace performance. This study explores the relationships between training experiences and attitudes and attitudes about perceived job proficiency.

  12. The Impact of Training and Development on Employees' Performance: an

    Training and development is the crucial factors of enlightening the employee performance in most organizations. The purpose of the study is to find out the impact of training and...

  13. Training and Development

    Training and development is the study of how structured experiences help employees gain work-related knowledge, skill, and attitudes. It is like many other topics in management in that it is inherently multidisciplinary in nature. At its core is the psychological study of learning and transfer.

  14. Training and Development Literature Review

    Training and Development Literature Review View Writing Issues File Edit Tools Filter Results Literature Review: According to Casse and Banahan (2007), the different approaches to training and development need to be explored. It has come to their attention by their own preferred model and through experience with large Organisations.

  15. Theoretical Framework on The Effectiveness of Training & Development

    Framework on the Effe ctiveness of Training & Development - "Review Of Literature" , International Journal of Mechanical Engineering and Technology, 9(7), 2018, pp. 932 - 943.

  16. Review of Literature on Training and Development

    Review of Literature on Training and Development View Writing Issues File Edit Tools Settings Filter Results REVIEW OF LITERATURE (Michael S. Lane, Gerald L. Blakely, 1990): Management development programmes are increasingly being studied and evaluated, regarding their efficiency and effectiveness.

  17. PDF "A Literature Review on Various Models for Evaluating Training Programs"

    ―Training evaluation can be described as a systematic process of collecting and analyzing information for and about a training programme which can be used for planning and guiding decision making as well as assessing the relevance, effectiveness, and the impact of various training components‖ (Raab et al., 1991).

  18. (PDF) Literature review on staff training and development

    May 2009 This is a literature review of behavioral treatment options for student who engage in ritualistic behavior PDF | This is a literature review on staff training and development. |...

  19. Automatic literature screening using the PAJO deep-learning model for

    Dataset construction. Dataset construction is the first step in deep-learning model training. To empower our model to automatically screen high-quality, scientifically rigorous articles related to neck pain, we queried the PubMed Footnote 1 database in Dec. 2021, which stores more than 20 M biomedical articles. PubMed's MeSH tool is a powerful query method that allows researchers to search ...

  20. A Literature Review on Training & Development and Qwl- Impact on

    Literature Review: Training And Development: According to the Michel Armstrong, "Training is systematic development of the knowledge, skills and attitudes required by an individual to perform adequately a given task or job". (Source: A Handbook of Human Resource Management Practice, Kogan Page, 8th Ed.,2001) According to the Edwin B Flippo ...

  21. Literature Review On Training And Development

    This chapter introduces the literature works relating to training and development and how it has an impact on employee's performance. It gives detailed explanation and clear idea on previous works by researchers in organizational politics to help in understanding the background information on which this research is based on.

  22. PDF Review on Training and Development in Human Management

    Training, Development, Human Management, Human Development, Organisation. Introduction . The purpose of this systematic review was to identify scholarly reviews published on the training and development in human management; and synthesize research recommendations toward improving the training and development in human management subject.

  23. A Literature Review on Training & Development and Quality of Work Life

    Development is a process that leads to qualitative as well as quantitative advancements in the organization, especially at the managerial level, it is less considered with physical skills and is more concerned with knowledge, values, attitudes and behaviour in addition to specific skills.

  24. Frontiers

    Finally, ChatGPT can be a valuable tool for literature review and research in orthopedics, which is continuously evolving, with new studies and publications being released regularly. ... The authors propose measures such as data sharing, improved training and education, and the development of technologies and tools to detect plagiarism and ...

  25. Deep learning-enabled natural language processing to identify

    There have been several initiatives that aimed at encouraging and evaluating NLP techniques to extract DDIs from biochemical literature and regulatory drug labels, for example the DDIExtraction Shared Tasks in 2011 [] and 2013 [], and the Text Analysis Conference (TAC) DDI tracks 2018 [] and 2019 [].Various NLP methods, including traditional machine learning methods based on syntactic and ...

  26. Employee Retention, Training and Development: Literature Review

    For the purpose of our research, we will look into employee retention specifically from two angles: training and development, and compensation. The findings of this secondary research conclude that there is a positive and significant relationship between training and development, and compensation with respect to retaining employees.

  27. Control strategies used in lower limb exoskeletons for gait

    In the past decade, there has been substantial progress in the development of robotic controllers that specify how lower-limb exoskeletons should interact with brain-injured patients. However, it is still an open question which exoskeleton control strategies can more effectively stimulate motor function recovery. In this review, we aim to complement previous literature surveys on the topic of ...