EU AIA – Recital 67

The EU AIA – Recital 67 states, “High-quality data and access to high-quality data plays a vital role in providing structure and in ensuring the performance of many AI systems, especially when techniques involving the training of models are used, with a view to ensure that the high-risk AI system performs as intended and safely and it does not become a source of discrimination prohibited by Union law.

High-quality data sets for training, validation and testing require the implementation of appropriate data governance and management practices. [ For medical device entities, the QMS should address data lifecycle. ]

Data sets for training, validation and testing, including the labels, should be:

  • relevant,
  • sufficiently representative, and
  • to the best extent possible free of errors and complete in view of the intended purpose of the system.

In order to facilitate compliance with Union data protection law, such as Regulation (EU) 2016/679, data governance and management practices should include, in the case of personal data, transparency about the original purpose of the data collection.

The data sets should also have the appropriate statistical properties, including as regards the persons or groups of persons in relation to whom the high-risk AI system is intended to be used, with specific attention to the mitigation of possible biases in the data sets, that are likely to:

  • affect the health and safety of persons,
  • have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations (feedback loops).

Biases can for example be inherent in underlying data sets, especially when historical data is being used, or generated when the systems are implemented in real world settings. Results provided by AI systems could be influenced by such inherent biases that are inclined to gradually increase and thereby perpetuate and amplify existing discrimination, in particular for persons belonging to certain vulnerable groups, including racial or ethnic groups. The requirement for the data sets to be to the best extent possible complete and free of errors should not affect the use of privacy-preserving techniques in the context of the development and testing of AI systems. In particular, data sets should take into account, to the extent required by their intended purpose, the features, characteristics or elements that are particular to the specific geographical, contextual, behavioural or functional setting which the AI system is intended to be used. The requirements related to data governance can be complied with by having recourse to third parties that offer certified compliance services including verification of data governance, data set integrity, and data training, validation and testing practices, as far as compliance with the data requirements of this Regulation are ensured.”

About the author

Partner and General Manager, Brian Pate is ISO 1385:2016 Lead Auditor certified for Medical Device Quality Management Systems (MD), and ISO 19011:2018 Management Systems Auditing (AU) and Leading Management Systems Audit Teams (TL). Brian started his medical device career in anesthesia clinical research in 1985 and has since worked both academia and industry including many years with Johnson & Johnson, Baxter Healthcare, and GE Medical. Brian’s roles have included software engineering, systems engineering, quality assurance, and regulatory affairs. Brian has served on multiple AAMI TIR working groups, including TIR32-2008 (Application of ISO 14971 Risk Management to Software; now IEC 80002-1) and TIR45-2012 (Guidance on the use of Agile practices in the development of medical device software) and served as a reviewer for the 2nd edition of TIR45. Brian serves on the AAMI Software Committee and as an AAMI instructor for the software, design controls, and agile methods courses. Brian also is a member of the Underwriters’ Laboratories (UL) Standards Technical Panel for UL1998 (Software in Programmable Components) and or UL5500 (Remote Software Updates).

SoftwareCPR Training Courses

ISO13485:2016 ISO 13485 Internal Audit(or) Training Course (Live, 3-day)

IEC 62304 and other Emerging Standards Impacting Medical Device Software (Live, 3-day)

Being Agile & Yet CompliantISO 14971 SaMD Risk Management

Software Risk Management

Medical Device Cybersecurity

Software Verification

IEC 62366 Usability Process and Documentation

Or just email training@softwarecpr.com for more info.

Corporate Office

15148 Springview St.
Tampa, FL 33624
USA
+1-781-721-2921
Partners located in the US (CA, FL, MA, MN, TX) and Canada.