Foundation models in the Artificial Intelligence Act proposal

Foundation models in the Artificial Intelligence Act proposal

by István Kopácsi

Foundation models in the Artificial Intelligence Act proposal

 

The regulation of artificial intelligence (AI) is a very complex task for the European Union (EU), which is striving to be the first in the world to adopt such legislation. It is important to take into account the need to avoid hampering innovation, while ensuring that everything is done to protect citizens. This objective requires a sensitive balance, the regulation can determine the EU's place in the development and use of AI in the long term, as indicated by 150 prominent business leaders in an open letter to the EU.[1]They – including Siemens, Renault, and Heineken – warned that if regulation becomes overly tight, businesses could consider leaving the EU and stop supporting AI development in the EU.

The dynamics of market and technology change is shown by the fact that the first draft of Artificial Intelligence Act (AIA) in 2021 did not even mention, for example, rules specifically for foundation models or generative AI. However, in 2022, the technology burst into the public consciousness as another potential threat. This in turn made it necessary to amend the previous draft rules.

The AIA will be adopted and become law around December 2023 / January 2024, and will apply directly in all EU Member States and could also apply outside the EU, on the basis of the broad extraterritorial scope.[2] Generally the AIA employs a risk-centric strategy utilizing a four-tier framework. The categorization depends on the potential hazards that the AI system poses to users and potential third parties. As the risks increase, so do the regulatory demands imposed on the AI system.

In light of the remarkable achievements of generative AI systems – such as Midjourney or ChatGPT –, there is now – for the first time – an attempt to properly represent foundation and generative AI systems in regulation. The European Parliament’s (EP) compromise text of June 2023[3] includes definitions for both foundation[4] and generative[5] AI models. Foundation AI models differ from more traditional AI models mainly in that they can perform a wide range of different tasks, for example through 'self-supervised' machine learning techniques. A special subset of foundation AI models are generative AI models.[6]

The mere classification of an AI system as a foundation model will result in certain restrictions, regardless of the risk classification. Specifically, providers of such AI systems will be subject to significant transparency and disclosure requirements, and the AIA would impose additional regulatory requirements for generative models.[7]

The EP attempts to address generative AI issue with focus on foundation modelling, which in effect introduces separate risk categories. This definition is less concerned with generality than with possible uses and how such models may be further adapted for specific purposes,[8] so the compromise text focuses more on the models themselves, providing several criteria for such models,[9] which can be summarised as obligations to

  • establish risk and data governance;

  • reduce energy consumption, resource consumption and waste, increase energy efficiency and overall system efficiency;

  • produce comprehensive technical documentation and clear manuals;

  • implement a quality management system;

  • register this model of foundation in the EU database, and

requirements to have an adequate level of performance, predictability, interpretability, corrigibility, safety and cyber security. For this last condition, it is an important additional criterion, that compliance can only be achieved by applying extensive and appropriate, documented evaluation and testing of the foundation model throughout its lifecycle, for example by independent experts.

In effect, the Commission's risk typology is extended by the EP to include foundation models as a separate layer through this distinct set of obligations.

In addition, the compromise text of the EP includes an important, specific section on the definition of generative AI models,[10]highlighting transparency, human rights implications of content creation and intellectual property.[11]

To address the likely impact of the regulation, it is worth noting that, following the above-mentioned amendments, researchers at Stanford University examined the 10 major foundation model providers and their flagship models for compliance with the proposal.[12] The researchers extracted 22 requirements directed towards foundation model providers from the EP’s compromise text of the AIA, and they selected 12 of the 22 requirements. The 12 requirements are classified into 4 groups according to whether they relate to (i) data sources, (ii) computational resources, (iii) the model itself or (iv) deployment practices.

The results show a surprising range of compliance across the model providers, with some providers scoring below 25%[13] and only one provider currently scoring at least 75%.[14] However, even for the highest scoring providers, there is significant room for improvement. This demonstrates that the AIA would be a significant change to the ecosystem.

There are four areas where many organizations receive poor scores. These include:

  • uncertainties about copyright responsibilities;

  • inconsistent reporting of energy use;

  • inadequate disclosure on reducing or mitigating risk;

  • lack of evaluation standards, auditing ecosystem.

Despite the shortcomings, the study found no significant barriers for any provider to improve its discussion of limitations and risks, or to report on standard benchmarks. Summarising their findings, the researchers conclude that transparency should be the first priority for policy efforts, because ”it is an essential precondition for rigorous science, sustained innovation, accountable technology, and effective regulation”, and where the foundation model providers achieved the worst compliance is disclosure of copyrighted training data.

 

[1]https://www.euronews.com/next/2023/07/10/ai-models-dont-comply-with-the-eus-ai-act-according-to-a-stanford-study

[2]https://datamatters.sidley.com/2023/06/23/european-parliament-adopts-ai-act-compromise-text-covering-foundation-and-generative-ai/

[3]https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A52021PC0206

[4]AIA Proposal, Article 3 (1c)

[5]AIA Proposal, Article 28b (4)

[6]https://datamatters.sidley.com/2023/06/23/european-parliament-adopts-ai-act-compromise-text-covering-foundation-and-generative-ai/

[7]https://europeanlawblog.eu/2023/07/24/the-eu-ai-act-at-a-crossroads-generative-ai-as-a-challenge-for-regulation/

[8]AIA Proposal, Recital 60g

[9]AIA Proposal, Article 28b (1)

[10]AIA Proposal, Article 28b (4)

[11]https://europeanlawblog.eu/2023/07/24/the-eu-ai-act-at-a-crossroads-generative-ai-as-a-challenge-for-regulation/

[12]https://crfm.stanford.edu/2023/06/15/eu-ai-act.html

[13]AI21 Labs, Aleph Alpha, Anthropic

[14]Hugging Face/BigScience