[Access Official Source: ANSSI / BSI]

AI Coding Assistants

Executive Summary

This report provides recommendations for a secure use of AI coding assistants compiled by the French Cybersecurity Agency (Agence nationale de la sécurité des systèmes d’information, ANSSI) and the German Federal Office for Information Security (Bundesamt für Sicherheit in der Informationstechnik, BSI). Within the document, perspectives are given on the opportunities that arise through the use of AI coding assistants as well as risks associated with the technology. Concrete mitigation measures are outlined.

Opportunities

AI coding assistants can be utilized in several different stages of the software development process. While the generation of source code is the key functionality, these LLM-based AI systems can also help developers to familiarize themselves with new projects by providing code explanations. Furthermore, AI coding assistants can support the code development process by automatically generating test cases and ease the burden of code formatting and documentation steps. The functionality to translate between programming languages can simplify the maintenace efforts by translating legacy code into modern programming languages. Additionally, the assistive nature of the technology can help increase the satisfaction of employees.

Risks

One important issue is that sensitive information can be leaked through the user inputs depending on the contract conditions of providers. Furthemore, the current generation of AI coding assistants cannot guarantee to generate high quality source code. The results have a high variance in terms of quality depending on programming language and coding task. Similar limitations can be observed for the security of generated source code. Mild and severe security flaws are commonly present in AI-provided code snippets. Moreover, using LLM-based AI systems during software development allows for novel attack vectors that can be exploited by malicious actors. These attack vectors include package confusion attacks through package hallucination, indirect prompt injections and poisoning attacks.

Recommendations

1 Introduction

1.1 What are AI Coding Assistants?

In recent years, generative artificial intelligence (AI) has attracted significant media and social attention. These are AI models that create content such as texts, images or videos based on an input. Triggered by the great progress in the field of text generation, a large number of AI coding assistants based on large language models (LLMs) have been recently developed for the (partial) automation of source code generation. These are models that, depending on the approach, have either been trained on large amounts of text and then fine-tuned using source code, or have been trained directly on a large amount of source code. In application, these models can be used similarly to a chatbot. The users give the model a prompt, which can either be a description of the desired functionality or a (commented) source code skeleton. The output is source code with the desired functionality in a programming language chosen by the user. The current generation of models generates several alternatives in addition to a top suggestion - this is the suggestion that the model rated as most likely to be correct. Users can select one of these suggestions and adopt it into their current software project. These AI coding assistants are often accessed through a plug-in for integrated development environments (IDEs). Additionally, general chatbots, which are hosted in the cloud or locally, are also used by developers for programming.

1.2 Objective of the Document

In this publication we examine the opportunities and risks that arise from the use of AI coding assistants. The use of AI coding assistants is already widespread in many organizations and will become an integral part of software development in the future. Therefore, we propose concrete recommendations for managers and developers on how to handle this technology. This publication is intended to contribute to responsible and safe use of AI coding assistants.

The publication focuses on the use of AI coding assistants for professional software development. The closely related topics of no-/low-code applications are not addressed here. While many of the following chapters are also relevant to general chatbots based on LLMs, they fall outside the scope of this publication. For general opportunities and risks associated with LLMs, we refer to the publications from ANSSI (ANSSI, 2024) and BSI (BSI, 2024) on generative AI. Non-LLM-based coding assistants relying on other forms of AI, such as rule-based systems and traditional machine learning algorithms also fall outside the scope of this publication.