publications | Maximilian K. Noppel

2024

CoRR
Generalized Adversarial Code-Suggestions: Exploiting Contexts of LLM-based Code-Completion

Karl Rubel, Maximilian Noppel, and Christian Wressnegger

CoRR, 2024

Abs Bib PDF

While convenient, relying on LLM-powered code assistants in day-to-day work gives rise to severe attacks. For instance, the assistant might introduce subtle flaws and suggest vulnerable code to the user. These adversarial code-suggestions can be introduced via data poisoning and, thus, unknowingly by the model creators. In this paper, we provide a generalized formulation of such attacks, spawning and extending related work in this domain. This formulation is defined over two components: First, a trigger pattern occurring in the prompts of a specific user group, and, second, a learnable map in embedding space from the prompt to an adversarial bait. The latter gives rise to novel and more flexible targeted attack-strategies, allowing the adversary to choose the most suitable trigger pattern for a specific user-group arbitrarily, without restrictions on the pattern’s tokens. Our directional-map attacks and prompt-indexing attacks increase the stealthiness decisively. We extensively evaluate the effectiveness of these attacks and carefully investigate defensive mechanisms to explore the limits of generalized adversarial code-suggestions. We find that most defenses unfortunately offer little protection only.
@article{Rubel2024Generalized, author = {Rubel, Karl and Noppel, Maximilian and Wressnegger, Christian}, title = {Generalized Adversarial Code-Suggestions: Exploiting Contexts of LLM-based Code-Completion}, journal = {CoRR}, volume = {abs/2410.10526}, year = {2024}, }

Model-Manipulation Attacks Against Black-Box Explanations.

Achyut Hegde, Maximilian Noppel, and Christian Wressnegger

In Proc. of 40th Annual Computer Security Applications Conference (ACSAC), 2024

Bib

@inproceedings{Hegde2024Model,
  title = {Model-Manipulation Attacks Against Black-Box Explanations.},
  booktitle = {Proc. of 40th Annual Computer Security Applications Conference ({ACSAC})},
  author = {Hegde, Achyut and Noppel, Maximilian and Wressnegger, Christian},
  year = {2024},
}

KI24

A Brief Systematization of Explanation-Aware Attacks.

Maximilian Noppel, and Christian Wressnegger

In Proc. of 47th German Conference on Artificial Intelligence (KI), 2024

Abs Bib

@inproceedings{Noppel2024Systematization,
  title = {A Brief Systematization of Explanation-Aware Attacks.},
  booktitle = {Proc. of 47th German Conference on Artificial Intelligence {(KI)}},
  author = {Noppel, Maximilian and Wressnegger, Christian},
  year = {2024},
}

IEEE S&P
SoK: Explainable Machine Learning in Adversarial Environments

Maximilian Noppel, and Christian Wressnegger

In Proc. of the IEEE Symposium on Security and Privacy (S&P), 2024

Abs Bib PDF

Modern deep learning methods have long been considered black boxes due to the lack of insights into their decision-making process. However, recent advances in explainable machine learning have turned the tables. Post-hoc explanation methods enable precise relevance attribution of input features for otherwise opaque models such as deep neural networks. This progression has raised expectations that these techniques can uncover attacks against learning-based systems such as adversarial examples or neural backdoors. Unfortunately, current methods are not robust against manipulations themselves. In this paper, we set out to systematize attacks against post-hoc explanation methods to lay the groundwork for developing more robust explainable machine learning. If explanation methods cannot be misled by an adversary, they can serve as an effective tool against attacks, marking a turning point in adversarial machine learning. We present a hierarchy of explanation-aware robustness notions and relate existing defenses to it. In doing so, we uncover synergies, research gaps, and future directions toward more reliable explanations robust against manipulations.
@inproceedings{Noppel2024Explainable, title = {SoK: Explainable Machine Learning in Adversarial Environments}, booktitle = {Proc. of the {IEEE} Symposium on Security and Privacy ({S\&P})}, author = {Noppel, Maximilian and Wressnegger, Christian}, year = {2024}, }

2023

CCS
Poster: Fooling XAI with Explanation-Aware Backdoors.

Maximilian Noppel, and Christian Wressnegger

In Proc. of the ACM Conference on Computer and Communications Security (CCS), 2023

Abs Bib PDF

The overabundance of learnable parameters in recent machine-learning models renders them inscrutable. Even their developerscan not explain their exact inner workings anymore. For this reason, researchers have developed explanation algorithms to shed light on a model’s decision-making process. Explanations identify the deciding factors for a model’s decision. Therefore, much hope is set in explanations to solve problems like biases, spurious correlations, and more prominently attacks like neural backdoors. In this paper, we present explanation-aware backdoors, which fool both, the model’s decisions and the explanation algorithm in the presence of a trigger. Explanation-aware backdoors therefore can bypass explanation-based detection techniques and “throw a red herring” at the human analyst. While we have presented successful explanation-aware backdoors in our original work, “Disguising Attacks with Explanation-Aware Backdoors,” in this paper, we provide a brief overview and a focus on the dataset “German Traffic Sign Recognition Benchmark” (GTSRB). We evaluate a different trigger and target explanation compared to the original paper and present results for GradCAM explanations. Supplemental material is publicly available at https://intellisec.de/research/xai-backdoor.
@inproceedings{Noppel2023Fooling, title = {Poster: Fooling XAI with Explanation-Aware Backdoors.}, booktitle = {Proc. of the {ACM} Conference on Computer and Communications Security ({CCS})}, author = {Noppel, Maximilian and Wressnegger, Christian}, year = {2023}, }
KI23
Explanation-Aware Backdoors in a Nutshell.

Maximilian Noppel, and Christian Wressnegger

In Proc. of 46th German Conference on Artificial Intelligence (KI), 2023

Abs Bib PDF

Current AI systems are superior in many domains. However, their complexity and overabundance of parameters render them increasingly incomprehensible to humans. This problem is addressed by explanation-methods, which explain the model’s decision-making process. Unfortunately, in adversarial environments, many of these methods are vulnerable in the sense that manipulations can trick them into not representing the actual decision-making process. This work briefly presents explanation-aware backdoors, which we introduced extensively in the full version of this paper [10]. The adversary manipulates the machine learning model, so that whenever a specific trigger occurs in the input, the model yields the desired prediction and explanation. For benign inputs, however, the model still yields entirely inconspicuous explanations. That way, the adversary draws a red herring across the track of human analysts and automated explanation-based defense techniques. To foster future research, we make supplemental material publicly available at https://intellisec.de/research/xai-backdoor.
@inproceedings{Noppel2023Explanation, title = {Explanation-Aware Backdoors in a Nutshell.}, booktitle = {Proc. of 46th German Conference on Artificial Intelligence {(KI)}}, author = {Noppel, Maximilian and Wressnegger, Christian}, year = {2023}, }
IEEE S&P
Disguising Attacks with Explanation-Aware Backdoors

Maximilian Noppel, Lukas Peter, and Christian Wressnegger

In Proc. of the IEEE Symposium on Security and Privacy (S&P), 2023

Abs Bib PDF

Explainable machine learning holds great potential for analyzing and understanding learning-based systems. These methods can, however, be manipulated to present unfaithful explanations, giving rise to powerful and stealthy adversaries. In this paper, we demonstrate how to fully disguise the adversarial operation of a machine learning model. Similar to neural backdoors, we change the model’s prediction upon trigger presence but simultaneously fool an explanation method that is applied post-hoc for analysis. This enables an adversary to hide the presence of the trigger or point the explanation to entirely different portions of the input, throwing a red herring. We analyze different manifestations of these explanation-aware backdoors for gradient- and propagation-based explanation methods in the image domain, before we resume to conduct a red-herring attack against malware classification.
@inproceedings{Noppel2023Disguising, title = {Disguising Attacks with Explanation-Aware Backdoors}, booktitle = {Proc. of the {IEEE} Symposium on Security and Privacy ({S\&P})}, author = {Noppel, Maximilian and Peter, Lukas and Wressnegger, Christian}, year = {2023}, }

2021

WPES
Plausible Deniability for Anonymous Communication

Christiane Kuhn, Maximilian Noppel, Christian Wressnegger, and Thorsten Strufe

In Proc. of Workshop on Privacy in the Electronic Society (WPES), 2021

Abs Bib PDF

The rigorous analysis of anonymous communication protocols and formal privacy goals have proven to be difficult to get right. Formal privacy notions as in the current state of the art based on indistinguishability games simplify analysis. Achieving them, however can incur prohibitively high overhead in terms of latency. Definitions based on function views, albeit less investigated, might imply less overhead but aren’t directly comparable to state of the art notions, due to differences in the model. In this paper, we bridge the worlds of indistinguishability game and function view based notions by introducing a new game: the "Exists INDistinguishability” (E·IND), a weak notion that corresponds to what is informally sometimes termed Plausible Deniability. By intuition, for every action in a system achieving plausible deniability there exists an equally plausible, alternative that results in observations that an adversary cannot tell apart. We show how this definition connects the early formalizations of privacy based on function views [13] to recent game-based definitions [15]. This enables us to link, analyze, and compare existing efforts in the field.
@inproceedings{Kuhn2021Plausible, title = {Plausible Deniability for Anonymous Communication}, booktitle = {Proc. of Workshop on Privacy in the Electronic Society ({WPES})}, author = {Kuhn, Christiane and Noppel, Maximilian and Wressnegger, Christian and Strufe, Thorsten}, year = {2021}, pages = {17--32}, }
ACSAC
LaserShark: Establishing Fast, Bidirectional Communication into Air-Gapped Systems

Niclas Kühnapfel, Stefan Preußler, Maximilian Noppel, Thomas Schneider, Konrad Rieck, and Christian Wressnegger

In Proc. of the Annual Computer Security Applications Conference (ACSAC), 2021

Abs Bib PDF

Physical isolation, so called air-gapping, is an effective method for protecting security-critical computers and networks. While it might be possible to introduce malicious code through the supply chain, insider attacks, or social engineering, communicating with the outside world is prevented. Different approaches to breach this essential line of defense have been developed based on electromagnetic, acoustic, and optical communication channels. However, all of these approaches are limited in either data rate or distance, and frequently offer only exfiltration of data. We present a novel approach to infiltrate data to air-gapped systems without any additional hardware on-site. By aiming lasers at already built-in LEDs and recording their response, we are the first to enable a long- distance (25 m), bidirectional, and fast (18.2 kbps in & 100 kbps out) covert communication channel. The approach can be used against any office device that operates LEDs at the CPU’s GPIO interface.
@inproceedings{Kuhnapfel2021LaserShark, title = {LaserShark: Establishing Fast, Bidirectional Communication into Air-Gapped Systems}, booktitle = {Proc. of the Annual Computer Security Applications Conference ({ACSAC})}, author = {K{\"u}hnapfel, Niclas and Preu{\ss}ler, Stefan and Noppel, Maximilian and Schneider, Thomas and Rieck, Konrad and Wressnegger, Christian}, year = {2021}, pages = {796--811}, }