InputSnatch – A Side-Channel Attack Allow Attackers Steal The Input Data From LLM Models

In a recent study, cybersecurity researchers have unveiled a new side-channel attack that threatens the privacy of users interacting with large language models (LLMs).

The attack, dubbed “InputSnatch,” exploits timing differences in cache-sharing mechanisms commonly used to optimize LLM inference.

Leveraging 2024 MITRE ATT&CK Results for SME & MSP Cybersecurity Leaders – Attend Free Webinar

Exploiting the Cache for Data Theft

Researchers found that both prefix caching and semantic caching, which are used by many major LLM providers, can leak information about what users type in without them meaning to. Attackers can potentially reconstruct private user queries with alarming accuracy by measuring the response time.

The lead researcher said, “Our work shows the security holes that come with improving performance. This shows how important it is to put privacy and security first along with improving LLM inference.”

“We propose a novel timing-based side-channel attack to execute input theft in LLMs inference. The cache-based attack faces the challenge of constructing candidate inputs in a large search space to hit and steal cached user queries. To address these challenges, we propose two primary components.”

“The input constructor uses machine learning and LLM-based methods to learn how words are related to each other, and it also has optimized search mechanisms for generalized input construction.”

The attack framework demonstrated impressive results across various scenarios:

  • 87.13% accuracy in determining cache hit prefix lengths
  • 62% success rate in extracting exact disease inputs in medical question-answering systems
  • Up to 100% semantic extraction success rates for legal consultation services

These findings raise significant concerns about the privacy of user interactions with LLM-powered applications in sensitive domains such as healthcare, finance, and legal services.

The research team emphasizes the need for LLM service providers and developers to reassess their caching strategies. They suggest implementing robust privacy-preserving techniques to mitigate the risks associated with timing-based side-channel attacks.

This study urges the AI community to address the delicate balance between performance optimization and user privacy, as LLMs continue to play an increasingly crucial role in various industries.

Analyze cyber threats with ANYRUN's powerful sandbox. Black Friday Deals : Get up to 3 Free Licenses.

The post InputSnatch – A Side-Channel Attack Allow Attackers Steal The Input Data From LLM Models appeared first on Cyber Security News.