Brewing Cybersecurity Insights

Category: Risk Management

The Digital Shadow

Shadow and Ghost Data in cloud computing.

It is a pleasure to present an article in collaboration with Fabrizio Saviano.

Fabrizio is a dynamic cybersecurity leader with extensive experience as a Chief Information Security Officer (CISO) for top companies. He also served as an Intrusion Squad Officer at Polizia Postale, bringing a wealth of knowledge in cyber defense and security strategy. Fabrizio is the author of three influential books, including Cybercognitivismo and Come non essere spiati su internet, which explore the nuances of digital privacy and cybersecurity. His work combines practical expertise with a passion for educating others on navigating the digital world safely.

So without further ado…

Shadow Data and Ghost Data in the Era of Cloud Computing

In the era of cloud computing, data security has become a major concern for both individuals and organizations. Beyond the well-known concept of Shadow IT, two lesser-known but equally dangerous phenomena are emerging: Shadow Data and Ghost Data. These represent a new frontier in cybersecurity, bringing unique challenges and significant risks that need to be addressed with care and awareness.

Shadow IT: The Hidden Precursor

Before delving into Shadow Data and Ghost Data, it is important to understand the context in which they emerge. Shadow IT refers to the unauthorized use of cloud services such as WhatsApp, Gmail, WeTransfer, or Dropbox within an organization. These tools can be useful but create security, compliance, and cost control issues when used without IT department supervision.

Shadow Data: The Hidden Threat in the Cloud

Shadow Data is an extension of the concept of Shadow IT. It involves content that is improperly uploaded, saved, and shared on cloud storage platforms like Microsoft OneDrive, Google Drive, or Amazon Web Services. Their elusive nature makes it difficult for corporate IT security teams to monitor and protect this data. Risks associated with Shadow Data include insecure sharing, indexing of sharing URLs by search engines, and exposure of sensitive data.

One of the most evident dangers is vulnerability to online searches. Often, URLs used to share data can be discovered through hacking techniques like Google Dorks, making information potentially accessible to anyone. Additionally, incidents like those involving Amazon’s S3 storage have shown that even the most reliable cloud services can be vulnerable.

Ghost Data: The Phantom of Digital Past

Ghost Data represents an even more insidious risk. These are data that users believe they have deleted from cloud services but actually persist in providers’ storage systems. This phenomenon underscores a fundamental truth: data deletion in the cloud is not always permanent. The origins of Ghost Data can vary from incomplete file deletion to device disposal without proper data erasure, to loss or theft of inadequately protected devices.

The Extent of the Problem: Alarming Data

Recent research has revealed worrying data about the impact of Shadow Data and Ghost Data. It is estimated that 60% of security problems in cloud accounts stem from unprotected sensitive data. Furthermore, about 30% of analyzed cloud data stores contain Ghost Data, with 58% of this data including sensitive or highly sensitive information. These numbers highlight the urgency of addressing the issue of Shadow and Ghost Data seriously and proactively.To mitigate the risks associated with Shadow Data and Ghost Data, a multi-layered approach is essential.

First and foremost, user education and awareness are crucial. Users must be trained on the risks of improper data sharing and correct privacy practices in cloud services. It is also important to promote the use of strong passwords and develop a culture of cybersecurity within the organization.

Monitoring and Control are equally crucial. Companies should implement software for identifying and analyzing Shadow and Ghost Data, establish clear policies for their management, and conduct periodic reviews of data present in cloud systems and company devices.

Proactive protection includes using encryption tools for sensitive data and implementing secure backup systems. Additionally, solutions for secure and permanent data deletion are essential to ensure that deleted data cannot be recovered in the future.

Shadow Data and Ghost Data represent a growing challenge in the cybersecurity landscape. With the continuous evolution of cloud technologies and increasing reliance on these services, it is crucial that individuals and organizations remain vigilant and proactive in managing their digital data. The cybersecurity of the future will not only be a matter of advanced technology but also awareness and responsible behavior. Only through continuous and conscious commitment can we hope to navigate safely through the increasingly deep and complex waters of the digital world.

Unveiling the Risk Landscape of LLMs

A Comprehensive approach proposal

Risk Landscape of LLM
Created with Bing Image Creator

Greetings, readers! Welcome back to our exploration of LLM (Large Language Models) security risks. In my previous posts (here and here), I discussed the significance of understanding these risks. That’s why I am excited to share my participation in the creation of the OWASP Top 10 Risk for Large Language Model Applications 😊.

In this article, we will delve into the challenges involved in defining an approach to create the Top 10 LLM security risk list and propose a holistic approach to address them.

The Challenges in Defining a Top 10 LLM Security Risk List

As we embark on this endeavor, we encounter several challenges that need to be overcome:

  1. Evolving Landscape: LLMs are rapidly evolving, with new models (including Open ones with no restrictions) and attack techniques emerging. Keeping the evaluation comprehensive to address emerging risks is challenging but necessary.
  2. Complexity and Interdependencies: LLMs involve various components, including training data, algorithms, infrastructure, and user interactions. Understanding their interdependencies and how risks propagate across them requires careful analysis. Some components are already covered by other Top 10s but they might be so relevant that we might want to include them
  3. Lack of Standardization: Inconsistencies in terminology and definitions related to LLM security risks can lead to inconsistencies in risk assessment and mitigation. Establishing standardized language and frameworks is vital and luckily OWASP will help a lot in this. A couple of examples below:
    • I had a discussion about Intellectual Property Theft. I wrongly assumed that we were speaking only the theft of the LLM model itself, but if we think about it there are other king of IP theft, e.g., the weights are intellectual property, or if some users provide IP to the LLM, the LLM will learn from that and might provide the IP to the next users. As I said I didn’t consider those as for me those were privacy risks… but these are also ML risks
    • We had discussions on how we should call the “hallucination” risk (e.g., is this term humanizing LLMs? Shuldn’t something as “Confabulation” be better? Maybe, but hallucination is already LLM Jargon).
  4. Multidimensional Risks: LLM risks encompass technical, ethical, legal, and societal aspects. Incorporating these perspectives and achieving a holistic understanding is essential.
  5. Risk Prioritization: Determining the significance of each risk and prioritizing them within the Top 10 list is complex. Professional judgment and a thorough assessment are needed.
  6. Balance of Granularity: Striking the right balance between granularity and practicality is crucial. The Top 10 list should be concise, understandable, and actionable, while capturing the breadth and depth of LLM security risks.

Addressing the Challenges with TARA

“Necessity makes the method” used to say one of my old bosses, and to tackle these challenges, I propose adopting a TARA (Threat Analysis and Risk Assessment) method, which involves identifying potential threats, analyzing their likelihood and impact, and evaluating associated risks.

First Step: Threat Modeling

We start conducting a comprehensive threat modelling exercise, defining threat categories specific to LLMs and documenting potential threats within each category.

Below you will find my proposal of threat list, it is not supposed to be 100% correct, just to give an idea on how it would look like. To do so I used OWASP v0.1, Adam AI centered Top 10 some of the Cybersec risks and ML risks from this super insightful article.

Category Threats Sub-Threat 
LLM-specific Prompt Injection Direct Prompt Injection 
Second Order Injection 
Cross-content injections 
Machine Learning Training-Time Attacks Training Data Poisoning 
Byzantine attacks 
Decision-Time Attacks Inference 
Evasion Attacks  ???
Oracle Attacks Extraction 
Inversion 
Membership Inference 
Model Theft Model Theft
Surrogate Model
Statistical Attack Vectors Bias  Drift 
Model Hijacking Attacks Backdoors 
Trojanized models 
User specific Overreliance on LLM-generated ContentHallucination
Bias
Inexplicability
Operational  ???Inadequate AI Alignment
Application /  
Infrastructure 
Insecure development Inadequate Sandboxing 
Improper Error Handling
Insecure deploymentUnauthorized Code Execution 
SSRF Vulnerabilities
Insufficient Access Controls
Personal Data /  
Intellectual Property 
 ???Data Leakage
IP Theft
A proposal of LLM Threats

To be more accurate, this exercise leans more towards threat identification rather than threat modelling.

Please note that I’m not sure where all the sub-threats should be. For instance an ML threat might be the root cause of the existence of some User specific or Personal Data/IP threats…

The following TARA Steps

The next steps would be:

  1. Risk Evaluation: Estimate the likelihood and impact of each identified threat, considering various perspectives and dimensions. Combine these factors to calculate the overall risk level associated with each threat.
  2. Risk Prioritization: Prioritize risks based on their significance and impact, using professional judgment and a holistic perspective to choose the Top 10.
  3. Mitigation Strategies: Define appropriate mitigation and prevention strategies to address the identified risks effectively.

Those phases are all straightforward, the only difficult part could be understanding the impact. What angle do we need to consider? For an organization of course many of those threats could result in data breaches, financial losses, reputational damage, legal implications, etc. What if we consider a non-enterprise end-user? And the LLM owner? E.g., the latter would be the only one that wants to avoid model theft…

Conclusion

LLMs are at the forefront of technological advancement, and understanding their risks is paramount for secure adoption. By adopting a comprehensive approach like TARA, we can identify, assess, and mitigate these risks more effectively.

Collaboration, standardization, and a multidisciplinary perspective are key to success in this endeavor. Let’s work together to create a safer LLM landscape and pave the way for responsible and secure deployment.

Join me for future articles as we explore LLM security risks and discuss practical mitigation strategies.

OWASP vs. Cybersec.Café’s LLM Top Security Risks

A Follow-Up Comparative Analysis

LLM Top Security Risks
Created with Bing Image Creator

Following our previous exploration of Large Language Models’ (LLMs) security risks, I am now presenting a comparative analysis of the risks highlighted by Cybersec.Café and those identified by OWASP (Open Web Application Security Project). OWASP is a renowned authority in web application security and has recently published a preliminary list of LLM security risk.

LLM Top Security Risks Comparative Analysis

1. Jailbreaking

This corresponds to several risks in OWASP’s list: LLM03:2023 – Inadequate Sandboxing, LLM04:2023 – Unauthorized Code Execution, LLM05:2023 – SSRF Vulnerabilities, LLM08:2023 – Insufficient Access Controls, and LLM09:2023 – Improper Error Handling.

In my perspective, Jailbreaking refers to the process of gaining unauthorized access to and control over an LLM’s underlying systems or processes, while OWASP risks might pertain more to the system or application underpinning the LLM rather than the LLM itself. While jailbreaking could serve as an entry point for exploiting these OWASP risks, the mitigation strategies may not be fully effective in all cases.

By articulating these risks separately, OWASP’s approach might help define individual mitigation actions.

2. (Direct) Prompt injection, 3. Second-order injections

These risks directly align with OWASP’s LLM01:2023 – Prompt Injections, although OWASP’s category encompasses all forms of prompt injections.

4. Data Poisoning

This directly aligns with OWASP’s LLM10:2023 – Training Data Poisoning.

5. Misinformation

This risk somewhat corresponds to OWASP’s LLM06:2023 – Overreliance on LLM-generated Content, especially in scenarios where overreliance results in misinformation. However, OWASP’s category includes other potential issues, such as bias, making it more comprehensive.

6. Malicious content generation

This risk intersects with OWASP’s LLM07:2023 – Inadequate AI Alignment. The link might seem tenuous, but the principle remains that an LLM’s use case should not be creating malicious content.

7. Weaponization, 8. LLM-delivered attacks

These risks overlap with OWASP’s LLM04:2023 – Unauthorized Code Execution and LLM07:2023 – Inadequate AI Alignment. These risks underscore the potential for LLMs to be exploited for malicious purposes, be it coding malware or delivering attacks.

9. Abuse of vertical LLM APIs

This risk relates to OWASP’s LLM07:2023 – Inadequate AI Alignment and LLM08:2023 – Insufficient Access Controls. Poor AI alignment could potentially lead to misuse of the LLM, and similarly, poor access control could result in unauthorized actions.

10. Privacy and Data Leakage

This risk directly corresponds to OWASP’s LLM02:2023 – Data Leakage.

Conclusion

In creating this top 10 and comparing it with OWASP’s list, I observed that the key differences lie in the granularity and standardization of terminology.

The field of LLM security is still relatively nascent, and there is a noticeable need for standardization of terms. This comparison has shed light on this fact.

I hope that OWASP’s risk list will bring the critical security considerations for LLMs into sharper focus, laying a solid foundation for further discussions and the development of security measures in this rapidly evolving technology sphere.

The Top 10 Large Language Models Security Risks

Understanding the Top 10 Security Risks Associated with Large Language Models (LLMs)

Top 10 Large Language Models Security Risks
Image by Bing Image Creator

Introduction

Large Language Models (LLMs) have revolutionized the field of artificial intelligence and natural language processing, but with great power comes great responsibility. As LLMs become increasingly prevalent, it’s essential to understand the potential security risks they pose.

In light of OWASP’s recent announcement of the OWASP Top 10 Risk for Large Language Model Applications, this article aims to explore my perspective on the top 10 security risks associated with LLMs. I am eager to compare and contrast these risks with the ones OWASP will publish.

Cybersec.Cafè Top 10 Large Language Models Security Risks

  1. Jailbreaking: Bypassing the security measures of an LLM to gain unauthorized control and exploit it for malicious purposes.
  2. Prompt injection: Crafting prompts to influence the model’s output, which can lead to biased, offensive, or harmful text generation.
  3. Second-order injections: Advanced prompt injection techniques, where the prompt itself is generated by an LLM, making it harder to detect and prevent attacks. Note: I’m not considering cross-content injections (a type of prompt injection where the prompt is generated in one context and then used to generate text in another context – this can be used to generate text that is relevant to the first context but harmful in the second context) as I consider still as in between of both risk 2 and 3
  4. Data poisoning: Injecting malicious data into the training dataset, resulting in biased or harmful outputs. Rigorous validation and monitoring are crucial to mitigate this risk. This is actually a Machine Learning (ML) risk that extend to LLMs being that their training is ML based.
  5. Misinformation: Unintentional contribution to the spread of misinformation or support for creating misinformation campaigns.
  6. Malicious content generation: Misusing LLMs to generate persuasive or believable text for phishing or social engineering attacks.
  7. Weaponization: Misusing LLMs to support coding malware or potentially even for malware detection evasion (still a theoretical threat) by generating malware code that evades traditional endpoint detection and response scanners. For example, an LLM could be used to generate malware code that is not detected by traditional Endpoint Detection and Response scanners as the code is generated by an LLM that provides it via API.
  8. LLM-delivered attacks: Using LLMs to deceive users and obtain sensitive information or launch cyber attacks. For example, an LLM could be used to ask a user for sensitive information such as their passwords or credit card number.
  9. Abuse of vertical LLM APIs: Exploiting LLMs for purposes outside their intended use cases, potentially undermining the intended business model.
  10. Privacy: LLMs are trained on massive datasets that contain also personal information, raising privacy concerns if the models generate text like the confidential data it was trained from. This happens for instance with Inference Attacks or Model Inversion Attacks these attacks attempt to infer or recreate information about the training data from the outputs of an ML model.

Some other thoughts

Conclusion

While the risks associated with LLMs may seem challenging, we don’t know yet if they are insurmountable. As of today, we still lack comprehensive solutions to mitigate most of these risks compared to other security domains like applications and mobile devices. Additionally, due to the “black box” nature of LLMs, understanding their inner workings presents challenges in determining the appropriate security measures to adopt. Furthermore, regulatory frameworks surrounding LLM use are still evolving, as discussed in my geopolitical analysis of the ChatGPT block in Italy.

LLM security contains a multitude of unknown unknowns, and it necessitates further research and mitigation strategies to effectively safeguard against these risks. Awareness serves as the critical first step towards achieving effective cybersecurity if it will be ever possible to reach it.

Recommended Readings

To delve deeper into the topic, I recommend reading the following insightful resources:

  • https://www.wired.com/story/chatgpt-jailbreak-generative-ai-hacking/
  • https://themathcompany.com/blog/data-poisoning-and-its-impact-on-the-ai-ecosystem
  • https://spectrum.ieee.org/ai-cybersecurity-data-poisoning
  • https://www.semianalysis.com/p/google-we-have-no-moat-and-neither
  • https://ambcrypto.com/heres-how-to-jailbreak-chatgpt-with-the-top-4-methods-5/
  • https://www.techopedia.com/what-is-jailbreaking-in-ai-models-like-chatgpt
  • https://www.theregister.com/2023/04/26/simon_willison_prompt_injection/
  • https://blogs.itemis.com/en/model-attacks-exploits-and-vulnerabilities
  • https://research.nccgroup.com/2022/12/05/exploring-prompt-injection-attacks/
  • https://hiddenlayer.com/research/the-dark-side-of-large-language-models/
  • https://hiddenlayer.com/research/the-dark-side-of-large-language-models-2/
  • https://embracethered.com/blog/posts/2023/ai-injections-direct-and-indirect-prompt-injection-basics/
  • https://embracethered.com/blog/posts/2023/ai-injections-threats-context-matters/
  • https://www.mufeedvh.com/llm-security/

Relying on Security-by-Luck

The Interplay of Risk, Investment, and… Luck in Cybersecurity

Security-by-Luck
Photo by Djalma Paiva Armelin from Pexels

Last weekend, I came across a LinkedIn post illustrating how numerous companies were breached despite having SOC2, ISO 27001, and PCI-DSS certifications. This observation prompted me to reflect.

Initially, my thought was that there isn’t a direct correlation. The data set is rather small and doesn’t account for all the certified companies that have avoided breaches. Furthermore, certification is a form of assurance that some level of security is in place, signaling to potential attackers that there is valuable data worth protecting.

In the cybersecurity realm, we frequently emphasize robust defense mechanisms, proactive risk assessments, and constant vigilance. Today, however, I want to navigate less charted territory: “security-by-luck”.

What do you mean with Security-by-Luck?

My definition of “Security-by-luck” would be the situation where a company, despite having weak or inadequate security measures, remains unbreached due to factors outside its control, such as the attackers’ choices, capabilities, or sheer chance.

To clarify, I’m not endorsing this as a strategic approach – that would be reckless. Rather, I aim to highlight a crucial facet of cybersecurity – the constant interplay of risk, investment, and a dose of luck.

In a previous article, I discussed on the challenge of defining ‘how much security is enough’. No matter how much an organization invests in security, the threat of an attack persists. Conversely, not all lightly-defended organizations will suffer breaches, too lightly defended (even if those that are inadequately defended become low-hanging fruit for cybercriminals). However, over-investment in security isn’t the solution either, as organizations have other business objectives to meet. So, the question arises, where do we draw the line?

I’m not suggesting that companies should stop investing in cybersecurity and merely hope for the best. Instead, I want to stress the importance of making calculated risks.

To illustrate this, consider four hypothetical companies, each investing differently in cybersecurity…

The contenders:

  • Company A: Does the bare minimum for security (e.g., has an antivirus installed)
  • Company B: Complies with statutory requirements and uses common sense
  • Company C: Adheres to a cybersecurity standard and has obtained certification (like SOC 2, ISO 27001, PCI-DSS, HITRUST, etc.)
  • Company D: Follows all major best practices and has adopted bleeding-edge security solutions

Each of these companies, regardless of their investment level, can either be breached or remain secure. Here’s how:

Vulnerabilities-based Attacks:

  • A vulnerability in their system gets exploited – Company A gets breached.
  • Company B, which patches vulnerabilities quarterly, gets breached when an attacker exploits a flaw within the time window before it gets patched.
  • Even Company C, which patches vulnerabilities monthly, gets hacked, as the attackers were quicker on their feet.
  • Company D has no known unpatched vulnerabilities (a near impossibility in real life, but let’s go with it). However, there’s a zero-day vulnerability that they aren’t aware of (I know this is the definition of zero day). An attacker discovers and exploits it – Company D gets breached.

Let’s assume, for a moment, that all these companies understand this risk and decide to have all vulnerabilities patched (again a near impossibility) and are lucky there aren’t any unexploited zero-day vulnerabilities. You might think they’re safe. But what if an attacker targets their people instead?

People-based Attacks:

  • An attacker successfully executes a phishing attack on Company A, leading to a breach.
  • Despite having good email security and having conducted a phishing simulation last year, Company B falls prey to a successful social engineering attack.
  • Company C suffers a sophisticated MFA fatigue attack and gets breached.
  • In Company D, an attacker bribes an employee to gain access to the system (including credentials and MFA, as seen in the Lapsus$ attacks last year).

Even if the organization decide to invest in a solid cyber culture and luckily their employees are equipped with strong ethics to resist such attempts, are the potential threats truly over?

Unfortunately, no, the threats aren’t over. They are susceptible to…

Supply Chain Attacks:

The attack surface extends to vendors, giving birth to a new cycle of vulnerabilities and people-based attacks. Hence, even Company D could harbor cybersecurity points of failure within their supply chain.

Luck is Not a Strategy

In essence, cybersecurity isn’t merely about investment levels; it’s also about the complex interplay of factors that contribute to a company’s overall risk profile. Even the most secure organization cannot completely rule out the possibility of a breach. Given the dynamic nature of the landscape, absolute security is a virtual impossibility, making a small element of ‘luck’ an undeniable part of the equation.

Regrettably, many companies have relied solely on this ‘luck’ factor for so long that they’ve now become easy targets.

‘Security-by-Luck’ should not be a strategy in itself, but understanding its role in the broader cybersecurity framework is essential. The goal should always be to optimize investment, maintain a robust defense mechanism, foster employee awareness, and devise sound strategies to mitigate potential risks, including supply chain risks. This involves striking a balance, understanding that no solution offers 100% protection, and ensuring readiness to respond effectively (by having incident response plans and exercises conducted) if or when a breach occurs by conducting regular incident response plans and exercises.

Conclusion

In conclusion, while we can’t depend entirely on luck, or as the Cybersecurity community usually call it, the residual-risk, acknowledging its existence, could make us more attuned to the realities of the ever-evolving cybersecurity landscape. The presence of residual risk is an undeniable part of cybersecurity, and acknowledging without relying on it might encourage a more realistic approach towards cybersecurity strategy and implementation.

Is too much Security a Big Cyber Risk?

Finding the Right Balance

Photo by Pixabay from Pexels

In the ever-evolving world of cybersecurity, finding the right balance between protection and flexibility is crucial for organizations. While it might seem counterintuitive, having too much security can be just as risky as having too little. Overly restrictive measures can slow down or even block business operations, pushing employees to bypass protocols and increasing risk.

In a previous article, we discussed how Zero Trust can help organizations achieve both security and flexibility. In this article, we’ll explore the risks of too much security and provide guidance on finding the perfect balance to safeguard your organization without stifling innovation. But why is finding this balance so important? Let’s delve deeper into the consequences of not having the righ security balance and how it can negatively impact your organization.

  1. Understanding the risks of too little security:
  • High agility but increased cyber risk
  • The impact of security incidents can be severe
  • Lack of preparedness and response plans
  1. The dangers of too much security:
  • Business operations are slowed or blocked
  • Employees may bypass security protocols, leading to shadow IT
  • Costs and resources may be wasted on unnecessary security measures
  1. Finding the right balance:
  • Conduct a thorough risk assessment to identify threats and vulnerabilities
  • Prioritize security measures based on the organization’s unique needs and risk profile
  • Implement a layered approach to security, focusing on prevention, detection, and response
  • Continuously monitor and evaluate the effectiveness of security measures
  1. Fostering a security-aware culture:
  • Encourage a culture of security awareness and accountability throughout the organization
  • Provide regular training and education for employees on security best practices
  • Establish clear policies and guidelines for secure behavior
  1. Embracing flexibility and adaptability:
  • Stay informed of the latest cybersecurity trends and threats
  • Regularly reassess and adjust security measures as needed
  • Adopt a proactive approach to security, anticipating potential risks before they materialize

Conclusion: Striking the right balance between too little and too much security is a delicate task, but it’s essential for organizations looking to protect themselves from cyber threats while maintaining business agility. By understanding the risks associated with both extremes and implementing a well-rounded cybersecurity strategy, businesses can reduce their risk exposure and thrive in today’s complex digital landscape.

© 2024 CyberSec.Cafe