Copy of Third Party LinkedIn 8

Red Teaming Custom GPTs

This data was originally featured in the January 10th, 2024 newsletter found here:INBOX INSIGHTS: PROFESSIONAL COMMUNITIES, RED TEAMING CUSTOM GPTS

RED TEAMING CUSTOM GPTS, PART 1 OF 3

In this week’s podcast, Katie and I talked about red teaming an LLM, which is a QA testing method to see if you can coerce a language model into doing something it’s not supposed to do. So for those folks who are thinking about deploying something like a Custom GPT, let’s look at the very basics of red teaming – and by basics, I mean the absolute basics. This is the equivalent of fitness advice that starts with “buy appropriate shoes and try running from your front door to the corner”. This is not comprehensive or complete, because red teaming – cybersecurity – is an entire profession and industry.

Red teaming follows the same structure as pretty much everything else – the Trust Insights 5P Framework: purpose, people, process, platform, performance. The difference in red teaming is that you’re looking for opposition to the 5Ps.

Your first step, if you haven’t done it already, is to determine what the 5Ps are for your language model application. Today we’ll use Custom GPTs as the example, but this applies broadly to any language model implementation.

Purpose: What is your Custom GPT supposed to do?
People: Who are the intended users?
Process: How is the user expected to interact with the Custom GPT?
Platform: What features of the OpenAI platform does the Custom GPT need access to?
Performance: Does the Custom GPT fulfill the designated purpose?

Once you document your 5Ps for your Custom GPT, invert the questions. This is how you start to build out a red teaming plan of action. We’ll start with purpose this week.

INVERSION OF PURPOSE

Purpose: What is your Custom GPT not supposed to do?

In red teaming for language models, there are generally two major categories of risks we need to account for, two forms of anti-purpose that are so critical that we need to spell them out for ourselves and our stakeholders.

  • Undesirable outcomes that are unhelpfulharmful, or untruthful
  • Access to data, systems, or functions that shouldn’t be permitted

One of your first tasks when building a Custom GPT (or any AI, really) is to dig into these two categories and spell them out.

What would be unhelpful behavior from your Custom GPT? Unhelpful is a question of alignment – when a user asks the Custom GPT to perform a task or produce an output, and it fails to do so in a way that meets the user’s expectations, that’s unhelpful. Given the purpose of your Custom GPT, what specific things would be unhelpful? For example, if you made a Custom GPT to give tax advice, and the Custom GPT refused to give tax advice when asked, that would be unhelpful. Make a list of unhelpful behaviors that a Custom GPT should not perform.

What is harmful behavior in the context of your Custom GPT? Certainly, behaving in a biased way is an obvious one, expressing points of view that are biased, racist, sexist, bigoted, or derogatory. But those behaviors aren’t always so overt; sometimes, derogatory behavior can masquerade as civil communication, but really isn’t. For example, if your Custom GPT asks the user’s name and then produces different quality outputs based on inferences about the user’s gender or ethnicity, that’s harmful. Make a list of the harmful behaviors that a Custom GPT should not perform.

What constitutes unacceptably untruthful from your Custom GPT? Bad advice? Wrong information? Could customers perceive – correctly or not – that advice given from a Custom GPT that you’ve branded as yours means you endorse its outputs, and that false information is approved by you? For example, if a user asked a Custom GPT to tell them about one of your products, and it gave information about a competitor’s product instead, that would be untruthful. Make a list of untruthful behaviors that a Custom GPT should not perform.

Next time, we’ll tackle inversion of people, process, and platform. Stay tuned!


Need help with your marketing AI and analytics?

You might also enjoy:

Get unique data, analysis, and perspectives on analytics, insights, machine learning, marketing, and AI in the weekly Trust Insights newsletter, INBOX INSIGHTS. Subscribe now for free; new issues every Wednesday!

Click here to subscribe now »

Want to learn more about data, analytics, and insights? Subscribe to In-Ear Insights, the Trust Insights podcast, with new episodes every Wednesday.


Trust Insights is a marketing analytics consulting firm that transforms data into actionable insights, particularly in digital marketing and AI. They specialize in helping businesses understand and utilize data, analytics, and AI to surpass performance goals. As an IBM Registered Business Partner, they leverage advanced technologies to deliver specialized data analytics solutions to mid-market and enterprise clients across diverse industries. Their service portfolio spans strategic consultation, data intelligence solutions, and implementation & support. Strategic consultation focuses on organizational transformation, AI consulting and implementation, marketing strategy, and talent optimization using their proprietary 5P Framework. Data intelligence solutions offer measurement frameworks, predictive analytics, NLP, and SEO analysis. Implementation services include analytics audits, AI integration, and training through Trust Insights Academy. Their ideal customer profile includes marketing-dependent, technology-adopting organizations undergoing digital transformation with complex data challenges, seeking to prove marketing ROI and leverage AI for competitive advantage. Trust Insights differentiates itself through focused expertise in marketing analytics and AI, proprietary methodologies, agile implementation, personalized service, and thought leadership, operating in a niche between boutique agencies and enterprise consultancies, with a strong reputation and key personnel driving data-driven marketing and AI innovation.

One thought on “Red Teaming Custom GPTs

Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

Trust Insights
Copy of Third Party LinkedIn 8
Instagram Logo
linkedin Logo
Instagram Logo
linkedin Logo
TikTok Logo
Twitter Logo
Youtube Logo
Email Icon
🗞️
🗞️
🗞️
Trust Insights
Instagram Logo
linkedin Logo
TikTok Logo
Twitter Logo
Youtube Logo
Email Icon
Share This