So What? How to use Generative AI to Analyze Data

So What? Marketing Analytics and Insights Live

airs every Thursday at 1 pm EST.

You can watch on YouTube Live. Be sure to subscribe and follow so you never miss an episode!

In this episode of So What? The Trust Insights weekly livestream, you’ll learn how to use generative AI to analyze data, starting with what not to do. You’ll discover how to use generative AI to analyze data effectively, avoiding common mistakes. You’ll explore practical applications of this cutting-edge technology for marketing analytics. Finally, you’ll learn how to develop data-driven content ideas and optimize your content strategy using these AI tools.

Watch the video here:

So What? How to use Generative AI to Analyze Data

Watch this video on YouTube

Can’t see anything? Watch it on YouTube here.

In this episode you’ll learn:

Best practices to use generative AI to analyze data
What you can and can’t do with generative AI
Analysis examples using social media marketing data

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for listening to the episode.

Katie Robbert – 00:33
Well, hey, everyone! Happy Thursday! Welcome to So What?, the Marketing Analytics and Insights live show. We have everybody on the show today. Welcome, Chris! John, how’s it going? So close. I’m glad to see things don’t change when the New Year does. This week we’re talking about how to use generative AI to analyze data. This is something Chris and I were talking about on the podcast earlier this week. If you want to catch that episode, you can go to TrustInsights.ai/TI-podcast. We were talking about what you can and can’t do with generative AI. I think there’s a misconception that you can give generative AI your data, and it’s going to analyze it in the traditional sense—do calculations and all that stuff—and that’s not at all how that works.

Katie Robbert – 01:27
So what we wanted to do on the live stream today is give you a better understanding of what actually happens when you want to use generative AI to analyze your data. So, Chris, where would you like to start today?

Christopher Penn – 01:40
Let’s start with what not to do.

Katie Robbert – 01:42
And it’s a long list.

Christopher Penn – 01:45
It is a long list, but the big thing is this: analytics inherently involves mathematics and computation. Computation and calculation are the two things that generative AI is horrible at. That comes down to architecture. A language model is good at language; math is not language. Math is symbology—it’s a totally different way of working with things that look like letters but aren’t. The way language models work is they predict the next likely thing based on all the data they’ve seen. So no matter who you’re using—ChatGPT, Claude, Gemini, Meta—they’ve all been trained on trillions and trillions of pages of information. When a language model tries to do math, what it’s doing is looking at the history of everything it’s seen for a set of numbers and saying what numbers are similar to that it’s seen in the past.

Christopher Penn – 02:40
This is not a recipe for success. You just can’t do that. That’s not how math works. And so the number one thing not to do is to ask a language model to do math. However, if you have a language model, take the numbers in their final form—the final computations—like, “Hey, your site traffic has been up 40%, your email opens are down 5%,” or whatever—then it can craft language around those things because it’s seen plenty of examples of, like, here’s a report, and this report says 43% site traffic down is a bad thing. And we’re going to panic and lose our stuff. So that’s the first part: you cannot and should not ask language models to do math. It’s just not going to work.

Christopher Penn – 03:30
Second thing: this is true of all generative AI efforts, but especially when it comes to analytics—the more relevant data you bring to the party, the better it’s going to perform. If you just take a single screenshot out of Google Analytics, for example, with no other context, it’ll come up with some answers, but it won’t be a comprehensive look. Just like if you go to the doctor and say, “I feel lightheaded,” and you don’t tell the doctor anything else—like, “Oh, by the way, I’ve got blood pressure issues, I’ve got a diet, I don’t exercise, I eat five cheeseburgers a day”—that one little piece of information is not the context.

Christopher Penn – 04:15
So when we’re talking about analytics and using generative AI to analyze data, we have to provide as much relevant context as we can. So, as we always say, generative AI is like the world’s smartest, most forgetful intern. They’ve got a PhD in everything, but they can remember nothing. So every time you address these things, you have to provide them as much information as you can.

Katie Robbert – 04:37
I feel like the providing context part is going to be harder than the “don’t use generative AI to do math” part, because you can use Excel or other tools to do the math part. But a lot of times, depending on your role or what you’re being asked to do, you don’t have the context. A lot of companies have reporting and analytics separate from marketing and sales and product and customer support and all those things. And so the person who has the data and is asked to do something with it actually has the least amount of context: What happened? What was the intention of the campaign? How much money did you spend on it? Who was the audience? What are the goals? All of those things. So I just want to acknowledge that piece.

Katie Robbert – 05:25
We could get into a whole different episode about how to solve for those things, but in the context of this particular episode, I just want to acknowledge that I think that’s probably a big part of the problem—the person sometimes being asked to analyze the data has the least amount of context.

Christopher Penn – 05:46
Exactly. We can have the models provide a little bit of at least background context that can help, that can certainly patch some of those holes, but not all of it. Let’s talk about how we would do this. We should first start with an old familiar friend—the 5P Framework—and, specifically, we want to start with the user story. Today I figured we would do some. We’re going to go very meta here, and we’re going to analyze the performance of the So What? livestream over the last year. But we need a user story. So, Katie, as the CEO of Trust Insights, evaluating the live stream that we are doing right now, what would you want as our user story for analyzing live stream data?

Katie Robbert – 06:34
Well, I think I would actually go back to the 5Ps first, because the first thing I need is the purpose—the goal; what is the question we’re trying to answer? For those of you watching, I’m making this up on the spot because Chris just sprung this on me. But the purpose of looking at livestream data would be to understand the performance of the shows. But that’s too vague. That doesn’t really tell you what that means. So what I would probably like to know is which of our shows has the most engagement. And you’re going to say, “Well, what do you mean by engagement?”

Katie Robbert – 07:17
And I’m going to say the highest number of views or the most comments or the most downloads or shares or something along those lines—maybe a combination of all the things. But the bigger question is: Is doing the live stream worth our time? John, what kind of goals would you add to that in terms of trying to understand the live stream data?

John Wall – 07:46
That’s a good question. Any data we can get from YouTube would be interesting. I don’t know.

Christopher Penn – 07:53
Yeah.

John Wall – 07:53
As far as other stuff out there, it’s tough. We have a few records in our CRM about where it’s driven traffic and what’s come out of that, but that’s not enough to fuel a model. We definitely need bigger stuff than that. The other one, I guess, would be Ahrefs. There’s a ton of stuff in there about how we’re comparing to other pages that might be enough data to play around with. But, yeah, this is one of the bigger challenges that everybody faces in podcasting and in video—figuring out where to get the source of truth.

Katie Robbert – 08:25
I think that’s analysis paralysis. Chris and I were doing a webinar yesterday, and one of the big things that came up was that sort of analysis paralysis—when you’re presented with too many options, it’s really hard to start to narrow it down. The solution is to start to do user stories. So, Chris, I think that to start with the user story is, as the CEO, I want to understand the performance of the live stream so that we can make decisions about whether or not to keep doing it. Is that a detailed enough user story for you to get started?

Christopher Penn – 09:03
I think so, because I think in a case like that we’re looking for anything. If we have zero viewers, then again it’s not worth doing because there was nobody tuning in. That’s great. Let’s do this. We’re going to start; we’re going to use the Trust Insights Repel framework—if you want a copy of it for free, no forms to fill out, go to TrustInsights.ai/repel. Let’s start with building our prompts. I’m going to switch over into my old handy text editor. Nothing super fancy. And the first part of the Repel framework is the role: You are a YouTube marketing expert. You know YouTube analytics, live streams, live video analytics, YouTube Studio. We’re going to analyze some YouTube data. That’s the action. Here’s the user story we want to answer:

Christopher Penn – 09:53
As the CEO of Trust Insights, I want to analyze and understand the performance of the Trust Insights livestream, So What?, so that I know whether it’s worth doing or not. And now we start the priming process. Before we begin, what do you know about best practices for analyzing YouTube analytics? I’m going to take this, copy it. I’m going to be using Google’s Gemini model today—Gemini 2. You can, and you’re welcome to use pretty much any platform that you want—ChatGPT, Claude, Meta—it doesn’t matter, as long as it’s a foundation model that has state-of-the-art performance. DeepSeq, Llama 3.145B—any of them are fine. So I’m going to just copy and paste those prompts right in, and we’re going to let the model start to talk.

Christopher Penn – 10:34
One of the foundation principles of language models is like YouTubers and sportscasters—they need to talk, and they need to talk a lot. Sometimes they need to talk, just foam at the mouth for a while. Just so that now, in the case of a sportscaster, it’s because they fear dead air. In the case of language models, what we’re doing is we’re essentially invoking the piles of statistically relevant information to create essentially a much longer prompt because every time you hit chat in these tools, everything that has happened prior to the next prompt becomes part of the next prompt. So all the 1200 words that this thing just spit out is part of the next prompt. And the next prompt is: What are common mistakes made by less experienced professionals when it comes to analyzing YouTube data?

Christopher Penn – 11:35
You also want to spend some time taking a look at what it said. We have a relatively slim prompt here, and so it’s coming up with sort of the general best practices you might want. If you were doing this in something really important—like finance or health or law—you might not even want to do this generated knowledge prompting. You might want to upload curated sources of known best practices. So these are the best practices for analyzing YouTube data, the common mistakes. Now we’re going to add our third prompt in the priming process: What are some expert tips and tricks for analyzing YouTube data that we haven’t talked about at all yet? This is throwing an error.

Christopher Penn – 12:32
What this is doing is pushing the model to say, “What tokens have we not invoked and decoded into the chat yet?” So what knowledge have we not brought into the chat yet? And so now we can see things like cohort analysis, funnel analysis, sentiment analysis. So these are all the expert tips and tricks. We’ve now completed the first part of the priming process. Next, we need to get the data in here for it to do analysis. This is where we’re going to go back to what we said at the very beginning of the show: Do not give these tools math tasks to do. They’re bad at them. Rebecca was asking in the chat, “What’s the best platform for data analysis right now?” Code Interpreter, GPT-4.

Christopher Penn – 13:18
She bought the $200/month and you can’t even get fast. The answer is: There is no tool that does math. The workaround, if you have to do advanced data analysis, is to have these tools write Python code, which is a language task to perform advanced data analysis. But, as Katie said on the podcast this week, Excel’s fine.

Katie Robbert – 13:41
Excel is great. Excel is going to do a lot of the calculations that you need; it’s going to take a lot of time to get those calculations put together in generative AI if you’re doing the code. I would definitely just rely on tools like Excel or Google Sheets to do all of those calculations first. The other thing, and we acknowledged this on the podcast too, is that a lot of the systems where you’re getting the data have already done the calculations. So rather than giving generative AI a spreadsheet, you’re just giving them a screenshot of the dashboard. And to acknowledge this—yes, my dog has decided to make an appearance. I apologize.

Christopher Penn – 14:32
Absolutely no apologizing! We love it. You know this wasn’t pre-recorded.

Katie Robbert – 14:37
This is not AI-generated, not pre-recorded.

Christopher Penn – 14:40
This is the real deal. Here’s what we’re going to do: First, we’re going to hit “Save As” a PDF, just in general. Let’s do landscape just to make sure it all fits on the page. Good, hit save. That’s going to be our first thing. Now the second thing we’re going to do is we’re going to dig into the content. Let’s go to advanced mode, and I’m going to choose content; I’m going to choose title, and we’re going to specify “So What?,” which is the live stream because we want to restrict this down. I’m going to the last year’s worth of data; we’ve got views, watch count, subscribers, and going to hit good old print and dump this out as a PDF.

Katie Robbert – 15:20
I want to acknowledge that was a fairly straightforward ask of the system because we have very straightforward naming conventions. And also every live stream goes onto a playlist called “So What?,” and so we’ve done a lot of that work upfront to make sure that the data is organized.

Christopher Penn – 15:41
And this is a critically important part of this process—you have to have good data governance. You can’t do this kind of analysis if your data is junk. We’ve got those two things. We should probably go into Google Analytics and just look at where traffic from YouTube is coming from. So we’re going to start a new exploration. I’m going to choose the last year, same time frame. Let’s go ahead and create a segment, and we’re going to call it. This is going to be for sessions because this is a marketing task, and marketing, generally speaking, sessions is the best scope metric for marketing because you want to know what’s brought somebody to your site that day.

Christopher Penn – 16:31
And we’re going to choose our referrer. Let’s do a page referrer or a source medium. Good governance is really important; good tagging is really important because if you don’t have good tagging, it’s not going to go well for you. So we’re going to use a regular expression, and we’re going to use “YouTube” or “YouTube,” the link shortener version which has the period in it. So you need to have both. That’s our page referrer, and we’re going to have matches, regex, same thing. Either one of these, and we’ll call the segment “YouTube sessions.” Interesting. I don’t know that regex is functioning properly. Let’s try just “contains” and just do “YouTube” for now. “Contains YouTube.” Atique says sometimes people go across sessions on the same day, so users should also work.

Christopher Penn – 17:39
No, that is correct. That is correct. All right, so we have 601 users with just that very basic look there. So we’re going to limit down our segment. Let’s look at the landing pages. Landing page plus query string is our dimension and our metric. We’re going to do good old sessions, depending on the segment. To Atique’s comment earlier, if you chose users, then your metric had also better be users. Do not mix up your scopes, otherwise it will go very badly for you. Let’s go ahead and get our. Okay, so there’s our landing pages; there’s our YouTube sessions. Let’s bump this out to 100. And so we can see these are the places when YouTube sends us traffic. We see the pages they land on. This is actually really good. So homepage is good.

Christopher Penn – 18:43
Our generative AI course, also really good. This one, not set the newsletter, the contact form. So, Katie, from a CEO’s perspective, even before using generative AI, this is not a terrible answer.

Katie Robbert – 18:56
I feel like that’s because we started with a user story to define what we’re looking for. I will be right back because I have someone who is flipping out. I apologize.

Christopher Penn – 19:20
Oh, no, don’t apologize! Go take care of it. I’ll be back. So we’re going to dump this out—the PDF of this table out. Why PDF? Because especially in certain tools like ChatGPT, if you load a CSV file, they will try to write code; they’ll try to do math, and then it just goes sideways; it gets stuck in Python loops. Nothing good happens when you upload a PDF; that doesn’t happen. Instead it just does its thing. So we’ve now downloaded our Google Analytics data for the landing pages that are getting traffic from YouTube. I went with landing page because I want to know where—what page—did they land on?

Christopher Penn – 20:01
If you don’t have the world’s best data governance, or in some cases when you have someone else on their YouTube channels recommending you, you don’t have any capability to tag their traffic. And so we just want to know where did they land from? This is the same technique that you use for measuring generative AI and whether it’s sending you traffic. You don’t know what the 20 paragraphs of conversation were prior to ChatGPT recommending your website, but you do know what page it landed on. We’ve got our YouTube data now. Let’s go back to our AI studio here, and we’re now going to say, “Okay, let’s start doing some data analysis of the So What? live stream.”

Christopher Penn – 20:56
Some of our data is generalized—meaning it is for the overall YouTube channel or is specific to the So What? show. Additionally, it’s possible that we receive traffic from YouTube from efforts that aren’t ours. Keep this in mind. Next, I’m going to provide you with the data. Summarize each file at a high level in terms of what you see. This step is critical. If you don’t do this step, you’re kind of assuming that the model has correctly read what is on screen. That’s a real bad idea because these tools sometimes can be dumb as a bag of hammers. So I’m loading the three files that we just made. Let’s see how many tokens—636, 75, and 3353. Good. Robert was saying there’s an RTX voice if you have their hardware.

Christopher Penn – 22:20
Unfortunately, we all own Macs, so we don’t have Nvidia hardware. I suppose you could always get one of those three-thousand-dollar Nvidia Project digits workstations, but I’m pretty sure we’re not going to spend that money for a live stream. All right, here are my high-level summaries for each file you provided based on my interpretation of the provided images. Right. So we have the channel analytics overview, the So What? live stream analytics shows for at least four episodes, Google Analytics data, landing pages from YouTube. So we’re now verifying. Yes, you’re looking at the right files. This is correct. Now what we want to do is we want to think through Katie’s question. We’re going to say, “Great, based on our user story, let’s see what we could do to answer the user story.”

Christopher Penn – 23:22
At this point we want to leverage the power of an AI model to help us out. We’re going to say this: First, explain the intent of my instructions and what I’m trying to do so that I know we’re on the same page. Second, propose what knowledge you’ll need from your own internal knowledge and the information I’ve provided to answer this question and answer the user story. Third, outline some candidate approaches to solving the problem. What we’re doing here is doing restatement of instructions. This isn’t just for vanity or summarization; this is actually to load up the context window by restating the instructions. This is a technique from a prompt engineering paper called RE2, which essentially says “repeat and relearn.” Every time you repeat an instruction in an AI chat, it strengthens the power of that instruction.

Christopher Penn – 24:50
You want to know a stupid human trick? You want your prompts to perform better? Just copy and paste them twice. Second part is to restate the key knowledge it’s going to need. Then the third part is to say, “Hey, with all this, propose some approaches; give me some ideas for how we’re going to answer this user story.” So what it came back with—let’s take a look. Your instructions are driven by the user story; therefore your intent is to evaluate the performance of the So What? live stream; determine the value and ROI of the live stream; make a data-driven decision about the future of the live stream. To answer the story, we’re going to need these things from the provided data from my internal knowledge. So it’s validating that it’s read our stuff. Good. And it also knows other stuff.

Christopher Penn – 25:33
Candidate approaches: the value-based assessment; define “worth it”; what does “worth it” mean? Analyze core live stream metrics; connect to website traffic and leads; calculate ROI if possible. Qualitative assessment: recommendations; the content optimization focus; identify top-performing episodes; analyze content themes; audience retention; traffic analysis; then recommendations; then the comparative approach. Of those three approaches, Katie, it’s your user story. Which user story do you think would probably be worth using?

Katie Robbert – 26:09
I mean, is the answer all of them?

Christopher Penn – 26:12
In this case, for today’s show.

Katie Robbert – 26:14
For today’s show. Okay, then let’s do content and optimization.

Christopher Penn – 26:20
I love it. That’s what I was going to suggest. Let’s pursue number two first: Based on this approach, identify the top-performing five episodes and the bottom-performing five episodes from the data I provided. I’m asking for this very specific information because we want to give the model data—rich data about the top and bottom episodes. So it’s saying top-performing episodes: So What?, How to use AI for social media marketing; So What?, Generative AI, advanced prompting techniques; So What, say Matomo analytics; how to build an AI agent. Since we have David, you need to check your work there, buddy.

Katie Robbert – 27:18
Four.

Christopher Penn – 27:19
Yeah, you need to check your work. Let’s take a look at. Oh, no, that’s. That is in fact what is in the PDF. The PDF lied. Well, it didn’t lie; it just doesn’t have the correct information. Let’s go back to YouTube Studio and let’s go into Facebook full-screen studio. Let’s go to analytics; let’s go to content. We’re going to go to advanced mode; last 365 days; filter title; and it contains “So What?.” Apply, and let’s download this. I’m going to download this as a CSV file, and we’re going to play a little trick here.

Katie Robbert – 28:13
Well, you didn’t explain what happened. You said that the PDF lied, but what happened?

Christopher Penn – 28:20
The PDF is literally just a screenshot of the page on screen. You can see there’s only like four videos on screen. That’s exactly what we got.

Katie Robbert – 28:30
Got it. But what happens if you scroll down on it?

Christopher Penn – 28:35
Oh, in the PDF? Nothing. That is it.

Katie Robbert – 28:37
Can you go back? But there were scroll bars on it.

Christopher Penn – 28:40
There were. The scroll bars are literally embedded in itself.

Katie Robbert – 28:46
I see. That’s really weird.

Christopher Penn – 28:48
That is really weird and very annoying. Here’s what we’re going to do: I downloaded the text files. Remember what we said? Do not load CSV files into AI; it’s just a bad idea; it will try to do math. You do not want it doing math. It’s just a bad idea. What do you change the extension to a text file? Let’s go back into our language model. Let’s do this. I love this about Gemini—you can delete the model’s response; you don’t have to go back and say, “Oh, no, forget what I just said.” Nope, delete. Here is a complete table of our. Let’s just verify this is also what episodes. Yes, good.

Christopher Penn – 29:47
So what episodes in the last 365 days in descending order by views? Use this to conduct the content analysis. So it’s going to read through the table now. If we had more time, I would actually reformat this into the tabular format that AI likes best—markdown tables—but we don’t have time for that today. So now it’s saying, “I understand. Here are your top five episodes; here are your bottom five performing episodes. Preliminary observations: As we know before, the top four episodes focus heavily on AI and its applications. The bottom performers cover a wide range of topics—including trends, change management, data mining, and Google Data Studio. It’s difficult to draw definitive conclusions based solely on views; possible reasons: niche topics; lack of clear value; competition; algorithm factors.” That itself is kind of handy to say.

Christopher Penn – 30:47
That didn’t go as well as we would have wanted it to. Go ahead, Katie.

Katie Robbert – 30:55
I have so many questions, so please just keep going.

Christopher Penn – 30:58
No, because you might.

Katie Robbert – 31:00
Might answer the questions because I feel like, yeah, this is good to know, but doesn’t really tell me what I need to know yet. It says the bottom five: consumer trends, data mining; what’s the trend? Change management and digital transformation; that breaks my heart. Dark data mining and getting started with Google Data Studio. Yeah, I mean it gives some preliminary observations, but doesn’t really tell me what to do.

Christopher Penn – 31:36
Okay, so I like that it helped us isolate and figure out what those general pages are. Here’s the problem: A YouTube title isn’t super helpful; that’s like the tiniest possible snippet of text that you could offer. What would be helpful is if we had the shows themselves. Wouldn’t it be useful if we had those shows? The good news is we do. Here’s how you would approach this: There’s a utility called yt-dlp. yt-dlp is a command-line application. One of the things that it does incredibly well is because it’s designed to download YouTube data or YouTube stuff in general—so it can download your videos, it can download just the audio, and critically, it can download the captions files for.

Christopher Penn – 32:34
So what you would do is you would download this software, install it (which requires a little bit of technical lifting), and then you say, “yt-dlp, here’s my YouTube channel; give me my captions files.” And it will spit out a big old folder of all of your captions files. As an example of where we want to go with this, we can say, “Well, generative AI just told us which of our episodes are the top and bottom. Let’s go take a look here. How to use AI for social media marketing.” That’s what we’re looking at here. Let’s scroll down. So what? How to use—where’s my So What? file? Oh, goodness, there’s a lot in here. We’ve done a lot of shows. Let me. Let’s take all of our So What? files here. And I’m going to.

Christopher Penn – 33:32
I need to sew them up because they’re overwhelmingly large.

Katie Robbert – 33:37
While he does that, John, we’ve often talked about, like, when we do these kinds of examples of how to use generative AI to do certain things with the Marketing Over Coffee podcast, you could then also have a similar set of files with all of the transcripts from all of the recordings. Is this the kind of analysis that you would do, or do you feel like you have the right kind of data available already to tell you what’s working and what isn’t working with the Marketing Over Coffee podcast?

John Wall – 34:13
Yeah, well, that’s just because Marketing Over Coffee is underfunded as far as labor. There are like three programs I know I could do that would work, so there’s no point in me digging further to see where the other opportunities are. But, yeah, this is great. And since he’s grinding, I have to tell the story of the very first time—I mean, it must be close to two years ago now—the first time we did this exact activity, Chris was like, “Hey, this can do some stuff.” I was like, “I’m going to have it run some analysis for me.”

John Wall – 34:45
And I threw a bunch of files at it and wanted it to kind of cut and slice and dice them for me. And he was just like, “Yeah, no, that’s not what this does.” At that point I was like, “I’m going to leave it to Chris.”

Katie Robbert – 35:01
I feel like one of the reasons we turn to tools like generative AI is because the data out of the box in the systems just doesn’t tell you enough of what you need to know. It tells you kind of like, “This went up, this went down,” if you’re lucky. Sometimes it doesn’t even do that. So I feel like this is sort of that next level—almost the qualitative part of the analysis that’s arguably a bit harder because it’s not just looking at numbers; you’re looking at the actual content itself and trying to make decisions. And that’s not something that a lot of the systems where our data is currently housed does really well.

John Wall – 35:48
Yeah, it’s all a matter of the questions begat more questions.

Katie Robbert – 35:53
Mm.

Christopher Penn – 35:53
Now let’s add another question: Here is the Trust Insights ideal customer profile, which Katie built, which is super extensive; it tells exactly who our customer is. I’m going to say, “Take this analysis of our YouTube stuff and revise the analysis through the lens of our ICP. What in these top and bottom functions would they find value in?” Because if we know that, then we’ll help explain the content a little bit better. Rebecca has a follow-up question: Are there similar data privacy concerns with Gemini versus OpenAI? It’s within the Google infrastructure. NotebookLM promises privacy and security; Gemini in the AI Studio. If you are a paid developer with AI Studio, your data is not used to train models.

Christopher Penn – 36:41
If you are using the free version, your data is being used to train models. If you are concerned about that, use it through the API, and you will have to pay for it. And we do get a bill from Google, although it’s a very small bill. NotebookLM is a great resource for this, particularly if you are analyzing a lot of transcripts because you can dump hundreds of megabytes of text data into NotebookLM and have it do that analysis. And I think that would be phenomenal. Maybe that’s a follow-up episode sometime on. Actually, we probably need to do like an intro to NotebookLM live stream because there’s so much you can do with it that we just don’t have time within the context of another episode.

Katie Robbert – 37:23
So glad that you put the ICP in here because that was going to be one of my questions when you did the initial analysis—an observation is that it’s great that we collectively look at the data, but we’re not the customer, so we’re always going to have a bias in terms of the things that we did and didn’t do. So one of my first reactions was, “Oh, that’s so disappointing that change management didn’t rank as high.” I personally feel like it should rank higher. It doesn’t matter what I think; I want to know what the customer thinks. So I’m so glad that this is now part of the analysis.

Christopher Penn – 37:58
And now for a commercial plug: If you would like your own ICP, contact Trust Insights; we’ll happily build you one. All right, so if we look at our YouTube data through the lens of the ICP, the top five performing episodes—So What?—I feel like it’s strong; this aligns well with the ICP’s desire for AI-driven insights and operational efficiency. Why it likely resonated—and so on and so forth. So What? on Matomo: Strong; why it resonated. Solution to a common challenge generated by advanced prompting: Strong; why it resonated: valuable useful knowledge; how to build an AI agent: Moderate to strong; it’s a more advanced topic, so the how-to format is still good.

Christopher Penn – 38:36
So What, Leveraging your data for ICP storytelling: The ICP’s need for data-driven storytelling and then the bottom episodes: ICP alignment weak; while the ICP is interested in data analysis, the topic might be too abstract and theoretical; lacks clear connection to immediate business goals or pain points; this is all produced before the ICP was provided, which means that this show may not be what the ICP had in mind; determining what’s a trend: Weak; topics too basic for the ICP, who already has a good understanding of trend analysis; change management: Moderate; so that’s not weak in this case; it’s moderate, saying digital transformations relevant; change management might not be the primary focus for the ICP.

Christopher Penn – 39:12
The title lacks a strong hook; doesn’t clearly communicate the value proposition for marketers; might be perceived as too high-level or corporate; dark data mining: too technical; Google Data Studio: either a very specific tool; getting started suggests getting a level of focus which might not appeal to more experienced marketers. So this analysis now is super helpful.

Katie Robbert – 39:35
That tells me so much more than the initial analysis because this is where we get stuck with reporting and analytics—it’s like, “Okay, we can look at the numbers; like this was low-performing. Okay, so what do we actually do about that? What does that mean, ‘low-performing’? Why was it low-performing?” Now that we’ve layered in the ideal customer profile, we can find out why it was low-performing and what to do about it to make it more attractive to our audience. That’s like the secret sauce; that shouldn’t be secret; it should always be through the lens of your audience and your customers. For some reason, that’s so hard for people to wrap their heads around. Like, you guys don’t care what I think about it; I’m not the customer.

Katie Robbert – 40:23
I’m going to take John’s opinion with a grain of salt because he’s not the customer. Chris, I’m definitely not listening to you; you’re not the customer. But because the three of us have such differing opinions on what we think is going to resonate, the easiest way to answer that question is to actually talk to the customer.

Christopher Penn – 40:42
Exactly. So now I’m doing a final prompt in the series: “Great, based on what we’ve done so far, let’s build a plan for future live streams—a checklist for ideating future shows that mesh well with our specific capabilities and talents, but also appealing to the ICP. Based on the YouTube content I provided, a knowledge block for Trust Insights (this is several pages—who we are, what we do, all that stuff, which, by the way, you should have written out): One, explain back the intent of my instructions clearly. Two, choose the knowledge from what we’ve discussed so far and your own knowledge to solve this question. Three, explain your choices and approach for solving this question. Four, solve the question and produce the results in full.”

Christopher Penn – 41:51
What are we doing here? This is chain of thought—structured chain of thought as a prompt engineering technique. If you’ve ever done public speaking, you’ve heard the term: “Tell them what you’re going to tell them, then tell them what you told them,” which is a terrible public speaking framework, but it’s a great prompt framework. Regenerative AI: Tell me what I told you, and tell me what you’re going to do. Tell me as though you’re doing it; tell me what you did and why; and then do it. That approach is going to generate much better results than just saying, “Write the plan.” Why?

Christopher Penn – 42:27
Because it gives the model essentially a chance to do a trial run of its ideas in the same way that you or I—we write a rough draft first, and then we do a final draft. The intent: You want to create a checklist for ideating future So What? live stream episodes that appeals to the Trust Insights ideal customer profile, aligns with our capabilities, is feasible and engaging. The topic should be something that you, Chris, and John can discuss with authority and enthusiasm.

Katie Robbert – 42:57
Gotta have that enthusiasm!

John Wall – 42:58
John Wall, the authority. That was impressive. I was like, “Respect my authority.”

Christopher Penn – 43:08
We’ll create a detailed checklist that combines the elements of a brainstorming guide and a filter. Each point in the checklist will serve as an idea generator, relevance filter, and an engagement check—topic relevance and value; our strengths; engagement format; SEO and promotion; post-production. Example of the checklist: “Unlocking the power of data storytelling with generative AI.” Now what do we do with this checklist, Katie?

Katie Robbert – 43:36
It looks like a process; it looks like a set of requirements; it looks like something that, as we’re coming up with topics for future live streams, we should be going back to and gut-checking against it. It almost kind of looks like the live stream version of the ICP.

Christopher Penn – 43:55
Mm. It also looks an awful lot like system instructions. These are system instructions for an AI to use. Watch what happens: We’re going to take just that checklist; I’m going to take this thing; I’m going to rename this “Live Stream System Instructions Evaluate.” The user will provide a live stream idea; you will evaluate the idea according to this checklist, which is great because.

Katie Robbert – 44:39
I have one for you.

Christopher Penn – 44:42
Let’s take off the example because I don’t want the example here. I want the Trust Insights About Us profile and I want the Trust Insights ICP because those two things make this work. I’m going to reuse our RE2 prompt format and I’m going to repeat the prompt again at the bottom. This whole thing gets copied into the system instructions. And now I could put this in a regular Google gem; I could put it in Claude; I could put it in ChatGPT; it’s pretty robust; it’s about 3000 words long or 3000 characters long—no, words, 3000 words long. So now, Katie, here’s our live stream idea. Evaluate it according to the system instructions. What’s the idea, Katie?

Katie Robbert – 45:41
So What? Introduction to NotebookLM. And I would like this to be a gem. Please and thank you.

Christopher Penn – 45:55
Let’s see: Live stream, topic, and relevance: NotebookLM as a tool potentially helps aggregate information, which indirectly addresses the ICP pain point—data aggregation—although it’s not as directly as data analytics platforms. Value proposition: So What? Introduction to NotebookLM elements: Intriguing but doesn’t fully communicate the value; the So What? ties to the Trust Insights brand. Introduction to NotebookLM might be too niche if the audience isn’t familiar with the tool. Alignment with Trust Insights strengths and expertise: Engagement and format; SEO and promotion; post-production. Based on your evaluation, give me three to five other candidate titles that would resonate better with our ICP on the topic of NotebookLM.

Katie Robbert – 46:50
So while that’s generating, what I’m actually a little surprised by, Chris, is that this live stream evaluation doesn’t have a scoring rubric.

Christopher Penn – 47:00
And we absolutely could make one. We could absolutely make one. However, having answer open-ended questions for brainstorming isn’t a bad thing. And that’s kind of what it did—it was going through the open-ended stuff. But yes, to convert into a scoring rubric would take literally two or three minutes. But here are some ideas: NotebookLM organizing and marketing intel with Google’s AI; from chaos to clarity; how NotebookLM transforms your marketing and research; unlock the power of AI for content strategy; a deep dive into NotebookLM; you’re an AI-powered research assistant for better marketing decisions; and streamline your workflow: Mastering Google’s NotebookLM for marketing. I kind of like that last one.

Katie Robbert – 47:39
I like the last one; I like number four. The other ones feel very salesy, which is probably the way we should go. But I like number four and number five.

Christopher Penn – 47:53
Huh. So as we wind down here, what we’ve done today is we’ve taken data and we’ve taken the qualitative data from transcripts and from knowledge blocks and from our ideal customer profile, bonded it with our quantitative data, helped us prioritize what episodes we should look at for the qualitative analysis, and then converted all that into findings, checklists, and ultimately an application in AI in 48 minutes that we can now use to not only do the analysis and know what happened, but now say, “Well, what should we do differently?” Because one of the things that we say all the time in all of our talks on the topic is: Analysis without action is distraction. If you’re just making slides to make slides and you’re not doing anything differently, you’re not accomplishing anything; you’re just wasting your time.

Christopher Penn – 48:55
Generative AI gives us the ability to convert analysis into action.

Katie Robbert – 49:02
Which is so important because that’s where a lot of these reports miss the mark. It’s like, “Here’s what happened. Okay, great, I can change it. What the heck do you want me to do?” And that’s sort of. John, what I was talking with you about is these platforms that the data lives in—for, like, our live streams and our podcast—that’s all it is. It just tells you what happened, and then you screenshot that, you put it into a deck, and you ship it off to your stakeholder, and they’re like, “What the heck do I do with it? Like, this is not helpful.”

Christopher Penn – 49:35
Now imagine you gave your stakeholder, “Hey, here’s a Google gem. Every time you do a content plan, now just use this.” How much more valuable would that be to any stakeholder? How much more value would that be for your agency or your firm or your consulting practice to say, “Not only do we do the analysis for you, we will give you the tools that you need to do better.”

Katie Robbert – 49:59
Oh, I 100% plan on using this new gem that you’re going to put together every week as we do the live stream. And we should likely also do one for the podcast.

Christopher Penn – 50:10
Yep, exactly. Any final parting thoughts or words?

John Wall – 50:17
We need to get Chris a wizard’s hat for him—making gems in the background. That’s the best phrase for saving it. You’re not saving a document or a file; no, this is a gem.

Katie Robbert – 50:27
Oh my goodness.

Christopher Penn – 50:29
Get some healing crystals! All right, on that note, I think we’re done. Thanks for tuning in, folks. We will see you on the next one. Thanks for watching today. Be sure to subscribe to our show wherever you’re watching it. For more resources and to learn more, check out the Trust Insights podcast at TrustInsights.ai and our weekly email newsletter at TrustInsights.ai/newsletter. Got questions about what you saw in today’s episode? Join our free analytics for marketers Slack group at TrustInsights.ai/analytics-for-marketers. See you next time!

Need help with your marketing AI and analytics?

4 thoughts on “So What? How to use Generative AI to Analyze Data”

Bexi ai says:

January 19, 2025 at 2:06 am

This user story framework for data analysis is such a great way to stay focused on business goals. With the addition of AI tools, do you think marketers will eventually rely more on AI to inform content strategies and decision-making, or will it remain a complementary tool?

Runway API says:

March 5, 2025 at 5:09 pm

I love the point about avoiding over-reliance on generative AI without verifying its analysis. It’s easy to assume that AI is foolproof, but human oversight is crucial to ensure quality insights.

Photo to Coloring says:

March 6, 2025 at 5:11 pm

I really appreciate the distinction between what generative AI *can* and *can’t* do when it comes to analyzing marketing data. Many people still overlook the limitations of these tools. It’s good to see how you can apply them effectively for real-world use cases like social media marketing.

Ghibli Art AI says:

April 1, 2025 at 6:49 pm

The idea of using generative AI to develop data-driven content strategies is fascinating! I’d love to hear more about how you balance AI-driven insights with human intuition—do you find that AI tends to reinforce existing biases in marketing data?

So What? Marketing Analytics and Insights Live

airs every Thursday at 1 pm EST.

In this episode you’ll learn:

4 thoughts on “So What? How to use Generative AI to Analyze Data”

Leave a Reply Cancel reply

Subscribe to our Weekly Newsletter

Pin It on Pinterest