The Testing Show: ChatGPT for Testers

Panelists

Matthew Heusser

Connect with me

Michael Larsen

Connect with me

Beth Marshall

Connect with me

Chris Kenst

Connect with me

Ted Ariaga

Connect with me

Transcript

In the past several months, ChatGPT has been getting a lot of attention as a new cool tool that everyone is talking about. It can generate song lyrics on the fly. It can create code to solve problems and be compiled. Still, do we really understand what it is and what it can or can’t do? To that end, Matthew Heusser and Michael Larsen welcome Ted Ariaga, Chris Kenst, and Beth Marshall to talk about ChatGPT, especially as it interests the world of software testing, and explore areas where it is ready for prime time and where it is not, or at least, not yet.

References:

ChatGPT: Optimizing Language Models for Dialogue:
https://openai.com/blog/chatgpt/
ELIZA: a very basic Rogerian psychotherapist chatbot:
https://web.njit.edu/~ronkowit/eliza.html
Prisma Labs: Lensa AI:
https://prisma-ai.com/lensa
Faker:
https://fakerjs.dev/guide/
Fake Name Generator:
https://www.fakenamegenerator.com/

Transcript:

Michael Larsen (INTRO):

Hello, and welcome to The Testing Show.

Episode 132.

ChatGPT for Testers.

This show was recorded Thursday, January 19, 2023.

ChatGPT has been getting a lot of attention. It can generate song lyrics on the fly. It can create and compile code to solve problems. It can answer and pass state bar exams. Still, do we really understand what it is and what it can or can’t do? Matthew Heusser and Michael Larsen welcome Ted Ariaga, Chris Kenst, and Beth Marshall to talk about ChatGPT, specifically in the world of software testing. What will it’s impact be or do we even know?

And with that, on with the show.

Matthew Heusser (00:00):
So welcome back to The Testing Show, folks. And if you are anywhere active on LinkedIn, Twitter, or the socials, you’ve probably heard something about ChatGPT and it’s time we got a little bit more serious about it. So I gathered some folks that I thought have an interesting perspective. Beth Marshall is a tester from the UK that I mostly know by reputation, but she’s been doing a lot of posting on LinkedIn about her work with ChatGPT and testing. Welcome Beth.

Beth Marshall (00:32):
Lovely to be here. Thanks for having me. Matt.

Matthew Heusser (00:34):
Chris Kenst I’ve known for a long time, who’s the sitting president of the Association for Software Testing. Welcome to the show, Chris.

Chris Kenst (00:42):
Thanks Matt. Thanks for having me.

Matthew Heusser (00:44):
And Ted Ariaga is a tester from the United Kingdom who’s been doing a lot of blogging about his work with testing and Cypress, welcome to the show Ted.

Ted Ariaga (00:57):
Thank you. It’s a pleasure to be here.

Matthew Heusser (00:59):
And of course as always we have Michael Larsen, our show producer and co-host.

Chris Kenst (01:03):
Hello and glad to be here.

Matthew Heusser (01:05):
So let’s talk about ChatGPT, which first of all, if you’re wondering what the heck it is, it’s an attempt toward a generalized AI wrapped around a chat bot, but I’m sure Beth could do a better job of describing it. Beth, what’s ChatGPT?

Beth Marshall (01:21):
Oh, let’s start on an easy slash hard one. ChatGPT is one of a plethora of generative AI tools by the OpenAI company, which in itself has an interesting history. If anyone wants to Google that or look it up on ChatGPT, it kind of aims to be alongside its API it involves a UI that you can type anything into and it should come back with a kind of human discussive response. So you can have a chat with your computer and the aim is for that to give you the best response that it can with the data that is fed into it. And people are going a bit nuts for this tool.

Matthew Heusser (02:08):
So Chris, what does that mean for me as a tester? It’s generalized AI. So what is this? Is this like Eliza where I type words and it pretends to be a human?

Chris Kenst (02:21):
That’s a good question. What is it for testers is it’s a tool. So it’s yet another tool but it’s an interesting tool. So as Beth said, it’s a set of APIs and in the case of ChatGPT, it’s actually like a chatbot that you can ask questions to. I think the most obvious use case for testers is it’s like a search engine, but it’s a search engine that’s really specific and will return things to you in a very descriptive manner. So if you were to ask it something like, “Can you give me a set of zip codes in… name a city?” it would probably return those to you. If you asked it to generate a Selenium script that would log into a webpage, it would give it to you. And so it ends up being a way to ask questions, to search for information in a conversational manner instead of returning a bunch of links, which is what we’re used to for something like Google or name-a-search engine. This is actually going to return a description and more than likely an answer to you that is a very plausible answer. Not necessarily a correct answer, not necessarily a truthful answer, but it’s going to return an answer to you that is usable. If you ask it to go to a test site and create a script, it’ll probably get that right. It’ll probably introduce you to give you something that’s very simple. If you ask it to give you a set of zip codes for a particular city, it might just make it up.

Matthew Heusser (04:05):
Well let’s be specific. What we can say is this is an AI that appears to have trawled some portion of the internet. It also has predictive speech. So and we go to Google, “What is the shortest…”, it starts to fill it out for us person or car or whatever. If I go to Google right now and I type in, “What is the shortest…” it will complete my sentence. Shortest day of the year, verse in the Bible, book in the Bible, shortest day in Michigan, shortest word in the world, shortest Harry Potter movie. This is taking that, not to generalize the question, but to generalize the answer based on huge amount of data that is on the internet

Chris Kenst (04:50):
And then return it to you in a conversational manner.

Matthew Heusser (04:53):
Right.

Chris Kenst (04:53):
Which is a bit unusual. I mean if you’ve dealt with chatbots in the past online, you’re used to that but you’re not used to having a search engine return… If you asked Google who has the best testing site, you would expect a list of testing sites you would not expect a response of this is the best testing site and it gives you a response and that is unique and scary and awesome at the same time.

Matthew Heusser (05:22):
Right? And it can infer meaning; if you were to say give me the top 10 testing sites, it’s gonna give you a list of 10. But what it’s doing, my assessment is it’s very similar to your college sophomore/college freshman. It’ll do the Google search for you essentially and then put some stuff together that that is the most common, most popular, most right. So if you ask it a coding question about Selenium, it’ll present the stuff that is the aggregate of all the different Selenium answers they could find on the web. It is likely that it lacks the understanding of the difference between Selenium RC versus web driver versus WebDriver two versus WebDriver three. It is likely that the code block that it gives you, which is mostly cut and pasted from mostly the aggregate of mostly the best advice they can find on the web in that area, it’s very likely that doesn’t actually work or you can’t figure out how to install it. Am I wrong?

Chris Kenst (06:20):
I think that’s right.

Michael Larsen (06:21):
Here’s something I found interesting. I think it’s neat that OpenAI has basically already put up a listing of limitations and they spell out what the limitations are. From the very first entry, here… I thought this was interesting: ChatGPT sometimes writes plausible sounding but incorrect or nonsensical answers. Fixing this issue is challenging as number one, during RL training, there’s currently no source of truth. Hope that thought for a second. Two, training the model to be more cautious causes it to decline questions that it can answer correctly. And three, supervised training misleads the model because the ideal answer depends on what the model knows rather than what the human demonstrator knows. Now that intrigues me because I’ll take this a little bit of a different direction. There’s a big boost right now with AI tools and a lot of people are having a lot of fun with it.

(07:19):
But some interesting comments have come up from it too. If you are at all familiar with TikTok and have seen the manga AI or the AI avatar type things, people are putting in pictures of themselves or they’re having their faces be scanned and they’re being converted into pictures. Some of them looking strikingly similar to the individual, some of them looking not at all similar. It’s always amusing to me to see what the tools are capable of doing. Somebody had said this is going to completely disrupt art and people are gonna be making artwork that’s not gonna be working with this. And it was funny because for fun I said I’m gonna generate some of this art using my band’s pictures. And the fascinating thing is it can do a really good job of interpreting what a face is, but boy does it do horrendous things when it comes to trying to render a guitar <laugh>. It doesn’t know what to do with it. Most of the pictures that we got back were utterly unusable because you couldn’t even figure out what the person was holding. So I mentioned that in context to this whole ChatGPT question. AI again, just like when people say this is a tool that is gonna be so smart and it’s going to replace everybody that does a particular job, they were saying that about this replacing artists and I honestly don’t see that happening at this given point in time. There’s a lot of limitations.

Matthew Heusser (08:52):
The implication with this stuff is it’s gonna take jobs away. I think that’s maybe the second half of this conversation. For now though, I want to get a read on what it is and how useful it is. So earlier, Chris Kenst said if you ask it for a test script, it’d probably come up with the test script. And I said, but it’s not gonna work. I think by test script he meant something different. I think he meant something like test login, test search test add to cart, test checkout, test checkout when you don’t have any items in your cart for an e-commerce site, that kind of thing. Is that what you meant Chris?

Chris Kenst (09:29):
That’s correct. That’s something that is easily demonstrated probably already on the web. You could probably search for it. If you asked ChatGPT for it, it would find a reasonable match and give it to you.

Matthew Heusser (09:41):
Yeah, and that’s what I’m getting at. That level of skill. Any articulate person with a day’s training in software testing should be able to come up with a list of test ideas that is better in less than 10 minutes. Am I wrong?

Chris Kenst (09:59):
I think that’s right.

Matthew Heusser (10:01):
So then what’s the practical utility of ChatGPT?

Ted Ariaga (10:05):
I think in terms of practicality, it’s maybe not as practical as you’d want for the example you gave. It’s not really intuitive enough at this point to provide the information that you need in the context that you need it. So that is one of the many things with AI. So I’m not really an expert, but what they do, what you you hear with these things is that there are learning models. I don’t know if anyone saw the meme, it was popular when it wasn’t really me, but it was an image that was circulation on the net a while back that showed how many nodes or the knowledge base that the AI used to actually learn and then they were projecting based on the current usage, how much it would have learned by the time the next update comes. That is an obscene amount of learning that it’s done.

(10:57):
So let’s not forget this is just the beta test and going forward the AI would actually be a lot more practical. Cause like I said, if anyone used it was able to interact with, you could tell what the bugs were. But in terms of practicality, right now it’s basically just the basic stuff where it lets you create test cases. It’s useful for documentation but that’s probably as far as it goes, goes at the moment. But if you’re talking about actual code that performs certain functionalities, now it’s maybe not as strong and like Matt said, it just aggregates probably the most frequently used data and they just presents that to you and just says, yeah, this is the code, I will do this particular thing. But in terms of practicality, now what it does is that it saves you a lot of time. So if you’ve ever had to write code and you have to perform a Google search or scroll through stack overflow, you know how difficult the process is where you have to find the exact code to solve the exact problem that you’re looking for.

(12:00):
This sort of like goes some way to limiting that. Cause with Google search results you have a bunch of links that you can possibly try and you don’t always get the best information that you need immediately unless or maybe an expert would search in with keywords and all of that stuff. With this now it almost aggregates the best information and gives it to you in one place and says try this option first. Now it may or may not be accurate now, but it shows you what the potential is going forward and how practical it could be because it would definitely replace Google as the number one search results as number one search location for all of these features now. Yeah, in terms of practicality, not much yes yet with regards to automated testing but going forward you can definitely see the potential.

Beth Marshall (12:51):
I think as well. I think there’s a brilliant observations there that I definitely agree with Ted. I think when it comes to the point about the kind of data points that you currently have in this model, bear in mind that ChatGPT 3, which is the current model, is only trained largely on a data set up until 2021. And as we know there is a lot of movement in technology. So I think when ChatGPT 4, which is the next one to come along, comes along, the data set… I think there’s around 175 billion data points currently and there’s gonna be potentially 1 trillion in GPT 4. So there’s gonna be a huge leap around the corner and I think that could change how valuable this stuff is. I agree with Ted and everyone on the panel really that currently in terms of practical, actionable, what can this thing do?

(13:47):
What can this thing help me with as a tester? I think it’s wise to keep it small, don’t expect the tool to do your job for you. I don’t think it’s gonna replace testers anytime soon. There are concerns about the environmental impact aside from any kind of legality or issues that you might have with sharing your image with tools like Lensa and ChatGPT. But some of the things that I’ve been using it for is kind of a second opinion in the same way that you’d use Stack Overflow to say but in a more conversational way. So you might say, “Hey, a tester’s job isn’t just automation is it? So a tester’s job might be, “Can you design some training for someone that’s onboarding? Design an onboarding plan. Okay chat, I’ve got some ideas but ChatGPT, how would you design this onboarding plan and how would you design it for this type of person?”

(14:39):
And then get that initial kind of second pair of eyes on something to see if you are on the right track. Something that I was able to use it for last week was to generate a query using Postman that created using the low-code workflow kind of feature called flows that I’m just obsessed with that created a workflow to query the open AI API, ask it a question and then take its response in the form of, “Generate me some test data for anything”. So this could be, “Generate me the 10 best flowers that are sold in Holland” or as bespoke as you like. Get that from ChatGPT and then turn that around and use that in your API workflow as test data. So there’s already a faker library that can do that for you, but I think ChatGPT might help you make slightly more bespoke options. There are some uses and I do think it will start to form habits with people. It’s certainly starting to form a habit with me where I’m using it in the space where I used to Google, in the space where I used to look on Stack Overflow, but it’s not coming for any of our jobs anytime soon I don’t think.

Matthew Heusser (16:03):
Yeah, well let’s put both those together a little bit. So you have an ability to use it like you might a college freshman or sophomore, like an intern, where you might say, “Hey you don’t really know how to do software testing or even coding really, but I’m having this interesting problem. Can you google some stuff up and figure out some things for me? Can you get on Stack Overflow and figure something out for me?” And you might hire an intern to do that. You might take them an hour and save you 20 minutes but it might be generally useful as a first pass. Now we’ve got a piece of software we can ask that for that can do looking at the various stack overflow answers and give you the one that thinks is the best. That’s not nothing. Another thing it can do is it can be trained. You can tell it what’s wrong with a piece of code and it can take a second trial. And it appears to have a compiler in it.

(17:00):
So if there is an answer for how to do something on the web and it’s only in Ruby, you can ask how to do it in Python. It’ll find that answer and translate it for you in a way that will compile. That’s not nothing. It could be wrong. Like I ask it how to do selenium thing in Pascal but even then it came up with the code and said I can’t find any libraries in Pascal but if you write one it’ll work. Which is true. Michael and I worked with a company where one of the employees had written the Perl library for Selenium at one point. So that actually creates general utility I think. Am I wrong in any of that assessment so far?

Ted Ariaga (17:48):
No, you are not. I completely agree with a lot of what you said, right? And that was sort of like initial point that I was trying to make that as a learning resource, it’s actually top. So if you had to maybe train an intern or you just wanted to learn, I can see the benefit in having an AI that could actually help you learn and learn a lot quicker. So yeah, it’s got a lot of benefits and like says it’s got to compile as well. I actually did try scenarios whereby would want to write a perfect use case for this now would be if you had certain test mission framework written and maybe Java and say okay I need the equivalent of this framework in maybe Cypress or Playwright and what have you. It does that Now like I said, current version, it might not work exactly but think of potential of that, right?

(18:44):
So if the model improves and it gets better and the matter of fact that you can completely migrate or you could potentially migrate legacy systems written in maybe Java or C++ through modern day languages now like JavaScript or Ruby or Perl, anything that you consider to be modern, it’s the potential is endless. And I can understand, I think earlier someone mentioned the concerns for, “Oh! It’s going to take away our jobs… this, that, and the other”, and yeah I, I definitely do see the concern and I don’t this is probably the right time for this, but I think Beth also raised some concerns as well with the whole side of process for open AI, Where she mentioned in her blog that she was a bit skeptical signing up with her own personal details email and all that. So she had to use a generic email application combinator to sign up and it’s funny cause I had the exact same thoughts so it’s actually glad for me to know that I’m not the only one that’s paranoid about these things. Most people are like, a lot of people are skeptical when it comes to AI and all of this, it’s new ground and yeah we should definitely proceed with caution.

Michael Larsen (19:56):
I find it interesting too when you look at a variety of whatever kind of query you make for example using Postman or like anything else, if you’re looking for something that’s fairly generic or that’s got a broad base of information, I could see this being something that can save you some time. Something to get you up and running fairly quickly if you wanna do some quick and dirty testing and you want to get some data ideas or you wanna look for something. But if you’re gonna be asking it anything that’s got nuance to it or that requires any clarification to it or that might not be necessarily well understood, not that us in testing ever have that problem. I think the biggest and more interesting factor here is that there’s a lot of nuance that can show up in a lot of these things. I think that these tools do work very well when you are looking for something fairly generic or you have something set that you wanna work with and maybe you might discover a couple of things that are lesser known but still fairly mainstream with what you’re trying to produce and work with.

(20:58):
It’s just like going to stack overflow when you want to go and say I’m trying to figure this thing out and it feels like you’re the first person to ever ask that particular question. You’re gonna stumble a bit trying to find it and there’s where I think that you’re either gonna get nonsensical information from something like ChatGPT or won’t guide you where you want to go. I hope I’m making sense with that.

Matthew Heusser (21:19):
Thanks Michael. So Chris has his hand raised to talk about a different application for ChatGPT which is test data and the Oracle problem, we should elaborate on that for the

Chris Kenst (21:33):
Audience. Well I think Michael was just talking about this and so what happens in his example where you go to stack overflow and you’re the first person to ask the question. So the the Oracle problem is an oracle’s a mechanism, a principle which helps us recognize a problem. Ultimately what that means is if you ever think of an expected result, your expected result comes from an oracle. So if you want to know if a test passes or fails, you need to have something that helps us understand why that test should pass or it should fail. And that is the oracle. In the case; I like to go with zip codes. So zip codes in each country are pretty well defined either by law or a set of protocols that they have in place. So they’re pretty well defined, you don’t get a lot of variation and it’s relatively easy to tell and I say relatively easy to tell when a postal code or a zip code is correct in format or not.

(22:32):
But when you want to get the answer to a question, and as Michael was saying before, we run into ambiguity all the time around when somebody wants something and they just wanna know if something passes the test. So how do you actually understand that and that’s the Oracle problem. There’s kind of two ways to view this and Michael was saying like, with ChatGPT, when you ask it something that is really specific or as Beth said, well if you ask it a bespoke question and it doesn’t know the answer to it, it doesn’t have that oracle that reference to go out and find it. The other thing too is that when we’re talking about machine learning and AI, which you know GPT, that is what it is, it’s a general AI, when we’re even thinking about like testing AI systems and machine learning systems, we also don’t have a lot of oracles for understanding how to test them.

(23:26):
So there’s this dual problem, there’s the “it’s nice to have a chat interface to interact with an underlying AI/ML model like ChatGPT is” and we can use it to answer these questions and whether it’s correct if we even mentioned privacy, like whether it’s private, whether it’s fair, there’s all these kinds of questions that come about from just using it and then there’s the, if we go one step further and we talk about testing those models, we also don’t have oracles. It becomes this very difficult and intractable problem and I took us a little bit away probably in what I just said Matt, on what you wanted to talk about which was test data. So do you wanna pause and and go back and talk about test data or are we kind of on an interesting track?

Matthew Heusser (24:16):
Well I think test data is an application of the Oracle problem because you can say, I don’t know, give me 10 addresses in the United States, 10 valid and 10 invalid for my real estate software where you’re going to then cut and paste in those 10 addresses for your checker conversioner tool that will convert R O A D to RD. And I’ve worked with some of those. How do you know if the addresses it picks are actually right, which would be the test data problem.

Chris Kenst (24:48):
And I think Beth had some really good insights on this cuz that’s what she was doing. Instead of using like a generic library like Faker, she was using ChatGPT or the underlying GPT models to request chat data. Yeah there’s that idea of well is it correct, maybe you don’t care but maybe you do. And is that something that you rely on ChatGPT or GPT just in general to generate for you or to give you a response to

Matthew Heusser (25:17):
You’re test data. I want it to be right cuz I’m using to test my application.

Chris Kenst (25:20):
Well Matt, it depends on what our goal is with the test, doesn’t it? So if we are just plugging in data as a part of like a checkout process, maybe there are only little bits and pieces of the data that we care about it being correct because perhaps we’re not testing whether or not something passes or you care about correct data. But there will be times when we do. For example, in an e-commerce system for the most part I don’t care about the data that happens in a checkout process. There will be a few things maybe that zip code or billing information has to be correct for the credit card information has to be correct. But for the most part like I don’t care unless I’m testing that particular instance where I’m trying to get something to fail. So it depends on what our goal is.

Matthew Heusser (26:06):
Is that right Beth? Because if I didn’t care if the data was right, I just used a random number generator.

Beth Marshall (26:11):
I think Chris is right. I think it depends on what type of data you want because how do we know that the current library’s are correct? I think there may be instances where you do want information that you don’t know and you do want a collaborative up to date view of what a particular data set is that you might not be able to necessarily guess or come up with yourself. There’s caution in all these things and and I think we are gonna see a lot of people fall into the trap of becoming over reliant on ChatGPT to do the thinking for them, then have egg on their face later.

Matthew Heusser (26:48):
So let’s try to, I think I got a real good example. You are trying to populate a database that you wanna run. You don’t want to do just one, two or three examples. The database is limited so the names have to be different. You want a wide variety of popular names in case there’s something, a special character or something that gets cut off. You say instead of using a random number generator, instead of using a random text generator, instead of having the names be a A, AA, AAB, you want names. So you go to chat GPT and say, “Gimme the hundred most popular names, first names. Gimme the hundred most popular last names. You probably could even use ChatGPT, “Hey, can you give me all the combinatorics on this?” and get 10,000 names out of it, which you could then cut and paste into some little importer and now you got a big old database. Like in that case you don’t really care about those names, you just want to have them and be able to refer back to them. You might run a SELECT statement that sums all those numbers. You might import that data into Excel or something and sum all those numbers and make sure they match because what you’re really testing is the import. Is that the kind of thing you were saying, Chris?

Chris Kenst (28:02):
Yeah, that’s right. Names are interesting. How we use that data later on and what are we testing absolutely matters. I think in an early example Beth said you use something like a faker. Faker is a very popular library. In a lot of programming languages, you don’t know if that data’s any good but for the most part you probably don’t care. So like as you were saying, if it’s a name field you’re like just gimme a first name and a last name because that’s what I need to move on. But if your goal is to then go and later use that name across the application and it is a name from a country that have different markings on the names and emphasis and that sort of thing, you can then find problems with that data. So as we typically do in software testing, we go like, “What is the point of the test?” And if the point of the test is to focus on: Are names getting cut off? Are names represented correctly? those kinds of things. That is a very particular kind of test and maybe that’s not what you want to use a third-party library for like ChatGPT or Faker. Maybe you want to control that data. It depends on what your goal is for it. Does that make sense?

Matthew Heusser (29:16):
Yeah.

Michael Larsen (29:16):
I mean this, this has been something that’s been around for quite a while. There’s a tool that I’ve been using for years called fake name generator, which does something very similar. What I like about the fake name generator based on exactly what you said, do you have a country or do you have something very specific you’re looking for? You can then tailor that data as needs be if you want specifically male names or female names or if you want addresses that are specific to a particular region. Now granted, again, this is done because somebody has taken the time to cull a database for that and they’ve salted the data so you’re not getting actual addresses but they’re plausible addresses based on real places. My guess is also ChatGPT is probably doing something similar but whereas fake name generator’s been around for almost 20 years and the data makes sense when you use it across various platforms (I know cuz I’ve used it for localization testing), I’d be very curious to see if ChatGPT or something like it would do as well. Who knows?

Matthew Heusser (30:26):
I think of the five of us, maybe Beth has done the most with generating test data. Do you have any commentary to add there, Beth?

Beth Marshall (30:37):
I don’t think I would consider myself an expert in any way, but as all testers do, I use whatever tool I see in front of me. I like to investigate it and experiment with it and add it to my toolkit. People end up using ChatGPT, they think they’re gonna use it for one thing and they use it for quite another. When I used to work for a small company that made security cameras, one of the things they said was people buy security cameras because they think they’re gonna be burgled and it’s gonna protect them. And what they actually use it for is taking cute videos of their pets. That is something that I’m seeing ChatGPT being kind of used or abused for a lot, asking me to generate this test data. I mean that’s a worthy use case, but <laugh> I see a lot of people using it to generate lyrics or to do fun things. I’m not saying that’s valid or invalid, but I’d love to know what proportion of the queries going into ChatGPT are more humorous than they are applicable.

Matthew Heusser (31:41):
And I think that’s a good… we’ve got a couple things to pivot onto, but I think that’s a good point. There are a couple of different kinds of people talking about ChatGPT. There’s one group that says it’s amazing, it’s fantastic. I’m using it all the time. It’s great. And then when you really push hard, they’re playing with it and writing song lyrics. I keep hearing chatter, we’re using it to generate code, but I can’t get any real examples of it. And I think we’re just early in the cycle. I think we’ll know more in six months what the practical utility is. The people that are on it now as the Gartner hype cycle are the sort of early adopters who are in the visionaries. So the real value, it’s something to explore now, but I don’t think any of us have used it enough to make definitive claims. And speaking of definitive claims, there is the whole conversation about, “it’s gonna make my job go away”, which I think we should probably have later once we’ve all explored it a little bit more. I think we got too much material for this week anyway. There’s a lot of, “We don’t knows”, so my advice when we’re in those situations is to be someone at the forefront instead of someone scared at the back. I think I’m gonna do final thoughts and let Michael go first.

Michael Larsen (32:57):
All right! Well, again, my final thoughts on this are, like any other tool, any time that something comes out and gets some hype and people get excited about it, there’s going to be wildly divergent claims as to what it can or can’t do. Fortunately, having been around this industry for 30 years, we’ve seen lots of wild claims fall by the wayside and some very modest ones turn out to be very long-standing environments that we use today. That’s why I mentioned fake name generator. It’s something that I’ve leveraged for 20 years and I couldn’t dream of doing my job without it. But again, early days yet, I’m interested in seeing what it can do. Again, in some of the more fun aspects that I’ve played around with, if you’re doing pop culture references or things like that, it’s pretty smart because there’s a lot of data that it can plumb for that. But if you’re doing very unique development work that is either cutting edge or hasn’t really been proposed before, it doesn’t know where to start and it might lead you into circles, the old “Trust but verify, take with a grain of salt”. It seems intriguing and I’d be curious to see what happens with it going forward.

Matthew Heusser (34:08):
Thanks, Michael. Ted?

(34:09):
[MIchael: Christ Kenst had to drop from the call at this point so he was not present for final thoughts.]

Ted Ariaga (34:10):
For me personally, final thoughts. Like Michael said, at this current state, it’s trust but verify, right? So for stuff that’s basically generic, you can sort of rely on ChatGPT, but like you said as well, the more complex niche stuff, you don’t really want to put all your eggs in that basket.

Matthew Heusser (34:31):
Beth, you wanna lead us out with your last thoughts?

Beth Marshall (34:35):
Yes. So my final thoughts on ChatGPT and the like as it currently stands is just as everyone said, really do treat it with the same kind of critical thinking and analytic mindset that you go about treating everything else with. If you are a tester, it’s in your nature to query things and absolutely bring that same level of rigor, probably more so to what you are seeing from ChatGPT. Keep it small. Experiment and see what this can do, if it’s worth adding to your kind of mental testing toolkit. And if so, how? There’s obviously huge advances that are taking place in this field for every single industry. It’s certainly worth knowing about it. But also be aware of how contentious this is. This is a tool which really has the ability to be a conversation stopper or an argument creator with a hell of a lot of people. If you thought the term “manual tester” was contentious, then wow, throw in ChatGPT and get ready for some arguments. So yeah, definitely be aware of the contentiousness of the tool, treat it with care, but be aware of what it can do.

Matthew Heusser (35:53):
Well. Let’s start figuring out what it can do and we want to have that adventure with you. I think if the Qualitest folks will let us, we’ll talk about that next time. Thanks for being on the show everybody today. I really appreciate it.

Michael Larsen (36:06):
Thanks for having us.

Beth Marshall (36:07):
Thanks so much.

Ted Ariaga (36:08):
Thanks for having us.

Michael Larsen (OUTRO):
That concludes this episode of The Testing Show. We also want to encourage you, our listeners, to give us a rating and a review on Apple podcasts, Google Podcasts, and we are also available on Spotify. Those ratings and reviews, as well as word of mouth and sharing, help raise the visibility of the show and let more people find us. Also, we want to invite you to come join us on The Testing Show Slack channel, as a way to communicate about the show. Talk to us about what you like and what you’d like to hear, and also to help us shape future shows. Please email us at thetestingshow (at) qualitestgroup (dot) com and we will send you an invite to join group. The Testing Show is produced and edited by Michael Larsen, moderated by Matt Heusser, with frequent contributions from our many featured guests who bring the topics and expertise to make the show happen. Additionally, if you have questions you’d like to see addressed on The Testing Show, or if you would like to be a guest on the podcast, please email us at thetestingshow (at) qualitestgroup (dot) com.

The Testing Show: ChatGPT for Testers

share

Panelists

References:

Transcript:

Recent posts

Get started with a free 30 minute consultation with an expert.