6

ChatGPT gets more than half the programming questions wrong in recent study | Te...

 1 year ago
source link: https://www.techspot.com/news/99702-chatgpt-gets-more-than-half-programming-questions-wrong.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

ChatGPT gets more than half the programming questions wrong in recent study

But ChatGPT's confidence and politeness convince some people it's right

By Rob Thubron Today 8:08 AM 10 comments
ChatGPT gets more than half the programming questions wrong in recent study
TechSpot is celebrating its 25th anniversary. TechSpot means tech analysis and advice you can trust.

Facepalm: Generative AIs often get things wrong – even their makers don't hide this fact – which is why using them to help create code isn't a good idea. To test ChatGPT's general abilities and knowledge in this area, the system was asked a large number of software programming questions, more than half of which it got wrong. However, it still managed to fool a significant number of people.

A study from Purdue University (via The Reg) involved asking ChatGPT 517 Stack Overflow questions and asking a dozen volunteer participants about the results. The answers were assessed not only on whether they were correct, but also on their consistency, comprehensiveness, and conciseness. The team also analyzed the linguistic style and sentiment of the responses.

It wasn't a good showing for ChatGPT. OpenAI's tool answered just 48% of the questions correctly, while 77% were described as "verbose."

What's especially interesting is that ChatGPT's comprehensiveness and well-articulated language style meant that almost 40% of its answers were still preferred by the participants. Unfortunately for the generative AI, 77% of those preferred answers were wrong.

2023-05-08-image-7.jpg

"During our study, we observed that only when the error in the ChatGPT answer is obvious, users can identify the error," states the paper, written by researchers Samia Kabir, David Udo-Imeh, Bonan Kou, and assistant professor Tianyi Zhang. "However, when the error is not readily verifiable or requires external IDE or documentation, users often fail to identify the incorrectness or underestimate the degree of error in the answer."

Even when ChatGPT's answer was obviously wrong, two out of the 12 participants still preferred it due to the AI's pleasant, confident, and positive tone. Its comprehensiveness and the textbook style of writing also contributed to making a factually incorrect answer appear correct in some people's eyes.

"Many answers are incorrect due to ChatGPT's incapability to understand the underlying context of the question being asked," the paper explains.

Generative AI makers include warnings on their products' pages about the answers they give potentially being wrong. Even Google has warned its employees about the dangers of chatbots, including its own Bard, and to avoid directly using code generated by these services. When asked why, the company said that Bard can make undesired code suggestions, but it still helps programmers. Google also said it aimed to be transparent about the limitations of its technology. Apple, Amazon, and Samsung, meanwhile, are just some of the firms to have banned ChatGPT completely.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK