Friday, September 20, 2024
32 C
Surat
32 C
Surat
Friday, September 20, 2024

Study shows ChatGPT fails to solve new coding problems, can’t replace human programmers yet

Must read

Study shows ChatGPT fails to solve new coding problems, can’t replace human programmers yet

A recent study reveals significant variation in ChatGPT’s coding success rates, highlighting both its potential and limitations. While the AI ​​excels at old coding problems, it struggles with new problems, indicating that it cannot replace human programmers just yet.

Advertisement
Study shows ChatGPT fails to solve new coding problems, can’t replace human programmers yet
The image is created using AI.

Since the unveiling of ChatGPT in 2022, there has been a lot of talk about how AI chatbots can replace humans in certain jobs. While some tech experts believe that AI chatbots will take over some human tasks like coding, others have argued that technology can never be as smart as humans and it will only help them become better at their jobs. Some tech giants have also said in the past that years from now, there will be no need for human programmers at all because AI will take care of coding. But is that really the case? A recent study says that’s not necessary.

Advertisement

The research, published in the June issue of IEEE Transactions on Software Engineering, compared code produced by ChatGPT to code written by human programmers, focusing on functionality, complexity and security. The study found that ChatGPT’s success rate in producing functional code varied widely. Depending on the difficulty of the task, the programming language used and other factors, the AI’s success ranged from 0.66 percent to 89 percent. This wide range indicates that while ChatGPT can sometimes match or even surpass human programmers, it also has significant limitations.

Yutian Tang, a lecturer at the University of Glasgow involved in the study, said AI-based code generation can boost productivity and automate some software development tasks. However, it is important to understand both the strengths and weaknesses of these AI models. Tang emphasized the need for a comprehensive analysis to identify potential issues and improve AI-generated code techniques.

To understand these limitations in more depth, the research team tested GPT-3.5’s ability to solve 728 coding problems from the LeetCode platform in five programming languages: C, C++, Java, JavaScript, and Python. The study showed that ChatGPT was quite efficient at solving pre-2021 coding problems on LeetCode, achieving a success rate of around 89 percent for easy problems, 71 percent for medium problems, and 40 percent for hard problems.

However, the AI’s performance has dropped significantly when tackling coding problems introduced after 2021. For example, ChatGPT’s success rate for easy problems dropped from 89 percent to 52 percent. For harder problems, the success rate dropped from 40 percent to a mere 0.66 percent. This suggests that ChatGPT struggles with new coding problems, possibly due to its training data not including these recent challenges.

Tang proposed a reasonable hypothesis for ChatGPT’s varying performance. He suggested that AI performs better with pre-2021 algorithmic problems because these problems are more likely to be included in its training dataset. As coding evolves, ChatGPT is not exposed to new problems and solutions, lacking the critical thinking skills of human programmers. This limitation means that while ChatGPT can effectively address problems it has already encountered, it struggles with new, unfamiliar issues.

The study’s findings suggest that AI models like ChatGPT hold promise for increasing productivity and automating some coding tasks, but they are not yet a substitute for human programmers. AI’s inability to solve new coding problems highlights the need for continued development and training to keep up with the constantly evolving field of software engineering.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest article