In a test of creative thinking, artificial intelligence surpasses the typical human performance.

In a test of creative thinking, artificial intelligence surpasses the typical human performance.

A groundbreaking study has unveiled that artificial intelligence, notably ChatGPT4, can excel in comparison to the average human’s capacity to generate ideas in a classic test of creativity. Nevertheless, AI chatbots consistently demonstrated high performance but didn’t surpass the most creative human participants. Instead, humans exhibited a broader spectrum of creative potential, possibly attributable to variances in executive functions and cognitive processes.

Traditionally, creativity has been regarded as an exclusively human attribute, driven by intricate cognitive processes like imagination, insight, and the ability to forge connections between seemingly unrelated concepts. However, as AI technology advances, it has become increasingly evident that machines possess the capability to produce creative results that can compete with, and sometimes even outperform, human accomplishments.

Study author Simone Grassini, an associate professor at the University of Bergen, expressed, “I believe we are presently situated in a unique historical juncture where our perception of machines and machine intelligence might undergo a profound transformation. As a scientist, I firmly believe that there is ample room for research into how people perceive machines and which human abilities machines are currently capable of emulating.”

“A few decades ago, it would have been challenging to envision machines exhibiting capabilities like creative behavior, and the field is evolving so rapidly that it’s hard to foresee what will transpire in the coming one or two years.”

The researchers conducted their investigation using a conventional creativity assessment called the Alternate Uses Task (AUT). In this assessment, both human participants and AI chatbots were tasked with generating inventive and unique applications for everyday objects, such as a rope, box, pencil, and candle.

For the human participants, a time limit of 30 seconds was provided to produce as many creative ideas as they could. In contrast, the chatbots were instructed to generate a set number of ideas (e.g., 3 ideas) and were limited to using only 1-3 words in each response. Each chatbot underwent the assessment 11 times.

The study encompassed three AI chatbots: ChatGPT3, ChatGPT4, and Copy.Ai, in addition to a cohort of 256 human participants. These participants, all native English speakers, were recruited from the online platform Prolific and had an average age of 30.4 years, ranging from 19 to 40 years.

The responses from both human participants and AI chatbots underwent analysis using two primary methods:

  1. Semantic Distance Scores: This automated technique gauged the originality of responses by measuring how distinct they were from common or expected applications of the given objects.
  2. Subjective Ratings of Creativity: Six human evaluators, unaware of whether the responses were generated by AI or humans, were tasked with assessing the creativity of the ideas on a 5-point scale.

The findings indicated that AI chatbots, particularly ChatGPT3 and ChatGPT4, consistently achieved higher semantic distance scores in comparison to humans. This implies that they generated responses that were more unconventional and less typical when contrasted with human participants. Human raters also rated AI chatbots, especially ChatGPT4, as more creative on average than human participants.

“Based on our results, AI chatbots like ChatGPT are showing remarkable proficiency in producing creative responses when engaged in conventional creative thinking tasks, commonly employed in psychological research,” Grassini remarked.

Nonetheless, it’s important to recognize that while AI chatbots performed impressively, they did not consistently outperform the most creative human participants. In some instances, exceptionally creative individuals among the human participants demonstrated the ability to compete with AI chatbots in generating novel and imaginative responses.

“The typical machine outperforms the average human in the Alternate Uses Task. However, the best-performing human participants still surpassed all the models we tested,” Grassini conveyed to PsyPost.

This achievement by AI systems is indeed noteworthy. Yet, it is crucial to refrain from overestimating its implications in the real world. The fact that a machine can excel in a very specific creative task does not necessarily imply that it will perform adeptly in intricate jobs requiring creativity. The extent to which these machine ‘skills’ are transferable to real-world applications remains an area for further exploration.

“I tend to believe that in the future, AI such as chatbots will complement human creativity rather than replace it in creative roles,” Grassini remarked. “We ought to consider a future where humans and AI machines coexist without automatically assuming that machines will render us obsolete or take away all our jobs.”

“Nonetheless, it is crucial to emphasize that the influence of AI on the job market is substantial and likely to expand in the coming years. The way our society adapts to the integration of AI into human employment is an essential contemporary concern. I anticipate that governments and stakeholders will develop guidelines and regulations concerning the deployment of machines to replace or assist human labor.”

Among the chatbots assessed, ChatGPT4 emerged as the most creative when subjective assessments were taken into account.

“One noteworthy discovery was that ChatGPT4, the latest model we tested, did not outperform the other AI models when evaluated using an algorithm to gauge semantic distance,” Grassini elucidated. “Nevertheless, ChatGPT4 generally outperformed the other models when assessed by human raters in terms of the level of creativity evident in the responses.”

This suggests that ChatGPT4’s outputs did not significantly differ from the other models in the ‘objective’ assessment of semantic distance between the suggested item and the imaginative utilization of it. However, the responses generated by ChatGPT4 were perceived as more ‘captivating’ or ‘subjectively more creative’ by the human raters.

It’s important to acknowledge that, like any scientific study, this research has certain limitations. “We only examined one facet of creative behavior,” Grassini conveyed to PsyPost. “Our findings may not be universally applicable to creativity as a multifaceted phenomenon.”

The researchers also acknowledged that comparing creativity at the process level between humans and chatbots presents challenges, as chatbots essentially function as “black boxes” with concealed internal processes.

“The machine may not ‘demonstrate’ creativity in the conventional sense, as it may have learned the best response to that specific task from the training data,” Grassini expounded. The task may have gauged the chatbots’ memory more than their “capacity to devise creative uses for objects. Due to the architecture of these models, this remains indeterminate.”

Source :

Recents Post

Fiber coupler

First-order surface grating fiber couplers are devices in optical communication...

Read More

JEC Residence C5, Plumbon, Banguntapan, Modalan, Banguntapan, Kec. Banguntapan, Bantul, Daerah Istimewa Yogyakarta 55198.

(+62274) 2805750


About Us

Membership & Services

IAES Journal



Help & F.A.Q

Terms & Conditions

Privacy Policy