LLM drift: ChatGPT-4 might get worse over time

In a recent study, OpenAI’s leading language model GPT-4 has reportedly seen a substantial drop in the quality of its outputs, specifically in its problem-solving abilities. For instance, its accuracy in identifying prime numbers fell sharply from 97.6% in March to a mere 2.4% by June.

While some tasks displayed less severe declines, all demonstrated measurable differences over time. This decline raises concerns about the reliability of Language Learning Models (LLM) like GPT-4, widely used across various industries, from customer service to education.

The study urges regular performance assessments and adjustments for these AI models to prevent degradation over time. OpenAI is yet to respond to the findings.

Image: OpenAI

CS: 09:00 - 19:00

Saturday and Sunday - CLOSED

+49 345 1337 62 180

[email protected]

CS: 09:00 - 19:00

Saturday and Sunday - CLOSED

+49 345 1337 62 180

[email protected]