Why Do AI Models Generate Erroneous Answers to Simple Questions?

Oct 19, 20242 min read

I have had my fair share of amusing glitches while working with LLM Models. Counting the r’s in Strawberry still brightens up my dull days (I may need better hobbies, or a more exciting social life!) But it often left me wonder why the model sometimes behaves in such an illogical manner.

After digging deeper, here are likely reasons for these occasional errors, feel free to correct or add your thoughts.

1. Token-by-Token Generation

LLMs generate one token at a time. The first token is generated based on the input, and subsequent tokens use the initial input and previous tokens as context. Now, the probability of the first token being incorrect is never zero, and hence an initial error can lead to a sequence of errors. While researchers are improving self-correction, most studies show that LLMs are not capable of reliable self-correction.

2. Pattern Matching

LLMs train over vast set of preexisting data and generate responses based on the patterns that they have learnt. This feature helps in returning fluent and contextually relevant responses. Hence, even the seemingly logical responses are produced through pattern matching rather than actual reasoning which can lead to errors.

3. Built-in Hallucination

Hallucination is an outcome of pattern matching. Models are trained to fill in gaps when missing data, and predict text based on patterns learnt from training data. This works well in many cases. But this can also lead to factually incorrect yet confident answers that are not based on real data but set to fit the expected pattern.

4. Optimisation for Language Generation

LLMs are optimized for fluent language generation and not for symbol or numeric interpretation. This means they might falter with more precise tasks (arithmetic/logical/counting the letters) such as my personal favourite, counting the number of r's in strawberry.

Understanding model limitations can help in optimizing their performance and getting the best results through better prompting. I look forward to your thoughts.

Why Do AI Models Generate Erroneous Answers to Simple Questions?

Recent Posts

Comments