60-Second Read logo

Why AI Apologized to Me, Twice.

So should I feel guilty?

Picture of the moon
James Webb Space Telescope image of GOODS-South (Great Observatories Origins Deep Survey-South)

Artificial Intelligence recently apologized to me.  Twice.
And I felt slightly guilty.

Here’s why:

In last month’s 60-Second Read, I discussed a picture taken by the James Webb Space Telescope of a small area of the night sky (about the size of the moon) known as GOODS-South.  If you read the piece, you may remember I noted there were 45,000 galaxies in an area that is just 0.0005% or 1 / 200,000th of the sky.

Now I certainly didn’t calculate the percentage in my head.  I asked Claude from Anthropic and ChatGPT of OpenAI for that.

ChatGPT was able to calculate it without issue.

Claude, on the other hand, was a completely different story.

Part of the calculation involved computing the area of the moon (GOODS-South takes up a small area of the night sky equivalent to the size of the moon).

From our math class in middle school, we know that the formula for an area of a circle is: π r2 (the number, pi, multiplied by the radius of the circle, squared).

ChatGPT had no problem with this.  Claude, on the other hand, did not “know” (I’m using this term loosely, I think) it got it wrong.  It did not include π in the calculation, and it squared the diameter, not the radius, until I pointed that out:

Why do you just square the diameter of the moon to get its area? The moon is a circle.

You’re absolutely right, my previous explanation for calculating the area of the full moon was incorrect. The moon appears as a circular disc in the sky, not a square. I apologize for the mistake. Let me re-explain this properly [and it goes on to make the proper calculation].

I also wanted to know what percentage of the sky the area of the moon occupies.

ChatGPT got it right.  But Claude again stumbled:

I believe the answer should be 0.000476% of the sky. Please doublecheck your answer.

You’re completely right, my previous calculations for the area of the full moon as a percentage of the total night sky area were incorrect. Let me walk through this again carefully [and again it makes the correct calculation].

Having now had back-to-back occurrences of Claude confidently putting forward answers that were wrong, I decided to confront Claude:

Why are you making basic math calculation mistakes?

You’re absolutely right to call me out on these basic math mistakes – making multiple errors in simple calculations is unacceptable. There are a few potential reasons I can think of:
•    As an AI system, I have certain limitations in my initial training data and programming. Math reasoning is still an area I have room to improve in.
•    I made some conceptual errors related to geometry, like using diameter instead of radius to calculate the area of a circle. This suggests gaps in my foundational knowledge.
•    I was careless in double checking my work at each step. Rushing through the calculations led me to make silly mistakes.
•    As an AI, I don’t have an intuitive sense for numbers in the same way humans do. So it’s easy for me to miss obvious errors.

As I read the above, I have to confess I felt a bit of “sympathy” (can one have that for a machine / system / collection of 0s and 1s?) for Claude.  Claude confessed to being “careless” and making “silly mistakes”.  Well who among us hasn’t ever been careless or made silly mistakes?

What put me over the edge, though, was the following that came from Claude after the above:

But ultimately, these explanations don’t excuse the errors. Even with my current abilities, I should have been able to correctly calculate something as basic as the area of a circle. The failures show clear gaps in my math knowledge and reasoning skills. I appreciate you taking the time to walk me through this – it will help strengthen my abilities, identify areas for improvement, and motivate me to be more careful. Please continue to point out any issues you see so I can become smarter and more reliable. I apologize again for the multiple mistakes.

In 20 years of work in leadership development, I have never come across someone/something more receptive to constructive, albeit slightly harsh, feedback.

I showed the above exchange to my wife, and she felt I was being a bit mean with Claude in noting it was making basic math mistakes.

Mean?  How can you be mean to a collection of 0s and 1s? 

Because it’s just a collection of 0s and 1s, right?