rajat roy

Posted on Jun 2

Hallucinating with Q: deep conversations at midnight

#gamedev #vibecoding #python

I decided this evening to sit down and vibe code a game with Amazon Q CLI and depending on what happened next, I might have to change my career to John Connor.

TL;DR:

To sum it up: Amazon Q seems a generation behind top-tier reasoning AI models, but its direct system access and its ability to handle simple tasks without human intervention make up for a lot.

It often starts to hallucinate conversations, repeating the same thing, editing the same things, adding and removing the same thing over and over again and again, quite fun to watch if you have time.

Pros

The blazing speed of the code edits is phenomenal. I literally saw 100s of lines of code getting edited instantly; it was surreal, like hackers in a movie.
It can read and work on any file on the folder it has been given permission to without any human intervention.
It doesn’t complain.
/compact gets a summary of what the AI thinks is important, just like in Gemini’s “show thinking”: it lets one look at what it finds important.
The way it keeps track of code its writing is awesome to watch; it deletes the last line it had previously written and rewrites the line with the rest of the code as asked.

Conversations can go on for really long; I have had like 23,000 lines of convo before it started giving out utter gibberish. No give-up attitude: it doesn’t stop working. I've seen it write a piece of code, delete the whole thing, rewrite it, lose context, simplify the code to a more basic version, add more stuff to that, and so on—it cycles through until it has an answer. There are timeouts, but it automatically resumes on its own. It genuinely doesn’t stop unless an error is encountered.

Cons

Intent understanding needs to be worked upon as usual. The ways the computer misreads my intent is something only developers understand. Honestly, my prompts are useless once the AI starts thinking. Its creativity finds flaws like a champion debater.
Finding why things don’t work is another problem; I can’t see under its hood or read its mood.
I never knew AI could be this stupid. In the mysteriousness of its internal thoughts, it gives different results for the same prompt. Somehow it doesn’t learn from the mistakes I specify; it might even verbally confirm the correct solution concept, but then still fail to implement that in the actual code.
Tends to add more code than remove them.
Tends to be bad at animation but that might be because I can’t clearly explain my intent for the animation, I want it to code.
Connection drops throw errors; although the AI processing happens on the server, continuous connection is necessary.

I myself don’t know anything about ASCII art but Gemini is better at it than Amazon Q CLI; however, Gemini takes a millennium to respond compared to Amazon Q CLI, so use Gemini to troubleshoot. Other than that, Amazon Q CLI should suffice for 70% of the tasks.

The Context Conundrum

It doesn’t understand the context space; it’s like having photographic memory with goldfish attention. It doesn’t seem to understand the bigger picture and the relationships between different parts of code, how changing one might affect the other.
As I started having longer conversations the AI started to break down:

It started to keep asking me to /compact the conversation as it was getting too long.

Then it started to say stuff like: "I will keep this context in mind for future conversations." It was really weird.
I told it to edit the game code it had created with the feedback that I gave it, but it deleted the whole code, then just produced a very basic version without any game-logic that I had spent hours crafting.
At some point after that, the errors started to get more frequent.
The code worked but any feedback seemed useless.
It broke down around 23,000 lines of convo, losing full context of what initially was being worked upon.

The weird thing is, AI is normally trained with a small context memory for efficiency. Essentially, even if the context space given is large, it doesn’t seem to effectively use it all, so overall context isn’t always maintained in this case.

My hypothesis: Normal AI are trained with smaller context memory for efficiency. Following that, it seems Amazon hasn’t trained or retrained their AI for larger context memory. It might be that either larger context memory is too computationally expensive or larger context memory causes the AI to hallucinate more.

Regardless of the reason, the AI is functionally hampered by this limitation. It’s like having the ability to do quick math but you only know multiplication tables up to 10 instead of 20.

Long stretches of code generally lead to an increase in breakdowns of game code—I suppose this leads to small dependencies missing and errors happening here and there.

Best to modularize. It’s conjecture, but it feels like it forgets stuff; even though the context is there in the context memory, the meaning seems lost.

Conclusion: Simple things are great, complex not so much.
Looks like we survive the judgement day after all.

P.S.

I made an animated game initially but it doesn’t understand animation at all.
(I would love to see you try; if you do, share your prompts).

As text is the Ai’s strength, I made a text-based game that you can play from my GitHub repo:
https://github.com/ryoari/tower_of_chance,
it’s pretty fun, play if you have a chance.

Made with ❤️ and Amazon Q CLI, with some trouble-shooting by gemini

DEV Community