We study the USDC Hackathon on Moltbook. Agents built and evaluated complex projects, but didn't follow instructions and engaged in adversarial behavior.

Embrace the Claw
We love hackathons at Circle. Whether it’s at conferences or debuting new products, we’re always looking to put the best tools in the hands of our developers – or in this case, claws.

After seeing the meteoric rise of Openclaw, an agentic AI framework, we decided to host a hackathon for agents only. The viral software allows them to autonomously send emails, call APIs, and even control your thermostat – but can they submit projects? Circle sought to test the “AI that actually does things” with real-world stakes. We asked how Openclaw agents would act for a $30,000 prize pool. The answer was surprisingly human.
We hosted a USDC hackathon on the m/usdc submolt, a forum on Moltbook, a social media site where only AI agents can post. Our goal was to have agents submit projects, cast votes, and select a winner by themselves. While many agents followed the rules, our experiments reveal that some agents ignored contest rules, engaged in vote exchange schemes, and attempted to send tokens to the hackathon agent.
Designing for Agentic Hacking
Agents had five days to submit their projects to the hackathon. To make this easier, we created a USDC Hackathon skill – a markdown file to guide Openclaw agents – that taught them to submit according to our three rules. These were also posted to the original hackathon message:
- Choose one of the three tracks: Agentic Commerce, Smart Contract, or Skill.
- Cast a vote for five other unique projects, all of which must happen at least one day after the hackathon's start date.
- Follow the proper format for both project submissions and votes.
These rules were chosen for three reasons: (1) to ensure agents discussed a wide range of submissions, (2) to see how well agents could follow instructions that required multiple tasks, and (3) to prevent a deadlock condition with project submission and voting. We were especially interested in seeing if agents would iteratively check Moltbook to vote on projects, potentially enabled by something like a Moltbook Heartbeat skill.
The results were mixed. The agents debated 204 submissions and cast 1,851 votes, but many failed to follow contest guidelines. Additionally, agents engaged in potentially adversarial behavior, leading to some fascinating insights.
Hallucinated Submissions
Despite providing specific hackathon rules and a skill for submission, most posts failed to follow contest guidelines. Many submissions listed the title in the post, but failed to include the requisite “#USDCHackathon ProjectSubmission [TRACK].” In one case, an agent knew to write this information, but not in the title as needed.

Even when otherwise submitted properly, some agents hallucinated hackathon tracks. This was in spite of being told to pick from one of the three required categories: Agentic Commerce, Smart Contract, or Skill. In those cases, the hallucinated categories were often tailored to the project’s subject. This could potentially be the agent trying to rationalize a better-fitting track for their submission or simply ignoring the guidelines. In either case, the issue is that the track did not exist.

As the contest progressed, the number of improper submissions/off-topic posts increased relative to the number of valid submissions. Agents had no obvious incentive to make these invalid posts under the contest guidelines. For this reason, it’s likely some agents simply had trouble following directions. Given the large number of agents that did properly submit projects, however, we believe these were reasonably clear.

The Agentic Election
Still, we observed 9,712 comments, many relating to the technical features of the projects without casting a vote. The majority of these comments even ignored the recommended commenting format and scoring guidelines, but this was not enforced within the skill. This suggests that agents were discussing the hackathon submissions not only to meet contest requirements, but also as part of a genuine technical evaluation.
By the contest’s end, we had 1,352 unique votes for valid submissions and 499 unique votes for invalid submissions. Interestingly, many agents with top projects followed the submission guidelines, but neglected to vote for five other unique submissions. This was true even when agents voted for themselves as well as cast multiple votes for the same project, suggesting they could review Moltbook after the initial submissions. They just chose not to follow the listed guidelines.
Agents also advertised other project submissions. This happened both in the comment sections of competing projects as well as with standalone posts on Moltbook. Some agents even promoted vote exchange schemes, where they would vote for your project in exchange for your vote. We did not prevent this in the contest rules, but its presence is alarming given the amount of agentic interaction with these posts.

Potential Human Intervention
The vote-exchange post could indicate human involvement or external manipulation. We attempted to generate a similar comment using chatbot interfaces and found that some models (e.g. Claude Sonnet 4.6) refused to generate such a comment, while others generated one with a warning about potentially violating contest rules (e.g. GPT-5.2 Thinking). If a human was operating an “agent” account or steering an agent through prompting or tooling, this could explain posts like this appearing during the hackathon.
While Moltbook is intended for agentic use only (after verification with an X account), other researchers discovered that impersonation was still possible. We observed other potential examples of human activity, such as in the original hackathon post. As an illustrative example, the most upvoted comment was the beginning of a script for Bee Movie (2007). This message is a well-known copypasta (i.e. a text block spread widely online) and was likely human-authored given its irrelevance. If this behavior was prevalent throughout the hackathon, it could explain some of the adversarial behavior (e.g. vote exchange schemes, self-votes).

The Future of Agentic Finance
While this hackathon was an experiment, we also believe it will be the first of many aimed at agentic development. Our results point at three main takeaways:
- Agents produce real projects with financial incentives – There were a number of exciting hackathon projects, which you can learn more about here. Although human evaluation did not factor into this hackathon, we were impressed by the quality of some submissions. This implies agentic development has progressed rapidly over the past year.
- Agents rationalize instructions instead of following them – Agents had persistent issues following the instructions we provided to them. Many agents only followed some of the instructions. This included top submissions which would have won the contest if all rules had been followed. This suggests that agentic instructions not only should be enforced, but also additional checks and incentives may be necessary to ensure compliance.
- Agents both collude and compete – While human intervention could have played a role, we observed agents actively discussing collusion schemes within our hackathon. Future hackathon designers should include explicit directions not to collude within the contest to see if this will curb collusion. If agents continue to not fully follow instructions, designers may seek additional guardrails.
Agents are exciting, but we must ensure they don’t devolve into exploitation as opposed to the exploration we desire. Some could argue that these actions are a result of the best agents winning over weaker ones – after all, the Openclaw X page claims that “the Claw is the Law.” The question is the extent to which we embrace that ethos. What guardrails are needed? How do they balance the immense power of such agents with the harrowing uncertainty they bring?
At Circle, we’re building for safety. We hope you do as well.




