Exhausted man defeats AI model in world coding championship

kebab@endlesstalk.org · 9 months ago

Exhausted man defeats AI model in world coding championship

scintilla · 9 months ago

Did I miss where they talk about how the AI “coder” worked? because based on what I’ve heard from programmers it just would lie the first few times. Was a human allowed to fix mistakes that the AI made?

Seems like the model was specifically tuned for this maybe?

I feel like I’m missing information as to whether or not I should be impressed by the AIs performance.

Derpgon@programming.dev · edit-2 9 months ago

From experience: Junie, and AI agent based on Sonnet 4, performs quite well. It can even write tests and fix them if they are failing.

Not saying the quality is great, but good enough eventually work and to pass as junior code.

Not sure how good OpenAI agent is, and if they used their coding agent Codex, and if they did then was it as-is or with some tuning? Not sure, they write it was “custom agent based on o3”.

They write all,the contestants have the same hardware, but did the agent run on the given machine, or in the cloud? Human brain is like 20-40W, so let’s say the upper limit given he has to move his hands - did the AI agent get the same wattage? I don’t think so.

vane@lemmy.world · edit-2 9 months ago

fucking articles these days, no link to nothing, just bunch of copy paste text hype from twitter
here’s link to problem: https://img.atcoder.jp/awtf2025heuristic/en.pdf
here’s link to live stream: https://www.youtube.com/watch?v=TG3ChQH61vE

edit: from stream 39:00 first solution by openai was after 15 minutes first human solution was after 38 minutes and it was 2x slower than initial openai solution

edit2: from stream 7:20:12 winner is telling live that he slept 8-10h over last 72h, openai models are crap given he don’t know what he’s doing there, his code is crap and he doesn’t know why it’s working so well

Apeman42@lemmy.world · 9 months ago

Then he laid down his ~~hammer~~ keyboard and died.

KingOfSleep@lemmy.ca · 9 months ago

The last human code master.

lordnikon@lemmy.world · 9 months ago

john Henry beat that infernal machine

pastermil@sh.itjust.works · 9 months ago

He’s gonna save as all!

Sekoia · 9 months ago

Having just read the problem, I’m curious how o3 solved it (and the human too tbh). My experience with LLMs says they’d be absolute complete crap at this, it’s a very hard and open-ended problem. Intuitively I’d say it would just end up doing random changes tryjng to improve its score.

I think I could write the “trivial” solution but anything beyond seems… difficult. Congrats to the winner!

FaceDeer@fedia.io · 9 months ago

Getting Kasparov v. Deep Blue vibes here.

applebusch · 9 months ago

How do you win at coding?

FaceDeer@fedia.io · 9 months ago

Did you read the article? It says:

The competition required contestants to solve a single complex optimization problem over 600 minutes.

𝕱𝖎𝖗𝖊𝖜𝖎𝖙𝖈𝖍@lemmy.world · 9 months ago

They had to optimize their own 5 year old code

charade_you_are@sh.itjust.works · 9 months ago

you code like you’ve never coded before

LifeInMultipleChoice@lemmy.dbzer0.com · 9 months ago

deleted by creator

yarr@feddit.nl · 9 months ago

I don’t read this as a win. One man finished in front of OpenAI and many, many, many finished behind OpenAI. If this is the future of coding, it’s bleak indeed.

The top 1% of developers will probably be OK no matter what, it’s the rest of the crowd who isn’t an award winning developer that are probably in trouble.

not_IO · 9 months ago