• 0 Posts
  • 89 Comments
Joined 10 months ago
cake
Cake day: May 16th, 2025

help-circle
  • I decided to take a look at the bitcoin white paper.

    Usually, the introduction of a technical paper is fluff and people quickly move on to the technical parts. However, the casual claims made in the first paragraph of this paper have aged extremely poorly, to say the least. In a better world, Bitcoin would have remained as an obscure academic toy, and this introduction would have remained fluff.

    While the system works well enough for most transactions, it still suffers from the inherent weaknesses of the trust based model.

    What weaknesses are there in the trust based model? Let’s find out!

    Completely non-reversible transactions are not really possible, since financial institutions cannot avoid mediating disputes. The cost of mediation increases transaction costs, limiting the minimum practical transaction size and cutting off the possibility for small casual transactions, and there is a broader cost in the loss of ability to make non-reversible payments for non-reversible services. With the possibility of reversal, the need for trust spreads. Merchants must be wary of their customers, hassling them for more information than they would otherwise need.

    It seems like this guy really loves non-reversible transactions! But as we’ve seen with the history of crypto, non-reversible transactions sound really good until you fall victim to a crypto scam and there is no way to appeal to the bank to reverse the charges. Reversibility actually increases trust because you no longer need to be absolutely certain that you’re dealing with an honest person.

    A certain percentage of fraud is accepted as unavoidable.

    Almost like that is a problem of human nature. And it’s not like cryptocurrency has a spotless record when dealing with fraud! The problem with fraud is not the third party (the bank), but with the second party (the merchant or customer you’re dealing with).

    The introduction is not long, and most of the paper concerns the technical details of the construction of Bitcoin. By itself, there really is no way to complain about a pile of definitions. But there are still dumb comments that have aged poorly in retrospect.

    A block header with no transactions would be about 80 bytes. If we suppose blocks are generated every 10 minutes, 80 bytes * 6 * 24 * 365 = 4.2MB per year. With computer systems typically selling with 2GB of RAM as of 2008, and Moore’s Law predicting current growth of 1.2GB per year, storage should not be a problem even if the block headers must be kept in memory.

    But why would you want a block header with no transactions? If you wanted to, I don’t know, replace the world’s financial system, you would need to handle millions of transactions every 10 minutes. How big would the blocks be then? And remember that many copies of the same blockchain would need to be stored (certainly, every miner would need to store a copy). How many thousands or millions of times would that multiply things?

    Businesses that receive frequent payments will probably still want to run their own nodes for more independent security and quicker verification.

    Turns out it was a bold assumption to think that businesses would just run their own bitcoin miners.

    The proof of security (Section 11) is extremely sketchy by modern standards. (They’re assuming that all attackers would follow a certain format to attack and not try something different. I get it, proper proofs of security in cryptography are very subtle and difficult.) There is also a page of fluff making random calculations with the Poisson distribution. In any case, the security of Bitcoin requires that the collective computational power of the defenders exceeds the power of any attacker (so the defenders can make new blocks faster).

    Bitcoin is very strange as a cryptographic system in that the defender must have more resources than any possible attacker. In most cryptographic systems, the system should be secure even if the attacker has vastly more resources than the defender. Your phone’s cryptography should be secure even if some government agency dedicated their supercomputers to try and break it. This means that Bitcoin must waste tons of energy, since that is required to maintain security. Any more energy dumped into it will only increase security and not make the actual transactions faster, which makes Bitcoin horrendously inefficient.

    As a purely academic idea in cryptography, it is an interesting curiosity, but the arguments for why it’s useful are sketchy. There are other such curiosities that are much more interesting, like homomorphic encryption or secure multiparty computation. It would be a nice line on a CV, but not “incredible”.

    The true significance of Bitcoin was the terrible libertarian economic argument for it, and the chain of events that would transform it into nothing more than a speculative fashion trend. It has nothing to do with the technical details of Bitcoin. The technical and economic arguments for Bitcoin turned out to be so weak that nowadays, the only real support for Bitcoin is that maybe you can sell it for a higher price to a greater fool.


  • This somehow makes things even funnier. If he had any understanding of modern math, he would know that representing a set of things as points in some geometric space is one of the most common techniques in math. (A basic example: a pair of numbers can be represented by a point in 2D space.) Also, a manifold is an extremely broad geometric concept: knowing that two things are manifolds does not meant that they are the same or even remotely similar, without checking the details. There are tons of things you can model as a manifold if you try hard enough.

    From what I see, Scoot read a paper modeling LLM inference with manifolds and thought “wow, cool!” Then he fished for neuroscience papers until he found one that modeled neurons using manifolds. Both of the papers have blah blah blah something something manifolds so there must be a deep connection!

    (Maybe there is a deep connection! But the burden of proof is on him, and he needs to do a little more work than noticing that both papers use the word manifold.)


  • Kolmogorov complexity:

    So we should see some proper definitions and basic results on the Kolmogorov complexity, like in modern papers, right? We should at least see a Kt or a pKt thrown in there, right?

    Understanding IS compression — extracting structure from data. Optimal compression is uncomputable. Understanding is therefore always provisional, always improvable, never verifiably complete. This kills “stochastic parrot” from a second independent direction: if LLMs were memorizing rather than understanding, they could not generalize to inputs not in their training data. But they do. Generalization to novel input IS compression — extracting structure, not regurgitating sequences.

    Fuck!


  • Nonsensical analogies are always improved by adding a chart with colorful boxes and arrows going between them. Of course, the burden of proof is on you, dear reader, to explain why the analogy doesn’t make sense, not on the author to provide more justification than waving his hands really really hard.

    Many of these analogies are bad as, I don’t know, “Denmark and North Korea are the same because they both have governments” or something. Humans and LLMs both produce sequences of words, where the next word depends in some way on the previous words, so they are basically the same (and you can call this “predicting” the next word as a rhetorical flourish). Yeah, what a revolutionary concept, knowing that both humans and LLMs follow the laws of time and causality. And as we know, evolution “optimizes” for reproduction, and that’s why there are only bacteria around (they can reproduce every 20 minutes). He has to be careful, these types of dumbass “optimization” interpretations of evolution that arose in the late 1800s led to horrible ideas about race science … wait a minute …

    He isn’t even trying with the yellow and orange boxes. What the fuck do “high-D toroidal attractor manifolds” and “6D helical manifolds” have to do with anything? Why are they there? And he really thinks he can get away with nobody closely reading his charts, with the “(???, nothing)” business. Maybe I should throw in that box in my publications and see how that goes.

    I feel like his arguments rely on the Barnum effect. He makes statements like “humans and LLMs predict the next word” and “evolution optimizes for reproduction” that are so vague that they can be assigned whatever meaning he wants. Because of this, you can’t easily dispel them (he just comes up with some different interpretation), and he can use them as carte blanche to justify whatever he wants.




  • For all the talk about these people being “highly agentic”, it is deeply ironic how all the shit they do has no meaning and purpose. I hear all this sound and fury about making millions off of ChatGPT wrappers, meeting senators in high school bathrooms, and sperm races (?), and I wonder what the point is. Silicon Valley hagiographies used to at least have a veneer that all of this was meaningful. Are we supposed to emulate anyone just because they happen to temporarily have a few million dollars?

    Even though the material conditions of working in science are not good, I’d still rather do science than whatever the hell they’re doing. I would be sick at the prospect of being a “highly agentic” person in a “new and possibly permanent overclass”, where my only sense of direction is a vague voice in my head telling me that I should be optimizing my life in various random ways, and my only motivation is the belief that I have to win harder and score more points on the leaderboard. (In any case, I believe this “overclass” is a lot more fragile than the author seems to think.)




  • my current favorite trick for reducing “cognitive debt” (h/t @simonw ) is to ask the LLM to write two versions of the plan:

    1. The version for it (highly technical and detailed)
    2. The version for me (an entertaining essay designed to build my intuition)

    I don’t know about them, but I would be offended if I was planning something with a collaborator, and they decide to give me a dumbed down, entertaining, children’s storybook version of their plan while keeping all the technical details to themselves.

    Also, this is absolutely not what “cognitive debt” means. I’ve heard technical debt refers to bad design decisions in software where one does something cheap and easy now but has to constantly deal with the maintenance headaches afterwards. But the very concept of working through technical details? That’s what we call “thinking”. These people want to avoid the burden of thinking.


  • This is why CCC being able to compile real C code at all is noteworthy. But it also explains why the output quality is far from what GCC produces. Building a compiler that parses C correctly is one thing. Building one that produces fast and efficient machine code is a completely different challenge.

    Every single one of these failures is waved away because supposedly it’s impressive that the AI can do this at all. Do they not realize the obvious problem with this argument? The AI has been trained on all the source code that Anthropic could get their grubby hands on! This includes GCC and clang and everything remotely resembling a C compiler! If I took every C compiler in existence, shoved them in a blender, and spent $20k on electricity blending them until the resulting slurry passed my test cases, should I be surprised or impressed that I got a shitty C compiler? If an actual person wrote this code, they would be justifiably mocked (or they’re a student trying to learn by doing, and LLMs do not learn by doing). But AI gets a free pass because it’s impressive that the slop can come in larger quantities now, I guess. These Models Will Improve. These Issues Will Get Fixed.



  • I thought I was sticking my neck out when I said that OpenAI was faking their claims in math, such as with the whole International Math Olympiad gold medal incident. Even many of my peers in my field are starting to become receptive to all of these rumors about how AI is supposedly getting good at math. Sometimes I wonder if I’m going crazy and sticking my head in the sand.

    All I can really do is to remember that AI developers are bad faith (and scientists are actually bad at dealing with bad faith tactics like flooding the zone with bullshit). If the boy has cried wolf 10 times already, pardon me if I just ignore him entirely when he does it for the 11th time.

    I would not underestimate how much OpenAI and friends would go out of their way to cheat on math benchmarks. In the techbro sphere, math is placed on a pedestal to the point where Math = Intelligence.


  • It took a full eleven paragraphs before the article even mentions AI. Before that, it was a bunch of stuff about how Wikipedia is conservative and Gen Z and Gen Alpha have no attention span. If the author has to bury the real point and attempt to force this particular rhetorical framing, I think the haters are winning. Well done everyone.

    my comments about this turd of an article

    These three controversies from Wikipedia’s past reveal how genuine conversations can achieve—after disagreements and controversy—compromise and evolution of Wikipedia’s features and formats. Reflexive vetoes of new experiments, as the Simple Summaries spat highlighted last summer, is not genuine conversation.

    Supplementing Wikipedia’s Encyclopedia Britannica–style format with a small component that contains AI summaries is not a simple problem with a cut-and-dried answer, though neither were VisualEditor or Media Viewer.

    Surely, AI summaries are exactly the same as stuff like VisualEditor and Media Viewer, which were tools that helped contributors improve articles. Please ignore my rhetorical sleight of hand. They’re exactly the same! Okay, I did mention AI hallucinations in one sentence, but let’s move on from that real quick.

    A still deeper crisis haunts the online encyclopedia: the sustainability of unpaid labor. Wikipedia was built by volunteers who found meaning in collective knowledge creation. That model worked brilliantly when a generation of internet enthusiasts had time, energy, and idealism to spare. But the volunteer base is aging. A 2010 study found the average Wikipedia contributor was in their mid-twenties; today, many of those same editors are now in their forties or fifties.

    Yeah, because Wikipedia editors are permanently static. Back in 2001, Jimmy Wales handpicked a bunch of teenagers to have the sacred title of Wikipedia Editor, and they are the only ones who will ever be allowed to edit Wikipedia. Oh wait, it doesn’t work like that. Older people retire and move on, and new people join all the time.

    Meanwhile, the tech industry has discovered how to extract billions in value from their work. AI companies train their large language models on Wikipedia’s corpus. The Wikimedia Foundation recently noted it remains one of the highest-quality datasets in the world for AI development. Research confirms that when developers try to omit Wikipedia from training data, their models produce answers that are less accurate, less diverse, and less verifiable.

    Now that we have all these golden eggs, who needs the goose anymore? Actually, it is Inevitable that the goose must be killed. It is progress. It is the advancement of technology. We just have to accept it.

    The irony is stark. AI systems deliver answers derived from Wikipedia without sending users back to the source. Google’s AI Overviews, ChatGPT, and countless other tools have learned from Wikipedia’s volunteer-created content—then present that knowledge in ways that break the virtuous cycle Wikipedia depends on. Fewer readers visit the encyclopedia directly. Fewer visitors become editors. Fewer users donate. The pipeline that sustained Wikipedia for a quarter century is breaking down.

    So AI is a parasite that takes from Wikipedia, contributes nothing in return, and in fact actively chokes it out? And you think the solution is for Wikipedia to just surrender and implement AI features? Do you keep forgetting what point you’re trying to make?

    Meanwhile, AI systems should credit Wikipedia when drawing on its content, maintaining the transparency that builds public trust. Companies profiting from Wikipedia’s corpus should pay for access through legitimate channels like Wikimedia Enterprise, rather than scraping servers or relying on data dumps that strain infrastructure without contributing to maintenance.

    Yeah, what a wonderful suggestion. The AI companies just never realized all this time that they could use legitimate channels and give back to the sources they use. It’s not like they are choosing to do this because they have no ethics and want the number to go up no matter the costs to themselves or to others.

    Wikipedia has survived edit wars, vandalism campaigns, and countless predictions of its demise. It has patiently outlived the skeptics who dismissed it as unreliable. It has proven that strangers can collaborate to build something remarkable.

    Wikipedia has survived countless predictions of its demise, but I’m sure this prediction of its demise is going to pan out. After all, AI is more important than electricity, probably.




  • “California is, I believe, the only state to give health insurance to people who come into the country illegally,” Kauffman said nervously. “I think we probably should not be providing that.”

    “So you’d rather everyone just be sick, and get everyone else sick?” another reporter asked.

    “That’s not what I’m saying,” said Kauffman.

    “Isn’t that effectively what happens?” the reporter countered. “They don’t have access to health care and they just have to get sick, right?”

    Kauffman contemplated that one for a moment. “Then they have to just get sick,” he said. “I mean, it’s unfortunate, but I think that it’s sort of impossible to have both liberal immigration laws and generous government benefits.”

    Do I need to comment on this one?


  • I don’t even think many AI developers realize that we’re in a hype bubble. From what I see, they genuinely believe that the Models Will Improve and that These Issues Will Get Fixed. (I see a lot of faculty in my department who still have these beliefs.)

    What these people do see, however, are a lot of haters who just cannot accept this wonderful new technology for some reason. AI is so magical that they don’t need to listen to the criticisms; surely they’re trivial by comparison to magic, and whatever they are, These Issues Will Get Fixed. But lately they have realized that with the constant embarrassing AI failures (AI surely doesn’t have horrible ethics as well), there are a lot of haters who will swarm the announcement of any AI project now. The haters also tend to be people who actually know stuff and check things (tech journalists are incentivized to not do that), but it doesn’t matter because they’re just random internet commenters, not big news outlets.

    My theory is that now they add a ton of caveats and disclaimers to their announcements in a vain attempt to reduce the backlash. Also if you criticize them, it’s actually your fault that it doesn’t work. It’s Still Early Days. These Issues Will Get Fixed.



  • I wonder what actual experts in compilers think of this. There were some similar claims about vibe coding a browser from scratch that turned out to be a little overheated: https://pivot-to-ai.com/2026/01/27/cursor-lies-about-vibe-coding-a-web-browser-with-ai/

    I do not believe that this demonstrates anything other than they kept making the AI brute force random shit until it happened to pass all the test cases. The only innovation was that they spent even more money than before. Also, it certainly doesn’t help that GCC is open source, and they have almost certainly trained the model on the GCC source code (which the model can regurgitate poorly into Rust). Hell, even their blog post talks about how half their shit doesn’t work and just calls GCC instead!

    It lacks the 16-bit x86 compiler that is necessary to boot Linux out of real mode. For this, it calls out to GCC (the x86_32 and x86_64 compilers are its own).

    It does not have its own assembler and linker; these are the very last bits that Claude started automating and are still somewhat buggy. The demo video was produced with a GCC assembler and linker.

    I wonder why this blog post was brazen enough to talk about these problems. Perhaps by throwing in a little humility, they can make the hype pill that much easier to swallow.

    Sidenote: Rust seems to be the language of choice for a lot of these vibe coded “projects”, perhaps because they don’t want people immediately accusing them of plagiarism. But Rust syntax still reasonably follows languages like C. In most cases, blindly translating C code into Rust kinda works. Now, Rust does have the borrow checker which requires a lot of thinking to deal with, but I think this is not actually a disadvantage for the AI. Borrow checking is enforced by the compiler, so if you screw up in that department, your code won’t even compile. This is great for an AI that is just brute forcing random shit until it “works”.