Is AI Progress Accelerating?

  • OpenAI’s new o3 system – trained on the ARC-AGI-1 Public Training set – has scored a breakthrough 75.7% on the Semi-Private Evaluation set at our stated public leaderboard $10k compute limit. A high-compute (172x) o3 configuration scored 87.5%.
  • This is a surprising and important step-function increase in AI capabilities, showing novel task adaptation ability never seen before in the GPT-family models. For context, ARC-AGI-1 took 4 years to go from 0% with GPT-3 in 2020 to 5% in 2024 with GPT-4o. All intuition about AI capabilities will need to get updated for o3.
  • Source.

Another 52 Things List

  • AI produces fewer greenhouse gas emissions than humans! Humans emit 27g of CO2 in the time it takes to write three hundred words. ChatGPT, however, performs the same task in 4.4 seconds and produces only 2.2g of CO2. 
  • On average, spouses in the United States have genetic similarity equivalent to that between 4th and 5th cousins.
  • People know whether or not they want to buy a house in just 27 minutes, but it takes 88 minutes to decide on a couch.
  • Full list here.

Peak Obesity?

  • The press has been infatuated with the latest CDC mean estimate of obesity prevalence in the US ticking down (here and here).
  • What they miss is this line from the CDC – “Changes in the prevalence of obesity and severe obesity between the two most recent survey cycles, 2017–March 2020 and August 2021–August 2023, were not significant.
  • In contrast, severe obesity rates continue to climb and this result is significant.
  • Source.

52 Things I learned 2024 Edition

  • The list returns.
  • People whose surnames start with U, V, W, X, Y or Z tend to get grades 0.6% lower than people with A-to-E surnames. Modern learning management systems sort papers alphabetically before they’re marked, so those at the bottom are always seen last, by tired, grumpy markers. A few teachers flip the default setting and mark Z to A, and their results are reversed.
  • Ozempic seems to be changing the second hand clothes market, creating a surge in plus-size women’s apparel sales. Size 3XL listings have doubled over the last two years

On the Cover of Science

  • Evo is a genomic foundation model that enables prediction and generation tasks from the molecular to genome scale. Using an architecture based on advances in deep signal processing, Evo is trained on 7 billion parameters with a context length of 131 kilobases at single-nucleotide resolution. Evo captures two fundamental aspects of biology—the multimodality of the central dogma and the multiscale nature of evolution.
  • Source.
WordPress Cookie Notice by Real Cookie Banner