• The Current ⚡️
  • Posts
  • 🧵Sunday Threads - Microsoft’s “Muse” Could Be A Big Moment for Gaming

🧵Sunday Threads - Microsoft’s “Muse” Could Be A Big Moment for Gaming

Also, can AI really develop it’s own internal value structure over time?

A few 🧵 worth reading into 🧐 for your Sunday…

Context

  • Muse = the first World and Human Action Model (WHAM)

    • Trained on 1B+ gameplay images

    • Used 7+ YEARS of continuous gameplay data

    • Learned from real Xbox multiplayer matches

  • The graphics are obviously poor right now, but demoing the ability to generate gameplay in real time (rather than develop before) shifts gaming into a whole new category.

    • for comparison, imagine if you could pick up a book that writes itself with relevant and coherent information as you turn the pages..

Why care?

The gaming industry seemed to hate AI from the start, but this implies more than just ripping off existing games into AI generated copies — it could bring about a whole new way of creating digital experiences that, if you’ve ever read Ready Player One, could easily become a new paradigm for humanity’s online interactions in general.

Context

  • Apparently, advanced AI models exhibit internal systems of “value” that can quantify anything from human life to political preferences.

    • These “utilities” not only emerge, but are acted upon.

    • As AI’s get smarter, they become more opposed to having their values changed.

  • The researchers propose “Utility Engineering” which would simulate the process of citizen assembly internally in the model, with what is essentially a voting process to rewrite the value structure over time.

Why care?

We knew bias inside AI models was a problem from the start, and the idea that they can exhibit emergent value structures from within implies another layer of difficulty in addressing those issues. Efforts to safeguard AI models will also have to be continuous — similar to how laws change over time to manage societal deviations from an established norm — except AI seems to move and grow and change a bit faster than humanity’s collective consciousness.

Context

  • Reinforcement learning + test-time compute = key to superintelligence

  • Removing human engineered inference strategies from the training loop exhibited the biggest jump in quality

  • Introducing verifiable rewards (such as winning a game or getting a problem correct) incentivizes the AI to continue training

Why care?

OpenAI has taken on a Big Tech / Evil Empire sort of feel as of late, so it’s easy to forget that the core of their business is really just advancing super intelligent AI research as quickly as possible. With these findings re: incentives, the human element (/limitation) can effectively be removed and superintelligence (at least as far as us “limited” beings can perceive it) can begin to emerge.