Abstract
This talk introduces Self-Proving models, a new class of models that formally prove the correctness of their outputs via an Interactive Proof system. After reviewing some related literature, I will formally define Self-Proving models and their per-input (worst-case) guarantees. I will then present algorithms for learning these models and explain how the complexity of the proof system affects the complexity of the learning algorithms. Finally, I will show experiments where Self-Proving models are trained to compute the Greatest Common Divisor of two integers, and to prove the correctness of their results to a simple verifier. No prior knowledge of autoregressive models or Interactive Proofs will be assumed of the listener. This is a joint work with Noga Amit, Shafi Goldwasser, and Guy Rothblum.
Date
Apr 1, 2025 12:00 PM
Event
University of Warwick; University of Oxford; Cambridge University; Google DeepMind; École Polytechnique Fédérale de Lausanne (EPFL); Institut de Recherche en Informatique Fondamentale (IRIF); Zuse Institute Berlin (ZIB); Massachusetts Institute of Technology (MIT); Harvard University; Yale University; Alignment, Trust, Watermarking, and Copyright Issues in LLMs workshop at the Simons Institute for the Theory of Computing; and at the CS Theory Seminar at University of California, Berkeley
- University of Warwick: January 14th, 2025.
- University of Oxford: January 13th, 2025.
- Cambridge University: January 9th, 2025.
- Google DeepMind: January 8th, 2025.
- École Polytechnique Fédérale de Lausanne (EPFL): December 17th, 2024.
- Institut de Recherche en Informatique Fondamentale (IRIF): December 10th, 2024.
- Zuse Institute Berlin (ZIB): December 4th, 2024.
- Massachusetts Institute of Technology (MIT): November 20th, 2024.
- Harvard University: November 18th, 2024.
- Yale University: November 14th, 2024.
- Alignment, Trust, Watermarking, and Copyright Issues in LLMs workshop at the Simons Institute for the Theory of Computing: October 15th, 2024.
- CS Theory Seminar at University of California, Berkeley: September 11th, 2024.