ScenicNL: Generating Probabilistic Scenario Programs from Natural Language

Karim Elmaaroufi, Devan Shanker, Ana Cismaru, Marcell Vazquez-Chanlatte, Alberto L. Sangiovanni-Vincentelli, Matei Zaharia, and Sanjit A. Seshia. ScenicNL: Generating Probabilistic Scenario Programs from Natural Language. In Conference on Language Models (COLM), 2024.

Download

[pdf] 

Abstract

For cyber-physical systems (CPS), including robotics and autonomous vehicles, mass deployment has been hindered by fatal errors that occurwhen operating in rare events. To replicate rare events such as vehicle crashes, many companies have created logging systems and employedcrash reconstruction experts to meticulously recreate these valuable events in simulation. However, in these methods, ”what if” questions are noteasily formulated and answered. We present ScenarioNL, an AI System for creating scenario programs from natural language. Specifically, we gen-erate these programs from police crash reports. Reports normally contain uncertainty about the exact details of the incidents which we representthrough a Probabilistic Programming Language (PPL), Scenic. By using Scenic, we can clearly and concisely represent uncertainty and variationover CPS behaviors, properties, and interactions. We demonstrate how commonplace prompting techniques with the best Large Language Models(LLM) are incapable of reasoning about probabilistic scenario programs and generating code for low-resource languages such as Scenic. Our sys-tem is comprised of several LLMs chained together with several kinds of prompting strategies, a compiler, and a simulator. We evaluate our systemon publicly available autonomous vehicle crash reports in California from the last five years and share insights into how we generate code that is bothsemantically meaningful and syntactically correct

BibTeX

@inproceedings{elmaaroufi-colm24,
  author       = {Karim Elmaaroufi and
                  Devan Shanker and
                  Ana Cismaru and
                  Marcell Vazquez{-}Chanlatte and
                  Alberto L. Sangiovanni{-}Vincentelli and
                  Matei Zaharia and
                  Sanjit A. Seshia},
  title        = {{ScenicNL}: Generating Probabilistic Scenario Programs from Natural Language},  
  booktitle = {Conference on Language Models (COLM)},
  OPTpages        = {331--351},
  year         = {2024},
  abstract  = {For cyber-physical systems (CPS), including robotics and autonomous vehicles, mass deployment has been hindered by fatal errors that occur
when operating in rare events. To replicate rare events such as vehicle crashes, many companies have created logging systems and employed
crash reconstruction experts to meticulously recreate these valuable events in simulation. However, in these methods, ”what if” questions are not
easily formulated and answered. We present ScenarioNL, an AI System for creating scenario programs from natural language. Specifically, we gen-
erate these programs from police crash reports. Reports normally contain uncertainty about the exact details of the incidents which we represent
through a Probabilistic Programming Language (PPL), Scenic. By using Scenic, we can clearly and concisely represent uncertainty and variation
over CPS behaviors, properties, and interactions. We demonstrate how commonplace prompting techniques with the best Large Language Models
(LLM) are incapable of reasoning about probabilistic scenario programs and generating code for low-resource languages such as Scenic. Our sys-
tem is comprised of several LLMs chained together with several kinds of prompting strategies, a compiler, and a simulator. We evaluate our system
on publicly available autonomous vehicle crash reports in California from the last five years and share insights into how we generate code that is both
semantically meaningful and syntactically correct},
}

Generated by bib2html.pl (written by Patrick Riley ) on Sat Oct 19, 2024 18:38:44