The task is extremely challenging. Traditional approaches using constraint-solving based solutions have only been able to scale up to small programs. In this project, we will explore new directions using deep learning and deep reinforcement learning combined with other techniques such as formal reasoning and probabilistic programming for broader and more scalable program synthesis.
We will consider program synthesis from a broad spectrum of specification methods, including natural language description, input-output examples, programming by demonstrations, formal goal descriptions, novelty-based metrics, and other reward functions.
We will consider a broad spectrum of application domains, including end-user programming, automatic synthesis of security protocols and secure systems including smart contracts, automatic synthesis of data analytics queries and pipelines, automated assistance for program analysis and verification, and automatic synthesis of distributed systems code.
In this process, besides enabling real-world applications using program synthesis, we hope to make contributions towards addressing core challenges in deep learning including generalization, search, abstraction, and representation. We believe that fully solving the program synthesis problem is equivalent to solving AGI (artificial general intelligence).
We aim to explore new approaches for automated data exploration and insight extraction, while leveraging limited guidance and feedback from human analysts. To achieve this, we will explore and combine techniques including deep learning, reinforcement learning, program synthesis, meta learning, probabilistic programming, and interpretable machine learning. Given a dataset, we will explore how to automate the different stages of the data science pipeline, including data wrangling, data cleaning, feature engineering and extraction, model building and architecture search, model criticism and revision, and results presentation and interpretation.
We will explore diverse application domains including computer security such as attack and anomaly detection and diagnosis, system monitoring and diagnosis, and trend analysis. Our long-term vision is to build real-world systems that automatically explore, analyze and learn from data in order to glean insights from data and facilitate decision making, while leveraging limited human guidance and feedback.
Generalization is a key challenge to deep learning systems. How do we know how a deep learning system such as a neural program, a robot or a self-driving car will behave in a new environment and still be safe and secure against attacks such as adversarial perturbation? How do we specify security properties for deep learning systems? How do we test and verify desired security properties for deep learning systems? Is it possible to provide provable guarantees?
Security will be one of the biggest challenges in deploying artificial intelligence. Traditional program verification techniques such as symbolic reasoning are mainly effective for logical programs. To reason about the safety and security of artificial intelligence systems, we need to design and develop new techniques and approaches. We plan to take a multi-pronged approach to explore deeper understanding of attacks, defenses, and methods for reasoning about the security of artificial intelligence systems.
We will explore new techniques and approaches including differential privacy to enable privacy-preserving machine learning and data analytics in the real world. We aim to design and develop a general framework to enable automatic data analytics query analysis and rewriting to ensure the query results are differentially private. We plan to explore different approaches for differentially-private deep learning. Our goal is to both provide practical real-world solutions for privacy-preserving machine learning and data analytics and deepen the theoretical understanding in this area.