Loading Events

« All Events

  • This event has passed.

Cancelled: CIS Seminar: “The Value Alignment Problem in Artificial Intelligence”

March 18 at 1:30 AM - 2:30 PM

Abstract:

Abstract: Much of our success in artificial intelligence stems from the adoption of a simple paradigm: specify an objective or goal, and then use optimization algorithms to identify a behavior (or predictor) that optimally achieves this goal. This has been true since the early days of AI (e.g., search algorithms such as A* that aim to find the optimal path to a goal state), and this paradigm is common to AI, statistics, control theory, operations research, and economics. Loosely speaking, the field has evaluated the intelligence of an AI system by how efficiently and effectively it optimizes for its objective. This talk will provide an overview of my thesis work, which proposes and explores the consequences of a simple, but consequential, shift in perspective: we should measure the intelligence of an AI system by its ability to optimize for our objectives.

 

In an ideal world, these measurements would be the same — all we have to do is write down the correct objective! This is easier said than done: misalignment between the behavior a system designer actually wants and the behavior incentivized by the reward or loss functions they specify is routine, it is commonly observed in a wide variety of practical applications, and fundamental, as a consequence of limited human cognitive capacity. This talk will build up a formal model of this value alignment problem as a cooperative human-robot interaction: an assistance game of partial information between a human principal and an autonomous agent. It will begin with a discussion of a simple instantiation of this game where the human designer takes one action, write down a proxy objective, and the robot attempts to optimize for the true objective by treating the observed proxy as evidence about the intended goal. Next, I will generalize this model to introduce Cooperative Inverse Reinforcement Learning, a general and formal model of this assistance game, and discuss the design of efficient algorithms to solve it. The talk will conclude with a discussion of directions for further research including applications to content recommendation and home robotics, the development of reliable and robust design environments for AI objectives, and the theoretical study of AI regulation by society as a value alignment problem with multiple human principals.

Dylan Hadfield-Menell

Electrical Engineering and Computer Science, University if California, Berkeley

Dylan is a final year Ph.D. student at UC Berkeley, advised by Anca Dragan, Pieter Abbeel, and Stuart Russell. His research focuses on the value alignment problem in artificial intelligence. His goal is to design algorithms that learn about and pursue the intended goal of their users, designers, and society in general. His recent work has focused on algorithms for human-robot interaction with unknown preferences and reliability engineering for learning systems.

Details

Date:
March 18
Time:
1:30 AM - 2:30 PM
Event Category:
Event Tags:
Website:
https://www.cis.upenn.edu/events/

Organizer

Computer and Information Science
Phone:
215-898-8560
Email:
cis-info@cis.upenn.edu
Website:
http://www.cis.upenn.edu

Venue

Wu and Chen Auditorium (Room 101), Levine Hall
3330 Walnut Street
Philadelphia, PA 19104 United States
+ Google Map
Website:
https://www.facilities.upenn.edu/maps/locations/levine-hall-melvin-and-claire-weiss-tech-house