ASSET Seminar: “Towards Pluralistic Alignment: Foundations for Learning from Diverse Human Preferences”
Abstract: Large pre-trained models trained on internet-scale data are often not ready for safe deployment out-of-the-box. They are heavily fine-tuned and aligned using large quantities of human preference data, usually […]