

文集名AAAI Special Track (Safe, Robust and Responsible AI Track)
会议名38th AAAI Conference on Artificial Intelligence (AAAI-24), 36th Conference on Innovative Applications of Artificial Intelligence (IAAI-24), 14th Symposium on Educational Advances in Artificial Intelligence (EAAI-24)
机构Association for the Advancement of Artificial Intelligence (AAAI)
会议日期20-27 February 2024
会议地点Vancouver, Canada

Enumerating Safe Regions in Deep Neural Networks with Provable Probabilistic GuaranteesLuca Marzari; Davide Corsi; Enrico Marchesini; Alessandro Farinelli; Ferdinando Cicalese2024
Divide-and-Aggregate Learning for Evaluating Performance on Unlabeled DataShuyu Miao; Jian Liu; Lin Zheng; Hong Jin2024
SentinelLMs: Encrypted Input Adaptation and Fine-Tuning of Language Models for Private and Secure InferenceAbhijit Mishra; Mingda Li; Soham Deo2024
Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy SynthesisRohan Mitta; Hosein Hasanbeig; Jun Wang; Daniel Kroening; Yiannis Kantaros; Alessandro Abate2024
Feature Unlearning for Pre-trained GANs and VAEsSaemi Moon; Seunghyuk Cho; Dongwoo Kim2024
Reward Certification for Policy Smoothed Reinforcement LearningRonghui Mu; Leandro Soriano Marcolino; Yanghao Zhang; Tianle Zhang; Xiaowei Huang; Wenjie Ruan2024
EncryIP: A Practical Encryption-Based Framework for Model Intellectual Property ProtectionXin Mu; Yu Wang; Zhengan Huang; Junzuo Lai; Yehong Zhang; Hui Wang; Yue Yu2024
Neural Closure CertificatesAlireza Nadali; Vishnu Murali; Ashutosh Trivedi; Majid Zamani2024
SocialStigmaQA: A Benchmark to Uncover Stigma Amplification in Generative Language ModelsManish Nagireddy; Lamogha Chiazor; Moninder Singh; Ioana Baldini2024
MaxEnt Loss: Constrained Maximum Entropy for Calibration under Out-of-Distribution ShiftDexter Neo; Stefan Winkler; Tsuhan Chen2024
ORES: Open-vocabulary Responsible Visual SynthesisMinheng Ni; Chenfei Wu; Xiaodong Wang; Shengming Yin; Lijuan Wang; Zicheng Liu; Nan Duan2024
Q-SENN: Quantized Self-Explaining Neural NetworksThomas Norrenbrock; Marco Rudolph; Bodo Rosenhahn2024
Understanding Likelihood of Normalizing Flow and Image Complexity through the Lens of Out-of-Distribution DetectionGenki Osada; Tsubasa Takahashi; Takashi Nishide2024
Adversarial Initialization with Universal Adversarial Perturbation: A New Approach to Fast Adversarial TrainingChao Pan; Qing Li; Xin Yao2024
A PAC Learning Algorithm for LTL and Omega-Regular Objectives in MDPsMateo Perez; Fabio Somenzi; Ashutosh Trivedi2024
Robust Stochastic Graph Generator for Counterfactual ExplanationsMario Alfonso Prado-Romero; Bardh Prenkaj; Giovanni Stilo2024
Visual Adversarial Examples Jailbreak Aligned Large Language ModelsXiangyu Qi; Kaixuan Huang; Ashwinee Panda; Peter Henderson; Mengdi Wang; Prateek Mittal2024
Dissenting Explanations: Leveraging Disagreement to Reduce Model OverrelianceOmer Reingold; Judy Hanwen Shen; Aditi Talati2024
I-CEE: Tailoring Explanations of Image Classification Models to User ExpertiseYao Rong; Peizhu Qian; Vaibhav Unhelkar; Enkelejda Kasneci2024
A Simple and Practical Method for Reducing the Disparate Impact of Differential PrivacyLucas Rosenblatt; Julia Stoyanovich; Christopher Musco2024