_
RobertoLofaro.com - Knowledge Portal - human-generated content
Change, with and without technology - human, AI, scraping readers welcome
for updates on publications, follow: on Instagram, Twitter, Patreon, YouTube, Kaggle metadata


_

You are now here: AI Ethics Primer - search within the bibliography - version 0.4 of 2023-12-13 > (tag cloud) >tag_selected: christiano


Currently searching for:

if you need more than one keyword, modify and separate by underscore _
the list of search keywords can be up to 50 characters long


if you modify the keywords, press enter within the field to confirm the new search key

Tag: christiano

Bibliography items where occurs: 33
The AI Index 2022 Annual Report / 2205.03468 / ISBN:https://doi.org/10.48550/arXiv.2205.03468 / Published by ArXiv / Version released on 2022-05-02 / on (web) Publishing site


Regulation and NLP (RegNLP): Taming Large Language Models / 2310.05553 / ISBN:https://doi.org/10.48550/arXiv.2310.05553 / Published by ArXiv / Version released on 2023-10-09 / on (web) Publishing site


Survey on AI Ethics: A Socio-technical Perspective / 2311.17228 / ISBN:https://doi.org/10.48550/arXiv.2311.17228 / Published by ArXiv / Version released on 2025-11-04 / on (web) Publishing site


A Survey on Human-AI Collaboration with Large Foundation Models / 2403.04931 / ISBN:https://doi.org/10.48550/arXiv.2403.04931 / Published by ArXiv / Version released on 2025-09-02 / on (web) Publishing site


Frontier AI Ethics: Anticipating and Evaluating the Societal Impacts of Language Model Agents / 2404.06750 / ISBN:https://arxiv.org/abs/2404.06750 / Published by ArXiv / Version released on 2024-10-18 / on (web) Publishing site


AI Alignment: A Comprehensive Survey / 2310.19852 / ISBN:https://doi.org/10.48550/arXiv.2310.19852 / Published by ArXiv / Version released on 2025-04-04 / on (web) Publishing site


The Necessity of AI Audit Standards Boards / 2404.13060 / ISBN:https://doi.org/10.48550/arXiv.2404.13060 / Published by ArXiv / Version released on 2024-04-11 / on (web) Publishing site


Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback / 2404.10271 / ISBN:https://doi.org/10.48550/arXiv.2404.10271 / Published by ArXiv / Version released on 2024-06-04 / on (web) Publishing site


Integrating Emotional and Linguistic Models for Ethical Compliance in Large Language Models / 2405.07076 / ISBN:https://doi.org/10.48550/arXiv.2405.07076 / Published by ArXiv / Version released on 2024-05-14 / on (web) Publishing site


AI Alignment through Reinforcement Learning from Human Feedback? Contradictions and Limitations / 2406.18346 / ISBN:https://doi.org/10.48550/arXiv.2406.18346 / Published by ArXiv / Version released on 2024-06-26 / on (web) Publishing site


On the Creativity of Large Language Models / 2304.00008 / ISBN:https://doi.org/10.48550/arXiv.2304.00008 / Published by ArXiv / Version released on 2024-09-18 / on (web) Publishing site


Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to Sensitivity in Large Language Models / 2410.12880 / ISBN:https://doi.org/10.48550/arXiv.2410.12880 / Published by ArXiv / Version released on 2025-01-24 / on (web) Publishing site


Do LLMs Have Political Correctness? Analyzing Ethical Biases and Jailbreak Vulnerabilities in AI Systems / 2410.13334 / ISBN:https://doi.org/10.48550/arXiv.2410.13334 / Published by ArXiv / Version released on 2024-10-23 / on (web) Publishing site


Hybrid Approaches for Moral Value Alignment in AI Agents: a Manifesto / 2312.01818 / ISBN:https://doi.org/10.48550/arXiv.2312.01818 / Published by ArXiv / Version released on 2025-01-16 / on (web) Publishing site


Prioritization First, Principles Second: An Adaptive Interpretation of Helpful, Honest, and Harmless Principles / 2502.06059 / ISBN:https://doi.org/10.48550/arXiv.2502.06059 / Published by ArXiv / Version released on 2025-10-14 / on (web) Publishing site


Multi-Agent Risks from Advanced AI / 2502.14143 / ISBN:https://doi.org/10.48550/arXiv.2502.14143 / Published by ArXiv / Version released on 2025-02-19 / on (web) Publishing site


DarkBench: Benchmarking Dark Patterns in Large Language Models / 2503.10728 / ISBN:https://doi.org/10.48550/arXiv.2503.10728 / Published by ArXiv / Version released on 2025-03-13 / on (web) Publishing site


Formalising Human-in-the-Loop: Computational Reductions, Failure Modes, and Legal-Moral Responsibility / 2505.10426 / ISBN:https://doi.org/10.48550/arXiv.2505.10426 / Published by ArXiv / Version released on 2025-09-25 / on (web) Publishing site


Wide Reflective Equilibrium in LLM Alignment: Bridging Moral Epistemology and AI Safety / 2506.00415 / ISBN:https://doi.org/10.48550/arXiv.2506.00415 / Published by ArXiv / Version released on 2025-05-31 / on (web) Publishing site


Reversing the Paradigm: Building AI-First Systems with Human Guidance / 2506.12245 / ISBN:https://doi.org/10.48550/arXiv.2506.12245 / Published by ArXiv / Version released on 2025-06-13 / on (web) Publishing site


A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications / 2506.12594 / ISBN:https://doi.org/10.48550/arXiv.2506.12594 / Published by ArXiv / Version released on 2025-06-14 / on (web) Publishing site


Artificial Intelligence Governance for Businesses / 2011.10672 / ISBN:https://doi.org/10.48550/arXiv.2011.10672 / Published by ArXiv / Version released on 2025-07-16 / on (web) Publishing site


The AI Ethical Resonance Hypothesis: The Possibility of Discovering Moral Meta-Patterns in AI Systems / 2507.11552 / ISBN:https://doi.org/10.48550/arXiv.2507.11552 / Published by ArXiv / Version released on 2025-07-13 / on (web) Publishing site


ADEPTS: A Capability Framework for Human-Centered Agent Design / 2507.15885 / ISBN:https://doi.org/10.48550/arXiv.2507.15885 / Published by ArXiv / Version released on 2025-07-18 / on (web) Publishing site


Towards Transparent Ethical AI: A Roadmap for Trustworthy Robotic Systems / 2508.05846 / ISBN:https://doi.org/10.48550/arXiv.2508.05846 / Published by ArXiv / Version released on 2025-08-07 / on (web) Publishing site


The Fair Game: Auditing & Debiasing AI Algorithms Over Time / 2508.06443 / ISBN:https://doi.org/10.48550/arXiv.2508.06443 / Published by ArXiv / Version released on 2025-08-08 / on (web) Publishing site


Between a Rock and a Hard Place: Exploiting Ethical Reasoning to Jailbreak LLMs / 2509.05367 / ISBN:https://doi.org/10.48550/arXiv.2509.05367 / Published by ArXiv / Version released on 2025-09-12 / on (web) Publishing site


AI Governance in Higher Education: A course design exploring regulatory, ethical and practical considerationsAI Governance in Higher Education: A course design exploring regulatory, ethical and practical considerations / 2509.06176 / ISBN:https://doi.org/10.48550/arXiv.2509.06176 / Published by ArXiv / Version released on 2025-09-16 / on (web) Publishing site


ArGen: Auto-Regulation of Generative AI via GRPO and Policy-as-Code / 2509.07006 / ISBN:https://doi.org/10.48550/arXiv.2509.07006 / Published by ArXiv / Version released on 2025-09-06 / on (web) Publishing site


The Ultimate Test of Superintelligent AI Agents: Can an AI Balance Care and Control in Asymmetric Relationships? / 2506.01813 / ISBN:https://doi.org/10.48550/arXiv.2506.01813 / Published by ArXiv / Version released on 2025-09-29 / on (web) Publishing site


Understanding the Process of Human-AI Value Alignment / 2509.13854 / ISBN:https://doi.org/10.48550/arXiv.2509.13854 / Published by ArXiv / Version released on 2025-09-17 / on (web) Publishing site


Fully Autonomous AI Agents Should Not be Developed / 2502.02649 / ISBN:https://doi.org/10.48550/arXiv.2502.02649 / Published by ArXiv / Version released on 2025-10-20 / on (web) Publishing site


AI Alignment vs. AI Ethical Treatment: 10 Challenges / 2510.12844 / ISBN:https://doi.org/10.48550/arXiv.2510.12844 / Published by ArXiv / Version released on 2025-10-14 / on (web) Publishing site