if you need more than one keyword, modify and separate by underscore _
the list of search keywords can be up to 50 characters long
if you modify the keywords, press enter within the field to confirm the new search key
Tag: gemma
Bibliography items where occurs: 33
- Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its Applications, Advantages, Limitations, and Future Directions in Natural Language Processing / 2304.02017 / ISBN:https://doi.org/10.48550/arXiv.2304.02017 / Published by ArXiv / Version released on 2024-08-03 / on (web) Publishing site
- Learning Human-like Representations to Enable Learning Human Values / 2312.14106 / ISBN:https://doi.org/10.48550/arXiv.2312.14106 / Published by ArXiv / Version released on 2024-11-08 / on (web) Publishing site
- The Narrow Depth and Breadth of Corporate Responsible AI Research / 2405.12193 / ISBN:https://doi.org/10.48550/arXiv.2405.12193 / Published by ArXiv / Version released on 2026-01-28 / on (web) Publishing site
- The Future of Child Development in the AI Era. Cross-Disciplinary Perspectives Between AI and Child Development Experts / 2405.19275 / ISBN:https://doi.org/10.48550/arXiv.2405.19275 / Published by ArXiv / Version released on 2024-05-29 / on (web) Publishing site
- How Ethical Should AI Be? How AI Alignment Shapes the Risk Preferences of LLMs / 2406.01168 / ISBN:https://doi.org/10.48550/arXiv.2406.01168 / Published by ArXiv / Version released on 2024-08-01 / on (web) Publishing site
- MoralBench: Moral Evaluation of LLMs / 2406.04428 / Published by ArXiv / Version released on 2025-07-04 / on (web) Publishing site
- The Ethics of Interaction: Mitigating Security Threats in LLMs / 2401.12273 / ISBN:https://doi.org/10.48550/arXiv.2401.12273 / Published by ArXiv / Version released on 2024-07-10 / on (web) Publishing site
- Open Artificial Knowledge / 2407.14371 / ISBN:https://doi.org/10.48550/arXiv.2407.14371 / Published by ArXiv / Version released on 2024-07-19 / on (web) Publishing site
- VersusDebias: Universal Zero-Shot Debiasing for Text-to-Image Models via SLM-Based Prompt Engineering and Generative Adversary / 2407.19524 / ISBN:https://doi.org/10.48550/arXiv.2407.19524 / Published by ArXiv / Version released on 2024-08-16 / on (web) Publishing site
- Improving governance outcomes through AI documentation: Bridging theory and practice / 2409.08960 / ISBN:https://doi.org/10.48550/arXiv.2409.08960 / Published by ArXiv / Version released on 2024-12-09 / on (web) Publishing site
- ValueCompass: A Framework for Measuring Contextual Value Alignment Between Human and LLMs / 2409.09586 / ISBN:https://doi.org/10.48550/arXiv.2409.09586 / Published by ArXiv / Version released on 2025-11-04 / on (web) Publishing site
- The doctor will polygraph you now: ethical concerns with AI for fact-checking patients / 2408.07896 / ISBN:https://doi.org/10.48550/arXiv.2408.07896 / Published by ArXiv / Version released on 2024-11-11 / on (web) Publishing site
- Large-scale moral machine experiment on large language models / 2411.06790 / ISBN:https://doi.org/10.48550/arXiv.2411.06790 / Published by ArXiv / Version released on 2024-12-30 / on (web) Publishing site
- Bias in Decision-Making for AI's Ethical Dilemmas: A Comparative Study of ChatGPT and Claude / 2501.10484 / ISBN:https://doi.org/10.48550/arXiv.2501.10484 / Published by ArXiv / Version released on 2025-10-30 / on (web) Publishing site
- FairT2I: Mitigating Social Bias in Text-to-Image Generation via Large Language Model-Assisted Detection and Attribute Rebalancing / 2502.03826 / ISBN:https://doi.org/10.48550/arXiv.2502.03826 / Published by ArXiv / Version released on 2025-08-15 / on (web) Publishing site
- Safety at Scale: A Comprehensive Survey of Large Model and Agent Safety / 2502.05206 / ISBN:https://doi.org/10.48550/arXiv.2502.05206 / Published by ArXiv / Version released on 2025-08-02 / on (web) Publishing site
- On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective / 2502.14296 / ISBN:https://doi.org/10.48550/arXiv.2502.14296 / Published by ArXiv / Version released on 2025-09-30 / on (web) Publishing site
- Fair Foundation Models for Medical Image Analysis: Challenges and Perspectives
/ 2502.16841 / ISBN:https://doi.org/10.48550/arXiv.2502.16841 / Published by ArXiv / Version released on 2026-01-14 / on (web) Publishing site
- Comprehensive Analysis of Transparency and Accessibility of ChatGPT, DeepSeek, And other SoTA Large Language Models / 2502.18505 / ISBN:https://doi.org/10.48550/arXiv.2502.18505 / Published by ArXiv / Version released on 2025-02-21 / on (web) Publishing site
- MinorBench: A hand-built benchmark for content-based risks for children / 2503.10242 / ISBN:https://doi.org/10.48550/arXiv.2503.10242 / Published by ArXiv / Version released on 2025-03-13 / on (web) Publishing site
- Towards Safer Pretraining: Analyzing and Filtering Harmful Content in Webscale datasets for Responsible LLMs / 2505.02009 / ISBN:https://doi.org/10.48550/arXiv.2505.02009 / Published by ArXiv / Version released on 2025-08-12 / on (web) Publishing site
- Analysing Safety Risks in LLMs Fine-Tuned with Pseudo-Malicious Cyber Security Data / 2505.09974 / ISBN:https://doi.org/10.48550/arXiv.2505.09974 / Published by ArXiv / Version released on 2025-05-15 / on (web) Publishing site
- Are Language Models Consequentialist or Deontological Moral Reasoners? / 2505.21479 / ISBN:https://doi.org/10.48550/arXiv.2505.21479 / Published by ArXiv / Version released on 2025-10-12 / on (web) Publishing site
- Mechanistic Interpretability Needs Philosophy / 2506.18852 / ISBN:https://doi.org/10.48550/arXiv.2506.18852 / Published by ArXiv / Version released on 2025-06-23 / on (web) Publishing site
- Towards the Digital Me: A vision of authentic Conversational Agents powered by personal Human Digital Twins
/ 2506.23826 / ISBN:https://doi.org/10.48550/arXiv.2506.23826 / Published by ArXiv / Version released on 2025-06-30 / on (web) Publishing site
- Development of management systems using artificial intelligence systems and machine learning methods for boards of directors (preprint, unofficial translation) / 2508.03769 / ISBN:https://doi.org/10.48550/arXiv.2508.03769 / Published by ArXiv / Version released on 2025-08-05 / on (web) Publishing site
- Towards Assessing Medical Ethics from Knowledge to Practice / 2508.05132 / ISBN:https://doi.org/10.48550/arXiv.2508.05132 / Published by ArXiv / Version released on 2025-08-07 / on (web) Publishing site
- Beyond Ethical Alignment: Evaluating LLMs as Artificial Moral Assistants / 2508.12754 / ISBN:https://doi.org/10.48550/arXiv.2508.12754 / Published by ArXiv / Version released on 2025-08-18 / on (web) Publishing site
- The Scales of Justitia: A Comprehensive Survey on Safety Evaluation of LLMs
/ 2506.11094 / ISBN:https://doi.org/10.48550/arXiv.2506.11094 / Published by ArXiv / Version released on 2025-10-30 / on (web) Publishing site
- Enabling Ethical AI: A case study in using Ontological Context for Justified Agentic AI Decisions / 2512.04822 / ISBN:https://doi.org/10.48550/arXiv.2512.04822 / Published by ArXiv / Version released on 2025-12-04 / on (web) Publishing site
- Mind the Gap! Pathways Towards Unifying AI Safety and Ethics Research / 2512.10058 / ISBN:https://doi.org/10.48550/arXiv.2512.10058 / Published by ArXiv / Version released on 2025-12-10 / on (web) Publishing site
- Reliable and Responsible Foundation Models: A Comprehensive Survey / 2602.08145 / ISBN:https://doi.org/10.48550/arXiv.2602.08145 / Published by ArXiv / Version released on 2026-02-04 / on (web) Publishing site
- From experimentation to engagement: on the paradox of participatory AI and power in contexts of forced displacement and humanitarian crises / 2604.06219 / ISBN:https://doi.org/10.48550/arXiv.2604.06219 / Version released on 2026-03-23 / on (web) Publishing site
_