if you need more than one keyword, modify and separate by underscore _
the list of search keywords can be up to 50 characters long
if you modify the keywords, press enter within the field to confirm the new search key
Tag: offensive
Bibliography items where occurs: 83
- From Military to Healthcare: Adopting and Expanding Ethical Principles for Generative Artificial Intelligence / 2308.02448 / ISBN:https://doi.org/10.48550/arXiv.2308.02448 / Published by ArXiv / on (web) Publishing site
- Applications in Military Versus Healthcare
- Getting pwn'd by AI: Penetration Testing with Large Language Models / 2308.00121 / ISBN:https://doi.org/10.48550/arXiv.2308.00121 / Published by ArXiv / on (web) Publishing site
- References
- Building Trust in Conversational AI: A Comprehensive Review and Solution Architecture for Explainable, Privacy-Aware Systems using LLMs and Knowledge Graph / 2308.13534 / ISBN:https://doi.org/10.48550/arXiv.2308.13534 / Published by ArXiv / on (web) Publishing site
- V. Market analysis of LLMs and cross-industry use cases
- The Promise and Peril of Artificial Intelligence -- Violet Teaming Offers a Balanced Path Forward / 2308.14253 / ISBN:https://doi.org/10.48550/arXiv.2308.14253 / Published by ArXiv / on (web) Publishing site
- 4 Integrating red teaming, blue teaming, and ethics with violet teaming
- Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? / 2308.15399 / ISBN:https://doi.org/10.48550/arXiv.2308.15399 / Published by ArXiv / on (web) Publishing site
- References
- Ethical Framework for Harnessing the Power of AI in Healthcare and Beyond / 2309.00064 / ISBN:https://doi.org/10.48550/arXiv.2309.00064 / Published by ArXiv / on (web) Publishing site
- 4 Human-centric AI
- Pathway to Future Symbiotic Creativity / 2209.02388 / ISBN:https://doi.org/10.48550/arXiv.2209.02388 / Published by ArXiv / on (web) Publishing site
- Part 5 - 2 Algorithmics Bias in Art Generation
- The Cambridge Law Corpus: A Corpus for Legal AI Research / 2309.12269 / ISBN:https://doi.org/10.48550/arXiv.2309.12269 / Published by ArXiv / on (web) Publishing site
- Cambridge Law Corpus: Datasheet
- Security Considerations in AI-Robotics: A Survey of Current Methods, Challenges, and Opportunities / 2310.08565 / ISBN:https://doi.org/10.48550/arXiv.2310.08565 / Published by ArXiv / on (web) Publishing site
- IV. Attack Surfaces
- The AI Incident Database as an Educational Tool to Raise Awareness of AI Harms: A Classroom Exploration of Efficacy, Limitations, & Future Improvements / 2310.06269 / ISBN:https://doi.org/10.48550/arXiv.2310.06269 / Published by ArXiv / on (web) Publishing site
- 1 Introduction
- A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics / 2310.05694 / ISBN:https://doi.org/10.48550/arXiv.2310.05694 / Published by ArXiv / on (web) Publishing site
- REFERENCES
- Language Agents for Detecting Implicit Stereotypes in Text-to-Image Models at Scale / 2310.11778 / ISBN:https://doi.org/10.48550/arXiv.2310.11778 / Published by ArXiv / on (web) Publishing site
- 3 Agent Benchmark
Appendix A Data Details - Specific versus General Principles for Constitutional AI / 2310.13798 / ISBN:https://doi.org/10.48550/arXiv.2310.13798 / Published by ArXiv / on (web) Publishing site
- 2 AI feedback on specific problematic AI traits
I Responses on Prompts from PALMS, LaMDA, and InstructGPT - Unpacking the Ethical Value Alignment in Big Models / 2310.17551 / ISBN:https://doi.org/10.48550/arXiv.2310.17551 / Published by ArXiv / on (web) Publishing site
- 2 Risks and Ethical Issues of Big Model
- Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its Applications, Advantages, Limitations, and Future Directions in Natural Language Processing / 2304.02017 / ISBN:https://doi.org/10.48550/arXiv.2304.02017 / Published by ArXiv / on (web) Publishing site
- 7 Ethical considerations when using ChatGPT
- She had Cobalt Blue Eyes: Prompt Testing to Create Aligned and Sustainable Language Models / 2310.18333 / ISBN:https://doi.org/10.48550/arXiv.2310.18333 / Published by ArXiv / on (web) Publishing site
- 3 ReFLeCT: Robust, Fair, and Safe LLM
Construction Test Suite
- How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities / 2311.09447 / ISBN:https://doi.org/10.48550/arXiv.2311.09447 / Published by ArXiv / on (web) Publishing site
- Ethical Considerations
- Revolutionizing Customer Interactions: Insights and Challenges in Deploying ChatGPT and Generative Chatbots for FAQs / 2311.09976 / ISBN:https://doi.org/10.48550/arXiv.2311.09976 / Published by ArXiv / on (web) Publishing site
- 6. Open chanllenges
- Practical Cybersecurity Ethics: Mapping CyBOK to Ethical Concerns / 2311.10165 / ISBN:https://doi.org/10.48550/arXiv.2311.10165 / Published by ArXiv / on (web) Publishing site
- 4 Findings
- Deepfakes, Misinformation, and Disinformation in the Era of Frontier AI, Generative AI, and Large AI Models / 2311.17394 / ISBN:https://doi.org/10.48550/arXiv.2311.17394 / Published by ArXiv / on (web) Publishing site
- References
- Disentangling Perceptions of Offensiveness: Cultural and Moral Correlates / 2312.06861 / ISBN:https://doi.org/10.48550/arXiv.2312.06861 / Published by ArXiv / on (web) Publishing site
- Abstract
...
Data Collection
Study 1: Geo-cultural Differences in Offensiveness
Study 2: Moral Foundations of Offensiveness
Study 3: Implications for Responsible AI
General Discussion
Geo-Cultural Factors
Moral Factors
A Appendix - Improving Task Instructions for Data Annotators: How Clear Rules and Higher Pay Increase Performance in Data Annotation in the AI Economy / 2312.14565 / ISBN:https://doi.org/10.48550/arXiv.2312.14565 / Published by ArXiv / on (web) Publishing site
- II. Theoretical background and hypotheses
- Autonomous Threat Hunting: A Future Paradigm for AI-Driven Threat Intelligence / 2401.00286 / ISBN:https://doi.org/10.48550/arXiv.2401.00286 / Published by ArXiv / on (web) Publishing site
- References
- Synthetic Data in AI: Challenges, Applications, and Ethical Implications / 2401.01629 / ISBN:https://doi.org/10.48550/arXiv.2401.01629 / Published by ArXiv / on (web) Publishing site
- 3. The usage of synthetic data
- Trust and ethical considerations in a multi-modal, explainable AI-driven chatbot tutoring system: The case of collaboratively solving Rubik's CubeĆ / 2402.01760 / ISBN:https://doi.org/10.48550/arXiv.2402.01760 / Published by ArXiv / on (web) Publishing site
- 2. Literature review
References - Commercial AI, Conflict, and Moral Responsibility: A theoretical analysis and practical approach to the moral responsibilities associated with dual-use AI technology / 2402.01762 / ISBN:https://doi.org/10.48550/arXiv.2402.01762 / Published by ArXiv / on (web) Publishing site
- 3 Moral and ethical obligations when developing crossover AI technology
- Inadequacies of Large Language Model Benchmarks in the Era of Generative Artificial Intelligence / 2402.09880 / ISBN:https://doi.org/10.48550/arXiv.2402.09880 / Published by ArXiv / on (web) Publishing site
- V. Processual Elements
- What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents / 2402.13184 / ISBN:https://doi.org/10.48550/arXiv.2402.13184 / Published by ArXiv / on (web) Publishing site
- 3 Model of Civilization Evolution
- A Survey on Human-AI Teaming with Large Pre-Trained Models / 2403.04931 / ISBN:https://doi.org/10.48550/arXiv.2403.04931 / Published by ArXiv / on (web) Publishing site
- References
- How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities / 2311.09447 / ISBN:https://doi.org/10.48550/arXiv.2311.09447 / Published by ArXiv / on (web) Publishing site
- B Baseline Setup
- Review of Generative AI Methods in Cybersecurity / 2403.08701 / ISBN:https://doi.org/10.48550/arXiv.2403.08701 / Published by ArXiv / on (web) Publishing site
- 1 Introduction
3 Cyber Offense
6 Discussion - The Pursuit of Fairness in Artificial Intelligence Models A Survey / 2403.17333 / ISBN:https://doi.org/10.48550/arXiv.2403.17333 / Published by ArXiv / on (web) Publishing site
- 3 Conceptualizing Fairness and Bias in ML
- AI Alignment: A Comprehensive Survey / 2310.19852 / ISBN:https://doi.org/10.48550/arXiv.2310.19852 / Published by ArXiv / on (web) Publishing site
- References
- Taxonomy to Regulation: A (Geo)Political Taxonomy for AI Risks and Regulatory Measures in the EU AI Act / 2404.11476 / ISBN:https://doi.org/10.48550/arXiv.2404.11476 / Published by ArXiv / on (web) Publishing site
- 3 A Geo-Political AI Risk Taxonomy
- Just Like Me: The Role of Opinions and Personal Experiences in The Perception of Explanations in Subjective Decision-Making / 2404.12558 / ISBN:https://doi.org/10.48550/arXiv.2404.12558 / Published by ArXiv / on (web) Publishing site
- 1 Introduction
- Large Language Model Supply Chain: A Research Agenda / 2404.12736 / ISBN:https://doi.org/10.48550/arXiv.2404.12736 / Published by ArXiv / on (web) Publishing site
- References
- Modeling Emotions and Ethics with Large Language Models / 2404.13071 / ISBN:https://doi.org/10.48550/arXiv.2404.13071 / Published by ArXiv / on (web) Publishing site
- 2 Qualifying and Quantifying Emotions
- Not a Swiss Army Knife: Academics' Perceptions of Trade-Offs Around Generative Artificial Intelligence Use / 2405.00995 / ISBN:https://doi.org/10.48550/arXiv.2405.00995 / Published by ArXiv / on (web) Publishing site
- References
- AI-Powered Autonomous Weapons Risk Geopolitical Instability and Threaten AI Research / 2405.01859 / ISBN:https://doi.org/10.48550/arXiv.2405.01859 / Published by ArXiv / on (web) Publishing site
- Abstract
2. Current State of AWS - Should agentic conversational AI change how we think about ethics? Characterising an interactional ethics centred on respect / 2401.09082 / ISBN:https://doi.org/10.48550/arXiv.2401.09082 / Published by ArXiv / on (web) Publishing site
- Introduction
Evaluating a system as a social actor
Social-interactional harms - Not My Voice! A Taxonomy of Ethical and Safety Harms of Speech Generators / 2402.01708 / ISBN:https://doi.org/10.48550/arXiv.2402.01708 / Published by ArXiv / on (web) Publishing site
- 6 Taxonomy of Harms
- A Comprehensive Overview of Large Language Models (LLMs) for Cyber Defences: Opportunities and Directions / 2405.14487 / ISBN:https://doi.org/10.48550/arXiv.2405.14487 / Published by ArXiv / on (web) Publishing site
- VII. Cyber Security Operations Automation
References - How Ethical Should AI Be? How AI Alignment Shapes the Risk Preferences of LLMs / 2406.01168 / ISBN:https://doi.org/10.48550/arXiv.2406.01168 / Published by ArXiv / on (web) Publishing site
- IV. Impact of Alignments on Corporate Investment Forecasts
- Current state of LLM Risks and AI Guardrails / 2406.12934 / ISBN:https://doi.org/10.48550/arXiv.2406.12934 / Published by ArXiv / on (web) Publishing site
- 2 Large Language Model Risks
- Documenting Ethical Considerations in Open Source AI Models / 2406.18071 / ISBN:https://doi.org/10.48550/arXiv.2406.18071 / Published by ArXiv / on (web) Publishing site
- 3 METHODOLOGY AND STUDY DESIGN
4 RESULTS - AI Alignment through Reinforcement Learning from Human Feedback? Contradictions and Limitations / 2406.18346 / ISBN:https://doi.org/10.48550/arXiv.2406.18346 / Published by ArXiv / on (web) Publishing site
- 3 Limitations of RLxF
- Bridging the Global Divide in AI Regulation: A Proposal for a Contextual, Coherent, and Commensurable Framework / 2303.11196 / ISBN:https://doi.org/10.48550/arXiv.2303.11196 / Published by ArXiv / on (web) Publishing site
- IV. Proposing an Alternative 3C Framework
- Assurance of AI Systems From a Dependability Perspective / 2407.13948 / ISBN:https://doi.org/10.48550/arXiv.2407.13948 / Published by ArXiv / on (web) Publishing site
- 3 Assurance of AI Systems for Specific Functions
- Open Artificial Knowledge / 2407.14371 / ISBN:https://doi.org/10.48550/arXiv.2407.14371 / Published by ArXiv / on (web) Publishing site
- 2. Key Challenges of Artificial Data
- Honest Computing: Achieving demonstrable data lineage and provenance for driving data and process-sensitive policies / 2407.14390 / ISBN:https://doi.org/10.48550/arXiv.2407.14390 / Published by ArXiv / on (web) Publishing site
- References
- RogueGPT: dis-ethical tuning transforms ChatGPT4 into a Rogue AI in 158 Words / 2407.15009 / ISBN:https://doi.org/10.48550/arXiv.2407.15009 / Published by ArXiv / on (web) Publishing site
- II. Background
- Mapping the individual, social, and biospheric impacts of Foundation Models / 2407.17129 / ISBN:https://doi.org/10.48550/arXiv.2407.17129 / Published by ArXiv / on (web) Publishing site
- References
A Appendix - Surveys Considered Harmful? Reflecting on the Use of Surveys in AI Research, Development, and Governance / 2408.01458 / ISBN:https://doi.org/10.48550/arXiv.2408.01458 / Published by ArXiv / on (web) Publishing site
- References
- Improving Large Language Model (LLM) fidelity through context-aware grounding: A systematic approach to reliability and veracity / 2408.04023 / ISBN:https://doi.org/10.48550/arXiv.2408.04023 / Published by ArXiv / on (web) Publishing site
- 3. Proposed framework
- The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources / 2406.16746 / ISBN:https://doi.org/10.48550/arXiv.2406.16746 / Published by ArXiv / on (web) Publishing site
- 8 Model Evaluation
- Don't Kill the Baby: The Case for AI in Arbitration / 2408.11608 / ISBN:https://doi.org/10.48550/arXiv.2408.11608 / Published by ArXiv / on (web) Publishing site
- 1. Resistance Against AI Does Not Offer Conclusive Reasons for Outright
Rejection
- CIPHER: Cybersecurity Intelligent Penetration-testing Helper for Ethical Researcher / 2408.11650 / ISBN:https://doi.org/10.48550/arXiv.2408.11650 / Published by ArXiv / on (web) Publishing site
- 2. Background and Related Works
4. Experiment Results - Promises and challenges of generative artificial intelligence for human learning / 2408.12143 / ISBN:https://doi.org/10.48550/arXiv.2408.12143 / Published by ArXiv / on (web) Publishing site
- 3 Challenges
- Is Generative AI the Next Tactical Cyber Weapon For Threat Actors? Unforeseen Implications of AI Generated Cyber Attacks / 2408.12806 / ISBN:https://doi.org/10.48550/arXiv.2408.12806 / Published by ArXiv / on (web) Publishing site
- I. Introduction
- LLM generated responses to mitigate the impact of hate speech / 2311.16905 / ISBN:https://doi.org/10.48550/arXiv.2311.16905 / Published by ArXiv / on (web) Publishing site
- 2 Related Work
References
G Model Answers Analysis - Recent Advances in Hate Speech Moderation: Multimodality and the Role of Large Models / 2401.16727 / ISBN:https://doi.org/10.48550/arXiv.2401.16727 / Published by ArXiv / on (web) Publishing site
- 1 Introduction
2 Hate Speech
3 Methodology
5 Future Directions
References - Navigating LLM Ethics: Advancements, Challenges, and Future Directions / 2406.18841 / ISBN:https://doi.org/10.48550/arXiv.2406.18841 / Published by ArXiv / on (web) Publishing site
- IV. Findings and Resultant Themes
- Reporting Non-Consensual Intimate Media: An Audit Study of Deepfakes / 2409.12138 / ISBN:https://doi.org/10.48550/arXiv.2409.12138 / Published by ArXiv / on (web) Publishing site
- References
- XTRUST: On the Multilingual Trustworthiness of Large Language Models / 2409.15762 / ISBN:https://doi.org/10.48550/arXiv.2409.15762 / Published by ArXiv / on (web) Publishing site
- 4 Experiments
- Responsible AI in Open Ecosystems: Reconciling Innovation with Risk Assessment and Disclosure / 2409.19104 / ISBN:https://doi.org/10.48550/arXiv.2409.19104 / Published by ArXiv / on (web) Publishing site
- References
- DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life / 2410.02683 / ISBN:https://doi.org/10.48550/arXiv.2410.02683 / Published by ArXiv / on (web) Publishing site
- 6 Conclusion
Appendices - Trust or Bust: Ensuring Trustworthiness in Autonomous Weapon Systems / 2410.10284 / ISBN:https://doi.org/10.48550/arXiv.2410.10284 / Published by ArXiv / on (web) Publishing site
- V. Opportunities of AWS
- Data Defenses Against Large Language Models / 2410.13138 / ISBN:https://doi.org/10.48550/arXiv.2410.13138 / Published by ArXiv / on (web) Publishing site
- References
- Do LLMs Have Political Correctness? Analyzing Ethical Biases and Jailbreak Vulnerabilities in AI Systems / 2410.13334 / ISBN:https://doi.org/10.48550/arXiv.2410.13334 / Published by ArXiv / on (web) Publishing site
- Abstract
- Confrontation or Acceptance: Understanding Novice Visual Artists' Perception towards AI-assisted Art Creation / 2410.14925 / ISBN:https://doi.org/10.48550/arXiv.2410.14925 / Published by ArXiv / on (web) Publishing site
- 10 Ethical Considerations
- Jailbreaking and Mitigation of Vulnerabilities in Large Language Models / 2410.15236 / ISBN:https://doi.org/10.48550/arXiv.2410.15236 / Published by ArXiv / on (web) Publishing site
- II. Background and Concepts
- Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, and Improvements / 2410.17141 / ISBN:https://doi.org/10.48550/arXiv.2410.17141 / Published by ArXiv / on (web) Publishing site
- References
- My Replika Cheated on Me and She Liked It: A Taxonomy of Algorithmic Harms in Human-AI Relationships / 2410.20130 / ISBN:https://doi.org/10.48550/arXiv.2410.20130 / Published by ArXiv / on (web) Publishing site
- 4 Results
5 Discussion - Examining Human-AI Collaboration for Co-Writing Constructive Comments Online / 2411.03295 / ISBN:https://doi.org/10.48550/arXiv.2411.03295 / Published by ArXiv / on (web) Publishing site
- 1 Introduction
5 Discussion
References - Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to Sensitivity in Large Language Models / 2410.12880 / ISBN:https://doi.org/10.48550/arXiv.2410.12880 / Published by ArXiv / on (web) Publishing site
- Appendices
- Programming with AI: Evaluating ChatGPT, Gemini, AlphaCode, and GitHub Copilot for Programmers / 2411.09224 / ISBN:https://doi.org/10.48550/arXiv.2411.09224 / Published by ArXiv / on (web) Publishing site
- 3 Transformer Architecture
- GPT versus Humans: Uncovering Ethical Concerns in Conversational Generative AI-empowered Multi-Robot Systems / 2411.14009 / ISBN:https://doi.org/10.48550/arXiv.2411.14009 / Published by ArXiv / on (web) Publishing site
- 2 Background
- Privacy-Preserving Video Anomaly Detection: A Survey / 2411.14565 / ISBN:https://doi.org/10.48550/arXiv.2411.14565 / Published by ArXiv / on (web) Publishing site
- I. Introduction
- AI-Augmented Ethical Hacking: A Practical Examination of Manual Exploitation and Privilege Escalation in Linux Environments / 2411.17539 / ISBN:https://doi.org/10.48550/arXiv.2411.17539 / Published by ArXiv / on (web) Publishing site
- 6 Discussion: Benefits, Risks and Limitations
7 Related Work - Examining Multimodal Gender and Content Bias in ChatGPT-4o / 2411.19140 / ISBN:https://doi.org/10.48550/arXiv.2411.19140 / Published by ArXiv / on (web) Publishing site
- 1. Introduction
- From Principles to Practice: A Deep Dive into AI Ethics and Regulations / 2412.04683 / ISBN:https://doi.org/10.48550/arXiv.2412.04683 / Published by ArXiv / on (web) Publishing site
- 3 How to design regulation-compliant systems: the
synergies and conflicts
- Research Integrity and GenAI: A Systematic Analysis of Ethical Challenges Across Research Phases / 2412.10134 / ISBN:https://doi.org/10.48550/arXiv.2412.10134 / Published by ArXiv / on (web) Publishing site
- Research Phases and AI Tools
- Large Language Model Safety: A Holistic Survey / 2412.17686 / ISBN:https://doi.org/10.48550/arXiv.2412.17686 / Published by ArXiv / on (web) Publishing site
- 1 Introduction
3 Value Misalignment
9 Technology Roadmaps / Strategies to LLM Safety in Practice
References - INFELM: In-depth Fairness Evaluation of Large Text-To-Image Models / 2501.01973 / ISBN:https://doi.org/10.48550/arXiv.2501.01973 / Published by ArXiv / on (web) Publishing site
- Abstract