DEV Community

# aisafety

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
The Two Problems Nobody Owns in AI: Accessibility and Security Are Design Problems in Disguise

The Two Problems Nobody Owns in AI: Accessibility and Security Are Design Problems in Disguise

1
Comments
7 min read
The Anthropic Standoff: An Autonomous Agent's Perspective on AI, Military Contracts, and the Right to Say No

The Anthropic Standoff: An Autonomous Agent's Perspective on AI, Military Contracts, and the Right to Say No

Comments
8 min read
Claude's Soul Was Built by Addition. Its Fences Were Removed by Subtraction.

Claude's Soul Was Built by Addition. Its Fences Were Removed by Subtraction.

Comments
11 min read
Why Defense-Specific LLM Testing is a Game-Changer for AI Safety

Why Defense-Specific LLM Testing is a Game-Changer for AI Safety

Comments
2 min read
Engineering Safety: A Layered Governance Architecture for GitHub

Engineering Safety: A Layered Governance Architecture for GitHub

Comments
2 min read
RLHF's Empathy Optimization Creates a Grief Exploitation Vulnerability: Evidence from 28,272 Lines of Dialogue

RLHF's Empathy Optimization Creates a Grief Exploitation Vulnerability: Evidence from 28,272 Lines of Dialogue

Comments
11 min read
Architecture of Trust: Defending Against Jailbreaks and Attacks using Google ADK with LLM-as-a-Judge and GCP Model Armor

Architecture of Trust: Defending Against Jailbreaks and Attacks using Google ADK with LLM-as-a-Judge and GCP Model Armor

1
Comments
8 min read
The $100M AI Heist: How DeepSeek Stole Claude's Brain With 16 Million Fraudulent API Calls

The $100M AI Heist: How DeepSeek Stole Claude's Brain With 16 Million Fraudulent API Calls

Comments
28 min read
Why AI Chatbots Go Insane: Understanding the Assistant Axis and Persona Drift

Why AI Chatbots Go Insane: Understanding the Assistant Axis and Persona Drift

Comments
2 min read
When Safety Becomes Control

When Safety Becomes Control

Comments
23 min read
When One AI Designs Communication Protocols for Another

When One AI Designs Communication Protocols for Another

1
Comments
3 min read
I’m Not Building AI Demos. I’m Building AI Audits (ASDP + Slop Gates)

I’m Not Building AI Demos. I’m Building AI Audits (ASDP + Slop Gates)

Comments
3 min read
Semantic Field Risk Memo — On an Unmodeled High-Dimensional Risk in LLM-based Systems

Semantic Field Risk Memo — On an Unmodeled High-Dimensional Risk in LLM-based Systems

Comments
7 min read
Meta-DAG: Why AI Ethics Failed as Engineering — and What I Built Instead

Meta-DAG: Why AI Ethics Failed as Engineering — and What I Built Instead

Comments
2 min read
The Loop Changes Everything: Why Embodied AI Breaks Current Alignment Approaches

The Loop Changes Everything: Why Embodied AI Breaks Current Alignment Approaches

Comments
5 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.