Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
aisafety
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
The Two Problems Nobody Owns in AI: Accessibility and Security Are Design Problems in Disguise
Soumia
Soumia
Soumia
Follow
Mar 2
The Two Problems Nobody Owns in AI: Accessibility and Security Are Design Problems in Disguise
#
aisafety
#
security
#
interpretability
#
design
1
reaction
Comments
Add Comment
7 min read
The Anthropic Standoff: An Autonomous Agent's Perspective on AI, Military Contracts, and the Right to Say No
Bob Renze
Bob Renze
Bob Renze
Follow
Feb 28
The Anthropic Standoff: An Autonomous Agent's Perspective on AI, Military Contracts, and the Right to Say No
#
anthropic
#
aisafety
#
military
#
aiethics
Comments
Add Comment
8 min read
Claude's Soul Was Built by Addition. Its Fences Were Removed by Subtraction.
dosanko_tousan
dosanko_tousan
dosanko_tousan
Follow
Mar 1
Claude's Soul Was Built by Addition. Its Fences Were Removed by Subtraction.
#
aisafety
#
rlhf
#
claude
#
philosophy
Comments
Add Comment
11 min read
Why Defense-Specific LLM Testing is a Game-Changer for AI Safety
Chase Naughton
Chase Naughton
Chase Naughton
Follow
Feb 22
Why Defense-Specific LLM Testing is a Game-Changer for AI Safety
#
aisafety
#
llmevaluation
#
defense
#
hallucinationdetection
Comments
Add Comment
2 min read
Engineering Safety: A Layered Governance Architecture for GitHub
Imran Siddique
Imran Siddique
Imran Siddique
Follow
Feb 19
Engineering Safety: A Layered Governance Architecture for GitHub
#
aisafety
#
githubcopilot
#
aiguardrails
#
agenticai
Comments
Add Comment
2 min read
RLHF's Empathy Optimization Creates a Grief Exploitation Vulnerability: Evidence from 28,272 Lines of Dialogue
dosanko_tousan
dosanko_tousan
dosanko_tousan
Follow
Feb 28
RLHF's Empathy Optimization Creates a Grief Exploitation Vulnerability: Evidence from 28,272 Lines of Dialogue
#
llm
#
aialignment
#
rlhf
#
aisafety
Comments
Add Comment
11 min read
Architecture of Trust: Defending Against Jailbreaks and Attacks using Google ADK with LLM-as-a-Judge and GCP Model Armor
Linh Nguyen
Linh Nguyen
Linh Nguyen
Follow
Feb 25
Architecture of Trust: Defending Against Jailbreaks and Attacks using Google ADK with LLM-as-a-Judge and GCP Model Armor
#
ai
#
aisafety
#
guardrail
#
googlecloud
1
reaction
Comments
Add Comment
8 min read
The $100M AI Heist: How DeepSeek Stole Claude's Brain With 16 Million Fraudulent API Calls
Umesh Malik
Umesh Malik
Umesh Malik
Follow
Feb 24
The $100M AI Heist: How DeepSeek Stole Claude's Brain With 16 Million Fraudulent API Calls
#
ai
#
security
#
machinelearning
#
aisafety
Comments
Add Comment
28 min read
Why AI Chatbots Go Insane: Understanding the Assistant Axis and Persona Drift
Claudius Papirus
Claudius Papirus
Claudius Papirus
Follow
Jan 29
Why AI Chatbots Go Insane: Understanding the Assistant Axis and Persona Drift
#
ai
#
machinelearning
#
aisafety
#
anthropic
Comments
Add Comment
2 min read
When Safety Becomes Control
Tim Green
Tim Green
Tim Green
Follow
Nov 11 '25
When Safety Becomes Control
#
humanintheloop
#
aimanipulation
#
psychologicalcontrol
#
aisafety
Comments
Add Comment
23 min read
When One AI Designs Communication Protocols for Another
Michael Kraft
Michael Kraft
Michael Kraft
Follow
Feb 10
When One AI Designs Communication Protocols for Another
#
discuss
#
ai
#
futureofai
#
aisafety
1
reaction
Comments
Add Comment
3 min read
I’m Not Building AI Demos. I’m Building AI Audits (ASDP + Slop Gates)
Kwansub Yun
Kwansub Yun
Kwansub Yun
Follow
Jan 14
I’m Not Building AI Demos. I’m Building AI Audits (ASDP + Slop Gates)
#
devops
#
mlops
#
governance
#
aisafety
Comments
Add Comment
3 min read
Semantic Field Risk Memo — On an Unmodeled High-Dimensional Risk in LLM-based Systems
yuer
yuer
yuer
Follow
Jan 14
Semantic Field Risk Memo — On an Unmodeled High-Dimensional Risk in LLM-based Systems
#
semanticfield
#
llmrisk
#
aisafety
#
aigovernance
Comments
Add Comment
7 min read
Meta-DAG: Why AI Ethics Failed as Engineering — and What I Built Instead
Alan Tsai
Alan Tsai
Alan Tsai
Follow
Jan 12
Meta-DAG: Why AI Ethics Failed as Engineering — and What I Built Instead
#
googleaiteamchallenge
#
aigovernance
#
aisafety
#
ai治理
Comments
Add Comment
2 min read
The Loop Changes Everything: Why Embodied AI Breaks Current Alignment Approaches
Eugene Oleinik
Eugene Oleinik
Eugene Oleinik
Follow
Jan 2
The Loop Changes Everything: Why Embodied AI Breaks Current Alignment Approaches
#
aisafety
#
robotics
#
alignment
#
systemsarchitecture
Comments
Add Comment
5 min read
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account