ComputerphileConstraining AI Agents

2025 • E49    4 gru 2025    21 min
As AI systems become more capable, rule-based safeguards, hard-coded restrictions, and simple alignment strategies start to break down. Buck Shlegeris talks about some tactics we might use as detailed in a recent paper.

Where to Watch Computerphile - 2025 • E49

Get Plex on Your Devices

Free on 20+ platforms. Pick yours.
See all supported devices →