AI Lab Watch
Subscribe
Sign in
Home
Archive
About
Latest
Top
The current state of RSPs
This is a reference post.
Nov 4
•
Zach Stein-Perlman
Share this post
The current state of RSPs
ailabwatch.substack.com
Copy link
Facebook
Email
Note
Other
October 2024
What AI companies should do
Some rough ideas
Oct 21
•
Zach Stein-Perlman
Share this post
What AI companies should do
ailabwatch.substack.com
Copy link
Facebook
Email
Note
Other
Anthropic rewrote its RSP
Some reactions
Oct 16
•
Zach Stein-Perlman
3
Share this post
Anthropic rewrote its RSP
ailabwatch.substack.com
Copy link
Facebook
Email
Note
Other
September 2024
Model evals for dangerous capabilities
Testing an LM system for dangerous capabilities is crucial for assessing its risks
Sep 23
•
Zach Stein-Perlman
1
Share this post
Model evals for dangerous capabilities
ailabwatch.substack.com
Copy link
Facebook
Email
Note
Other
July 2024
Safety consultations for AI lab employees
Many people who are concerned about AI x-risk work at AI labs, in the hope of doing directly useful work, boosting a relatively responsible lab, or…
Jul 27
•
Zach Stein-Perlman
Share this post
Safety consultations for AI lab employees
ailabwatch.substack.com
Copy link
Facebook
Email
Note
Other
New page: Integrity
And new-ish page: Policy advocacy
Jul 10
•
Zach Stein-Perlman
1
Share this post
New page: Integrity
ailabwatch.substack.com
Copy link
Facebook
Email
Note
Other
June 2024
Anthropic's Certificate of Incorporation
New details on the Long-Term Benefit Trust, but most questions remain
Jun 12
•
Zach Stein-Perlman
Share this post
Anthropic's Certificate of Incorporation
ailabwatch.substack.com
Copy link
Facebook
Email
Note
Other
May 2024
AI companies' commitments
New page
May 29
•
Zach Stein-Perlman
Share this post
AI companies' commitments
ailabwatch.substack.com
Copy link
Facebook
Email
Note
Other
Maybe Anthropic's Long-Term Benefit Trust is powerless
Anthropic should share the details
May 27
•
Zach Stein-Perlman
2
Share this post
Maybe Anthropic's Long-Term Benefit Trust is powerless
ailabwatch.substack.com
Copy link
Facebook
Email
Note
Other
AI companies aren't really using external evaluators
But they should
May 24
•
Zach Stein-Perlman
1
Share this post
AI companies aren't really using external evaluators
ailabwatch.substack.com
Copy link
Facebook
Email
Note
Other
New voluntary commitments (AI Seoul Summit)
Basically the companies commit to make responsible scaling policies. Part of me says this is amazing, the best possible commitment short of all…
May 21
•
Zach Stein-Perlman
Share this post
New voluntary commitments (AI Seoul Summit)
ailabwatch.substack.com
Copy link
Facebook
Email
Note
Other
DeepMind’s “Frontier Safety Framework” is weak and unambitious
FSF blogpost. Full document (just 6 pages; you should read it). Compare to Anthropic’s RSP, OpenAI’s RSP (“Preparedness Framework”), and METR’s Key…
May 18
•
Zach Stein-Perlman
Share this post
DeepMind’s “Frontier Safety Framework” is weak and unambitious
ailabwatch.substack.com
Copy link
Facebook
Email
Note
Other
Share
Copy link
Facebook
Email
Note
Other
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts