AI Lab Watch
Subscribe
Sign in
Home
Archive
About
Latest
Top
The current state of RSPs
This is a reference post.
Nov 4, 2024
•
Zach Stein-Perlman
Share this post
AI Lab Watch
The current state of RSPs
Copy link
Facebook
Email
Notes
More
October 2024
What AI companies should do
Some rough ideas
Oct 21, 2024
•
Zach Stein-Perlman
Share this post
AI Lab Watch
What AI companies should do
Copy link
Facebook
Email
Notes
More
Anthropic rewrote its RSP
Some reactions
Oct 16, 2024
•
Zach Stein-Perlman
4
Share this post
AI Lab Watch
Anthropic rewrote its RSP
Copy link
Facebook
Email
Notes
More
September 2024
Model evals for dangerous capabilities
Testing an LM system for dangerous capabilities is crucial for assessing its risks
Sep 23, 2024
•
Zach Stein-Perlman
1
Share this post
AI Lab Watch
Model evals for dangerous capabilities
Copy link
Facebook
Email
Notes
More
July 2024
Safety consultations for AI lab employees
Many people who are concerned about AI x-risk work at AI labs, in the hope of doing directly useful work, boosting a relatively responsible lab, or…
Jul 27, 2024
•
Zach Stein-Perlman
1
Share this post
AI Lab Watch
Safety consultations for AI lab employees
Copy link
Facebook
Email
Notes
More
New page: Integrity
And new-ish page: Policy advocacy
Jul 10, 2024
•
Zach Stein-Perlman
1
Share this post
AI Lab Watch
New page: Integrity
Copy link
Facebook
Email
Notes
More
June 2024
Anthropic's Certificate of Incorporation
New details on the Long-Term Benefit Trust, but most questions remain
Jun 12, 2024
•
Zach Stein-Perlman
Share this post
AI Lab Watch
Anthropic's Certificate of Incorporation
Copy link
Facebook
Email
Notes
More
May 2024
AI companies' commitments
New page
May 29, 2024
•
Zach Stein-Perlman
Share this post
AI Lab Watch
AI companies' commitments
Copy link
Facebook
Email
Notes
More
Maybe Anthropic's Long-Term Benefit Trust is powerless
Anthropic should share the details
May 27, 2024
•
Zach Stein-Perlman
2
Share this post
AI Lab Watch
Maybe Anthropic's Long-Term Benefit Trust is powerless
Copy link
Facebook
Email
Notes
More
AI companies aren't really using external evaluators
But they should
May 24, 2024
•
Zach Stein-Perlman
1
Share this post
AI Lab Watch
AI companies aren't really using external evaluators
Copy link
Facebook
Email
Notes
More
New voluntary commitments (AI Seoul Summit)
Basically the companies commit to make responsible scaling policies. Part of me says this is amazing, the best possible commitment short of all…
May 21, 2024
•
Zach Stein-Perlman
Share this post
AI Lab Watch
New voluntary commitments (AI Seoul Summit)
Copy link
Facebook
Email
Notes
More
DeepMind’s “Frontier Safety Framework” is weak and unambitious
FSF blogpost. Full document (just 6 pages; you should read it). Compare to Anthropic’s RSP, OpenAI’s RSP (“Preparedness Framework”), and METR’s Key…
May 18, 2024
•
Zach Stein-Perlman
Share this post
AI Lab Watch
DeepMind’s “Frontier Safety Framework” is weak and unambitious
Copy link
Facebook
Email
Notes
More
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts