A Month Has Passed
What's Next?
I have spent the last month doing approximately 70+ hours per week of AI safety. Whether that be learning or working on projects. I have a much better model of the landscape, what the types of technical work are, and even some hints as to what it might be like to work in some of these areas. I published writeups on the two larger projects I have been working on1. Which right now feels like a big release of tension but also now I feel much emptier - my brain has way less constant thinking through of these projects. But now I have a big question of what to do next! I don’t have a plan for the next project I want to work on.
If my current belief is that the highest impact thing for me to do is take the technical understanding I have built2 and apply it towards the collective action problem of how to slow down the speed of AI progress then it’s probably time to start at least vaguely considering how to move in that direction3 (and even what the options are in this area). I am extremely practiced at skilling up in a technical field but this seems much more nebulous (which is probably why I haven’t started yet - I don’t want to do the thing that feels much less tractable/intuitive to me.) There’s also still lots more understanding to be had on the technical side, especially as I continue to try to understand the core issues of the various problem stacks4 of which there are certainly some that I still haven’t considered at all. I am somewhat worried about losing steam as I5 make this transition away from the extremely fun easy to see progress problem of learning the various technical problems related to AI safety.
Not that there isn’t still clear work to be had. I still have a real backlog of papers I want to read. I will certainly want to write a blog retrospective on the experience posting my first substantive projects to LessWrong6. But, at some point I am going to have to start making some decisions and decide which parts of the AI safety ecosystem I want to go deeper into and start actually specializing in. This is scary, especially since I have changed my mind at least once per week on what I think the best specialization is.
I feel like most of my posts have had a takeaway. I guess this one is me claiming that decisions are hard, and deciding when/how to decide is hard, and I feel like I should start thinking about decisions. This does make me understand a little bit more why so many people end up at Anthropic - it’s super legible, it has tons of job openings, it’s easy to feel like you are working for the good guy7, and it pays well. If I was down to work there I wouldn’t have to do this next hard step of figuring out what is actually the best decision and how to fucking get there8.
Claude really thinks this blog post is weak9, I’m publishing it anyways. Maybe that’s a mistake? Hard to tell what the right quality bar should be - at least it’s short. Please feel free to let me know if you didn’t like this one.
On both LessWrong and EA Forum, but so far it seems like LessWrong is significantly more active than the EA forum which I find very interesting given that I generally associate EA Forum with more social power seeking, and LessWrong with more nerd - I would think the social power seekers would engage in social technology (forums) more.
And will certainly continue to build on
both from a skills perspective but also a clearer theory of change perspective
I started with mech interp, then thinking about model red-teaming, most recently security
slowly, maybe
Forum for rationalists and AI safety people, feeds directly into the “AI Alignment” forum which is more exclusive to AI safety
I basically don’t think that’s true at this point
Not to mention, whatever I decide “the best decision” ends up being will probably be both less status-ful and pay less than Anthropic.
This is what claude said, I certainly agree with a good number of it’s takes but not enough to write more about it.
