This project explores the nature and future of AI agency and personhood, and its impact on our human sense of what it means to be a person.
This project has four strands:
Models of agency
By drawing on work about familiar kinds of non-human agents (eg corporations, or nations), we may learn useful lessons about the range of possible machine agents, and respects in which they compare and contrast to human agents.
Tools or persons?
Will AIs always be ‘tools’, or will some be ‘persons’, as philosophers say, with interests and rights of their own? This is a long-term issue, but may have shorter-term implications (eg for safety strategies). Mere tools might be easier to control, but machines developed as moral agents might do a better job of looking after our interests, as well as their own.
AI and human personhood
Even if AIs remain tools, what will their capabilities do to our own sense of personhood? Issues here range from short-term questions about the future of work, to longer-term concerns about the importance of agency to human wellbeing.
Goal-stability in self-improving AI
Could an AI advanced enough to alter its own software be reliably predicted not to change its own basic goals? This is an important long-term safety issue, linked to the Value Alignment project (ie a solution to the value alignment problem needs to be reliably stable in this way). It is connected with philosophical issues about the ability of deliberating agents to predict their own actions as they deliberate.