3. MEET THE PLAYERS | Value Diversity
Last Chapter: OVERVIEW | What to Expect From This Game
We suggested that civilization is an inherited game shaped by those before us, and we must choose among possible games to pass on to our future selves and generations. In this chapter, we explore the question of what constitutes good play. It is a more philosophical chapter than the remainder of the book but we hope it allows you to appreciate why we suggest voluntary cooperation as a strategy to play this game.
Improving One’s Game: Striving for Coherence But Built to Be Conflicted
What does it mean to play our civilization game well? A scroll through Twitter shows answers differ widely. No wonder, according to a social intuitionist model for human values. Our actions are mostly based on intuitions, such as a sudden revulsion at an action. Only when prompted to explain them do we invent rationalizations for them. When it comes to reasoned judgment about right and wrong, we resemble “a lawyer trying to build a case rather than a judge searching for the truth.”1
Our intuitions are shaped by factors outside of our control. Evolution built us to care about things because that caring structure had survival benefits. So here we are, creatures that care about things and that have ethical reactions to each other. We are also creatures that try to abstractly theorize about those reactions. Our intuitions as to what is better and worse often conflict, and so do our theories. How do we go forward from here?
Perhaps the tendency to create abstract rationalizations about our intuitions isn’t all that bad. We are endowed with a caring structure but also with the ability to reflect on it. At least from an individual problem-solving perspective, this ability can help us create an overall narrative aligning our various wants now with those of our future selves. We react to situations and others as infants, before we can reason about our reactions. But as we wonder about the reasons for these reactions, gain more experience, and live through times of conflict and growth, many of our intuitions will come up for revision.
We could think of ethics as our internal negotiation between which intuitions we want to hold onto as values and which we want to dismiss as biases. We revise what turns out to be futile or incompatible with the rest. In this process, nothing is sacred; our intuitions and even our beliefs change with multiple consistent equilibria among them. Nevertheless, with continuous adjustment of our default caring structure, the emerging set may help us move more towards who we want to be.
We may never fully “solve” our internal conflicts, but that may be just as well. Marvin Minsky suggested that, while we think of humans as entities with single agency, our minds are built to be conflicted. According to his multiple self view, our minds consist of many internal agents, each having simpler preferences. Our adaptive intelligence arises from these agents keeping each other in check.
Robin Hanson observes that “if your mood changes every month, and if you die in any month where your mood turns to suicide, then to live 83 years you need to have one thousand months in a row where your mood doesn’t turn to suicide.” Thanks to our internal division, even if parts of our mind went suicidal at times, the others are there to keep them in check.2 So the next time we beat ourselves up when part of us wants this, part of us wants that, we may take solace that having some internal conflict, rather than perfect alignment, may be more a feature than a bug.3
Coherence and Conflict
While some of our reflection occurs rather opaquely to us, explicit models can sometimes help us gain insight and make better choices. One such model is John Rawls’ Reflective Equilibrium.4
It has three steps:
Intuitions: Explore your intuitions for a number of situations.
Abstraction: Abstract provisional rules of thumb that account for those intuitions.
Reflection: When these abstractions don’t reliably recommend actions you find intuitive or when you encounter objections to them, revise either your intuitions or abstractions until you achieve a new equilibrium.
Let’s see this process at work using the Trolley Dilemma to illustrate our striving for coherence, and superlongevity to illuminate our internal conflict.
Striving for Coherence: The Trolley Dilemma
Imagine you see a trolleybus heading in your direction. The driver is slumped over the controls, unconscious. In its path on the tracks in front are five people chatting, oblivious, soon to be mown down. They are all going to die, but you can save them. You’re standing by the switch and if you simply pull a lever, you can divert the out-of-control trolleybus onto a different set of tracks where only one person is standing.
There is no time to warn anyone. Do you act and save five lives at the cost of only one? If your initial instincts are that the right answer is yes, you may want to think again.5 Imagine that, instead of pulling a lever you have to push a bystander in front of the trolley to stop it from hitting the five people. Would you push? Using the Reflective Equilibrium, you may reason as follows:
Intuitions: I feel like redirecting the trolley with the switch, even if it kills one and saves five, is okay but pushing the man into the threat would be wrong.
Abstraction: I can explain my willingness to switch with a utilitarian value theory. According to act utilitarianism, I ought to choose the action that maximizes a utility measure, such as happiness. All else equal, more lives saved means more utility. My aversion to pushing the man is well encapsulated with a deontological rule, such as the Kantian categorical imperative, which prohibits actions that fail to respect the individual’s autonomy. The pushed person would be used as a clear means to stop the threat, which fails to respect his autonomy and is thus impermissible.
Reflection: So far, so simple. But as I dive into the vast literature on the Trolley Dilemma, the following inner Socratic dialogue ensues:
Pro Utilitarianism: Some neuroscientists object that my application of the Kantian rule in the push case is actually just a post-rationalization of a dubious intuition. The physical act of pushing the man may activate an emotive cue against up-front personal harm which evolved in past small hunter gatherer groups and is not triggered by seeing a switch. Experiments show that an aversion to pushing was linked to brain areas associated with emotional processing while a willingness to switch was linked to brain areas associated with rational decision-making. Emotionally-laden cues should be morally irrelevant if I could rationally do more good otherwise. Ergo: I should revise my intuition against pushing the man in favor of utilitarian answers to both cases.6
Pro Deontology: Hold on! Perhaps those premises are shakier than I think. Some psychologists suggest that emotions have potential hidden value for rational decision-making.7 Somatic markers—emotional reactions with a somatic component based upon previous similar experiences—can serve as action heuristics that prescreen choice options so I might act in time to save the five people without taking the time to have this internal dialogue. My aversion against pushing may be an intuitive application of such a heuristic that encapsulates valuable information. For instance, having the reliable expectation that in my society we don’t push people into threats could be a good heuristic to move me closer to the trusting society I want to live in.
Pro Deontology as Rule Utilitarianism: Not so fast! If I praise deontology for its value as a heuristic rather than for its intrinsic rightness, am I actually a rule-utilitarian who merely uses the maxim as a short-cut for maximizing overall utility? For instance, Scott Alexander suggests that “when Kant says not to act on maxims that would be self-defeating if universalized, what he means is ‘don’t do things that undermine the possibility to offer positive-sum bargains.’”8 Instead of calculating the utility for every action, perhaps I follow Kantian heuristics as a rule to maximize total long-term utility?
Pro Deontology as Cooperative Strategy: Not necessarily! That I can sometimes be modeled according to a utility-maximizing agent in iterative game play, doesn’t automatically make me an aggregative utilitarian. It is tempting to jump from the homo economicus of game theory to anthropomorphizing society as an agent whose utility can be maximized. But this would be a fallacy. Extrapolating from individuals to an overall mystical body, composed of our preferences having a utility, ignores the separateness of persons. When I object to pushing people into threats, it is not necessarily because it serves the greater good long-term but because not upsetting the expectation of reciprocal altruism gives me a society in which I can better achieve my own goals. What does this mean for my Trolley answers?
… The Socratic dialogue continues …
Whether or not you find this particular dialogue convincing, it is one example of what it could look like to seek a more coherent caring structure.
Built to be Conflicted: Superlongevity
Having trained using the notorious Trolley Dilemma, you are ready for a more radical thought experiment: superlongevity. Those wanting a very long life with extreme personal growth may have to re-evaluate what it means to be alive. Your reflection on the desire for growth versus identity continuity might go like this:
Intuitions: If faced with the possibility of a very long life, I would be excited to grow into an incomprehensibly larger cognition than what I am now. I care about my future cognition pursuing a great variety of goals, creating greater adaptive complexity, problem-solving ability, and intelligence. But I would only want to pursue growth if it is actually me who is growing. What I care about now has changed a lot since my childhood. If I compare my far-future self to my current self, it is hard to imagine it will care about the same things. Becoming something incomprehensibly great is also incomprehensibly different. Is this still me?
Abstraction: I need a theory of what it means to be me. The most obvious one is "similarity identity": maintaining similarity of aspects at the core of my current identity.
Reflection: The theory of similarity identity clearly conflicts with my desire to grow. Factors at the core of my current identity may change. If I want to hold onto my desire for growth, I need to revisit what it means to be myself.
Against Similarity Identity: Choosing similarity as a defining factor of identity may be a trap. That choice developed unchallenged by new technological possibilities. My caring to stay alive is the result of an evolutionary process that offered far fewer choices and philosophical challenges. To my ancestors, what staying alive meant was pretty unambiguous. It was continuing my corporeal body’s future timeline. While there were many choices as to what to do to stay alive, there was an unambiguous interpretation of what it meant to be alive. I am now in a position to rethink my drive to stay alive, so it is better co-adapted to my other desires, such as personal growth.
Toward Credit Theory of Identity: Can I come up with a new theory of identity that feels intuitive? How about a “credit theory of identity”: It is unclear to what extent Ancient Greek civilization died and I am part of a new civilization or to what extent I am still part of the Ancient Greek civilization. Civilizations are mushy and without clear boundaries. Similarly, once the intimacy of knowledge between brains can be close to the intimacy of knowledge within a brain, personal identities may be more fluid. I can say the Greek civilization is still alive in me, in that I think back on it fondly and credit it for having been a crucial aspect of what I grew up to become. Projecting forward, I can similarly imagine an incomprehensibly greater future version of myself who looks back on me fondly as the beginning that grew into it. In line with my new theory of identity, I seek to transform my desire for survival into a desire to grow into something… Evolution created me with a drive for survival, for the obvious selective reasons. "similarity" and "credit" are two alternative interpretive choices that I can then make. Though one is more obvious, neither is a more true interpretation of survival/longevity than the other. It is up to me to shape my interpretation. I choose the credit theory, to shape my overall caring structure to be more internally aligned.
Pro Conflict: Wait, am I sure that, if technologically possible, I really want to intervene in my mind and “repair” parts that conflict with others? I should probably remember Minsky’s warning; when some parts of my mind remove conflicting parts, I risk destroying the richness from my mind’s natural conflict.
Conflict as Coherence: Wait, have I just eliminated potentially such valuable internal conflict myself by reaching coherence on the fact that I want to remain conflicted?
… The Socratic dialogue continues …
Again, whether or not you agree with the details, this is one way of updating a caring structure.
Playing With Others: Epistemic Humility and Open Minds
We just saw that, as individuals, we often have little insight into our caring structure. Even if we manage to make progress towards a more coherent whole, we should expect residual internal conflict. Currently, 7 billion other players are already playing the game of civilization. They all start with different ethical intuitions and abstractions translating into different caring structures. With imperfect insight and conflict regarding our own caring structures, epistemic humility is advised when addressing theirs.
Currently, other humans are still similar enough that we can sometimes model and shape them. Many of our socially rich interactions rely heavily on this cognitive similarity: We react to people and can use our reactions to them to model their reactions to us. By predicting how others will judge us, we learn to judge ourselves. Adam Smith calls this model the impartial spectator, which continually asks: “if I were in your shoes, seeing me doing what I'm doing, how would I react to me?”9
According to Vernon Smith, this impartial spectator is a good shortcut to explain rich human cooperative behaviors that are anomalies with respect to the game theory of that particular interaction.10 We don’t always cheat, even if we could, and we sometimes punish cheating to no personal benefit. For instance, experiments by Dan Ariely suggest our dishonesty budget when no one is looking is determined by how much dishonesty we can allow ourselves while still maintaining a basically honest self-image.11 That’s because rather than wanting to act so as to gain praise and avoid blame, our impartial observer makes us want to act in a praiseworthy manner, even if no one sees it.
Knowing about each others’ impartial spectator, and how it is shaped by others' reactions, generates a rich social account of values: We not only react to the culture around us, but influence everyone else's values by providing an inspirational example, and by praising and supporting projects and people we admire. All of us, by leading what we think of as good lives, help to form the overall evolution of values that will outlive us.12
This is only possible as long as we are similar enough. Just as our ancestors would be shocked by our levels of tolerance, our descendants’ worlds will seem very strange indeed to us and even their fellows.13 According to Robin Hanson, rates of social change have sped up with increased growth, competition, and technological change, so we should also expect accelerating value drift over time.14 With people living longer lives, our descendants may increasingly live with more fellow descendants that are very different to them. As the menu of environments to explore, experiences to have, and biological changes to make expands, our ethical intuitions and abstractions may increasingly diverge. The more diverse our civilization, the less we may be able to meaningfully model others and their reactions to us.
Why Value Diversity is Here to Stay
Is so much epistemic humility really necessary for interpersonal approaches to values? Earlier, we suggested that individuals can personally strive toward more coherence across their ethical intuitions and abstractions. Surely there must be a few core intuitions or abstractions that we can all agree to? Let’s see why we shouldn’t count on it.
Abstractions: The Moral Philosophy Trenches
The long history of rigorous ethical debate invites skepticism for soon converging on one value theory. Kant’s Categorical Imperative and act utilitarianism will serve as placeholders for a complex philosophical theory landscape.
Kantians’ disagreement on how to interpret the Categorical Imperative is well illustrated in the case of lying. Kant recommends that to determine if an action is permissible, formulate it into a maxim. Then check if the maxim treats humanity as an end in itself and could apply equally to every person without being self-defeating. Kant thinks lying is prohibited, because it robs us of the autonomy to rationally decide and because, if everyone lied, no one would believe anyone. In short, the maxim “telling a lie” both fails to treat humans as an end in itself and fails to be universalizable.
The Categorical Imperative as a method to generate rules is often praised for better handling rule-ambiguity than other rule-based value theories.15 Nevertheless, while some Kantians side with Kant that the maxim to be universalized is “telling a lie”, others disagree. They suggest more fine-grained maxims such as “lying in situation x” to allow lying to save an innocent friend from murder. Different people have different intuitions as to which situational aspects are morally relevant for rule construction.
Similar disagreements hold across utilitarians. They tend to agree that actions ought to maximize aggregate utility but disagree on how, when, and for whom to calculate ‘utilities’. Do non-human animals count? What about future, as yet unborn, generations? How much do they count?16 Are we trying to maximize pleasure? Or just minimize suffering? Should we cast a wider net instead to consider happiness, preference-satisfaction, virtue, or other definitions of the good life?17
Depending on one’s answers, one may soon run into Robert Nozick’s utility monsters; people “who get enormously greater sums of utility from any sacrifice of others than these others lose.”18 Alternatively, Derek Parfit’s mere addition paradox awaits, in which a large population with low positive utility may be better than fewer happy people as long as the final utility comes out higher.19 Each specification of utilitarian theories comes with different costs that different people with different intuitions will trade off differently. These cherry-picked disagreements ignore disagreements across Kantians and Utilitarians. They don't even begin to address other value theories.
If we can’t agree on the same value theories, can we at least agree on a few general principles to govern our collective lives that we endorse for different reasons? This may only postpone the problem such that we end up disagreeing on theories to handle value theories disagreement. For instance, John Rawls proposes there are some principles we should all consent to from behind a Veil of Ignorance. Such a veil would prevent humans “from knowing their own particular moral beliefs or the position they will occupy in society”.20 But getting global consensus on which factors one can take behind the Veil of Ignorance is tricky. Do you know when you live? Which species you are? Strip too much of your caring structure away and you can’t say what you want. Strip away too little, and you get a highly individualized answer. Different people have different intuitions as to what can and cannot be ignored.
With so much room for interpretation within theories, we should rely even less on picking just one theory to guide our civilizational game. The problem is not that there is no reasonable choice, but that there are more than one. A preference for one theory, or even an interpretation, depends on the same idiosyncratic factors that led to our differences in the first place.
Intuitions: Evolved to Value Differently
One may hope that even if we cannot agree on the abstraction level, our intuitions should track the same underlying foundations. After all, humans share an evolutionary context and a similar social environment. For instance, Scott Alexander notes that in a society we all need to “decide how to act, what to do, what behaviors to incentivize, what behaviors to punish, what signals to send.” So even if our intuitions “crystallize” differently on the surface, can we reverse engineer them to a common human morality?21
To see how tricky it is to distill a meaningful common core from different individual intuitions, let’s revisit the Trolley dilemma. Suppose you revise your intuition against pushing the man into the threat because you now think such evolutionary-influenced intuitions should be irrelevant if you can otherwise benefit more people. Then suppose you learn that you prefer your family over creatures removed in space and time because this was evolutionarily adaptive. You decide to stop caring for your family more than strangers.
Can you convince others to follow you and flatten their caring structure? They may think caring structures evolved to guide action, and it is unhelpful to care about entities we cannot help. For those who continue with a caring structure after reflecting on it, that may be part of the valuing core. As long as you hold onto some evolutionary intuitions as values, you may want to think twice before judging those who make different choices.
If you learn that human love for nature evolved as a mere heuristic for finding food and water, are you prepared to cut down all forests to replace them with utility-maximizing objects?22 Different creatures come to different decisions about what they regard as bias and what, upon reflection, they hold onto as value. Ultimately, even “utilitarian” intuitions may be debunked as shortcuts for having other people help us when needed, i.e. for reciprocal altruism.
Can we at least agree on some generic strategies such as reciprocal altruism?23 Marco Del Giudice suggests that from a comfortable environment, understanding the advantages of pro-social trusting strategies is easy because we expect to play many rounds of games with each other. In a much harsher environment, a more short-term survival strategy might be more adaptive because there may not be time to benefit from cooperation in future games.24 Should you convince those for whom cooperating gets them killed to switch strategies? How?
There may be nowhere outside of humans’ individual idiosyncratic circumstances from which we could recommend one universal strategy. Even if there was, it would be very difficult to reliably communicate this to others who are the product of their own environments. Without much hope to reach interpersonal value agreement, either on the level of abstractions or on the level of underlying intuitions, value diversity is here to stay.
Improving the Playing Field: Voluntary Cooperation
As we get more diverse, the Silver Rule may provide a good practical heuristic for our interactions for a while. If the Golden Rule is “Do unto others as you would have them do unto you", the Silver Rule is "Don't do unto others as you would have them not do unto you." Currently, our models of what we want done to us may sometimes still work to figure out what others would like to have done unto them. In an increasingly diverse world, the Silver Rule is more appropriate as an epistemically humble heuristic. Figuring out how to avoid harming others is difficult enough without also trying to actively act on their behalf.
Heuristics can be useful but if we want a robust civilizational architecture across the next few rounds of play, we need more reliable frameworks for engagement with different cognitive architectures. Even if we don’t have to worry about meeting alien minds tomorrow, we are actively creating mind-architectures very different from us.25 Robin Hanson explores a potential future economy in which humans create human-brain emulations. These have reduced inclinations for art, sex, and parenting, can change speeds by changing hardware, and create temporary copies of themselves.26
This scenario at least assumes humans as a source, but we are also making remarkable progress via software and hardware in AIs that function nothing like the human brain. While current “neural-network” architectures at best crudely mirror some of our brain’s functionality, it is naive to suppose they will mirror human caring structures for long.27
Future minds may not exhibit much of what we call “values” at all, but could be better characterized as “goal-seeking” entities. Nevertheless, as long as they have goals and act as though they make choices, they will have revealed preferences. With less and less comprehension of other players' values, those revealed preferences may be all we have when designing systems for different players to reach their goals. But they may also be all we need to up a playing field that allows for good games, as judged by each player.
With diverse values that increasingly drift apart, we can’t rely on aligning players on one grand strategy. Instead, to set civilization up well over the next rounds of play, we must build a playing field that can handle fundamental value differences.
Imagine a game of civilization played by Alice and Bob. A world in which different players have different goals can be described in terms of preferences among future states. The center dot is the current state of the world that players Alice and Bob are in. The axes are the world states, organized by Alice's preferences vertically and by Bob's preferences horizontally. Bob prefers the green worlds to the current world.
Positive Sum & Negative Sum
If we could extrapolate utilities from Alice's preferences and Bob’s preferences, we could say their interactions can lead to outcomes that have greater overall utility or smaller overall utility. Meaningfully comparing utilities across players will become more problematic the more diverse their futures get. But for now, let’s assume everything to the upper right of the red line are “positive sum” outcomes, and everything to the left are “negative sum outcomes”.
There is a problem with simply seeking positive sum outcomes. If Bob would be worse off than he currently is, he would fight any attempt to get there. Likewise, Alice would fight the positive sum outcomes she likes less than the status quo. But if Alice and Bob are either equally or better off than they currently are, both have good reason to cooperate. Together, they can move to Pareto-preferred worlds. Situation B is Pareto-preferred to situation A if anyone prefers B to A, and no one prefers A to B. Those worlds can be reached by voluntary cooperation. For human players, we could say these interactions are “freely” consented to; for non-human players, they are simply based on their “internal logic”.
Cooperation Across Humans
Human similarities also come with the tendency to compare oneself to others, including strong fairness intuitions and envy reactions. If Alice’s gain is perceived as too unfair, only she would be invested into bringing about that future, even if, all else equal, Bob would have consented to the deal. The all-too-human tendency to compare ourselves to others may lead Bob to reject a Pareto-preferred deal. It narrows the scope of what the world’s human players can achieve by voluntary cooperation.28
Cooperation Across Intelligences
Traditionally, the definition of an agent with utility assumes a comparability that future intelligent systems don't necessarily have going forward. Without meaningful metrics on which to compare utility across very different mind architectures, the diagonal red line, indicating positive and negative sum, disappears.
As long as players have goals and act as though they make choices, they will have revealed preferences. Those revealed preferences may be all we have when designing systems for players to reach their goals.29 Upholding voluntary cooperation could remain a stable common goal for both Alice and Bob across many rounds of future games, regardless of their intelligence. It’s all they need to unlock Pareto-preferred worlds that are better for each by their standards. The rest of this book is about how to set civilization up for this path of intelligent voluntary cooperation.
Values across players differ. In the future, they may drift further apart. Voluntarism enables independent pursuit of goals, so it should be attractive from a variety of perspectives. In addition to establishing peaceful co-existence, we also want to amplify cooperation for mutual benefit. Combining voluntarism at the base with an aspiration for increased cooperation results in voluntary cooperation. If we learn how to cooperate not just with humans but also with other intelligences, we can steepen our Paretotropian ascent.
Next up: SKIM THE MANUAL | Voluntarism
Jonathan Haidt, “The Emotional Dog and Its Rational Tail: A Social Intuitionist Approach to Moral Judgment.,” Psychological Review 108, no. 4 (2001): pp. 814-834, https://doi.org/10.1037/0033-295x.108.4.814, 814.
Robin Hanson, “World Government Risks Collective Suicide,” Overcoming Bias, accessed March 14, 2022, https://www.overcomingbias.com/2018/11/world-government-risks-collective-suicide.html. The article asks you to imagine a fictional world in which you died in any one month that you had a suicidal thought. In such a world, living eighty-three years would require a thousand months in a row in which your mood does not turn to suicide.
Marvin Lee Minsky and Juliana Lee, The Society of Mind, 1st ed. (New York: Simon & Schuster Paperbacks, 1988).
John Rawls, A Theory of Justice (Cambridge, Massachusetts: The Belknap Press of Harvard University Press, 1999).
Judith Jarvis Thomson, “Killing, Letting Die, and the Trolley Problem,” Monist 59, no. 2 (1976): pp. 204-217, https://doi.org/10.5840/monist197659224. Judith Thomson popularized this ingenious philosophical concundrum, in the 1976 article, elaborating on the work of English philosopher Philippa Foot in “The Problem of Abortion and the Doctrine of the Double Effect.”
Joshua D. Greene et al., “The Neural Bases of Cognitive Conflict and Control in Moral Judgment,” Neuron 44, no. 2 (2004): pp. 389-400, https://doi.org/10.1016/j.neuron.2004.09.027 In this, participants solved the Trolley Problems while in fMRI brain scanners, which showed an aversion to pushing being associated with amygdala activation in the brain.
Antonio R. Damasio, Looking for Spinoza: Joy, Sorrow and the Feeling Brain (London: Vintage, 2004).
Scott Alexander, “You Kant Dismiss Universalizability,” Slate Star Codex, July 22, 2020, https://slatestarcodex.com/2014/05/16/you-kant-dismiss-universalizability/. Scott Alexander suggests that when Kant says acting on maxims would be self-defeating if universalized, he essentially means ‘do not do things that undermine the possibility to offer positive sum bargains.’
Adam Smith and Edwin George West, The Theory of Moral Sentiment (New Rochelle (New York): Arlington House, 1969), 19.
Vernon Smith and James Otteson, “Will the Real Adam Smith Please Stand Up?” recorded 2015, on EconTalk, accessed 2022, https://www.econtalk.org/vernon-smith-and-james-otteson-on-adam-smith/
Dan Ariely, The (Honest) Truth About Dishonesty (New York: Harper Collins, 2012). Daniel Ariely presents behavioral probes of honesty levels in his book by tempting individuals to be dishonest for personal gain when they are observed versus when they think they are unobserved. In both cases, subjects allow themselves a dishonesty budget, which is only slightly larger when they think they are unobserved.
Russell D. Roberts, How Adam Smith Can Change Your Life: An Unexpected Guide to Human Nature and Happiness (New York: Portfolio/Penguin, 2015). Russ Roberts illustrates how this translates into our everyday lives. This social account of value resembles Aristotle’s Virtue Ethics, which focuses on the individual’s development of a stable moral character guided by such virtues as prudence, courage, and temperance.
Ted Chiang, “Catching Crumbs from the Table,” Nature 405, no. 6786 (2000): pp. 517-517, https://doi.org/10.1038/35014679. For a fictitious snippet of how much drift is possible with technology, Chiang’s article writes of humans developing an embryonic gene therapy that forks their descendants into humans and metahumans. They have little in common apart from technologies passed down from metahumans to humans.
Robin Hanson, “On Value Drift,” Overcoming Bias, February 21, 2018, https://www.overcomingbias.com/2018/02/on-value-drift.html.
Isaac Asimov, I, Robot (Gnome Press, 1950). For an example of conflict in rule-based theories, Isaac Asimov develops four laws of robotics as a story device: 1. Protection of humans, 2. Non-malfeasance to humans, 3. Obedience to human command, and 4. Self-preservation. To avoid conflict when rules recommend different actions, Asimov ranked the rules, so rule 1 trumps rule 2, rule 2 trumps rule 3, and so forth. The rules’ broad formulation still leads to conflicts, such as in Evitable Conflict, where AIs seek to follow the First Law by taking control of humanity; an action that would be impermissible by the Categorical Imperative as it fails to respect human autonomy.
Nick Bostrom “Astronomical Waste” 2003, Nickbostrom.com. https://nickbostrom.com/astronomical/waste.html. Future humans will care about their utility but current humans may not care to the same extent. Nick Bostrom suggests that given a few assumptions about technological progress, energy use, and human-brain emulations, every second we delay colonization of our local supercluster loses about 1029 potential human lives. One may disagree on the accuracy of this estimate, but our potential future’s vastness makes trading off utility of current humans with that of future humans difficult.
Nick Bostrom “Infinite Ethics” Analysis and Metaphysics, Vol 10 (2011): pp. 9-59. Amanda Askell “Pareto Principles in Infinite Ethics” a dissertation for the Doctor of Philosophy, NYU (2018): https://askell.io/files/Askell-PhD-Thesis.pdf. Bostrom and Askell eloquently show how an “infinite number of sad and happy people” might pose problems for aggregative theories (and other moral theories). Anders Sandberg, David Manheim “What is the Upper Limit of Value?” Philpapers.org (2021). https://philpapers.org/archive/MANWIT-6.pdf. Anders Sandberg and David Manheim argue for the alternative view that “the morally relevant universe is finite”.
Robert Nozick Anarchy, State and Utopia (Basic Books, 1974).
Derek Parfit Reasons and Person (Oxford University Press, 1984).
Iason Gabriel “Artificial Intelligence, Values, Alignment” Minds and Machines Vol 30 (2020): pp. 411-437. Gabriel explores the Veil of Ignorance, Human Rights and Democracy as possible strategies for reaching “global consensus” on handling reasonable value pluralism. The very fact that there is more than one choice for handling value pluralism, each with room for interpretation, means that preferences for when to use which interpretation can diverge. For instance, that many of us participate in systems that are non-democracies in our everyday lives suggests that there can be reasonable disagreement as to when they are and aren’t appropriate. We slide from reasonable pluralism of values into reasonable pluralism of theories for handling reasonable value pluralism.
Scott Alexander, “Value Differences as Differently Crystallized Metaphysical Heuristics,” Slate Star Codex, July 22, 2020, https://slatestarcodex.com/2018/07/24/value-differences-as-differently-crystallized-metaphysical-heuristics/.
Scott Alexander, “Value Differences as Differently Crystallized Metaphysical Heuristics,” Slate Star Codex, July 22, 2020, https://slatestarcodex.com/2018/07/24/value-differences-as-differently-crystallized-metaphysical-heuristics/.
Greene et al., “Neural Base, Cognitive Conflict,” 389.
Marco Del Giudice, Evolutionary Psychopathology: A Unified Approach (New York, NY: Oxford University Press, 2018).
Robin Hanson “How Far to Grabby Aliens?” OvercomingBias.com, December 2021: https://www.overcomingbias.com/2020/12/how-far-aggressive-aliens.html. Hanson explains why we may expect to meet them in roughly half a billion years. Eliezer Yudkowsky, “Three Worlds Collide” LessWrong, January 30, 2009: https://www.lesswrong.com/posts/HawFh7RvDM4RyoJ2d/three-worlds-collide-0-8. Yudkowsky explores how such an encounter may further shake our value understanding. The aliens we encounter in his story, while otherwise seemingly benign, eat their own babies because this was evolutionarily adaptive for them. Our moral outrage and high ground only lasts until we encounter another alien civilization to whom human-evolved customs are morally atrocious.
Robin Hanson Age of Em (Oxford University Press, 2018).
Robin Hanson, “On Value Drift” Overcoming Bias, February 21, 2018, https://www.overcomingbias.com/2018/02/on-value-drift.html. Robin Hanson notes that we may not have to worry much about additional value drift induced by non-em artificial intelligence all too soon because they will take on social roles near humans, and those we once occupied. But then again, Hanson already assumes great value drift across humans.
In the Ultimatum Game, people are presented with a choice of how to divide up a pie. The rules are that one player chooses first, and the second player can take the rest or reject the game, in which case neither player gets anything. If player one takes almost all of it and leaves very little to the second player, the system as a whole still moves to a Pareto-preferred outcome. But people often reject the game at that point and say the distribution is too unfair.
Envy in such a world will be less coherent. Nevertheless, the green cone in the middle is still the sweet spot for what Alice and Bob can achieve by making deals. This is because Bob can still reason about what seems to be Alice's preference ranking and hold out for a better deal.