"exploitation is a basic property of human society"
May 31, 2019 10:42 AM   Subscribe

One of the great workhorses in game theory is the prisoner’s dilemma [Wikipedia]. But an as-yet-unsolved question is how exploitation must have evolved in society. Today, we get an answer thanks to the work of Yuma Fujimoto and Kunihiko Kaneko at the University of Tokyo in Japan [arXiv]. They use the iterated prisoner’s dilemma to show how one player can exploit the other to get a better payoff. They also show why the exploited player goes along with the exploitation to create a stable strategy. [MIT Technology Review]
posted by ragtag (32 comments total) 17 users marked this as a favorite
 
" The interesting part of this result is that pursuing individual reward logically leads both of the prisoners to betray when they would get a better individual reward if they both kept silent."

Ok so is it just me, or is this entire thing extremely bad logic... And on who's part, the researchers, the writers or the prisoners? How is it ever always "purely rational" to be self interested as well... Someone fucked up.
posted by OnefortheLast at 10:47 AM on May 31, 2019


The problem is the prisoners. They lack complete knowledge of what the other is doing, so they cannot choose the logically best approach in concert.

Namely, the first prisoner cannot know with certainty that the second prisoner is following the best plan (i.e., remaining silent). So then the first prisoner looks at the possible disadvantage of trying to follow the best plan if the second prisoner is not doing so. Because that leads to very bad consequences for the first prisoner, he must then consider risk minimization by employing a secondary plan of blabbing in hopes of getting preferential treatment (e.g., a shorter prison sentence). Thus, the dilemma.

Interestingly, the mob has figured out how to deal with the prisoner’s dilemma. They have created a culture where, if you squeal, even if it leads to the better outcome for you in the short term, everyone knows you’re gonna get whacked once you’re set free. So the incentive for either party to attempt risk minimization by talking is taken off of the table as an option (and a huge disincentive is introduced). Thus, their prisoner’s dilemma is solved: both prisoners know that the only option is to remain silent.
posted by darkstar at 10:59 AM on May 31, 2019 [7 favorites]


Well isn't the whole point of strategy to be thinking however many steps ahead? These prisoners don't exist in a time second of isolation. (And like they probably have street sense, but sure let's assume they dont for argument' s sake. it's also not necessarily solved. People still can and do rat, it really all just depends on where you sit in the hierarchy.)
I'm just saying... game theory.. is a theory. And only as good/sound as the weakest point. Im not overly familiar with game theory and only skimmed the article, but I can see something is off with this. Strictly from a logic/reasoning/rational/strategy point... there shouldn't be dilemmas. Sure you could logic a set of parameters, but... do the parameters logic? Equations need to work both forwards backwards and together with the others
Just mho.
posted by OnefortheLast at 11:24 AM on May 31, 2019


Interesting result. I'm still trying to wrap my head around exactly the sort of human outcome this specific result would map to.

When I was playing around with iterated spatial prisoner's dilemma games, and started mapping out all the related game theory games, it struck me that the payoff space of prisoner's dilemma is a 1x1 square in the middle of a universe of less interesting but more common games. Most of the time, the payoffs are such that it's completely clear whether you should cooperate or defect. Prisoner's dilemma games have gotten a lot of attention not because they're common, but because they're mathematically interesting.

Also, as Elinor Ostrom's work made clear, a lot of game theory dilemmas disappear in practise with the simple addition of communication between the players. No communication is one of the key assumptions of the game, but how often is it true in a species like ours whose standout evolutionary trait is the volume and richness of our communication?
posted by clawsoon at 11:28 AM on May 31, 2019 [5 favorites]


And, of course, no Prisoner's Dilemma thread would be complete without the study which got actual prisoners to play Prisoner's Dilemma.
posted by clawsoon at 11:30 AM on May 31, 2019 [4 favorites]


yes, but what about the Prisoner's Trolley Problemma?
posted by the man of twists and turns at 11:34 AM on May 31, 2019 [9 favorites]


This article didn't really provide any information for a layperson. I assume it is math, and I assume that there was some kind of iterative computer simulation involved but I really have no idea. It would be nice if it were possible to flesh out what is meant by manipulating the other player etc. and how this would work.

Also it uses this kind of language:
Indeed, the so-called iterated prisoner’s dilemma shows how cooperative behavior must have evolved for social creatures. That solved what was once a significant problem for evolutionary biologists.

seems a little overblown.
posted by Pembquist at 11:38 AM on May 31, 2019 [2 favorites]


(The conclusion that I came to with my own ISPD experiments, which I will probably never get around to formally writing up anywhere so I might as well write here: In a world where the payoffs favour cooperators, defectors will survive longest in corners of the world that are hard to get to. In a world where the payoffs favour defectors, cooperators will survive longest in those same corners. The result made me think of monasteries surviving on mountaintops in the darkest of the Dark Ages, and pirates surviving in hidden harbours when governments got better at catching the lawless. The result I got probably didn't have anything to do with either of those things, since it's just a computer simulation of red and blue dots, but it was fun to think about.)
posted by clawsoon at 11:45 AM on May 31, 2019 [2 favorites]


yes, but what about the Prisoner's Trolley Problemma?

"INTRODUCING THE PRISONER'S TROLLEY PROBLEMMA
...
What do you do???"


I would wake up all upset because this is clearly another one of Those kind of nightmares.
posted by otherchaz at 12:54 PM on May 31, 2019


Well isn't the whole point of strategy to be thinking however many steps ahead? These prisoners don't exist in a time second of isolation.

These models get used by researchers in often counter-intuitive ways. The goal here is not and never was to model the actual behavior of prisoners; the only point of it being even notional prisoners is that it makes describing the incentives very simple -- would you prefer to serve a year in prison, or not to spend any time in prison?

Obviously prisoners sometimes rat each other out and sometimes don't; nobody cares. The general goal of collective-action problem games more generally is to lay out the incentives to defect in some simple way and then to take candidate mechanisms by which people could get out of the dilemma, encode them into the structure of the game, and see how effective they are and what second-order problems they in turn spawn. Or, alternately, to take some existing collective-action problem that doesn't seem to have been even significantly ameliorated and describe some mechanisms that might do so by encoding them into the structure of the game.

If people particularly cared about prisoners, the goal wouldn't be "Here's how prisoners behave! Chessmate!" but rather "This is how mafia organizations support their members by threatening to murder them" or "Here are some traits that separate effective mafia organizations from ones that can't prevent their members from ratting on each other" or "Here are some secondary effects of using mafia organizations to prevent ratting"
posted by GCU Sweet and Full of Grace at 1:00 PM on May 31, 2019 [7 favorites]


like so much of science and society, someone makes up some funny rules and pits people/animals/etc against one another to see what happens.
maybe the problem is inequality or the police state(virtual as in game theory or actual as in real life)?
why do we find the horrible situations so infinitely satisfying to talk about?
how about experiments and ideas to avoid these situations ever coming to pass???
posted by danjo at 1:05 PM on May 31, 2019 [2 favorites]


GCU Sweet and Full of Grace: The general goal of collective-action problem games more generally is to lay out the incentives to defect in some simple way and then to take candidate mechanisms by which people could get out of the dilemma, encode them into the structure of the game, and see how effective they are and what second-order problems they in turn spawn. Or, alternately, to take some existing collective-action problem that doesn't seem to have been even significantly ameliorated and describe some mechanisms that might do so by encoding them into the structure of the game.

As I understand it, studying these games got really interesting for researchers when they were trying to figure out why the USA and USSR hadn't blown the world up with nukes yet, and how they might be prevented from doing so in the future.
posted by clawsoon at 1:18 PM on May 31, 2019 [4 favorites]


Keep in mind that the iterated game does introduce communication by way of past actions, and that these games are just simple models.

Communication is all well and good, but the Prisoner's Dilemma holds up when you remember that lying is communication, too. In fact, the Prisoner's Dilemma is the same whether you say they "can't communicate," or "they've agreed to cooperate, but don't really trust one another."

The Mafia haven't solved the Prisoner's Dilemma so much as re-weighted the squares so that it's no longer a Prisoner's Dilemma.
posted by explosion at 1:39 PM on May 31, 2019


The way I would describe the social relevance is that these games set up a fairly simplistic social situation that can be modeled. The models yield theories about 'optimal outcomes' or 'stable strategies', which might in turn help explain similar social behaviors in the real world (social phenomena like altruism, etc). If we know that there is some survival benefit to a particular strategy, then human decision making will tend to align well with that strategy, because our great^n uncles and aunts that didn't choose it failed to produce offspring.
posted by simra at 2:03 PM on May 31, 2019


Lying does indeed make the whole thing more interesting.
posted by clawsoon at 2:19 PM on May 31, 2019


Interesting result. I'm still trying to wrap my head around exactly the sort of human outcome this specific result would map to.


The study is suggesting that one party will willingly participate in and help stabilize a dynamic that is clearly exploitative of them, because although they know they are being exploited, they are still benefiting in some way.

I can think of a few examples that might illustrate the theory being described in the article.

1. The first is worker exploitation. Workers realize they are being exploited for their labor and treated by management unfairly, but are disinclined to upset this dynamic, because they perceive themselves to be the beneficiaries of the system (by having employment at all). So, they willingly participate in stabilizing a system that exploits them because they feel that they are better off than in the alternative (the risk of having no job, or going a long time without medical coverage, etc.)

2. Another is a marital relationship where, for example, a spouse is exploited by being given less power and freedom, but in which they actively help promote and stabilize the relationship, because of the belief that they are still benefiting from it (by having stable support, affection, etc.), compared to the alternative.

3. A third possible example is taxation. The rich are often able to get less affluent people to vote for tax cuts that overwhelmingly benefit the wealthy, because the less affluent are also offered a small reduction in their taxes. So, even though the dynamic is established to take advantage of the less affluent, they willingly enter into and stabilize the dynamic because they also benefit, even if it’s by a much smaller amount.

The study described in the article suggests that this sort of stabilized inequality is a common thread in human society, which seems reasonable.
posted by darkstar at 3:11 PM on May 31, 2019 [7 favorites]


The Mafia haven't solved the Prisoner's Dilemma so much as re-weighted the squares so that it's no longer a Prisoner's Dilemma.


Yeah, that’s kind of what I was getting at. Basically, the mob has Kobayashi Maru’d the Prisoner’s Dilemma, by reprogramming it’s basic parameters so it’s no longer the same game.

Or, to mix cinematic metaphors, they’ve re-written the rules so that the only winning move is not to play.
posted by darkstar at 3:14 PM on May 31, 2019 [1 favorite]


darkstar: I can think of a few examples that might illustrate the theory being described in the article.

A couple of those examples had crossed my mind, but I couldn't quite (and still can't quite) picture a human situation which is a series of iterated cooperate-or-defect moves involving no communication which transforms a situation where they start as equals and end up in stable inequality. In all of your examples, the "players" are starting from a position of inequality, and the honest communication of power by one over the other is an important part of the dynamic which maintains the inequality.
posted by clawsoon at 3:26 PM on May 31, 2019 [1 favorite]


I have employed the desperate expedient of reading the paper. Here's how much I understood of it.

The players are defined by how likely they are to cooperate if the other player cooperated in the previous round, and how likely they are to cooperate if the other player defected in the previous round. A (1, 0) player is tit-for-tat: If the other player cooperated last time, we always cooperate this time; if the other player defected last time, we always defect this time. A (1, 1) player always cooperates; a (0, 0) player always defects. A (.9, .9) player cooperates 90% of the time, no matter what the other player does.

In the games they played, they start one player (.9, .1): Cooperate 90% of the time if the other player cooperated last time, but only cooperate 10% of the time if the other player defected last time. Close to tit-for-tat.

The players learn thusly: If the strategy works, do more of that strategy. So if a (.9, .1) strategy results in a round where both players cooperate and win, update the strategy to (.91, .1) - a little more likely than last time to cooperate if the other player cooperated in the last round. If the (.9, .1) strategy results in a round where I cooperate but the other player defects and I lose, update the strategy downward to (.89, .1).

What they did was start one player at (.9, .1) - close to tit-for-tat - and the other player at (.9, .1), then (.9, .25), then (.9, .65), then (.9, .7) - i.e. they make the second player successively initially more naive, more willing at the start of the game to cooperate even if the first player defected in the previous round.

The result was that in the first case and the third case, the players learned to cooperate all the time. In the second case, the second player quickly got stuck always responding to cooperation with cooperation, so when the first player started defecting more often the first player wasn't able to adjust. (A bit more on that in the next paragraph.) In the fourth case, the situations ended up reversed. (A bit more on that two paragraphs down.)

Why did the second player get stuck cooperating in the second case? They explain the problem in the paper: If you randomly get close to 100% responding to cooperation with cooperation, you're hardly ever going to try responding to cooperation with defection, so your model will almost never get adjusted in that direction. If you're at (.99, whatever), you'll only respond to cooperation with defection one turn every hundred or so, so you have very few opportunities to get to (.98, whatever). You will be entirely vulnerable to bait-and-switch: If the other player cooperates in one round, you will cooperate in the next round; next round they defect and win.

Why did the fourth situation, starting with the most naive second player, end up with that initially naive second player exploiting the first player? To me it looks like the outsize impact of randomness when you're dealing with small numbers, the same sort of thing that causes alleles to randomly become fixed in small populations. All of the drama in each game happened in the first 10-20 rounds of play, with the players making probabilistic moves. I suspect that multiple games with the same starting point would end up at different places as a result; sometimes cooperation, sometimes the naive player being exploited, sometimes the naive player turning the tables.

The final interesting dynamic: If one player gets stuck responding to cooperation with cooperation all the time, and the other player plays bait-and-switch, the first player will start cooperating more even after the other player defects. It looks like cooperation after defection rose to 50-60% in the cases like this that they show. My interpretation: If you find it impossible not to give the other person a gift if they gave you a gift last time, but they're taking turns randomly giving you a gift or punching you after you give them a gift, you'll start responding to their punches with a random mix of punches and gifts instead of always punching them back.

I may be taking the interpretation too far - glad to be corrected on this - but it looks like tit-for-tat isn't a stable strategy if the two players can only remember a single previous round and learn by experience. They'll either end up cooperating all the time, or one player will get stuck always responding to cooperation with cooperation and the other player will take advantage of the first with bait-and-switch.
posted by clawsoon at 5:09 PM on May 31, 2019 [2 favorites]


*tit-for-tat isn't a stable strategy -> near tit-for-tat isn't a stable strategy.

If I'm thinking this through correctly, the highlighted results from the paper are mostly the result of a model where one player can get stuck always responding to cooperation in the last round with cooperation this round. This makes them exploitable.
posted by clawsoon at 5:28 PM on May 31, 2019


My thinking is slowing down, so someone please double-check this: It seems like the strategy that the exploited player settles on is the closest approximation that this system can get to tit-for-two-tats. On average, the exploited players in this version of the game are responding to two-ish defections from the other player with a single defection of their own.
posted by clawsoon at 5:43 PM on May 31, 2019


Would I be correct to interpret a player in this version of the game as equivalent-ish to a neural net consisting of 2 unconnected neurons in 1 layer?
posted by clawsoon at 5:49 PM on May 31, 2019


incentives very simple -- would you prefer to serve a year in prison, or not to spend any time in prison?

My point is that incentive is actually never a given and never simple. Ask a forensic psycholgist: true incentive or motive, is the single most difficult thing to determine.

Using this model as example, well, People do in fact choose to serve time over not serving time when there is an incentive in doing so. This model assumes the "prisoner" hasn't done their own independent incentive evaluation beyond the "dilemma" presented to them, after the fact. That's not how anything works.
posted by OnefortheLast at 12:49 AM on June 1, 2019


I'd argue:
There is no logic or pure reason in unnatural consequences
And
That incentive does not necessarily even apply to punishment

It is faulty logic to present a punishment or consequence as an incentive or reward. That equation doesn't calculate backwards.
posted by OnefortheLast at 1:11 AM on June 1, 2019


I'd argue that you're missing the point of a mathematical model.

I'd also gently suggest that maybe you don't meet the prerequisites for this thread. The reason it sounds wrong to you is because you have a profound misunderstanding of what this is about, that we are not going to be able to fix in the confines of this thread. I'm sure no-one here, including you, wants to see what the game theory equivalent of an anti-vaxxer looks like.
posted by Merus at 2:50 AM on June 1, 2019 [5 favorites]


I think you're missing the point of theory.

I find it really interesting that these types of discussions are not only allowed but encouraged in my country's education system, but that there are other places in the world where questioning anything has the mob upon you uttering threats and insults and calling into question your "prerequisites". Though it is nice to see the parts of game theory that work play out well.
posted by OnefortheLast at 9:26 AM on June 1, 2019 [1 favorite]


Mod note: Hey, OnefortheLast, you're working with some nonstandard definitions that aren't really what the post is about -- which is fine, but now that you've offered your thoughts, please leave it at that. It's ok if people want to discuss the results presented in the post using terms as they're used in the field; the whole thread doesn't need to start from first principles. If you'd like to make a thread about foundational challenges to these game theory/economics assumptions/terminology that'd be a better place to discuss those.
posted by LobsterMitten (staff) at 11:23 AM on June 1, 2019 [1 favorite]


I think that criticizing the framing offered by the authors:
“This study provides a new perspective on the origin of exploitation in society,” say Fujimotoa and Kaneko.
...is a perfectly valid thing to do. This is an interesting game theoretic result, but the players in this version can't develop strategies even as complicated as your average bacteria. "This says something about human society" can be legitimately countered with, "No it doesn't, and here's why."

Having read the paper, my guess is that the results are side effects of the limitations they placed on the players. If you can remember exactly one previous interaction with another person, and you are locked into a relationship with them that you can't escape, then these results might apply to you.
posted by clawsoon at 3:24 PM on June 1, 2019 [1 favorite]


I was going to post a very similar complaint. The actual origin of exploitation in society, and whether exploitation is a "basic property," are historical and empirical questions.

I tend to like game theory exercises but getting the numbers work in this case is like coming up with a good speed estimate for your spherical horse model; it's interesting but you ruin the whole exercise if you start claiming this gives you insights about biomechanics.

University press offices are a broken institution.
posted by mark k at 10:56 PM on June 1, 2019 [2 favorites]


I was going to blame university press offices, too, but the pull quote is from the researchers themselves.

Perhaps there's a game theoretic explanation for why they'd do this...
posted by clawsoon at 4:43 AM on June 2, 2019


I'm still digging through the math on this—either I'm a dummy* or it's not simple**—but this paper (from 2012, describing the state of the art prior to this paper) is the most helpful one I've read so far. (I've even managed to run the bot they describe, ZDGTFT-2, in a tournament of my own, and I can confirm that it is surprisingly strong even against bots you'd think are more sophisticated than it. And its strategy is a single line of code.)

*I am
**it's not

posted by ragtag at 7:07 PM on June 2, 2019 [1 favorite]


I realized that I forgot to post this great article about the whole kit-and-kaboodle. Section 19 (on zero-determinant strategies) helps put the grandiose claims made by the article (which folks complained about) in perspective, though there's a lot of groundwork to understand before you can follow along.
posted by ragtag at 10:56 AM on June 19, 2019


« Older GONNA TAKE MY HORSE   |   “There is no where or when in which it is safe to... Newer »


This thread has been archived and is closed to new comments