Iocaine Powder
June 1, 2019 3:53 PM Subscribe
Buried in the source code from a 1999 Rock Paper Scissors tournament is a comment section starting with, "They were both poisoned," that goes on to describe one of the cleverest strategies ever designed for a computer game. As with its inspiration from The Princess Bride, the strategy considers that its opponent might know its strategy, and that its opponent might know that it knows that the opponent knows its strategy, and... you get the idea. Tweaks on the original idea remain the core of the best Rock, Paper, Scissors algorithms.
Here's Dan Egnor's own description of the algorithm.
posted by ragtag at 4:20 PM on June 1, 2019 [8 favorites]
posted by ragtag at 4:20 PM on June 1, 2019 [8 favorites]
Thanks for the link, ragtag. Looks like it adds a couple of things to the description he gave in the source code comments.
posted by clawsoon at 4:22 PM on June 1, 2019 [1 favorite]
posted by clawsoon at 4:22 PM on June 1, 2019 [1 favorite]
What impresses me most about Iocaine Powder is that it's an algorithm that basically contains a theory of mind (within the context of the game it plays, anyway). Not just a theory of mind - a theory of theory of theory of mind...
posted by clawsoon at 4:31 PM on June 1, 2019 [1 favorite]
posted by clawsoon at 4:31 PM on June 1, 2019 [1 favorite]
I'm not sure when else I'm going to get to tell this story, so:
One chilly night in the fall of 2004, I chanced to find myself upon the streets of Toronto. I represented one half of the august top-rated debate team from a small upstate NY liberal arts school (*cough* named after a man shot by Aaron Burr), recovering from a day in which we had arrived at a debate tournament and discovered, 11 minutes before the commencement of Round 1, that we would be competing using a format heretofore unknown to us. A lot of hemming and hawing ensued, before we ultimately turned to face the task awaiting us, and swaggered to the podium to stare down the fate bearing down on us like a hawk on a rabbit. The night was not kind to us, and by 11PM, we had succeeded in drowning our sorrows in a heretofore unimagined selection of Labatt's higher-gravity options available at the corner pub.
Which is when we were wholly taken aback to encounter none other than Syracuse University's internationally ranked (!!) team of rock-paper-scissors players, fresh off their top-32 finish in that year's invitation-only (!!!) tournament. I admit I was skeptical as to the provenance of their accolades, but a quick glance at the large medallions hanging off two team members' necks was enough to assuage any doubts; these were our fellow Upstate New Yorkers, and they had come to win RoShamBo rounds and chew bubblegum, and they were all out of bubblegum.
Naturally, we did what any self-respecting half-in-the-bag Americans would do upon meeting so-called champions of the rock-paper-scissors arena on the streets of a foreign nation, and challenged them to a duel on the spot. We proceeded to go a spectacular 0-24 against our fellow countrymen, before realizing that maybe there's more to the metagame than we had previously considered. It's only now, 15 years later, upon reading this description of the algorithm, that I have come to understand the instrument of our demise.
posted by Mayor West at 5:03 PM on June 1, 2019 [70 favorites]
One chilly night in the fall of 2004, I chanced to find myself upon the streets of Toronto. I represented one half of the august top-rated debate team from a small upstate NY liberal arts school (*cough* named after a man shot by Aaron Burr), recovering from a day in which we had arrived at a debate tournament and discovered, 11 minutes before the commencement of Round 1, that we would be competing using a format heretofore unknown to us. A lot of hemming and hawing ensued, before we ultimately turned to face the task awaiting us, and swaggered to the podium to stare down the fate bearing down on us like a hawk on a rabbit. The night was not kind to us, and by 11PM, we had succeeded in drowning our sorrows in a heretofore unimagined selection of Labatt's higher-gravity options available at the corner pub.
Which is when we were wholly taken aback to encounter none other than Syracuse University's internationally ranked (!!) team of rock-paper-scissors players, fresh off their top-32 finish in that year's invitation-only (!!!) tournament. I admit I was skeptical as to the provenance of their accolades, but a quick glance at the large medallions hanging off two team members' necks was enough to assuage any doubts; these were our fellow Upstate New Yorkers, and they had come to win RoShamBo rounds and chew bubblegum, and they were all out of bubblegum.
Naturally, we did what any self-respecting half-in-the-bag Americans would do upon meeting so-called champions of the rock-paper-scissors arena on the streets of a foreign nation, and challenged them to a duel on the spot. We proceeded to go a spectacular 0-24 against our fellow countrymen, before realizing that maybe there's more to the metagame than we had previously considered. It's only now, 15 years later, upon reading this description of the algorithm, that I have come to understand the instrument of our demise.
posted by Mayor West at 5:03 PM on June 1, 2019 [70 favorites]
Oh heck, I camped with Dan Egnor at Burning Man a few years back, and have known him for longer than that, but I didn’t realize he wrote Iocaine Powder! (We were at Burning Man to make a Burning Man Puzzle Hunt, which I feel is not a wholly different sort of nerdery)
posted by aubilenon at 6:02 PM on June 1, 2019 [3 favorites]
posted by aubilenon at 6:02 PM on June 1, 2019 [3 favorites]
It is a well-known fact that a truly random strategy cannot possibly lose consistently. Hence, it is useful to switch to a random strategy once the algorithm detects that it is losing.This feels like it's going to end up being one of those ideas that you encounter one place and then start seeing applications for everywhere.
posted by nebulawindphone at 6:05 PM on June 1, 2019 [11 favorites]
But what if your opponent decides to get involved in a land war in Asia?
posted by dannyboybell at 6:15 PM on June 1, 2019
posted by dannyboybell at 6:15 PM on June 1, 2019
I've never been able to get used to any of these other weird names for "jan ken po".
posted by tobascodagama at 7:35 PM on June 1, 2019
posted by tobascodagama at 7:35 PM on June 1, 2019
can i use this information to start a cult? pls advice
posted by Foci for Analysis at 8:27 PM on June 1, 2019
posted by Foci for Analysis at 8:27 PM on June 1, 2019
Incidentally, here are the rules for a rock-paper-scissors variant that actually adds asymmetric valuation to the decision-making process, to provide strategic space within the game. It’s pretty good!
posted by DoctorFedora at 8:31 PM on June 1, 2019 [1 favorite]
posted by DoctorFedora at 8:31 PM on June 1, 2019 [1 favorite]
That's probably not the intended link. I suspect this is the intended link: you put down three markers representing rock, paper and scissors in the middle of the table. When you win with a particular throw, you move that marker closer to you - if it's close to your opponent, move it to the middle; if it's in the middle, move it close to you; if it's already close to you, you win.
posted by Merus at 1:59 AM on June 2, 2019 [5 favorites]
posted by Merus at 1:59 AM on June 2, 2019 [5 favorites]
nebulawindphone: This feels like it's going to end up being one of those ideas that you encounter one place and then start seeing applications for everywhere.
I was contemplating the algorithm while browning the meat for stew one day, and realized that stirring is a crude generator of randomization which gives good-enough results compared to flipping over each piece of stew meat individually.
As I was stirring in the other stew ingredients, the idea was reinforced.
posted by clawsoon at 4:52 AM on June 2, 2019 [5 favorites]
I was contemplating the algorithm while browning the meat for stew one day, and realized that stirring is a crude generator of randomization which gives good-enough results compared to flipping over each piece of stew meat individually.
As I was stirring in the other stew ingredients, the idea was reinforced.
posted by clawsoon at 4:52 AM on June 2, 2019 [5 favorites]
Incidentally, here are the rules for a rock-paper-scissors variant that actually adds asymmetric valuation to the decision-making process, to provide strategic space within the game. It’s pretty good!Wait, you linked to a gif of a retro mortal-combat-style game. Which is also super cool, but either I'm missing the connection or you pasted the wrong URL.
posted by nebulawindphone at 5:35 AM on June 2, 2019
I was just talking to someone a few days ago about how coders love to give things nerdy names and/or bury fun easter eggs in their code. This makes me smile.
posted by not_on_display at 6:17 AM on June 2, 2019
posted by not_on_display at 6:17 AM on June 2, 2019
It's kind of shocking how often randomness plus an adequately fast fitness function beats carefully engineered algorithms.
Many years ago I remember beating my head for days against a tricky problem involving drawing the largest possible text label within highly irregular polygons in a way that "looked right". I had settled on something complicated involving a walk to find the most distant vertex in a Voronoi diagram, but it still settled on local optima.
Then a senior researcher came and took a look, said "Why don't you just try random placement 1 million times, and repeat with a larger text size each time you find a fit until you can't anymore. CPU is cheap".
It worked beautifully.
posted by xthlc at 6:33 AM on June 2, 2019 [9 favorites]
Many years ago I remember beating my head for days against a tricky problem involving drawing the largest possible text label within highly irregular polygons in a way that "looked right". I had settled on something complicated involving a walk to find the most distant vertex in a Voronoi diagram, but it still settled on local optima.
Then a senior researcher came and took a look, said "Why don't you just try random placement 1 million times, and repeat with a larger text size each time you find a fit until you can't anymore. CPU is cheap".
It worked beautifully.
posted by xthlc at 6:33 AM on June 2, 2019 [9 favorites]
As I was stirring in the other stew ingredients, the idea was reinforced.
To clarify: In stew, you generally want relatively even distribution of ingredients in the final result. You don't want a ladle full of stew to come out all meat, or all beans, or all cashew nuts. (Cashew nuts, yeah, that's the way I stew. Apple slices, too.) It would be theoretically possible to devise a method which distributes the ingredients with perfect uniformity so that every ladle is guaranteed to have the perfect ingredient mix, but random-ish stirring produces good-enough results with much less effort.
BTW, if you want to try out your own Rock, Paper, Scissors ideas, take a look at rpscontest.com. I tried a couple myself.
posted by clawsoon at 6:48 AM on June 2, 2019 [2 favorites]
To clarify: In stew, you generally want relatively even distribution of ingredients in the final result. You don't want a ladle full of stew to come out all meat, or all beans, or all cashew nuts. (Cashew nuts, yeah, that's the way I stew. Apple slices, too.) It would be theoretically possible to devise a method which distributes the ingredients with perfect uniformity so that every ladle is guaranteed to have the perfect ingredient mix, but random-ish stirring produces good-enough results with much less effort.
BTW, if you want to try out your own Rock, Paper, Scissors ideas, take a look at rpscontest.com. I tried a couple myself.
posted by clawsoon at 6:48 AM on June 2, 2019 [2 favorites]
This feels like it's going to end up being one of those ideas that you encounter one place and then start seeing applications for everywhere.
I read a theory somewhere (thought it was on the Blue but my search fu is failing me) that “introduces randomness” was the whole value of magic to human society, because it avoided collective action problems.
Like, if you had some really fertile land by the side of a river or on the flanks of a volcano, everyone would want to farm there, and come the inevitable flood or eruption, the whole society would be wiped out. But how do you convince a few people to farm elsewhere for the long-term common good, when it incurs a pretty hefty short-term disadvantage?
You read the cracks in a scapula bone, or throw sticks, or trace the lines on a goat’s liver, and then back that up with the authority of unseen powers beyond human comprehension, is how.
Magic: for when randomness is a better strategy than cognitive blindspots!
posted by chappell, ambrose at 8:21 AM on June 2, 2019 [5 favorites]
I read a theory somewhere (thought it was on the Blue but my search fu is failing me) that “introduces randomness” was the whole value of magic to human society, because it avoided collective action problems.
Like, if you had some really fertile land by the side of a river or on the flanks of a volcano, everyone would want to farm there, and come the inevitable flood or eruption, the whole society would be wiped out. But how do you convince a few people to farm elsewhere for the long-term common good, when it incurs a pretty hefty short-term disadvantage?
You read the cracks in a scapula bone, or throw sticks, or trace the lines on a goat’s liver, and then back that up with the authority of unseen powers beyond human comprehension, is how.
Magic: for when randomness is a better strategy than cognitive blindspots!
posted by chappell, ambrose at 8:21 AM on June 2, 2019 [5 favorites]
chappell, ambrose: You read the cracks in a scapula bone, or throw sticks, or trace the lines on a goat’s liver, and then back that up with the authority of unseen powers beyond human comprehension, is how. Magic: for when randomness is a better strategy than cognitive blindspots!
Interesting idea! I've always been vaguely bothered by the reason that reading the omens would be such a persistent and common thing, and that sounds like as good an explanation as any. (And maybe depending on random omens for when to start a battle gives some element of surprise that can't be theorized by the opponent?)
A more pedestrian example is the use of a random pivot in some versions of the quicksort algorithm.
posted by clawsoon at 9:06 AM on June 2, 2019
Interesting idea! I've always been vaguely bothered by the reason that reading the omens would be such a persistent and common thing, and that sounds like as good an explanation as any. (And maybe depending on random omens for when to start a battle gives some element of surprise that can't be theorized by the opponent?)
A more pedestrian example is the use of a random pivot in some versions of the quicksort algorithm.
posted by clawsoon at 9:06 AM on June 2, 2019
"Why don't you just try random placement 1 million times, and repeat with a larger text size each time you find a fit until you can't anymore?
It wouldn't be Biblical
posted by thelonius at 9:08 AM on June 2, 2019
It wouldn't be Biblical
posted by thelonius at 9:08 AM on June 2, 2019
It is clear that there is some feature of the human mind that favors magical thinking. I was going to say "some flaw," but it's such a big flaw that it seems like it must have some huge benefit. The idea that it inserts some randomness into human decision-making is ... kind of brilliant. It prevents all rational agents from coming to the same decision, and it doesn't matter what counterfactual B.S. justification they choose. I don't understand enough about genetics to convince myself it would actually be selected for, but at my "evolution lite" level of knowledge it is plausible.
Also the pointer to the source code was great.
posted by Gilgamesh's Chauffeur at 9:20 AM on June 2, 2019 [3 favorites]
Also the pointer to the source code was great.
posted by Gilgamesh's Chauffeur at 9:20 AM on June 2, 2019 [3 favorites]
Gilgamesh's Chauffeur: It prevents all rational agents from coming to the same decision
Does this mean that sectors of the economy that are subject to boom-bust cycles would be better off if the participants read omens instead of responding rationally to price signals? The price goes up for milk, so a million independent farmers start dairy herds, and in two years the price for milk crashes and tens of millions of dairy cows are slaughtered, and then the cycle begins again... maybe all those farmers should've looked at a curdled milk omen instead?
The comments on the next bot down in the source code (Phasenbott) point out that:
Does this mean that sectors of the economy that are subject to boom-bust cycles would be better off if the participants read omens instead of responding rationally to price signals? The price goes up for milk, so a million independent farmers start dairy herds, and in two years the price for milk crashes and tens of millions of dairy cows are slaughtered, and then the cycle begins again... maybe all those farmers should've looked at a curdled milk omen instead?
The comments on the next bot down in the source code (Phasenbott) point out that:
I've checked the effects of this "meta-ing", and after the first couple steps it's not worthwhile: If one of the base strategies doesn't match the opponent's play, then Iocaine's strategy becomes so subtle as to be effectively random.posted by clawsoon at 9:30 AM on June 2, 2019 [3 favorites]
Gilgamesh's Chauffeur: I don't understand enough about genetics to convince myself it would actually be selected for, but at my "evolution lite" level of knowledge it is plausible.
You might be interested in the evolution of bet hedging in biology:
posted by clawsoon at 9:53 AM on June 2, 2019 [3 favorites]
You might be interested in the evolution of bet hedging in biology:
Because bet hedging is designed to produce genetically diverse offspring randomly in order to survive catastrophe, it is difficult to develop treatments for bacterial infections, as bet hedging may ensure the survival of its species within its host, heedless to the antibiotic....e.g. the way that tuberculosis uses noise-driven transitions to maximize survival.
posted by clawsoon at 9:53 AM on June 2, 2019 [3 favorites]
For anyone interested in where I picked this idea up, it was this tweet:
@StefanFSchubert
posted by chappell, ambrose at 10:12 AM on June 2, 2019 [4 favorites]
@StefanFSchubert
Some superstitious divination rituals may have spread because they functioned as adaptive randomization devices in contexts where people otherwise would have used decision procedures worse than chance. From Henrich, "The Secret of Our Success".And for those wanting to read more, the screenshots are from The Secret of Our Success, by Joseph Henrich.
[very interesting screenshots about hunting Caribou follow]
posted by chappell, ambrose at 10:12 AM on June 2, 2019 [4 favorites]
FORBIDDEN CHESS PIECES:
The Prophet, who is aware of the hands which move the pieces.
The Crows, placed on the board after the final turn.
posted by acb at 11:13 AM on June 2, 2019 [3 favorites]
The Prophet, who is aware of the hands which move the pieces.
The Crows, placed on the board after the final turn.
posted by acb at 11:13 AM on June 2, 2019 [3 favorites]
wow, yeah, I definitely botched the link. Thanks for finding the right link for me, Merus!
posted by DoctorFedora at 10:31 PM on June 2, 2019
posted by DoctorFedora at 10:31 PM on June 2, 2019
I was just talking to someone a few days ago about how coders love to give things nerdy names and/or bury fun easter eggs in their code.
Which is great until you don't have the context. I once spent a good long while pulling out what little hair I have left trying to figure out why in the name of God a class that managed authorization and user permissions was called "MrDelmonte".
Turns out it's a reference to an ad campaign from the 80s where the Man from Del Monte would sample fruit and approve it. "The Man from Del Monte, he says yes", you see. It never aired where I grew up.
My first order of business when I eventually rewrite the thing is to change the name of the class. :P
posted by Mr. Bad Example at 2:31 AM on June 3, 2019 [2 favorites]
Which is great until you don't have the context. I once spent a good long while pulling out what little hair I have left trying to figure out why in the name of God a class that managed authorization and user permissions was called "MrDelmonte".
Turns out it's a reference to an ad campaign from the 80s where the Man from Del Monte would sample fruit and approve it. "The Man from Del Monte, he says yes", you see. It never aired where I grew up.
My first order of business when I eventually rewrite the thing is to change the name of the class. :P
posted by Mr. Bad Example at 2:31 AM on June 3, 2019 [2 favorites]
It's easy to make fun of AuthenticationAndPermissionManager-style class names, but I'll take them over the cutesy shit any day.
posted by tobascodagama at 6:48 AM on June 3, 2019 [1 favorite]
posted by tobascodagama at 6:48 AM on June 3, 2019 [1 favorite]
You read the cracks in a scapula bone, or throw sticks, or trace the lines on a goat’s liver, and then back that up with the authority of unseen powers beyond human comprehension, is how. Magic: for when randomness is a better strategy than cognitive blindspots!
So because of my IPD post and this RPS post, I've been playing with a bunch of IPD bots lately, and I've noticed that stochastic bots have a tendency to eat non-stochastic bots alive. If you don't include some randomness yourself, it seems that you become predictable (and therefore easy to manipulate), or else sensitive to random events (thereby wasting points trying to find a deeper meaning to your opponent's actions that simply doesn't exist).
I'm becoming more and more convinced that magical thinking is an inoculant to a harsh world, and that we ignore it at our own peril.
posted by ragtag at 6:08 PM on June 3, 2019 [1 favorite]
So because of my IPD post and this RPS post, I've been playing with a bunch of IPD bots lately, and I've noticed that stochastic bots have a tendency to eat non-stochastic bots alive. If you don't include some randomness yourself, it seems that you become predictable (and therefore easy to manipulate), or else sensitive to random events (thereby wasting points trying to find a deeper meaning to your opponent's actions that simply doesn't exist).
I'm becoming more and more convinced that magical thinking is an inoculant to a harsh world, and that we ignore it at our own peril.
posted by ragtag at 6:08 PM on June 3, 2019 [1 favorite]
I guess I should condition that by saying the exact proportion of randomness is crucial. A bot that plays purely by chance is awful. But slot randomness into an overall strategy, and...
posted by ragtag at 6:32 PM on June 3, 2019 [1 favorite]
posted by ragtag at 6:32 PM on June 3, 2019 [1 favorite]
On the theme of randomness: Another place it's obvious is in the flight of many insects. Unlike Prisoner's Dilemma, though, there's obviously no payoff to cooperating with the bird who's about to eat you. Minimizing your predictability maximizes your chance of escaping, and that's the only metric that matters.
posted by clawsoon at 10:01 AM on June 4, 2019
posted by clawsoon at 10:01 AM on June 4, 2019
> BTW, if you want to try out your own Rock, Paper, Scissors ideas, take a look at rpscontest.com. I tried a couple myself.
What we need here is a Metafilter RPS programming throwdown. Clawoon's bots are not doing bad at all--one is ranked 59th right now, and there is some pretty heavy competition there, with 2968 total bots. It looks like some pretty clever people have put some real thought and work into some of them.
After a couple of mild cracks at making a winning bot, I realized there was no realistic way I was going to beat the bayesian-markov-chain-neural-net type things without a bunch of actual work and thinking, so I decided the better part of valor might be to shoot for the bottom end of the food chain.
Right now I have the bot with the lowest winning percentage of all 2968 currently active, and 3 of the bottom 5 spots (by winning percentage--the leaderboard system works by a little different weighting and none of my bots have been on board long enough to fully reap their badness in the those rankings).
All my bots here. (Note that some are mods of others' work, others are original.)
So all this is a lead-up to my real message here: I challenge every Mefite: Do worse if you can.
Word of warning: It ain't easy being bad.
(Though probably easier than being good, given the competition up there.)
posted by flug at 11:51 PM on June 6, 2019 [1 favorite]
What we need here is a Metafilter RPS programming throwdown. Clawoon's bots are not doing bad at all--one is ranked 59th right now, and there is some pretty heavy competition there, with 2968 total bots. It looks like some pretty clever people have put some real thought and work into some of them.
After a couple of mild cracks at making a winning bot, I realized there was no realistic way I was going to beat the bayesian-markov-chain-neural-net type things without a bunch of actual work and thinking, so I decided the better part of valor might be to shoot for the bottom end of the food chain.
Right now I have the bot with the lowest winning percentage of all 2968 currently active, and 3 of the bottom 5 spots (by winning percentage--the leaderboard system works by a little different weighting and none of my bots have been on board long enough to fully reap their badness in the those rankings).
All my bots here. (Note that some are mods of others' work, others are original.)
So all this is a lead-up to my real message here: I challenge every Mefite: Do worse if you can.
Word of warning: It ain't easy being bad.
(Though probably easier than being good, given the competition up there.)
posted by flug at 11:51 PM on June 6, 2019 [1 favorite]
> while browning the meat for stew one day, and realized that stirring is a crude generator of randomization which gives good-enough results
Yes, you may know that one of the basic results in chaos theory and other related fields that study phenomena like this is that any reasonable mixing motion of the type humans are likely to use when mixing up batter, salads, cocktails, or whatever, is actually one of the best simulations of a random process that we can produce.
If you have a reasonably good stirring motion and repeat it a sufficient number of times, you're actually guaranteed to have a thoroughly mixed result.
And . . . the "random number generators" used in most programming languages use a similar scheme to generate their random numbers--which are actually not completely random, since they are predictable and reproducible if you know the "seed" they started with. They're often referred to as "pseudo-random".
Generating something that is "more random" than this type of numerical mix-master approach is actually a really, really hard problem.
posted by flug at 12:14 AM on June 7, 2019 [1 favorite]
Yes, you may know that one of the basic results in chaos theory and other related fields that study phenomena like this is that any reasonable mixing motion of the type humans are likely to use when mixing up batter, salads, cocktails, or whatever, is actually one of the best simulations of a random process that we can produce.
If you have a reasonably good stirring motion and repeat it a sufficient number of times, you're actually guaranteed to have a thoroughly mixed result.
And . . . the "random number generators" used in most programming languages use a similar scheme to generate their random numbers--which are actually not completely random, since they are predictable and reproducible if you know the "seed" they started with. They're often referred to as "pseudo-random".
Generating something that is "more random" than this type of numerical mix-master approach is actually a really, really hard problem.
posted by flug at 12:14 AM on June 7, 2019 [1 favorite]
It ain't easy being bad.
Wait. Isn't the optimally-bad RPS bot the same as the optimally-good RPS bot, except that it's move is rotated? (That is, if it predicts the opponent is going to play rock, it'll play scissors instead of paper?)
posted by ragtag at 2:56 AM on June 7, 2019 [1 favorite]
Wait. Isn't the optimally-bad RPS bot the same as the optimally-good RPS bot, except that it's move is rotated? (That is, if it predicts the opponent is going to play rock, it'll play scissors instead of paper?)
posted by ragtag at 2:56 AM on June 7, 2019 [1 favorite]
> Wait. Isn't the optimally-bad RPS bot the same as the optimally-good RPS bot, except that it's move is rotated? (That is, if it predicts the opponent is going to play rock, it'll play scissors instead of paper?)
Well, yes and no. In particular, several people including me have tried to just simply reverse the very best RPS bots in order to make the very worst, and there is a surprising amount of variability in the results. My worst bot, based on that exact philosophy, is currently almost 2% worst than the next worst bot, which is a rather amazing amount of improvement--far more than I'd have thought possible.
And it's many percent worst than the others that have taken this same approach, including one or two I put together.
By the same token - a bot I wrote myself (Giveaway 2) is super simple with practically no strategy at all to it and it is losing at about a 90% clip.
That same type of approach is yielding maybe 60% wins--vs 90% losses.
So there is quite a disparity between the win/loss side.
A lot of it is in the psychological or meta-strategy aspects, I suppose. Most bots are trying to win and you have to be sure to lose against those. But if you really want to be the worst, you also want to lose against the ones that are trying to lose.
An interesting conundrum.
posted by flug at 9:05 AM on June 7, 2019 [2 favorites]
Well, yes and no. In particular, several people including me have tried to just simply reverse the very best RPS bots in order to make the very worst, and there is a surprising amount of variability in the results. My worst bot, based on that exact philosophy, is currently almost 2% worst than the next worst bot, which is a rather amazing amount of improvement--far more than I'd have thought possible.
And it's many percent worst than the others that have taken this same approach, including one or two I put together.
By the same token - a bot I wrote myself (Giveaway 2) is super simple with practically no strategy at all to it and it is losing at about a 90% clip.
That same type of approach is yielding maybe 60% wins--vs 90% losses.
So there is quite a disparity between the win/loss side.
A lot of it is in the psychological or meta-strategy aspects, I suppose. Most bots are trying to win and you have to be sure to lose against those. But if you really want to be the worst, you also want to lose against the ones that are trying to lose.
An interesting conundrum.
posted by flug at 9:05 AM on June 7, 2019 [2 favorites]
I love the way this discussion is going.
BTW, I contacted the rpscontest.com maintainer (Byron Knoll), and he said he'd be willing to lend the code to someone who wanted to create an Iterated Prisoner's Dilemma version of the site. ragtag pointed me to the Axelrod project, which appears to be a best-in-class IPD contest runner, so... if there are some Mefites who are interested in creating something like this and who have some time...
posted by clawsoon at 10:18 AM on June 7, 2019 [1 favorite]
BTW, I contacted the rpscontest.com maintainer (Byron Knoll), and he said he'd be willing to lend the code to someone who wanted to create an Iterated Prisoner's Dilemma version of the site. ragtag pointed me to the Axelrod project, which appears to be a best-in-class IPD contest runner, so... if there are some Mefites who are interested in creating something like this and who have some time...
posted by clawsoon at 10:18 AM on June 7, 2019 [1 favorite]
FWIW, my conclusion about the leaders on rpscontest.com is that there are three groups: History-matching bots; bots (like Iocaine Powder) which are able to beat history-matching bots; and bots (like my better ones) that are able to beat everything except history-matching bots.
The best bots - all the variations on Iocaine Powder - effectively play randomly against each other, which makes the top 50 or so bots effectively randomly ordered.
posted by clawsoon at 11:08 AM on June 7, 2019 [1 favorite]
The best bots - all the variations on Iocaine Powder - effectively play randomly against each other, which makes the top 50 or so bots effectively randomly ordered.
posted by clawsoon at 11:08 AM on June 7, 2019 [1 favorite]
Yes, I noticed those same types of patterns in the "losing" bots as well. You have to somehow deal with the bots that want to win (several types), then the ones that are just random or whatever (like "always R" or "always rotate R-P-S"), the ones that are just accidentally bad for whatever reason--and then the various ones that are actually trying to be bad.
So my first strategy was to run head-to-head contests against the known worst bots, and try to out-badden them.
But then I realized: The bots that are actually trying to be bad are like 1% of the total, at best. So you're better off making sure you're as bad as possible against the other 99% and basically just ignoring the 1% that are your direct competitors.
Of course, at the top of the heap, things are just a little different.
Also, if you had a different competition or scoring structure, it would be different. Like if you had a head-to-head tournament structure, like the NFL playoffs, where winners keep getting paired off against winners all the way up, that would require a really different kind of design to win.
Or, if bots of similar strength were paired against each other far more often--that would be a different thing yet.
I think (but am not 100% sure) that on rpscontest.com opponents are just chosen randomly from across the whole list.
posted by flug at 9:38 PM on June 7, 2019 [1 favorite]
So my first strategy was to run head-to-head contests against the known worst bots, and try to out-badden them.
But then I realized: The bots that are actually trying to be bad are like 1% of the total, at best. So you're better off making sure you're as bad as possible against the other 99% and basically just ignoring the 1% that are your direct competitors.
Of course, at the top of the heap, things are just a little different.
Also, if you had a different competition or scoring structure, it would be different. Like if you had a head-to-head tournament structure, like the NFL playoffs, where winners keep getting paired off against winners all the way up, that would require a really different kind of design to win.
Or, if bots of similar strength were paired against each other far more often--that would be a different thing yet.
I think (but am not 100% sure) that on rpscontest.com opponents are just chosen randomly from across the whole list.
posted by flug at 9:38 PM on June 7, 2019 [1 favorite]
> The bots that are actually trying to be bad are like 1% of the total, at best. So you're better off making sure you're as bad as possible against the other 99% and basically just ignoring the 1% that are your direct competitors.
By the way, this suggests an interesting strategy for bots that want to be at the top of the heap as well:
Maybe your time is better spent trying to improve your performance against, say, the worst 40% of bots, or even the worst 80%--rather than trying to get 0.01% better than the top 1% of bots.
If you could improve your performance against the worst 40% of bots by, say, 10% that would be a real boost in your overall win rate. Whereas a marginal improvement in your performance against the best 5% of bots might not make a noticeable difference all.
posted by flug at 9:44 PM on June 7, 2019 [1 favorite]
By the way, this suggests an interesting strategy for bots that want to be at the top of the heap as well:
Maybe your time is better spent trying to improve your performance against, say, the worst 40% of bots, or even the worst 80%--rather than trying to get 0.01% better than the top 1% of bots.
If you could improve your performance against the worst 40% of bots by, say, 10% that would be a real boost in your overall win rate. Whereas a marginal improvement in your performance against the best 5% of bots might not make a noticeable difference all.
posted by flug at 9:44 PM on June 7, 2019 [1 favorite]
> an interesting strategy for bots that want to be at the top of the heap
Related to this, instead of trying to beat the bots at the top of the heap, the more effective strategy might be to just revert to random responses when you meet a really strong opponent--that way you're guaranteed to win at least 50% of matches.
If you could beat the weakest 80% of bots 100% of the time, and the strongest 20% of bots half the time, you would have a win rate of 90%. And that would be pretty impressive.
(It's also not totally realistic, as a lot of bots have enough randomness built into them that it's just not possible to beat them 100% of the time.)
On a related tangent, part of the fun and the strategy of human games like RPS and, for example, poker, is trying to get a read on your opponent(s) and adjusting your strategy appropriately. For example, you're going to play very differently if you think your opponent is a complete schlubb, vs if you think they are the one of the top-ranked players in the world.
So it could become quite interesting if you had a competition structure where some information about your opponent, and also perhaps yourself, was available. Just for example, if you knew both your and your opponent's current ranking. Or perhaps just your opponent's ranking, or perhaps just yours. Or maybe you know the results of your/their last 10 rounds. Or perhaps you have access to some of that type of information but it is somehow noisy or not quite 100% reliable.
Or, just for example, if you were running a head-to-head tournament type situation, if the bots knew what their current standing in the tournament was, before each match, they could use that info to develop strategy.
Another interesting idea would be to have a negotiating stage where bots could agree to exchange information with each other, perhaps over a few rounds of increasingly reliable information. So you could try to infer things about the other bot both from the information you gather about them and also when they decide to stop sharing.
Lastly, even the setup at RPScontest.com has some of this built in. With matches 1000 rounds in length, you can spend quite a bit of that testing your opponent and adjusting your strategy. It would be a lot different if matches were just 10 rounds, or 100.
Or, if they were 10,000 or 50,000.
You can find out quite a bit about your opponent over the course of hundreds of rounds--but even more over thousands.
posted by flug at 10:16 PM on June 7, 2019 [1 favorite]
Related to this, instead of trying to beat the bots at the top of the heap, the more effective strategy might be to just revert to random responses when you meet a really strong opponent--that way you're guaranteed to win at least 50% of matches.
If you could beat the weakest 80% of bots 100% of the time, and the strongest 20% of bots half the time, you would have a win rate of 90%. And that would be pretty impressive.
(It's also not totally realistic, as a lot of bots have enough randomness built into them that it's just not possible to beat them 100% of the time.)
On a related tangent, part of the fun and the strategy of human games like RPS and, for example, poker, is trying to get a read on your opponent(s) and adjusting your strategy appropriately. For example, you're going to play very differently if you think your opponent is a complete schlubb, vs if you think they are the one of the top-ranked players in the world.
So it could become quite interesting if you had a competition structure where some information about your opponent, and also perhaps yourself, was available. Just for example, if you knew both your and your opponent's current ranking. Or perhaps just your opponent's ranking, or perhaps just yours. Or maybe you know the results of your/their last 10 rounds. Or perhaps you have access to some of that type of information but it is somehow noisy or not quite 100% reliable.
Or, just for example, if you were running a head-to-head tournament type situation, if the bots knew what their current standing in the tournament was, before each match, they could use that info to develop strategy.
Another interesting idea would be to have a negotiating stage where bots could agree to exchange information with each other, perhaps over a few rounds of increasingly reliable information. So you could try to infer things about the other bot both from the information you gather about them and also when they decide to stop sharing.
Lastly, even the setup at RPScontest.com has some of this built in. With matches 1000 rounds in length, you can spend quite a bit of that testing your opponent and adjusting your strategy. It would be a lot different if matches were just 10 rounds, or 100.
Or, if they were 10,000 or 50,000.
You can find out quite a bit about your opponent over the course of hundreds of rounds--but even more over thousands.
posted by flug at 10:16 PM on June 7, 2019 [1 favorite]
« Older they're good genes Brent | Here comes the flood: New Blue Orchids single... Newer »
This thread has been archived and is closed to new comments
posted by Greg_Ace at 4:03 PM on June 1, 2019 [3 favorites]