In the models of Young (1993, Econometrica61, 57–84; 1993, J. Econ. Theory 59, 145–168), boundedly rational individuals are recurrently matched to play a game, and they play myopic best replies to the recent history of play. It could therefore be an advantage to instead play a myopic best reply to the myopic best reply, something boundedly rational players might conceivably also do. We investigate this possibility in the context of Young's (J. Econ. Theory 59, 145–168) bargaining model. It turns out that “cleverness” in this respect indeed does have an advantage in some cases. However, if all individuals are equally informed about past play, in a statistical sense, then the Nash bargaining solution remains the unique long-run outcome when the mutation rate goes to zero. Journal of Economic Literature Classification Numbers C70, C78.