Journal on Policy & Complex Systems Volume 4, Number 1, Spring 2018 | Page 126

Rethinking the International System as a Complex Adaptive System
used to measure “ the overall strength ” of an internal rule of an agent ( Holland , 1995 , p . 65 ). The underlying idea is that not all stimulus-response rules are equally effective in achieving agents ’ goals or in guaranteeing their survival . To make an example , if we think in terms of survival of an agent , an “ IF I see the cliff , THEN go straight ” rule is less fit than one that has an effector that says , “ THEN turn 180 degrees .”
In order to understand which rules are fit , an adaptive agent assigns credit to each rule based on the payoff received after that the rule has been effected . However , if there is no direct reward after the execution of a rule , an agent is faced with a dilemma regarding credit assignment . If my rival increases defense spending , should I also increase mine in the eventuality of war ? If the war never occurs , how do I measure the payoff of rearming vis-à-vis not rearming , or rearming just a little ? This is what it is called a credit assignment problem . An agent has to be able to assign credit to rules that ( 1 ) have no direct payoff , and ( 2 ) are part of a chain of action that is not yet concluded .
In the first case , the problem can be addressed by using a mechanism that in mathematics is described by bucket brigade algorithms . This mechanism serves to strengthen the credit of rules that belong to a chain of actions that ends with a good payoff ( Holland , 1995 , p . 56 ). For instance , the rule that says “ preemptively arm ” would be rewarded with a high payoff only if it is part of the chain of actions that has led to the overall survival of the agent . In the second case , since the chain of actions is not yet concluded , an agent has to be able to think and simulate the future . As Holland suggests , he can do so by building internal models and discovering new rules .
Rule discovery creates new internal rules through a process that mimics evolution . This mechanism has well represented Holland ’ s genetic algorithms , which serve to combine and crossover rules of behavior to create plausible and fit new rules ( Holland , 1995 , p . 57 ). Like in chromosomal crossover , genetic algorithms select couples of high credit rules and then cross them over with elements of randomness to create an offspring rule . In turn , the child rule is evaluated and , if considered strong , crossed over with another rule . Over time , the outcome of this procedure fosters in the selection of new rules of behavior and strategies with high potential .
Explaining in detail how to program an adaptive agent is out of the scope of this paper . The reader will find in the bibliography useful sources that exhaustively explain how to program an ABM with properties of credit assignment and rule discovery . For the scope of this study , what are of interest are the underlying rules of the process of adaptation , which if contextualized to IR can provide novel insights and new understandings . In particular , the concept of adaptation is fruitful to rethink agents ’ behavior in international politics .
Regarding behavior , one common assumption of positivist IR theory is that humans , and therefore states , are to some extent rational ; if not fully , at least in bounded terms . Rationality is in this sense the higher order rule of struc-
125