PyACTUp version 1.1.2

Introduction

PyACTUp is a lightweight Python implementation of a subset of the ACT-R 1 cognitive architecture’s Declarative Memory, suitable for incorporating into other Python models and applications. It is inspired by the ACT-UP 2 cognitive modeling toolbox.

Typically PyACTUp is used by creating an experimental framework, or connecting to an existing experiment, in the Python programming language, using one or more PyACTUp Memory objects. The framework or experiment asks these Memory objects to add chunks to themselves, describing things learned, and retrieves these chunks or values derived from them at later times. A chunk, a learned item, contains one or more slots or attributes, describing what is learned. Retrievals are driven by matching on the values of these attributes. Each Memory object also has a notion of time, a non-negative, real number that is advanced as the Memory object is used. Time in PyACTUp is a dimensionless quantity whose interpretation depends upon the model or system in which PyACTUp is being used. Note that the likelihood of retrievals does not depend upon the actual scale of time used, only on the ratios of the various values. There are also several parameters controlling these retrievals that can be configured in a Memory object, and detailed information can be extracted from it describing the process it uses in making these retrievals. The frameworks or experiments may be strictly algorithmic, may interact with human subjects, or may be embedded in web sites.

PyACTUp is a library, or module, of Python code, useful for creating Python programs; it is not a stand alone application. Some knowledge of Python programming is essential for using it.

PyACTUp is an ongoing project, and implements only a subset of ACT-R’s Declarative Memory. As it evolves it is likely that more of the ACT-R architecture will be incorporated into it. Such future additions may also change some of the APIs exposed today, and some work may be required to upgrade projects using the current version of PyACTUp to a later version.

1

J. R. Anderson, D. Bothell, M. D. Byrne, S. Douglass, C. Lebiere, & Y. Qin (2004). An integrated theory of the mind. Psychological Review 111, (4). 1036-1060.

2

D. Reitter and C. Lebiere. (2010). Accountable Modeling in ACT-UP, a Scalable, Rapid-Prototyping ACT-R Implementation. In Proceedings of the 10th International Conference on Cognitive Modeling (ICCM).

Installing PyACTUp

PyACTUp requires Python version 3.7 or later. Recent versions of Mac OS X and recent Linux distributions are likely to have a suitable version of Python pre-installed, but it may need to be invoked as python3 instead of just python, which latter often runs a 2.x version of Python instead. Use of a virtual environment, which is recommended, often obviates the need for the python3/python distinction. If it is not already installed, Python, for Windows, Mac OS X, Linux, or other Unices, can be downloaded from python.org, for free.

PyACTUp also works in recent versions of PyPy, an alternative implementation to the usual CPython. PyPy uses a just-in-time (JIT) compiler, which is a good match for PyACTUp, and PyACTUp models often run five times faster in PyPy compared to CPython.

Note that PyACTUp is simply a Python module, a library, that is run as part of a larger Python program. To build and run models using PyACTUp you do need to do some Python programming. If you’re new to Python, a good place to start learning it is The Python Tutorial. To write and run a Python program you need to create and edit Python source files, and then run them. If you are comfortable using the command line, you can simply create and edit the files in your favorite text editor, and run them from the command line. Many folks, though, are happier using a graphical Integrated Development Environment (IDE). Many Python IDEs are available. One is IDLE, which comes packaged with Python itself, so if you installed Python you should have it available.

Normally, assuming you are connected to the internet, to install PyACTUp you should simply have to type at the command line

pip install pyactup

Depending upon various possible variations in how Python and your machine are configured you may have to modify the above in various ways

  • you may need to ensure your virtual environment is activated

  • you may need use an alternative scheme your Python IDE supports

  • you may need to call it pip3 instead of simply pip

  • you may need to precede the call to pip by sudo

  • you may need to use some combination of the above

If you are unable to install PyACTUp as above, you can instead download a tarball. The tarball will have a filename something like pyactup-1.1.2.tar.gz. Assuming this file is at /some/directory/pyactup-1.1.2.tar.gz install it by typing at the command line

pip install /some/directory/pyactup-1.1.2.tar.gz

Alternatively you can untar the tarball with

tar -xf /some/directory/pyactup-1.1.2.tar.gz

and then change to the resulting directory and type

python setup.py install

Mailing List

There is a mailing list for those interested in PyACTUp and its development.

Background

Activation

A fundamental part of retrieving a chunk from a Memory object is computing the activation of that chunk, a real number describing how likely it is to be recalled, based on how frequently and recently it has been added to the Memory, and how well it matches the specifications of what is to be retrieved.

The activation, A_{i} of chunk i is a sum of three components,

A_{i} = B_{i} + \epsilon_{i} + P_{i}

the base level activation, the activation noise, and the partial matching correction.

Base level activation

The base level activation, B_{i}, describes the frequency and recency of the chunk i, and depends upon the decay parameter of Memory, d. In the normal case, when the Memory’s optimized_learning parameter is False, the base level activation is computed using the amount of time that has elapsed since each of the past appearances of i, which in the following are denoted as the various t_{ij}.

B_{i} = \ln(\sum_{j} t_{ij}^{-d})

If the Memory’s optimized_learning parameter is True an approximation is used instead, often less taxing of computational resources. It is particularly useful if the same chunks are expected to be seen many times, and assumes that repeated experiences of the various chunks are distributed roughly evenly over time. Instead of using the times of all the past occurrences of i, it uses L, the amount of time since the first appearance of i, and n, a count of the number of times i has appeared.

B_{i} = \ln(\frac{n}{1 - d}) - d \ln(L)

Note that setting the decay parameter to None disables the computation of base level activation. That is, the base level component of the total activation is zero in this case.

Activation noise

The activation noise, \epsilon_{i}, implements the stochasticity of retrievals from Memory. It is sampled from a logistic distribution centered on zero. A Memory object has a scale parameter, noise, for this distribution. It is normally resampled each time the activation is computed.

For some esoteric purposes when a chunk’s activation is computed repeatedly at the same time it may be desired to have all these same-time activations of a chunk use the same sample of activation noise. While this is rarely needed, when it is the fixed_noise context manager can be used.

Note that setting the noise parameter to zero results in supplying no noise to the activation. This does not quite make operation of PyACTUp deterministic, since retrievals of chunks with the same activations are resolved randomly.

Partial Matching

If the Memory’s mismatch parameter is None, the partial matching correction, P_{i}, is zero. Setting the parameter to None is equivalent to setting it to ∞, ensuring that only chunks that exactly match the retrival specification are considered. Otherwise P_{i} depends upon the similarities of the attributes of the chunk to those attributes being sought in the retrieval and the value of the mismatch parameter. When considering chunks in partial retrievals or blending operations attributes for which no similarity function has been defined are treated as exact matches; chunks not matching these attributes are not included in the partial retrieval or blending operation.

PyACTUp normally uses a “natural” representation of similarities, where two values being completely similar, identical, has a value of one; and being completely dissimilar has a value of zero; with various other degrees of similarity being positive, real numbers less than one. Traditionally ACT-R instead uses a range of similarities with the most dissimilar being a negative number, usually -1, and completely similar being zero. If preferred, PyACTUp can be configured to use these ACT-R-style similarities by calling the function use_actr_similarity with an argument of True, resulting in the computations below being appropriately offset.

The set_similarity_function defines how to compute the similarity of values for a particular attribute.

If the mismatch parameter has real value \mu and the similarity of slot k of i to the desired value of that slot in the retrieval is S_{ik}, the partial matching correction is

P_{i} = \mu \sum_{k} (S_{ik} - 1)

The value of \mu is normally positive, so P_{i} is normally negative, and increasing dissimilarities reduce the total activation, scaled by the value of \mu.

Blending

Besides retrieving an existing chunk, it is possible to retrieve an attribute value not present in any instance, a weighted average, or blend, of the corresponding attribute values present in a set of existing chunks meeting some criteria. Currently only real valued attributes can be blended.

A parameter, the temperature, or \tau, is used in constructing the blended value. In PyACTUp the value of this parameter is by default the noise parameter used for activation noise, multiplied by \sqrt{2}. However it can be set independently of the noise, if preferred.

If m is the set of chunks matching the criteria, and, for i \in m, the activation of chunk i is a_{i}, we define a weight, w_{i}, for the contribution i makes to the blended value

w_{i} = e^{a_{i} / \tau}

If s_{i} is the value of the slot or attribute of chunk i to be blended over, the blended value, BV, is then

BV =\, \sum_{i \in m}{\, \frac{w_{i}}{\sum_{j \in m}{w_{j}}} \; s_{i}}

API Reference

PyACTUp is a lightweight Python implementation of a subset of the ACT-R cognitive architecture’s Declarative Memory, suitable for incorporating into other Python models and applications. It is inspired by the ACT-UP cognitive modeling toolbox.

Typically PyACTUp is used by creating an experimental framework, or connecting to an existing experiment, in the Python programming language, using one or more PyACTUp Memory objects. The framework or experiment asks these Memory objects to add chunks to themselves, describing things learned, and retrieves these chunks or values derived from them at later times. A chunk, a learned item, contains one or more slots or attributes, describing what is learned. Retrievals are driven by matching on the values of these attributes. Each Memory object also has a notion of time, a non-negative, real number that is advanced as the Memory object is used. Time in PyACTUp is a dimensionless quantity whose interpretation depends upon the model or system in which PyACTUp is being used. There are also several parameters controlling these retrievals that can be configured in a Memory object, and detailed information can be extracted from it describing the process it uses in making these retrievals. The frameworks or experiments may be strictly algorithmic, may interact with human subjects, or may be embedded in web sites.

class pyactup.Memory(noise=0.25, decay=0.5, temperature=None, threshold=- 10.0, mismatch=None, learning_time_increment=1, retrieval_time_increment=0, optimized_learning=False)

A cognitive entity containing a collection of learned things, its chunks. A Memory object also contains a current time, which can be queried as the time property.

The number of distinct chunks a Memory contains can be determined with Python’s usual len() function.

A Memory has several parameters controlling its behavior: noise, decay, temperature, threshold, mismatch, learning_time_increment, attr:retrieval_time_increment and optimized_learning. All can be queried, and most set, as properties on the Memory object. When creating a Memory object their initial values can be supplied as parameters.

A Memory object can be serialized with pickle, allowing Memory objects to be saved to and restored from persistent storage.

If, when creating a Memory object, any of noise, decay or mismatch are negative, or if temperature is less than 0.01, a ValueError is raised.

learn(advance=None, **kwargs)

Adds, or reinforces, a chunk in this Memory with the attributes specified by kwargs. The attributes, or slots, of a chunk are described using Python keyword arguments. The attribute names must conform to the usual Python variable name syntax, and may be neither Python keywords nor the names of optional arguments to learn(), retrieve() or blend(): partial, reheard or advance. Their values must be Hashable.

Returns True if a new chunk has been created, and False if instead an already existing chunk has been re-experienced and thus reinforced.

After learning the relevant chunk, advance() is called with an argument of advance. If advance has not been supplied it defaults to the current value of learning_time_increment; unless this has been changed by the programmer this default value is 1. If this advance is zero then time must be advanced by the programmer with advance() following any calls to learn before calling retrieve() or blend(). Otherwise the chunk learned at this time would have infinite activation.

Raises a TypeError if an attempt is made to learn an attribute value that is not Hashable.

>>> m = Memory()
>>> m.learn(color="red", size=4)
True
>>> m.learn(color="blue", size=4)
True
>>> m.learn(color="red", size=4)
False
>>> m.retrieve(color="red")
<Chunk 0000 {'color': 'red', 'size': 4}>
retrieve(partial=False, rehearse=False, advance=None, **kwargs)

Returns the chunk matching the kwargs that has the highest activation greater than this Memory’s threshold. If there is no such matching chunk returns None. Normally only retrieves chunks exactly matching the kwargs; if partial is True it also retrieves those only approximately matching, using similarity (see set_similarity_function()) and mismatch to determine closeness of match.

Before performing the retrieval advance() is called with the value of advance as its argument. If advance is not supplied the current value of retrieval_time_increment is used; unless changed by the programmer this default value is zero. The advance of time does not occur if an error is raised when attempting to perform the retrieval.

If rehearse is supplied and true it also reinforces this chunk at the current time. No chunk is reinforced if retrieve returns None.

The returned chunk is a dictionary-like object, and its attributes can be extracted with Python’s usual subscript notation.

>>> m = Memory()
>>> m.learn(widget="thromdibulator", color="red", size=2)
True
>>> m.learn(widget="snackleizer", color="blue", size=1)
True
>>> m.retrieve(color="blue")["widget"]
'snackleizer'
blend(outcome_attribute, advance=None, **kwargs)

Returns a blended value for the given attribute of those chunks matching kwargs, and which contains outcome_attribute. Returns None if there are no matching chunks that contains outcome_attribute. If any matching chunk has a value of outcome_attribute value that is not a real number a TypeError is raised.

Before performing the blending operation advance() is called with the value of advance as its argument. If advance is not supplied the current value of retrieval_time_increment is used; unless changed by the programmer this default value is zero. The advance of time does not occur if an error is raised when attempting to perform the blending operation.

>>> m = Memory()
>>> m.learn(color="red", size=2)
True
>>> m.learn(color="blue", size=30)
True
>>> m.learn(color="red", size=1)
True
>>> m.blend("size", color="red")
1.1548387620911693
best_blend(outcome_attribute, iterable, select_attribute=None, advance=None, minimize=False)

Returns two values (as a 2-tuple), describing the extreme blended value of the outcome_attribute over the values provided by iterable. The extreme value is normally the maximum, but can be made the minimum by setting minimize to True. The values returned by iterable should be dictionary-like object that can be passed as the kwargs argument to blend(). The first return value is the kwargs value producing the best blended value, and second is that blended value. If there is a tie, with two or more kwargs values all producing the same, best blended value, then one of them is chosen randomly. If none of the values from iterable result in blended values of outcome_attribute then both return values are None.

This operation is particularly useful for building Instance Based Learning models.

For the common case where iterable iterates over only the values of a single slot the select_attribute parameter may be used to simplify the iteration. If select_attribute is supplied and is not None then iterable should produce values of that slot instead of dictionary-like objects. Similarly the first return value will be the slot value rather than a dictionary-like object. The end of the example below demonstrates this.

Before performing the blending operations advance() is called, only once, with the value of advance as its argument. If advance is not supplied the current value of retrieval_time_increment is used; unless changed by the programmer this default value is zero. The advance of time does not occur if an error is raised when attempting to perform a blending operation.

>>> m = Memory()
>>> m.learn(color="red", utility=1)
True
>>> m.learn(color="blue", utility=2)
True
>>> m.learn(color="red", utility=1.8)
True
>>> m.learn(color="blue", utility=0.9)
True
>>> m.best_blend("utility", ({"color": c} for c in ("red", "blue")))
({'color': 'blue'}, 1.4868)
>>> m.learn(color="blue", utility=-1)
True
>>> m.best_blend("utility", ("red", "blue"), "color")
('red', 1.2127)
reset(preserve_prepopulated=False, optimized_learning=None)

Deletes the Memory’s chunks and resets its time to zero. If preserve_prepopulated is false it deletes all chunks; if it is true it deletes all chunk references later than time zero, completely deleting those chunks that were created at time greater than zero. If optimized_learning is not None it sets the Memory’s optimized_learning parameter; otherwise it leaves it unchanged. This Memory’s noise, decay, temperature, threshold and mismatch parameters are left unchanged.

time

This Memory’s current time. Time in PyACTUp is a dimensionless quantity, the interpretation of which is at the discretion of the modeler.

advance(amount=1)

Adds the given amount to this Memory’s time, and returns the new, current time. Raises a ValueError if amount is negative, or not a real number.

noise

The amount of noise to add during chunk activation computation. This is typically a positive, floating point, number between about 0.1 and 1.5. It defaults to 0.25. If zero, no noise is added during activation computation. If an explicit temperature is not set, the value of noise is also used to compute a default temperature for blending computations. Attempting to set noise to a negative number raises a ValueError.

decay

Controls the rate at which activation for previously chunks in memory decay with the passage of time. Time in this sense is dimensionless. The decay is typically between about 0.1 and 2.0. The default value is 0.5. If zero memory does not decay. If set to None no base level activation is computed or used; note that this is significantly different than setting it to zero which causes base level activation to still be computed and used, but with no decay. Attempting to set it to a negative number raises a ValueError. It must be less one 1 if this memory’s optimized_learning parameter is set.

temperature

The temperature parameter used for blending values. If None, the default, the square root of 2 times the value of noise will be used. If the temperature is too close to zero, which can also happen if it is None and the noise is too low, or negative, a ValueError is raised.

mismatch

The mismatch penalty applied to partially matching values when computing activations. If None no partial matching is done. Otherwise any defined similarity functions (see set_similarity_function()) are called as necessary, and the resulting values are multiplied by the mismatch penalty and subtracted from the activation.

Attributes for which no similarity function has been defined are always compared exactly, and chunks not matching on this attributes are not included at all in the corresponding partial retrievals or blending operations.

Attempting to set this parameter to a value other than None or a real number raises a ValueError.

threshold

The minimum activation value required for a retrieval. If None there is no minimum activation required. The default value is -10. Attempting to set the threshold to a value that is neither None nor a real number raises a ValueError.

While for the likelihoods of retrieval the values of attr:time are normally scale free, not depending upon the magnitudes of attr:time, but rather the ratios of various times, the attr:threshold is sensitive to the actual magnitude. Suitable care should be exercised when adjusting it.

learning_time_increment

The default amount of time to advance() by after performing a learn operation. By default this is 1. Attempting to set this to a negative value raises a ValueError.

retrieval_time_increment

The default amount of time to advance() by before performing a retrieval or blending operation. By default this is zero. Attempting to set this to a negative values raise a ValueError.

activation_history

A MutableSequence, typically a list, into which details of the computations underlying PyACTUp operation are appended. If None, the default, no such details are collected. In addition to activation computations, the resulting retrieval probabilities are also collected for blending operations. The details collected are presented as dictionaries. The references entries in these dictionaries are sequences of times the corresponding chunks were learned, if optimizied_learning is off, and otherwise are counts of the number of times they have been learned.

If PyACTUp is being using in a loop, the details collected will likely become voluminous. It is usually best to clear them frequently, such as on each iteration.

Attempting to set activation_history to anything but None or a MutableSequence raises a ValueError.

>>> m = Memory()
>>> m.learn(color="red", size=3)
True
>>> m.learn(color="red", size=5)
True
>>> m.activation_history = []
>>> m.blend("size", color="red")
4.027391084562462
>>> pprint(m.activation_history, sort_dicts=False)
[{'name': '0000',
  'creation_time': 0,
  'attributes': (('color', 'red'), ('size', 3)),
  'references': (0,),
  'base_activation': -0.3465735902799726,
  'activation_noise': 0.4750912862904178,
  'activation': 0.12851769601044521,
  'retrieval_probability': 0.48630445771876907},
 {'name': '0001',
  'creation_time': 1,
  'attributes': (('color', 'red'), ('size', 5)),
  'references': (1,),
  'base_activation': 0.0,
  'activation_noise': 0.14789096368864968,
  'activation': 0.14789096368864968,
  'retrieval_probability': 0.5136955422812309}]
optimized_learning

A boolean indicating whether or not this Memory is configured to use optimized learning. Cannot be set directly, but can be changed when calling reset().

Warning

Care should be taken when using optimized learning as operations such as retrieve that depend upon activation will not longer raise an exception if they are called when advance has not been called after learn, possibly producing biologically implausible results.

current_time

A context manager used to allow reverting to the current time after advancing it and simiulating retrievals or similar operations in the future.

Warning

It is rarely appropriate to use current_time. When it is used, care should be taken to avoid creating biologically implausible models. Also, learning within a current_time context will typically lead to tears as having chunks created or reinforced in the future results in failures of attempts to retrieve them.

>>> m = Memory(temperature=1, noise=0)
>>> m.learn(size=1)
True
>>> m.advance(10)
11
>>> m.learn(size=10)
True
>>> m.blend("size")
7.983916860341838
>>> with m.current_time as t:
...     m.advance(10_000)
...     m.blend("size")
...     (t, m.time)
...
10012
5.501236696240907
(12, 10012)
>>> m.time
12
fixed_noise

A context manager used to force multiple activations of a given chunk at the same time to use the same activation noise.

Warning

Use of fixed_noise is rarely appropriate, and easily leads to biologically implausible results. It is provided only for esoteric purposes. When its use is required it should be wrapped around the smallest fragment of code practical.

>>> m = Memory()
>>> m.learn(color="red")
True
>>> m.activation_history = []
>>> m.retrieve()
<Chunk 0000 {'color': 'red'}>
>>> m.retrieve()
<Chunk 0000 {'color': 'red'}>
>>> pprint(m.activation_history, sort_dicts=False)
[{'name': '0000',
  'creation_time': 0,
  'attributes': (('color', 'red'),),
  'references': (0,),
  'base_activation': 0.0,
  'activation_noise': 0.07779212346913301,
  'activation': 0.07779212346913301},
 {'name': '0000',
  'creation_time': 0,
  'attributes': (('color', 'red'),),
  'references': (0,),
  'base_activation': 0.0,
  'activation_noise': -0.015345110792246082,
  'activation': -0.015345110792246082}]
>>> m.activation_history = []
>>> with m.fixed_noise:
...     m.retrieve()
...     m.retrieve()
...
<Chunk 0000 {'color': 'red'}>
<Chunk 0000 {'color': 'red'}>
>>> pprint(m.activation_history, sort_dicts=False)
[{'name': '0000',
  'creation_time': 0,
  'attributes': (('color', 'red'),),
  'references': (0,),
  'base_activation': 0.0,
  'activation_noise': 0.8614281690342627,
  'activation': 0.8614281690342627},
 {'name': '0000',
  'creation_time': 0,
  'attributes': (('color', 'red'),),
  'references': (0,),
  'base_activation': 0.0,
  'activation_noise': 0.8614281690342627,
  'activation': 0.8614281690342627}]
forget(when, **kwargs)

Undoes the operation of a previous call to learn().

Warning

Normally this method should not be used. It does not correspond to a biologically plausible process, and is only provided for esoteric purposes.

The kwargs should be those supplied fro the learn() operation to be undone, and when should be the time that was current when the operation was performed. Returns True if it successfully undoes such an operation, and False otherwise.

pyactup.set_similarity_function(function, *slots)

Assigns a similarity function to be used when comparing attribute values with the given names. The function should take two arguments, and return a real number between 0 and 1, inclusive. The function should be commutative; that is, if called with the same arguments in the reverse order, it should return the same value. It should also be stateless, always returning the same values if passed the same arguments. No error is raised if either of these constraints is violated, but the results will, in most cases, be meaningless if they are.

If True is supplied as the function a default similarity function is used that returns one if its two arguments are == and zero otherwise.

>>> def f(x, y):
...     if y < x:
...         return f(y, x)
...     return 1 - (y - x) / y
>>> set_similarity_function(f, "length", "width")
pyactup.use_actr_similarity(value=None)

Whether to use “natural” similarity values, or traditional ACT-R ones. PyACTUp normally uses a “natural” representation of similarities, where two values being completely similar, identical, has a value of one; and being completely dissimilar has a value of zero; with various other degrees of similarity being positive, real numbers less than one. Traditionally ACT-R instead uses a range of similarities with the most dissimilar being a negative number, usually -1, and completely similar being zero.

If the argument is False or True is sets the ACT-R traditional behavior on or off, and returns it. With no arguments it returns the current value.

Examples

Rock, paper, scissors

This is an example of using PyACTUp to model the Rock, Paper, Scissors game. Both players are modeled, and attempt to chose their moves based on their expectations of the move that will be made by their opponents. The two players differ in how much of the prior history they consider in creating their expectations.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# Rock, paper, scissors example using pyactup

import pyactup
import random

DEFAULT_ROUNDS = 100
MOVES = ["paper", "rock", "scissors"]
N_MOVES = len(MOVES)

m = pyactup.Memory(noise=0.1)

def defeat_expectation(**kwargs):
    # Generate expectation matching supplied conditions and play the move that defeats.
    # If no expectation can be generate, chooses a move randomly.
    expectation = (m.retrieve(**kwargs) or {}).get("move")
    if expectation:
        return MOVES[(MOVES.index(expectation) - 1) % N_MOVES]
    else:
        return random.choice(MOVES)

def safe_element(list, i):
    try:
        return list[i]
    except IndexError:
        return None

def main(rounds=DEFAULT_ROUNDS):
    # Plays multiple rounds of r/p/s of a lag 1 player (player1) versus a
    # lag 2 player (player2).
    plays1 = []
    plays2 = []
    score = 0
    for r in range(rounds):
        move1 = defeat_expectation(player="player2",
                                   ultimate=safe_element(plays2, -1))
        move2 = defeat_expectation(player="player1",
                                   ultimate=safe_element(plays1, -1),
                                   penultimate=safe_element(plays1, -2))
        winner = (MOVES.index(move2) - MOVES.index(move1) + N_MOVES) % N_MOVES
        score += -1 if winner == 2 else winner
        print("Round {:3d}\tPlayer 1: {:8s}\tPlayer 2: {:8s}\tWinner: {}\tScore: {:4d}".format(
            r, move1, move2, winner, score))
        m.learn(player="player1",
                ultimate=safe_element(plays1, -1),
                penultimate=safe_element(plays1, -2),
                move=move1)
        m.learn(player="player2", ultimate=safe_element(plays2, -1), move=move2)
        plays1.append(move1)
        plays2.append(move2)
        m.advance()


if __name__ == '__main__':
    main()

Here’s the result of running it once. Because the model is stochastic, if you run it yourself the results will be different.

$ python rps.py
Round   0   Player 1: rock          Player 2: scissors      Winner: 1       Score:    1
Round   1   Player 1: rock          Player 2: scissors      Winner: 1       Score:    2
Round   2   Player 1: rock          Player 2: rock          Winner: 0       Score:    2
Round   3   Player 1: scissors      Player 2: paper         Winner: 1       Score:    3
Round   4   Player 1: rock          Player 2: scissors      Winner: 1       Score:    4
Round   5   Player 1: paper         Player 2: paper         Winner: 0       Score:    4
Round   6   Player 1: rock          Player 2: scissors      Winner: 1       Score:    5
Round   7   Player 1: scissors      Player 2: paper         Winner: 1       Score:    6
Round   8   Player 1: rock          Player 2: paper         Winner: 2       Score:    5
Round   9   Player 1: rock          Player 2: scissors      Winner: 1       Score:    6
Round  10   Player 1: scissors      Player 2: rock          Winner: 2       Score:    5
Round  11   Player 1: scissors      Player 2: paper         Winner: 1       Score:    6
Round  12   Player 1: rock          Player 2: scissors      Winner: 1       Score:    7
Round  13   Player 1: paper         Player 2: paper         Winner: 0       Score:    7
Round  14   Player 1: rock          Player 2: paper         Winner: 2       Score:    6
Round  15   Player 1: scissors      Player 2: rock          Winner: 2       Score:    5
Round  16   Player 1: scissors      Player 2: paper         Winner: 1       Score:    6
Round  17   Player 1: rock          Player 2: paper         Winner: 2       Score:    5
Round  18   Player 1: scissors      Player 2: scissors      Winner: 0       Score:    5
Round  19   Player 1: scissors      Player 2: rock          Winner: 2       Score:    4
Round  20   Player 1: scissors      Player 2: paper         Winner: 1       Score:    5
Round  21   Player 1: rock          Player 2: paper         Winner: 2       Score:    4
Round  22   Player 1: scissors      Player 2: scissors      Winner: 0       Score:    4
Round  23   Player 1: paper         Player 2: rock          Winner: 1       Score:    5
Round  24   Player 1: scissors      Player 2: rock          Winner: 2       Score:    4
Round  25   Player 1: scissors      Player 2: scissors      Winner: 0       Score:    4
Round  26   Player 1: paper         Player 2: paper         Winner: 0       Score:    4
Round  27   Player 1: scissors      Player 2: rock          Winner: 2       Score:    3
Round  28   Player 1: scissors      Player 2: rock          Winner: 2       Score:    2
Round  29   Player 1: paper         Player 2: paper         Winner: 0       Score:    2
Round  30   Player 1: rock          Player 2: rock          Winner: 0       Score:    2
Round  31   Player 1: scissors      Player 2: rock          Winner: 2       Score:    1
Round  32   Player 1: paper         Player 2: rock          Winner: 1       Score:    2
Round  33   Player 1: paper         Player 2: rock          Winner: 1       Score:    3
Round  34   Player 1: paper         Player 2: rock          Winner: 1       Score:    4
Round  35   Player 1: paper         Player 2: scissors      Winner: 2       Score:    3
Round  36   Player 1: scissors      Player 2: scissors      Winner: 0       Score:    3
Round  37   Player 1: rock          Player 2: rock          Winner: 0       Score:    3
Round  38   Player 1: paper         Player 2: rock          Winner: 1       Score:    4
Round  39   Player 1: paper         Player 2: paper         Winner: 0       Score:    4
Round  40   Player 1: rock          Player 2: scissors      Winner: 1       Score:    5
Round  41   Player 1: paper         Player 2: rock          Winner: 1       Score:    6
Round  42   Player 1: paper         Player 2: scissors      Winner: 2       Score:    5
Round  43   Player 1: paper         Player 2: scissors      Winner: 2       Score:    4
Round  44   Player 1: rock          Player 2: scissors      Winner: 1       Score:    5
Round  45   Player 1: rock          Player 2: scissors      Winner: 1       Score:    6
Round  46   Player 1: rock          Player 2: rock          Winner: 0       Score:    6
Round  47   Player 1: paper         Player 2: paper         Winner: 0       Score:    6
Round  48   Player 1: rock          Player 2: scissors      Winner: 1       Score:    7
Round  49   Player 1: paper         Player 2: rock          Winner: 1       Score:    8
Round  50   Player 1: scissors      Player 2: paper         Winner: 1       Score:    9
Round  51   Player 1: rock          Player 2: rock          Winner: 0       Score:    9
Round  52   Player 1: scissors      Player 2: scissors      Winner: 0       Score:    9
Round  53   Player 1: paper         Player 2: rock          Winner: 1       Score:   10
Round  54   Player 1: scissors      Player 2: rock          Winner: 2       Score:    9
Round  55   Player 1: paper         Player 2: paper         Winner: 0       Score:    9
Round  56   Player 1: rock          Player 2: rock          Winner: 0       Score:    9
Round  57   Player 1: scissors      Player 2: scissors      Winner: 0       Score:    9
Round  58   Player 1: paper         Player 2: scissors      Winner: 2       Score:    8
Round  59   Player 1: rock          Player 2: rock          Winner: 0       Score:    8
Round  60   Player 1: scissors      Player 2: rock          Winner: 2       Score:    7
Round  61   Player 1: scissors      Player 2: scissors      Winner: 0       Score:    7
Round  62   Player 1: paper         Player 2: paper         Winner: 0       Score:    7
Round  63   Player 1: rock          Player 2: paper         Winner: 2       Score:    6
Round  64   Player 1: scissors      Player 2: rock          Winner: 2       Score:    5
Round  65   Player 1: paper         Player 2: rock          Winner: 1       Score:    6
Round  66   Player 1: paper         Player 2: paper         Winner: 0       Score:    6
Round  67   Player 1: paper         Player 2: scissors      Winner: 2       Score:    5
Round  68   Player 1: paper         Player 2: scissors      Winner: 2       Score:    4
Round  69   Player 1: rock          Player 2: scissors      Winner: 1       Score:    5
Round  70   Player 1: rock          Player 2: rock          Winner: 0       Score:    5
Round  71   Player 1: scissors      Player 2: paper         Winner: 1       Score:    6
Round  72   Player 1: rock          Player 2: scissors      Winner: 1       Score:    7
Round  73   Player 1: rock          Player 2: rock          Winner: 0       Score:    7
Round  74   Player 1: scissors      Player 2: rock          Winner: 2       Score:    6
Round  75   Player 1: scissors      Player 2: scissors      Winner: 0       Score:    6
Round  76   Player 1: paper         Player 2: scissors      Winner: 2       Score:    5
Round  77   Player 1: paper         Player 2: paper         Winner: 0       Score:    5
Round  78   Player 1: rock          Player 2: scissors      Winner: 1       Score:    6
Round  79   Player 1: paper         Player 2: rock          Winner: 1       Score:    7
Round  80   Player 1: scissors      Player 2: paper         Winner: 1       Score:    8
Round  81   Player 1: rock          Player 2: paper         Winner: 2       Score:    7
Round  82   Player 1: rock          Player 2: rock          Winner: 0       Score:    7
Round  83   Player 1: scissors      Player 2: rock          Winner: 2       Score:    6
Round  84   Player 1: paper         Player 2: rock          Winner: 1       Score:    7
Round  85   Player 1: paper         Player 2: paper         Winner: 0       Score:    7
Round  86   Player 1: rock          Player 2: scissors      Winner: 1       Score:    8
Round  87   Player 1: paper         Player 2: rock          Winner: 1       Score:    9
Round  88   Player 1: scissors      Player 2: rock          Winner: 2       Score:    8
Round  89   Player 1: paper         Player 2: paper         Winner: 0       Score:    8
Round  90   Player 1: rock          Player 2: scissors      Winner: 1       Score:    9
Round  91   Player 1: paper         Player 2: scissors      Winner: 2       Score:    8
Round  92   Player 1: paper         Player 2: rock          Winner: 1       Score:    9
Round  93   Player 1: scissors      Player 2: paper         Winner: 1       Score:   10
Round  94   Player 1: rock          Player 2: scissors      Winner: 1       Score:   11
Round  95   Player 1: rock          Player 2: paper         Winner: 2       Score:   10
Round  96   Player 1: rock          Player 2: rock          Winner: 0       Score:   10
Round  97   Player 1: scissors      Player 2: paper         Winner: 1       Score:   11
Round  98   Player 1: rock          Player 2: scissors      Winner: 1       Score:   12
Round  99   Player 1: paper         Player 2: paper         Winner: 0       Score:   12

Safe, risky

This is an example of using PyACTUp to create an Instance Based Learning (IBL) 3 model of a binary choice task, exhibiting risk aversion. A choice is made between two options, one safe and the other risky. The safe choice always pays out one unit. The risky choice is random, paying out three units one third of the time and zero units the rest. In this example code the choice is made by each virtual participant over the course of 60 rounds, learning from the experience of previous rounds. And the results are collected over 10,000 independent participants, and the number of risky choices at each round, averaged over all participants, is plotted.

This code uses two other Python packages, matplotlib and tqdm. Neither is actually used by the model proper, and the code can be rearranged to dispense with them, if preferred. Matplotlib is used to draw a graph of the results, and tqdm to display a progress indicator, as this example takes on the order of twenty seconds to run in CPython.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import pyactup
import random

import matplotlib.pyplot as plt

from tqdm import tqdm

PARTICIPANTS = 10_000
ROUNDS = 60

risky_chosen = [0] * ROUNDS
m = pyactup.Memory()
for p in tqdm(range(PARTICIPANTS)):
    m.reset()
    # prepopulate some instances to ensure initial exploration
    for c, o in (("safe", 1), ("risky", 0), ("risky", 2)):
        m.learn(choice=c, outcome=o, advance=0)
    m.advance()
    for r in range(ROUNDS):
        choice, bv = m.best_blend("outcome", ("safe", "risky"), "choice")
        if choice == "risky":
            payoff = 3 if random.random() < 1/3 else 0
            risky_chosen[r] += 1
        else:
            payoff = 1
        m.learn(choice=choice, outcome=payoff)

plt.plot(range(ROUNDS), [ v / PARTICIPANTS for v in risky_chosen])
plt.ylim([0, 1])
plt.ylabel("fraction choosing risky")
plt.xlabel("round")
plt.title(f"Safe (1 always) versus risky (3 × ⅓, 0 × ⅔)\nσ={m.noise}, d={m.decay}")
plt.show()

The result of running this is

_images/safe_risky_graph.png
3

Cleotilde Gonzalez, Javier F. Lerch and Christian Lebiere (2003), Instance-based learning in dynamic decision making, Cognitive Science, 27, 591-635. DOI: 10.1016/S0364-0213(03)00031-4.