As noted in my previous essay (Declarative and Imperative Models – Some Precursor Thoughts), as Ken and I work on our model of exploitation in human systems, I’ve been confronted with questions relating to how best to develop and present both this particular model and also computer models more generally. How can we present such models in a rigorous and reproducible form? And as a research tool, what legitimate conclusions can we draw from the behaviours of a computer model?
To develop my appreciation for the relative value and limitations of computers models (and perhaps other types of models) I want to first define and draw some conceptual boundaries around different model types – ones that are not too narrow or broad. Once I’ve completed the definition stage, I can then more rigorously compare computer models with other types of models, like mathematical models, and formulate some thoughts on the appropriate places, if any, for computer models within research circles.
To draw these boundaries, and develop subsequent conclusions, I believe I need to answer a number of questions that might seem disconnected at first, but which I think will all feed into the larger matter of how computer and other programmatic models can be used most appropriately and effectively. These questions include:
- Are computer programs importantly different from mathematical descriptions? For example, are computer programs inherently less expressive, or differently expressive, than mathematical descriptions?
- Is it ever possible to transform programs into mathematical descriptions? If yes, is it always possible?
- More broadly, what is the relationship between computer programs and mathematical descriptions?
- Are computer programs, by themselves, models? What is relationship between a computer program and a programmatic model?
- Similarly, are mathematical descriptions, by themselves, models? What is the relationship between a mathematical description and a mathematical model?
- Are programmatic models, as a class, more limited in their modelling abilities than mathematical models, as a class?
- What is computation? What does the activity of computation encompass?
- What is the relationship between computational models and non-computational (e.g. physical to scale) models? Is there really a sharp distinction?
- Are non-computational models useful? What are their properties, relative to other models?
- Is there something special about computation? Are there some phenomena that can only be modelled by computational models?
- Conversely, can computational models model all phenomena? Or are there some phenomena that can only be modelled by non-computational (e.g. scale) models?
Unfortunately, I’m not well versed in all of the necessary disciplines that must be brought to bear in order to answer these questions. Fortunately, however, it appears that other researchers have already done most, if not all, of the work required to answer them. In this respect, my goal in this essay is more to informally gathering together some existing evidence and opinions on these fronts, rather than to make novel contributions in this area.
In the process, I will most likely rehash many issues and conclusions already considered by analytic philosophers, mathematicians, computer scientists and researchers in other disciplines, as well as unintentionally leap over or ignore some relevant debates and theories on these subjects. As I say, my intention here is not to present anything controversial, debatable, novel or even particularly subtle. Rather, my goal for this essay is twofold. First, I hope that at the the end of this essay, I will have increased my own understanding of the relative role and relevance of computer models and imperative models (as defined in my previous essay) within the larger pantheon of modelling and research activities. Second, I hope that I can add to the clarity of discourse by highlighting and making explicit some modelling issues which are at times difficult to talk about in modelling circles.
I’ll start this essay by reviewing some well known ideas and facts about computing. Then, continuing on from my thoughts in my previous essay (which introduced some differences between directives and descriptions, and speculated on how this could lead to two different types of models, declarative and imperative) I’ll consider how these computational facts tie into questions about the nature of declarative and imperative models. All of this will require a bit of a diversion into some history of computer science, and then some consideration of philosophy of modelling topics, after which I can curve back around to consider how best computer programs might be used to model phenomena of interest.
A Reflection on the Historical Roots of Computing
The term ‘computer’ originally referred to a human who computes. What was the job description of one of these human computers? Specifically, to carry out calculations by following the instructions provided to the computer by their employer. These calculations encompassed the operations of arithmetic – addition, subtraction, etc. For example, suppose someone wanted to know the result of multiplying 123769 by 32974, adding 7 and then dividing by 9. They could hire a computer to generate the correct answer to this problem by appropriately carrying out the requisite calculations.
The tasks assigned to human computers could be laborious and complex. Human computers were, for example, often hired by astronomers to carry out the calculations required to come to conclusions about the hypothesized behaviours of astronomical bodies. Presumably these human computers were provided with a set of instructions and asked to carry out these instructions exactly, to the letter, in order to correctly generate the required results, which should always be the same, no matter which computer was doing the calculating.
Clearly, following a set of instructions is not restricted to mathematical endeavours, so the fact that instructions were involved did not, in and of itself, mean that the activity was computational. Consider, in comparison, a recipe, which is a set of instructions (which is to say a list of directives) that, when carried out, reliably transform a collection of raw materials into a new food – a cake, or a vegetable stew. Terms like computation, calculation, algorithm and function are used to describe similar activities in the mathematical realm, but it would seem strange, at least on the surface, to go in the other direction and describe cake baking in these same terms, with the baker ‘computing’ the results of the cake generation ‘function’ by carrying out ‘calculations’ – mixing, sifting, etc. – using an appropriate cake generating ‘algorithm’.
This odd terminological mismatch only serves to emphasize the fact that it is possible, at least on the surface of things, to carry out a series of instructions without doing computing. This statement is perhaps muddied a bit by claims that the entire universe and all of its behaviours and objects are based on, or somehow generated by, computation (i.e. via a series of mathematical calculations), in which case, maybe in a very abstract sense, mixing cake ingredients, underneath all of the chemical and physical activities taking place, could also be computation, but since these claims have not yet been proven, I will say that, at least on a day-to-day level, cake baking is not computation. As an aside, if you are a proponent of the everything-is-computation theory, then you may frequently need to mentally tack on the phrase “unless everything, at bottom, is computation” to some of the subsequent statements I make in this essay. I will generally proceed without providing this caveat, but feel free to do so.
Thus computing is just one type of activity, baking being another, that involves carrying out a set of specific instructions in order to obtain a specific result. It seems fair to call this set of instructions more generally, a ‘program’, and thus we have ‘computer programs’ that instruct a computer to carry out certain computing activities (calculations), and perhaps ‘baking programs’ which we generally call baking recipes, that instruct the baker to carry out certain baking activities.
There are also many activities, or actions, that do not obviously involve an actor following a set of instructions. A rock rolling down a hill, or a tree waving in the breeze are acting – if a tree or a rock can even be said to be actors – but they are not following a set of instructions. A fox hunting a rabbit is an actor acting, but if the fox is following a set of instructions – a program – then this program is not obviously coming from another agent, unless we perhaps count the fox’s own brain, or the universe itself, as that other agent. If we do wish to take this tack, we might express it by saying that the fox is its own instructor, and it is programming itself.
The term ‘algorithm’, which is based on the name of the Persian mathematician Muḥammad ibn Mūsā al-Khwārizmī, is more typically restricted to instructions to be carried out in the mathematical or computing context, although again this restriction is perhaps more conventional than necessary. In a mathematical context, an algorithm is typically a series of abstracted steps that, when implemented in a particular fashion by a computer (human or otherwise), generate the output of a function, relative to a particular input. Apart from calculating the output of functions, an algorithm may also describe, in an abstract way, the instructions for any other activities a computer might be tasked to carry out – sorting an array, for example, and then providing as an output the sorted array. Since modern digital computers carry out all instructions by turning them into instructions involving a limited set of very basic calculations involving 1s and 0s, anything done by a digital computer is, in the end, computation by definition, regardless of how far buried under layers of abstraction these calculations may be.
Since the theory that all human thought is computation is still a hypothesis, if a human follows the same algorithm to sort a series of numbers it has yet to be proven that all of the human activities involved in this activity can be translated, at some low level, into a series of calculations. Perhaps the sorting of numbers is a poor example here since, given that the objects being acted upon are numbers, this activity already seems like a mathematical operation, and thus a computational one, even at a more superficial level. However, the general point remains – not everything is a mathematical object, and not every human activity necessarily involves computation, as far as we know at this time.
Returning to Directives and Descriptions
But what does any of this have to do with modelling?
In my previous essay I suggested there was a distinction between a description of a state of affairs and a set of directives, but I also said this distinction didn’t always seem extremely distinct, after all. Both descriptions and directives are, at base, sets of statements that describe, or at least represent, something else. In the first case, we’re describing some state of affairs, actual or potential. In the second, we’re describing or representing actions, combined with the connotation (suggestion, wish or command) that the person receiving these instructions should do the actions. It seems relatively straightforward, however, to transform this second case into a series of statements that just describe an agent doing an action, without the connotation of command. For instance:
Measure the flour into the bowl.
Crack the eggs into the bowl.
Mix the ingredients in the bowl.
can be transformed into:
An agent measures the flour into the bowl.
An agent cracks the eggs into the bowl
An agent mixes the ingredients in the bowl.
It’s harder to go as smoothly in the other direction, and change statements that describe states of affairs into directives, although somewhat possible:
The flower is red. -> Flower, be red.
The flower bloomed yesterday. -> Flower, bloom yesterday.
It will rain tomorrow. -> Sky over Ottawa, rain tomorrow.
Still, in both cases, whether they are stated as imperatives, or as descriptions of states of affairs, we are left with a set of static statements.
From Statements to Models
Are these static sets of statements in and of themselves models?
My working definition of a model, which comes from Grear, but which is not universally accepted in philosophy of modelling circles (to which I must apologize for what will no doubt be a lack of nuance in some of my subsequent discussion relative to this substantive debate), is that a model is an independent structure with useful (physical) similarities to the target system. These similarities between the model and the target, when identified, allow an agent using the model to carry out reasoning, in the broadest sense of the term, on the model, and then transfer these findings over to the target system via the identified isomorphisms, through an application of analogical reasoning.
Under this definition, whether or not something can be a model of something else, and the ways in which it can act as a model, are more objective than subjective, assuming that similarity between objects is objective. It must be admitted, however, that saying that an object can, objectively, function as a model of another object, is not saying much, since arguably any object has at least something in common with any other object: I exist as an object and so does the sun, so the sun can model me, and I can model the sun.
Of course, the fact that, in the most general sense, anything can be in some respect a model of anything else, does not mean that any object can model any aspect of any other object. Since the chemical properties of the sun are very different from my chemical properties, it would be very hard, if not impossible, to use the sun to model my body chemistry. Similarly, even if not in principle impossible, it would be hard to use six rocks to model my house in a useful way – for example, to determine whether or not my sofa will fit in the corner of my living room.
With this in mind, in the case of sets of statements, what can they model? Or how would they work as models?
Symbolic representations are generally not very similar to the objects they represent – the words ‘red flower’ have very few physical similarities to an actual red flower – and if there are similarities, this similarity is purely accidental. This property of symbolic representations is, in fact, generally viewed as a major point in favour of symbolic representations: they can represent objects without needing to be physically similar to them, which means that there is great flexibility in how they can be used.
There may nonetheless be strategies for introducing connections and similarities between sets of statements and the target system. In the case of dynamic target systems, for example, actions may be introduced into the picture when, for example, an agent processes the directives and acts based on what they’ve processed. Or when a logician applies deductive reasoning to the set of descriptive statements. This will be discussed again further along in the essay.
Whether or not the statements themselves can be considered a good model of a particular system, if the appropriate information processor – which is by its nature an actor, capable of action – were to read a set of statements, it seems likely that this information processor would potentially be able to generate appropriate models from the statements. For example, if I read a set of statements describing in detail the state of a house, I could use these to construct a physical model of that house, and move model furniture around the model rooms. Moreover, if it’s true that we can create models inside our heads – mental models – then perhaps I don’t even need to create the physical model of the house – I can just create an internal mental structure that has the required similarities.
I would suggest that this exercise – turning a set of statements into a full blown model – is possible because there are additional connections between the target system and the internal contents of the information processor, itself, that make the relevant connections between the set of statements and the target system possible. This echoes a point Fodor makes about the language of thought, which he argues must have a particular relationship to the physical world in order to function as he says it does.
If people are wedded to the idea that sets of statements in and of themselves should be considered as models, rather than simply as components of a model that is then completed with the addition of another component (i.e. an information processor) that has the requisite similarities the system of interest, I don’t entirely feel the need to stand in their way. In this case, however, I would at least like to distinguish between those models that involve sets of statements and models that do not (e.g. a map, or a scale model of a house, or a mechanical model of the solar system), and which, instead have more direct physical similarities to the target system of interest.
The main reason for this preference on my part, is that, as noted above, it seems to me that models that are sets of statements must, in order to act as models, be necessarily interpreted by an agent (human or otherwise) with the appropriate interpretive abilities, whereas other types of models have similarities that exist within the materials and structure of the model itself – and this seems to me to be a non-trivial difference. As an illustration of this, consider that if all speakers of English were to die out (and all information about the English language destroyed, so that it could not be resurrected in some future agent), then a set of English statements about the solar system on their own could not act as a model of the solar system in the same way that a physically constructed model built out of gears and light bulbs could.
Although it might be less important in the grand scheme of things, I would also make a distinction between sets of statements in a more general sense (e.g. sets of statements in plain language), and sets of statements that adhere to a more formal structure – in particular, ones using logical operators or other mathematical formalisms. Formally structured statements can allow for the rigorous or even mechanical application of deductive and other types of reasoning (inductive, abductive, etc.), and through this application generate of new statements, which, at least in the case of deductive logical operations, will share known truth properties relative to the original statements. And these new statements themselves can effectively become a part of a model. It’s possible that all common language statements can be transformed into equivalent statements within this more formal structure, in which case this distinction may be less useful, but in the absence of proof of this, the distinction seems relevant.
Deductive and Imperative Equivalence
Let’s grant, as a first pass, that sets of statements are either models in and of themselves, or at least key components in models that, in combination with statement interpreters, can contribute to understanding something relevant about a target system. Where does this leave our distinction between imperative and declarative models? If they are both the same type of model in so far as they are both sets of statements, in what ways might they still be different? More specifically, what can we say about the expressive equivalence between deductive and imperative descriptions? We’ve already seen that transforming a directive into a descriptive statement is at least possible. But how generalizable is this process? And how equivalent would the resulting models be?
Relevantly, on the computation front, we know, thanks to research carried out in the first half of the 20th century, by Church, Turing and others, that any machine that can, for example, compute the lambda calculus, can compute anything else. Put another way, the machine that can implement the operations of the lambda calculus can by definition carry out any sequence of calculations, however else they are described, because the lambda calculus can represent all calculations. A Turing machine is one such machine capable of carrying out the operations of the lambda calculus. So if an imperative model is computational (which is itself perhaps not always so cut-and-dried), then this theory at least suggests that there will always be a bridge between that model and a mathematical (or more generally, deductive) model.
Even if computation cannot be entirely reduced to calculating functions (and it is beyond my current qualifications to weigh in on this), computing the value for a function given certain inputs is at least a major focus of computation. That said, here it is worth noting that models more broadly are not, by definition, functions. For example, the scale model of a house is not structured, directly, in such a way to take inputs and produces outputs based on these inputs. I’m not saying that it might not be somehow possible to characterise this scale model in those terms, but on the face of it, the fit is not great. So models don’t have to be functions with clear inputs and outputs. In the other direction, can a function be a (part of) a model? This at least would seem obviously to be the case. For example, we can use a function to model the position of a baseball being hit by a baseball bat.
So we have concluded that computational systems can be models. But are computational systems the be all and end all of models? Can computational systems, in principle, model everything?
Before trying to answer this question, it’s important to have an accurate picture of what does and does not count as a computational system. It’s easy for this picture to become distorted due to the powerful presence of the currently canonical computational system – the digital computer. Specifically, given the digital computer’s ubiquity and thus familiarity, it’s easy to assume, in the one direction, that everything with any degree of sophisticated behaviour must, somehow, be a digital computer and, in the other direction, that if something is clearly not a digital computer, then it is not a computer at all.
Within this context, it might first be worth pointing out, even if this seems very obvious to most, that not all machines are computers in the commonly understood sense of the word (is my blender a computer?). So the fact that a particular model is, itself, a machine does not by virtue of that fact make it a computational model. In the other direction, not all computers are digital, or discrete. In addition to human computers (which may or may not be discrete computers, since that is still to be determined), analogue computing devices can also carry out mathematically relevant activities, generating the appropriate outputs to functions, when supplied with appropriate inputs, even though they receive non-discrete input and generate non-discrete output. Ongoing research is increasingly showing that analogue computers can also be universal computers, in the same way that Turing equivalent digital systems are universal computers.
Are these analogue computers symbolic? This probably rests on the exact definition of what a symbol is, which I will not even attempt to address here (there is an entire field, semiotics, devoted to such questions). But at the very least they are not discrete machines, in the traditional sense. This becomes readily and viscerally apparent when viewing videos of the operations and construction techniques of the original analogue (mechanical) computers (for example those used to calculate the trajectories of missiles in the first half of the twentieth century).
So there are devices that are not computers, and models that are not computers, and computers that are not discrete. Where does this all leave us?
Let’s just go for broke and jump to the heart of the matter by asking the following question: is everything computable?
Put this way, it seems like a strange questions, and even one that is difficult to interpret sensibly. Is a chocolate cake computable? On the surface of things, it would seem not, since we have agreed, earlier, that, barring theories that show that the universe is somehow, foundationally, entirely generated by computation, following a recipe to bake a cake is not computing something. So the answer to the question “Is everything computable” would seem, trivially and obviously, to be no.
Rephrasing things, it might make more sense to ask: can we answer every question via the act of computing? Can we, in principle represent every question, or problem, as a computer program or mathematical description, and then compute the solution?
I am hardly qualified to answer this question, myself. However, as I understand it, we do know that there are at least some functions that are non-computable (for example, the halting function and other functions encompassed by Rice’s theorem, which comes from the discipline of computability theory). Not being a mathematician, it’s difficult for me to fully comment on the implications of this, particularly with respect to modelling, but the existence of non-computable functions does at least suggest to me that not everything is computable. Whether this result is only trivially the case, and irrelevant in most contexts, or has more substantive and practical implications, I am unqualified to say. I admit that, on the surface of it, this seems potentially at odds with some currently popular theories that the universe itself is composed of, or perhaps generated at its root by, computations, but, again, since I’m not a mathematician or a physicist, I really can’t say for sure that these two results are at odds. In any case it suggests, although certainly does not prove, that perhaps computational models cannot model everything.
Mathematics and the ‘Real World’
With all of this under our belt, we can return to an earlier question, presented in a somewhat different form: When we are carrying out mathematical activities, like calculations, which types of objects are eligible to be the raw materials and components of these activities? For example, does addition occur in the real world, acting on real world objects or only on objects that exist in the world of numbers? When I put some eggs into a basket with some other eggs, am I adding them? When I pour one glass of water into a second glass containing water, am I adding the water together?
In a sense, I don’t really care how you answer this question, but what I would request is this: choose an answer, and then proceed accordingly and consistently based on this answer when you discuss the relationship between computational models and target systems. If you say no, you can only add numbers themselves together, and then use these results to model what happens when you put the new eggs into the basket, that’s fine. In this case, I would say that there are useful similarities between adding numbers and combining eggs, and I can use the one to better understand the other via analogical reasoning. If you say yes, you can add without numbers, that’s fine, too. In this case, the act of combining eggs and the mathematical operation of addition of numbers are two activities of essentially the same over-arching type, despite their superficial differences. In this second case, by extension, I might say that adding eggs together is indeed just calculation, which seems a bit strange, as already noted, or I might say that mathematical addition and egg combination are two activities, one computational and one not, that fall under the category of a more general type of activity (merging, perhaps?).
By extension, this example also shows that, whether or not non-computational models as a category are essential in some absolute sense, we can certainly still do useful work with them. In the absence of eggs and even in the absence of arithmetic, I can still model what would happen were I to combine the eggs together, for instance by pairing each egg with an orange and then combining the oranges together and examining the result – possibly a less risky operation, even if I do have the eggs readily available. Similar systems were historically used to keep track of cattle by making marks on wood. Without counting or adding, the user of such a system could return to their cattle pen after some time had passed, match the cattle to the wood marks, and come to some useful conclusions based on this (e.g. now some of my cattle are gone, perhaps due to an unscheduled visit from my neighbour in the night).
It should also be noted, before moving on to other questions, that even models that are not composed of sets of statements, such as mechanical models of the solar system, can still themselves give rise to sets of statements, distinct from the model, but possibly used in conjunction with it. Specifically, once any type of model has been constructed, if I can perceive it and study it, I can then generate statements about the model and I can carry out deduction and other reasoning strategies on these statements, which will give me new information about the model that I can then carry over to the target system. Alternatively, I can simply manipulate the model more directly, to learn something about it, and transfer this discovery over to my target system. We may, in the end, have many layers of models that all ultimately lead back to the target system in question.
Static and Dynamic Models
However you wish to slice things, it seems clear that the description of a thing, or the model of a thing, is not that thing, and, similarly that the instructions for creating something are not the creation of that something. The properties of the target system are clearly different from the properties of the set of statements about that system. This is even more so the case for sets of statements than for physical model components like the marks on a piece of wood.
But, to return again to the question posed earlier, are instructions and descriptions in and of themselves entirely equal in their properties relative to each other? If yes – if the properties of imperative descriptions are essentially equivalent to declarative descriptions for all intents and purposes – then we are no worse (or possibly better) off, in terms of our model functionality, if we switch from one type of model to the other. And if the properties of imperative descriptions are a subset of the properties of declarative descriptions, the switch from the one to the other may provide additional benefits.
In my previous essay I noted that, although instruction sets are static, they have more of a dynamic sensibility, relative to descriptions. Thinking about this in terms of matching model properties to target system properties, can I have a static model of a dynamic process? If you are in the camp that thinks sets of statements are, in and of themselves, models, then the answer is presumably yes. You simply describe the activities of the system and this description sufficiently captures the active elements of the system.
In the case of implemented computer models, the dynamic nature of the model is more directly and necessarily baked in, however. I give the computer the program, and a dynamic model is created when the computer runs the program. Indeed, I would argue that the full model only exists while this is occurring. The computer can then report on what is happening, moment by moment, to the model as it is being run. (I, the programmer, am limited in my direct ability to either perceive the model or to manipulate it).
As part of my ongoing efforts to avoid artificial specificity in the portrayal of model types, I would note that the program here does not necessarily need to be run by a computing device. For example, for an agent based model, we might imagine, alternatively, creating the model by having people carry out the program instructions, with some of the people acting as agents and following out the instructions for agents at each time step, and with other people carrying out the instructions required to generate the simulated environment. Some individual based modellers have in fact used this method to generate results. In such cases, as with the situation where the computer is running the program, the model only exists for so long as the people are behaving in the scripted manners. Here the model of the process is in fact a process itself, and ceases to exist once the people cease carrying out their individual processes and go home.
Within this context, we can imagine asking the people running the program to freeze at a critical moment, and then investigating the current state of affairs in that moment. We could then extract a statement about this state of affairs, and transfer it over to our system of interest.
Similarly, we could imagine printing out a detailed description of every action taken by a computer during the running of a program:
Variable 1 did the following in this moment.
…
Variable N did the following in this moment.
If we wanted to, we could further carry out various types of logical reasoning over these statements to come up with some useful conclusions. And if need be we could repeat the process with a variety of starting conditions.
Transforming Imperative Models into Declarative Models
The parameter question is relevant because when an implemented and running computer program is provided with different inputs, we may get different results. Consider, for example, the following set of statements.
x is an integer.
y = 4 + 3
z = x + 2 + y
q = 5 if z is even and -1 if z is odd
We can also write this imperatively:
select any integer value and assign it to x.
add 4 + 3 and assign it to y.
add y to 2 to x and assign the result to z
if there is no remainder when z is divided by 2, assign 5 to q. Else assign -1 to q.
In this example, we get different results depending on the value of x.
Interestingly, at least in the case of the collection of mathematical expressions, we can deduce that q is in fact dependent more directly on the value of x alone, and that the value of q can be calculated based on whether or not x itself is even or odd.
More specifically, we can use the deductive process of simplification in the case of the mathematical description to demonstrate this:
z = x + 2 + 4 + 3
z = x + 9
q = 5 if x + 9 is even and -1 if x + 9 is odd
q = 5 if x is odd and – 1 if x is even
thus our original description is mathematically equivalent to the much simpler description: q = 5 if x is odd and – 1 if x is even.
But what about the imperative description – can we simplify this? Certainly to some extent. For example, if we use a replace command, we can go through the code, carry out some actions and, where appropriate replace elements of the old instructions with new instructions:
select any integer value and assign it to x.
add 4 + 3 and assign it to y ->
-> add 7 and assign it to y
-> assign 7 to y
add y to 2 to x and assign the result to z ->
->add 7 to 2 to x and assign the result to z
->add 9 to x and assign this function to z
if there is no remainder when z is divided by 2, assign 5 to q. Else assign -1 to q.
->if there is no remainder when (add 9 to x) is divided by 2, assign 5 to q. Else assign -1 to q.
This is the type of activity that a compiler could carry out.
But what about the last step? In this case we need to carry out some sort of deductive reasoning activity, and we also need to pull in additional mathematical statements – for example that any even number, when added to an odd number is an odd number. Could we do this with a program written in an imperative language? Could we do it for a program written in a declarative language? We will return to this point later.
The main take away from all of this is at this time is that, from where I currently stand, it at least appears possible, and even quite likely, that program instructions and mathematical expressions are, in theory, equally expressive, in terms of what they can represent. However, they are not perhaps equally amenable to having deductive reasoning applied to them. If both of these statements are true, then it should always be possible to convert programmatic models to declarative models, and then also to do more with these declarative models, once they have been created.
Does this then spell out out doom for programmatic models. Perhaps. But I’m not yet entirely confident that this necessarily means that we can always come to the same conclusions using programmatic and mathematical models. Let’s continue along this vein, and consider the type of mathematical, or more generally declarative, models into which programmatic models can be turned.
Programmatic Models and Good Models?
Let’s say for the time being that it is essentially possible to turn any computer program – and by extension, any programmatic model – into a series of mathematical expressions that can then act as a mathematical model. Even given this in principle possibility, it does not necessarily solve all of our modelling problems to simply create computer models and then turn them into mathematical models, because it seems likely that the resulting mathematical model would not at all be considered a ‘good’ mathematical model, even if it did still happen to capture something useful about the system of interest. Thus we would simply move from a ‘bad’ computer model, to a ‘bad’ mathematical model.
What would be wrong with such a model? For starters, it would most likely have a very large number of variables. As well, it would most likely contain statements that were not particularly general, but instead which encoded quite specific facts about the system. Broadly speaking, it would not in any sense of the word be considered an elegant model, nor one which could be easily used, by a person, to generate novel and useful statements via a process of deductive reasoning.
Even the application of computer-aided deduction would not likely help the matter, in terms of deducing usefully relevant new facts, although this might be too pessimistic of a perspective. A fair amount of effort has gone into, and is still going into, using computers to assist or outright generate mathematical proofs. The fact that this is an ongoing area of research suggests to me, however, that carrying out this sort of activity on digital computers is not trivial.
In contrast to a mathematical model derived somehow from a programmatic model, what would an elegant mathematical model look like? I’m not a mathematician, so it’s difficult for me to say, and I further understand that the term is usually reserved for proofs, not models, but from my vantage point I suspect it would be one with few variables, representing conceptually abstract properties of the system of interest, combined in relatively few expressions, that could, via a process of deductive reasoning, generate broadly generalizable conclusions about the variables in the model and their relationship with each other, and by extension the system represented by the model.
What do I mean by broadly generalizable conclusions? Certainly, discovering one possible set of values that made the model consistent would not be sufficiently generalizable – i.e. when variable x = 10 and variable y = 3, variable z = 14. Rather, we would want to know something much more general about the relationship between variables in the model – e.g. when x > 10, regardless of the value of y, z must always positive – which in turn should correspond with something generally interesting about the target system of interest – e.g. when arolgas reach 10 years of age their fur will necessarily always be fuzzy, regardless of their bazoli levels.
A Cellular Automata Example
I think it’s easier to appreciate the whole situation if we use as an example a cellular automata model. This is a particularly pertinent example because cellular automata frequently crop up in debates between computer and mathematical modellers – see for example the debate over Wolfram’s publications and related claims about cellular automata. So let’s suppose we have a particular 2D cellular automata setup, with a certain definition of a neighbourhood, and a certain set of rules that determine when cells live and die. Now let’s compare the situation between an programmatic, imperative model and a mathematical declarative model of this cellular automata system, both of which can certainly be created.
First let’s consider the programmatic, imperative model:
We can start by setting up the environment – we define x and y as variables of an integer data type, which will be used to represent each cell in a grid (in fact, these would most likely be the indices to a two dimensional array data object). The state of each cell – living or dead – would also be represented by a variable for each cell (effectively, the data array would hold the variable value). The initial probability of any cell to be alive or dead is also represented as a variable – a float – which will be a parameter of the model. We set the program to throw an error if this parameter variable is not set between 0 and 1. We also program in the rules that are to be applied to each cell, identified by its x and y values, based on the states of neighbouring cells.
We then run the model by setting a size for the grid, setting the probability parameter, randomly setting cells within the grid as alive or dead by generating a random number and comparing it with the parameter, and then looping through each step of the automaton, stopping after 10000 timesteps. At the end of the timesteps we measure what percentage of cells are alive and what percentage of cells are dead.
In the case of the mathematical (declarative) model, we also describe the grid using two variables, x and y, that we state are members of the integers. We can easily describe, using mathematical expressions the neighbourhood and the rules for cells to become living or dead (there are many examples of this in the literature – for one example see Grey, 2003), as well as a variable representing a relevant type of mathematical object (e.g. a matrix) that describes the state of each cell. Further, we can a define a variable p to represent the probability of a cell being alive or dead, and perhaps a variable t to represent time. We further specify that p is a real number between 0 and 1. In the case of this mathematical model, rather than running the model, we carry out deductive reasoning upon the model, deducing new facts by generating novel proofs using the starting premises represented by the model and deductive reasoning.
To me, a major difference in these two cases is that to run the imperative computer model, at least as described here, I must select specific, concrete values for my variables – I must pick a value for p, I must pick a size for my grid, etc. I’m not allowed to leave variables more generally defined (e.g. this variable can take any integer values). If I do, I’ll fail to get an output from my program. With the mathematical model, on the other hand, I can permanently keep the description at a more abstract level, and the results of my deductive reasoning can also remain general. The ability of the imperative model to definitively produce general conclusions is more limited in this specific way.
To see what I mean by this, lets suppose, on the imperative model side, that we pick a size for the grid, a selection of values for p, and run the model 10 000 times for each value of p. The output of this will be, at the end of 10 000 timesteps, a grid with each cell in a certain state. We can then create a description of that grid about some aspect of it that is of interest to us – for example – we might be interested in the percentage of cells that are alive.
Let us suppose that, after carrying out all of the runs of the model, we notice that the percentage of cells alive at the end of the run is always less than the percentage of cells alive at the beginning of the run (note that this would not really be the case for many CA rule sets – I’ve completely made up this result as an example). Noticing this, we might start to suspect that this was always the case – that it represents some sort of more general rule for CAs.
From an inductive reasoning point of view, this might indeed be a reasonable conclusion, but we are, in this case, vulnerable to all of the issues inductive reasoning brings to us. If I have only ever seen black swans, I might conclude that all swans are black, that all birds are black, or even that all animals are black, say if I have grown up in a bubble on the moon and have never seen any other types of animal. In the CA example, I have a particular grid size, a particular set of p and a certain amount of timesteps. How much generalizing can I do?
Models, Experiments and More
We seem to be going from bad to worse, from the perspective of a computer modeller. Although I have provided no conclusive proofs in this essay, what I’ve discussed so far would at least suggest that computer models have the same expressive power as mathematical models, and thus can always be turned into mathematical models, but, at the same time will almost inevitably be turned into bad mathematical models, from which useful generalizable knowledge cannot reliably or easily be produced.
At this point I can’t help but compare imperative modelling activities to experimental science activities, which also seek to derive general rules through the application of inductive reasoning to data collected during the experiment. Like the programmatic modeller, the experimentalist must also fix the parameters of the experiment, before carrying out the experiment. And the experiment itself will be carried out imperatively, as a series of instructions.
And, in fact, we see that this issue is not restricted to programmatic computer models or experiments, but really to anything programmatic, which is to say – anything encompassing a set of instructions and a control flow. Thinking back to our cake baking program, while we might imagine the cake baking program to make non-specific statements – pour some type of flour into a bowl, add some type of chopped fruit, or some kind of candy, into the bowl – in order to actually generate my cake output, I will need select a certain type of flour (e.g. rice flour) and a certain flavouring (e.g. chocolate chips) . I can’t create a ‘general cake’, and pop it in an oven.
Likely for this reason, experimentalists critique programmatic models from a different direction than mathematicians do. Specifically, they note that at least experiments are being carried out on the target system itself, which means that any issues relating to matches or mismatches between the model and the target, or mistakes made when transferring conclusions from the model to the target system, can be avoided.
This experimentalist critique is not unique to programmatic models. Animal models, for example, which are used as a substitute for carrying out experiments on humans, fall victim to this same critique. If this drug behaves this way in a mouse, is there really any guarantee that it will behave the same way in a human? From this perspective, we could suggest that the fewer similarities a model has, relative to a target, the worse it is, from an inductive reasoning point of view. Better to use a monkey model than a mouse model, and better to use a mouse model than a fish model. And better to use a fish model than a computer model.
The Worst of all Modelling Worlds?
Thus, programmatic imperative models would seem to inhabit the worst of all modelling worlds, coming from either an experimentalist or mathematical modelling perspective. This sentiment was well articulated by a biologist on my thesis committee, who told me in all seriousness that computer models should only ever be used in research as a last resort. Which may or may not have been meant to imply that, in practice, computer models should never be used at all.
It is true that, on the reading provided above, programmatic imperative models, which include most, if not all, computer models, would seem to exemplify, and perhaps serve to vividly highlight, both all of the issues that experimental scientists have with models more generally (they are opaque, they are oversimplified, they are not a good fit for the target system, due to poor description of the model it’s hard to evaluate to what extent they are or are not a good fit for the target system, etc.) and, at the same time, all of the issues mathematicians have with experimental research (the non-mathematical world is imperfect, experimental science can never come to any truly general conclusions about a system of interest, experimental science is forced into using bad types of reasoning, the whole endeavour is all too messy and inelegant, etc.) Effectively, programmatic imperative models embody simultaneously all the methodology elephants that inhabit the room whenever experimentalists and modellers come together to try to understand some system of interest.
To be damningly thorough, there is still another critique, not yet mentioned, that is, while not strictly speaking unique to programmatic models, generally presumed to apply to all programmatic models by default. This critique is that models that are not the simplest possible model of a phenomenon are always inferior to the simplest possible model of the phenomenon. But what is the simplest possible model? How will we know it when we see it?
My take on this is that ‘not simple models’ are, to flip the description around, unnecessarily complicated models, or, more concretely, models that include details that have no bearing on the behaviours or properties of the model with respect to how it relates to the system of interest. For example, imagine creating a model of the ground floor of a house to determine how the furniture can be re-arranged, and instead of just using paper and rectangles, meticulously recreating all of the furniture to be re-arranged in miniature, right down to replicating the fabric patterns on each piece. Creating the model at this level of detail invites a number of issues. First it results in unnecessary effort, when a simpler, more easily created model could do just as well. Second, by extension, the addition of these unnecessary details could somehow cause elements of the model to be misused, or inaccurate – perhaps adding the fabric on to the furniture makes the furniture too big, or not fit into corners. Perhaps we become distracted by the appearance of the furniture, instead of just worrying about where the furniture will fit. Perhaps we have mis-replicated the appearance of the furniture, and are now drawing unwarranted conclusions about what the living room will look like under certain arrangements. Sticking to a simple paper model, or even better, a simple mathematical one, would avoid all of these issues.
Appropriate Moments for Imperative Models
Given imperative models’ admitted vulnerabilities to the many thorny issues raised above, should computer modellers simply throw in the towel and retrain, en mass, as mathematicians? Before taking this drastic step, let us make one last ditch effort, which we will kick-off by assuming that ‘last resort scenario’ is not simply a euphemism for ‘a technique never to be used’. What are the scenarios in which programmatic imperative models might shine, or, if not shine, at least usefully and acceptably contribute to the research of a particular area?
We could start by looking for areas in which experimental research and mathematical modelling research themselves run into difficulty.
On the experimentalist’s side, experimental methodologies regularly run into issues related to ethics, required time and resources, and ecological validity. For example, in the case of my own research, studying the dynamics of exploitative behaviour, if I were to try to research this in an experimental way, I would most certainly run into ethical issues related to paying some people to exploit others. And then, even if I could mitigate the ethical issues associated with this, I would also potentially need to house large numbers of people for an extended period of time in a highly controlled environment, and preferably monitor and analyze all of their behaviours during this time. Even then, to obtain the controlled experimental conditions necessary to draw solid conclusions using inductive reasoning on the data available, I would need to abstract away from, and operationalize, the concept of exploitation in such a manner that I might not realistically be able to extend my experimental findings to real world situations. So studying this area entirely experimentally, sans models, is not a particularly viable option.
What about modelling the situation mathematically?
Let us suppose for the sake of argument that we have an extremely accomplished and competent mathematician who is fully capable of creating any conceivably relevant mathematical (or more generally, declarative) model, but who is also willing, if necessary (but only if absolutely necessary) to create an imperative model. And let us also suppose for a moment, even if this has not been definitively proven in this essay, that it is always, in principle, possible to create an elegant, maximally simple, descriptive mathematical model of a phenomenon (even if it might require the extension of existing mathematics to do so, and even if humans might need to the assistance of computers to find solutions to the equations in this model, or deduce relevant new statements from this model).
Under these two conditions, would this mathematician ever turn to imperative models? Would this hypothetical set-up, if true, effectively put an end to the use of such models? And indeed, to all other types of model (e.g. physical simulations of certain scenarios, or physical models of other types), and even to experimental research? Given our two starting conditions above, under what circumstances, if any, would that mathematician of unparalleled skill turn (perhaps with great chagrin) to the creation of an imperative computational model? Or even an experiment?
Of course, as I am not a mathematician, it’s very difficult for me to answer this question. Even with my limited knowledge, I have a profound appreciation for mathematics as a powerfully expressive formal system, which is also continually extending itself and its descriptive abilities, sometimes through the activities of pure mathematics research, but sometimes specifically in order to be sufficiently expressive to create new types of mathematical models of relevant physical phenomena (calculus and even to some extent the field of imaginary numbers spring to mind as examples). Nonetheless, one question might still arise in the face of all of this expressive power: what elements, properties and relationships of the system of interest do I need to initially include in my declarative model, in order to then capture or generate knowledge of relevant system behaviours through deductive reasoning?
This is particularly likely to be an issue in systems that are complicated or complex – ones that, for example, display emergent behaviour as a result of many objects interacting via a potentially small number of simple rules. Such systems, despite their simple rules, are capable of generating extremely complex and difficult to predict behaviours or structures. And in such systems, it isn’t always clear which properties and behaviours will be importantly relevant to the emergent behaviour. Possibly many of them will be. Difficult to predict side effects and unintended consequences are also common in these systems. Because of this, its very hard to know, just by reasoning about the system from existing known facts, available at a given level detail, which elements of the original system we can safely leave out of the model while still gaining an awareness of relevant behaviours of the target system.
For the same reason, to switch away from computational models for a moment, getting entirely rid of animal models of humans and replacing them with mathematical models might also prove to be difficult. Perhaps not impossible in principle, but nonetheless a real challenge, at least in the short term, while our knowledge of human and animal biology remains relatively incomplete.
In general, it might only be possible to eliminate the extraneous variables in a model, and thus create the most simple and elegant model of a target system possible, once a greater understanding of these variables’ contribution, or lack thereof, to the behaviour of the system of interest becomes available. And this greater understanding of the system might be very difficult to gain, if not impossible, simply by carrying out deductive reasoning on already known facts about the system.
If we start our investigation, instead, by including many variables, we can investigate, without assumptions, the significance of these variables, as well as the relevance of, and emergent relationships between, these variables. As a result, we may have more success at accurately capturing relevant behaviours of the target system. If we learn, in the process, that many of the original variables included in the model appear not to be relevant, perhaps because they do not interact, then we can safely eliminate them in subsequent models of the system.
In this respect, computer models, declarative and imperative alike, share something in common with non-computational models. Non-computational models, simply by their physical natures, include, ‘for free’, many properties and relationships. Although computer models do not get any of these physical properties for free in a direct sense, it is relatively easy when setting up a computer model to include detailed programmatic representations of physical properties at an individual object level. At the same time, it is possible to record every behaviour and property of each simulated object throughout the duration of a model run to better understand which of these behaviours and properties are having an influence on the over-all behaviour of the model, and by extension which are likely to be having an influence on the behaviour of the target system.
Such models will, admittedly, be more messy, more complicated, less elegant, more difficult to and possibly more opaque than the in-principal-possible-to-create most simple and elegant mathematical model of the phenomenon. Perhaps they will initially contain many variables that, ultimately, are irrelevant to the behaviour of the target system. But the relevance of any given variable is a conclusion of such models, not an assumption. This provides one possible answer to the question posed above, about circumstances in which the mathematician might turn to imperative models and other strategies. Ultimately, the mathematician might turn to experiments, non-computational and programmatic models if they felt the need to enrich or validate their starting model premises and assumptions about which system variables were relevant, as well as the assumed nature of the relationships between these variables.
If all of this fails to be convincing with respect to question of whether or not programmatic models should ever be used, I would finally also ask, on a more pragmatic note: if under some circumstances the in-principle best possible (which is to say – simplest, most generalizable, most abstract, most powerful, most elegant) model of a particular target system is not the one that is used in order to arrive at correct and useful conclusions about that system, how problematic is this, really, assuming that it is indeed the case that the required conclusions can still be validly generated and stated in appropriate terms (i.e. an inductive conclusion is not stated as deductive conclusion) and the analogue reasoning used to transfer the conclusion from the model to the target is made explicit?
To use a trivial example, in the case of the paper model of our living room, if the best possible model of this situation is, by definition, a mathematical one, but we don’t have the ability to make the required mathematical model ourselves, due to our sadly limited mathematical capabilities, does this mean that we need to find a mathematician for hire who is willing to make the mathematical model of our living room, and then operate it for us in order to determine if the sofa will fit into the corner next to the love seat? Probably not. The paper model, though sub-optimal, will work just fine in this case to provide the answer we need. On the other hand, if we need to prove, definitively and inarguably, that the maximum number of sofas we could fit into the living room is exactly 10, then we might in fact need to hire the mathematician. There is a time and a place for everything.
The critiques that experimentalists and mathematicians make about computer models are very valid, and should be taken into account. Programmatic modelling practices can certainly be improved, and programmatic models, broadly speaking, should be made less opaque, less idiosyncratic, more interpretable, more validatable, more verifiable, more replicable and, yes, more mathematical. As well, the conclusions that can be validly drawn from this type of model should not be overstated – they do not produce general deductive conclusions. Moreover, the results of such models should be confirmed, and extended, by both experimental and mathematical methods, and primarily used as preliminary inputs to these other types of research activities.
Taking these concerns into account, imperative models can be a valuable tool in our efforts to extend our understanding of systems and phenomena. To dismiss the entire class of models out of hand as being valueless would seem to be a deductive conclusion based on poor premises.