__Cooperating sequential processes__

0. Introduction

1. On the Nature of Sequential Processes.

2. Loosely Connected Processes.

2.1 . A Simple Example.

2.2. The Generalized Mutual Exclusion Problem.

2.3. A Linguistic Interlude.

3. The Mutual Exclusion Problem Revisited.

3.1. The Need for a More Realistic Solution.

3.2. The Synchronizing Primitives.

3.3. The Synchronizing Primitives Applied to the Mutual Exclusion Problem.

4.1. Typical Uses of the General Semaphore.

4.2. The Superfluity of the General Semaphore.

4.3. The Bounded Buffer.

5. Cooperation via Status Variables.

5.1. An Example of a Priority Rule.

5.2. An Example of Conversations

5.2.1. Improvements of the Previous Program.

5.2.2. Proving the Correctness.

6. The Problem of the Deadly Embrace.

The main purpose of this preface is to explain the specification "Preliminary Version", appearing on the title page of these lecture notes. They have been prepared under considerable tine pressure, circumstances under which I was unable to have my use of the English language corrected by a native, circumstances under which I was unable first to try out different methods of presentation. As they stand, I hope that they will serve their two primary purposes: to give my students a guide as to what I am telling and to give my Friends and Relations an idea of what I am doing.

The future fate of this manuscript, that may prove to be a monograph in statu nascendi, will greatly depend on their reactions to it. I am greatly indebted, in advance, to any reader who is so kind as to take the trouble to give his comments, either in the form of suggestions how the presentation or the material itself could be improved, or in the form of an appreciation. From the latter comments I will try to get an idea whether it is worth-while to pursue this effort any further and to prepare a publication fit for and agreeable to a wider public.

Already at this stage I should like to express my gratitude to many: to my collaborators C.Bron (in particular for his scrutinous screening of the typed version), to A.N.Habermann, F.J.A.Hendriks, C.Ligtmans and P.A. Voorhoeve for many stimulating and clarifying discussions on the subject itself, to the Department of Mathematics of the Technological University, Eindhoven, for the opportunity to spend my time on the problems dealt with and to lecture on their solutions and also —trivial as it may seem, this is nevertheless vital:— for putting at my private disposal a type writer with a character set in complete accordance with my personal wishes.

E.W.DijkstraDepartment of Mathematics

Technological University

P.O. Box 513

EINDHOVEN

The Netherlands

These lectures are intended for all those that expect that in their future activities they will become seriously involved in the problems that arise in either the design or the more advanced applications of digital information processing equipment; they are further intended for all those that are just interested.

The applications I have in mind are those in which the activity of a computer must include the proper reacting to a possibly great variety of messages that can be sent to it at unpredictable moments, a situation which occurs in process control, traffic control, stock control, banking applica- tions, automization of information flow in large organizations, centralized computer service and, finally, all information systems in which a number of computers are coupled to each other.

The desire to apply computers in the ways sketched above has often a strong economic motivation, but in these lectures the not unimportant ques- tion of efficiency will not be stressed too much. We shall occupy ourselves much more with the logical problems which arise, for example, when speed ratios are unknown, communication possibilities restricted etc. We intend to do so in order to create a clearer insight into tie origin of the difficulties we shall meet and into the nature of our solutions. To decide whether under given circumstances the application of our techniques is economically attractive or not falls outside the scope of these lectures.

I regret that I cannot offer a fully worked out theory, complete with Greek letter formulae, so to speak. The only thing I can do under the present circumstances is to offer a variety of problems, together with solutions. And in discussing these, we can only hope to bring as much system into it as we possibly can, to find which concepts are relevant, as we go along. May everyone that follows me along this road enjoy the fascination of these intriguing problems as much as I do!

__1. On the Nature of Sequential Processes.__

Our problem field proper is the cooperation between two or more sequential processes. Before we can enter this field, however, we have to know quite clearly what we call "a sequential process". To this preliminary question the present section is devoted.

I should like to start my elucidation with the comparison of two machines to do the same example job, the one a non-sequential machine, the other a sequential one.

Let us assume that of each of four quantities, named "`a`[1]", "`a`[2]", "`a`[3]" and "`a`[4]" respectively, the value is given. Our machine has to process these values in such a way that, as its reaction, it "tells" us, which of the four quantities has the largest value. E.g. in the case:

`a`[1] = 7, `a`[2] = 12, `a`[3] = 2, `a`[4] = 9

the answer to be produced is "`a`[2]" (or only "2". giving the index value pointing to the maximum element).

Note that the desired answer would become incompletely defined if the set of values were —in order— "7, 12, 2, 12". for then there is no unique largest element and the answer "`a`[2]" would have been as good (or as bad) as "`a`[4]". This is remedied by the further assumption, that of the four values given, no two are equal.

Remark 1. If the required answer would have been the maximum value occuring among the given ones, then the last restriction would have been superfluous, for then the answer corresponding to the value set "7, 12, 2, 12" would have been "12".

Remark 2. Our restriction "Of the four values no two are equal" is still somewhat loosely formulated, far what do we mean by "equal"? In the processes to be constructed pairs of values will be compared with one another and what is really meant is, that every two values will be sufficiently different, so that the comparator will unambiguously decide, which of the two is the largest one. In other words, the difference between any two must be large compared with "the resolving power" of our comparators.

We shall first construct our non-sequential machine. When we assume our given values to be represented by currents, we can imagine a comparator consisting of a two-way switch, the position of which is schematically controlled by the currents in the coils of electromagnets as in Fig.1 and Fig.2.

Fig. 1.
x < y |
Fig. 2.
y < x |

When current `y` is larger than current `x`, the left electromagnet pulls harder than the right one and the switch switches to the left (Fig.l) and the input `A` is connected to output `B`; if current `x` is the larger one, we shall get the situation (Fig.2) where the input `A` is connected to output `C`.

In our diagrams we shall omit the coils and shall represent such a comparator by a small box

Fig. 3.

Now we can construct our machine as indicated in Fig.3. At the output side we have drawn four indicator lamps, one of which will light up to indicate the answer.

In Fig.4 we indicate the position of the switches when the value set "7, 12, 2, 9" is applied to it. In the boxes the positions of the switches are indicated, wires not connected to the input are drawn blotted.

Fig.4.

We draw the reader's attention to the fact that now only the positions of the three switches that connect output 2 to the input, matter; the reader is invited to convince himself that the position of the other three switches is indeed immaterial.

It is also good to give a moment attention to see what happens in time when our machine of Fig.3 is fed with four "value currents". Obviously it cannot be expected to give the correct answer before the four value currents are going through the coils. But one cannot even expect it to indicate the correct answer as soon as the currents are applied, for the switches must get into their correct position and this may take some time. In other words: as soon as the currents are applied (simultaneously or the one after the other) we must wait a period of time —characteristic for the machine— and after that the correct answer will be shown at the output side. What happens in this waiting time is immaterial, provided that it is long enough for all the switches to find their final position. They may start switching simultaneously, the exact order in which they attain their final position is immaterial and, therefore, we shall not pay any attention to it any more.

From the logical point of view the switching time can be regarded as a marker on the time axis: before it the input data have to be supplied, after it the answer is available.

In the use of our machine the progress of time is only reflected in the obvious "before - after" relation, which tells us, that we cannot expect an answer before the question has been properly put. This sequence relation is so obvious (and fundamental) that it cannot be regarded as a characteristic property of our machine. And our machine is therefore called a "non-sequential machine" to distinguish it from the kind of equipment —or processes that can be performed by it— to be described now.

Up till now we have interpreted the diagram of Fig.3 as the (schematic) picture of a machine to be built in space. But we can interpret this same diagram in a very different manner if we place ourselves in the mind of the electron entering at the top input and wondering where to go. First it finds itself faced with the question whether "`a`[l] < `a`[2]" holds. Having found the answer to this question, it can proceed. Depending on the previous answer it will enter one of the two boxes "`a`[1] < `a`[3]" or "`a`[2] < `a`[3]", i.e. it will only know what to investigate next, after the first question has been answered. Having found the answer to the question selected from the second line, it will know which question to ask from the third line and having found this last answer it will now know which bulb should start to glow. Instead of regarding the diagram of Fig.3 as that of a machine, the parts of which are spread out in space, we have regarded it as rules of behaviour, to be followed in time.

With respect to our earlier interpretation two differences are highly significant. In the first interpretation all six comparators started working simultaneously, although finally only three switch positions matter. In the second interpretation only three comparisons are actually evaluated —the wondering electron asks itself three questions— but the price of this gain is that they have to be performed the one after the other, as the outcome of the previous one decides what to ask next. In the second interpretation three questions have to be asked in *sequence*, the one after the other. The existence of such an order relation is the distinctive feature of the second interpretation which in contrast to the first one is therefore called "a sequential process". We should like to make two remarks.

Remark 3. In actual fact, the three comparisons will each take a finite amount of time (switching time", "decision time" or, to use the jargon, "execution time") and as a result the total time taken will at least be equal to the sum of these three execution times. We stress once more, that for many investigations these executions can be regarded as ordered markers on a scaleless time axis and that it is only the relative ordering that matters from this (logical) point of view.

Remark 4. As a small side line we note that the two interpretations (call them "simultaneous comparisons" and "sequential comparisons") are only extremes. There is a way of, again, only performing three comparisons, in which two of them can be done independently from one another, i.e. simultaneously; the third one, however, can only be done, after the other two have been completed. It can be represented with the aid of a box in which two questions are put and which, as a result, has four possible exits, as in Fig.5.

Fig.5.

The total time taken will be at least the sum of the comparison execution times. The process is of the first kind in the sense that the first two comparisons can be performed simultaneously, it is of sequential nature as the third comparison can only be selected from the second line when the first two have both been completed.

We return to our purely sequential interpretation. Knowing that the diagram is meant for purely sequential interpretation we can take advantage of this circumstance make the description of the "rules of behaviour" more compact. The idea is, that the two questions on the second line —only one of which will be actually asked— are highly similar: the questions on the same line only differ in the subscript value of the left operand of the comparison. And we may ask ourselves: "Can we map the questions on the same line of Fig.3 on a single question ?"

This can be done, but it implies that the part that varies along a line —i.e. the subscript value in the left operand— must be regarded as a parameter, the task of which is to determine which of the questions mapped on each other is meant, when its turn to be executed has come. Obviously the value of this parameter must be defined by the past history of the process.

Such parameters, in which past history can be condensed for future use are called *variables*. To indicate that a new value has to be assigned to it we use the so-called assignment operator ":=" (read: "becomes"), a kind of directed equality sign which defines the value of the left hand side in terns of the value of the right hand side.

We hope that the previous paragraph is sufficient for the reader to recognize also in the diagram of Fig.6 a set of "rules of behaviour". Our variable is called *i*; if the reader wonders, why the first question, which is invariably "`a`[l] < `a`[2] ?" is not written that way, he is kindly requested to have some patience.

Fig.6

When we have followed the rules of Fig.6 as intended from top till bottom, the final value of `i` will identify the maximum value, viz. `a`[`i`].

The transition from the scheme of Fig.3 to the one of Fig.6 is a drastic change, for the last "rules of behaviour" can only be interpreted sequentially. And this is due to the introduction of the variable `i`: having only `a`[1], `a`[2], `a`[3] and `a`[4] available as values to be compared, the question "`a`[i] < `a`[2] ?" is meaningless, unless it is known for which value of `i` this comparison has to be made.

Remark 5. It is somewhat unhappy that the jargon of the trade calls the thing denoted by `i` a variable, because in normal mathematics, the concept of a variable is a completely timeless concept. Time has nothing to do with the `x` in the relation

sin(2 * `x`) = 2 * sin(`x`) * cos(`x`) ;

if such a variable ever denotes a value, it denotes "any value".

Each time, however, that a variable in a sequential process is used —such as `i` in "`a`[`i`]"— it denotes a very specific value, viz. the last value assigned to it, and nothing else! As long as no new value is assigned to a variable, it denotes a constant value!

I am, however, only too hesitant to coin new terms: firstly it would make this monograph unintendedly pretentious, secondly I feel that the (fashionable!) coining of new terms often adds as much to the confusion in one way as it removes in the other. I shall therefore stick to the term "variable".

Remark 6. One may well ask, what we are actually doing, when we introduce a variable without specifying, for instance, a domain for it, i.e. a set of values which is guaranteed to comprise all its future actual values. We shall not pursue this any further here.