# Why PGA?

Writing a convincing rationale for the work on program algebra in hindsight is a problematic form of backward engineering. Still it needs to be done in order to collect the various motivational remarks in a growing collection of papers to a comprehensible set of arguments.

## Technical similarity and dissimilarity with ACP-style process algebra

In ACP one uses a parameter A to denote a finite collection of so-called atomic actions. Processes are then denoted by means of expressions made from these atomic actions and a number of operators. Processes as such are elements of process algebras, i.e. algebras providing interpretations for the atomic actions and the various operators. Process algebras can differ in mainly two ways: more operators allow more complicated (and expressive) algebras, and for a fixed number of operators the algebras may be more or less abstract, the most abstract process algebras identifying most processes.

In PGA-style program algebra one also uses a collection A as a parameter. In this case the elements of this parameter set are called atomic instructions. Atomic instructions differ from atomic actions because an environment is presupossed that serves as the context where an instruction is executed. Upon the termination of the execution the environment will return to the running program a boolean value, the so-called reply value. This form of interaction is simpler but less symmetric than the communication found in process algebra. The theory of component interaction obtained in program algebra is by consequence far more involved and less elegant than in process algebra. But it is significantly closer to the intuitions of sequential programming.

Like in ACP-style process algebra no assumption is made about the nature of the atomic components (i.e. the atomic instructions). This already marks a difference with all theories of computer programing that presuppose the existence of some collection fo data-types. Some key underlying assumptions are:

• the concept of a computer program does not and should not depend on any notion of abstact or concrete datatype except the two boolean values.

• a program is an element of a program algebra. In particular in PGA-stype program algebra, a program will be an element of the instruction sequence equivalence based model of the equations of PGA. Expressions in PGA (and in many other notations) are program descriptions rather than programs. In simplified language one often uses 'program' instead of 'program description'. That is ok for daily use but should be avoided in conceptual issues are being dicsussed.

• The essential property of a program is connected to it being a sequence of commands. (The phrase instruction is not so helpful here as that means no more or less than something which has been placed in a sequence.) A command is a request that an environment will perform a task and report about its activity by returning a boolean reply. For the program as such neither te structure or nature of this environment nor the purpose or specification of the task is of any importance. In particular the concept of a program is (and should be) entirely independent of the concept of a so-called digital computer.

• The meaning or semantics of a program is not the same as the program itself. The semantics of a program ois a so-called polarized process also called a thread. It is found via behavioral abstraction (also called process extraction) from a program. This is the key difference with process algebra where a distinction between a process and the semantics of a process is not made.

PGA resembles ACP by providing an algebraic strcuture on top of an unconstrained set of atoms (in PGA atomic instructions), but to a much lesser degree one can expand the instruction sequence equivalence algebra with additional operators to find more expressive program algebras. Instead richer program description notations are found by using quite different syntax and projecting descriptions into PGA in order do define its semantics.

## Projection semantics

The use of projection semantics allows one to simulate the pragmatics of program description notation design. The meaning of a program description is simply found by determining the program it describes. To find that program one needs a transformation (here called projection function) from the given description notation PGLX (programming language X) to PGLA the program description notation that goes with PGA. (It might have been better to write PgDNA (or simply PDNA) instead of PGLA in order to mark the difference between programs and their descriptions more clearly throughout the work.)

The assumption underlying projection semantics is that ONLY IF a projection for a prospective program description notation is known (give), one can confirm that it is a program description notation indeed. If the projection is changed so is the program description notation. The 'classical' degree of freedom implicit in statements like: 'this is executed differently by the new PGLX compiler' does not really exist! New compilers (when changing semantics) make new program (description) notations, because a program notation may now be mapped to another program.

## What is a program? Consider PGA!

One reason for starting the work on PGA has been dissatisfaction with the vast quantity of books on programming (ususlly in Java) that pay no attention to the proper definition of thenotion of a computer program. I don't thrust any definition like: 'programs tell computers what to do' (because usually they reside silently on a disk and don't tell anything at all to anyone).

Then, after looking at a number of classical papers once more, it turned out that (in my view) the theoretical literature pays no real attention to the question of what constitutes a program either. Many papers consider this question settled in principle by a reference to the Turing Machine. I do not believe that at all. The TM is about machines and is not a platform for a first class treatment of programs. Many papers also deal with programs via examples, invariably using quite specific abstract data types like counters, floating point numbers or stacks as a point of departure. Again I fail to see that any of these expositions provides more than an example of a concept that remains hidden underneath. Then many papers start with a version of the lambda-calculus as a definition of a program. Again this is unclear to me because the whole observation (or assumption) that a Java program is a program because in fact it corresponds to an expression in an appropriately typed lambda-calculus strikes me as being both historically and intuitively invalid, irrespectively of the fact that this view of the matter can help to solve subtle semantical issues.

The classical work on program schemes comes closest to the PGA approach but that work has not been updated to deal with more recent developments. Moreover, it has been introduced as a way to make logical investigations about an important concept known in practice. PGA starts from the more drastic assumption that this important concept (programs and programming) is not really known in practice and still is in need of proper definitions.

## Where to go with PGA?

Clearly this question is primarily a worry for its designers alone! But given the fact that other people pay for our work some open information about future directions is definitely in place. At this stage no prediction can be made as to whether or not PGA will end up as a historical footnote (if at all) or more. Work on PGA cannot be based on any assumption regarding this matter. A slow but steady development of core theory about PGA is the current course of development.

1. Computer Virusses.

Our current project is to deal in comprehensive detail with the foundations of computer virus reseach as laid down in 1984 by Fred Cohen. Its seems that formalizing his work in the setting of program algebra is quite feasible (if not really 'natural') but at the same time confusing: the undecidability (of being a computer virus) that Cohen observed seems not to exist after this formalization. This is quite remarkable: because Cohen's proof suggests a diagonal argumant resembling the liar paradox which is absent in the PGA formalization (what we call a 'bypass' of Cohen's impossibility result.) His result can be recosntructed as well, but then we need a different proof.

This kind of work can lead to applications for which PGA is meant: to explain afterwards why things worked out the way they did. In the case of Cohen's result there has been a remarkable absence of secondary work on the subject in spite of a very large number of references to it. (An interesting example of a retrospective explanation is with respect to the role of goto's. Rather than using the structured programming dogma that goto's are problematic but fortunately redundant, PGA theory proves them more expressive than while loops. This may be viewed as a 'explanation' of the facts that goto's survive in modern programming languages, C# being a nice and recent example.)

2. Virtual Machines.

In PGA we have given up the position that (our) research is a front runner of industrial development. Conversely the leading role of industry is taken as a fact and the actual course of technology is is considered to constitute empirical data still in need of proper explanation. A second big area that we intend to investigate from the program algebra perspective is that of virtual machines. Technically virtual machines are programs. Probably this is not helpful when looking for clear definitions. Starting from the bare elements of the subject the very notion of a stored program requires further explanation. What exactly is the difference between a stored program computer and a non-stored program computer. By never defining non-stored program computers in a satisfactory way books and papers applauding the famous stored program model fail to provide the basis for a proper understanding of this issue. The same may be said for virtual machines. Only on the basis of a very clear qanalysis of the interaction between programs and machines one can hope to point down in satisfactory detail what the status of virtual machines might be.

3. Execution architectures and stored programs on Maurer Computers.

We have spotted Maurer's 1967 JACM paper on 'A theory of computer instructions' as the best possible basis for a theory of machines (Maurer calls them computers) that can used in combination with PGA-style program algebra. Maurer's model is particularly suitable to analyze the multitude of conceivable machine models between what we call the split program machine model (in computer architecture loosely indicated as the ISA architecture), and the stored program running on a modern out-of-order pipelined multi-processor chip. Here we are looking for quantitative results concerning the fraction of conceivable functionalities that can actually be programmed and stored on a given Maurer computer.