Workshop on Modular Knowledge (Tetrapod)

Oxford, July 13, 2018

at the Federated Logic Conference 2018

affiliated with the Third International Conference on Formal Structures for Computation and Deduction

#### Description

Mathematics, logics, and computer science support a rich ecosystem of formal knowledge. This involves many interrelated human activities such as modeling phenomena and formulating conjectures, proofs, and computations, and organizing, interconnecting, visualizing, and applying this knowledge. To handle the ever increasing body of knowledge, practitioners employ a rapidly expanding set of representation languages and computer-based tools centered around the four fundamental paradigms of formal deduction, computation, datasets, and informal narration.

Modularity has been recognized in all FLoC-related communities as a critical method for designing scalable representation languages and building large corpora of knowledge. It is also extremely valuable for comparing and exchanging knowledge across communities, corpora, and tools - a challenge that is both pressing and difficult.

Expanding on the Tetrapod workshop at the conference on intelligent computer mathematics (CICM) 2016, this workshop brings together researchers from a diverse set of research areas in order to create a universal understanding of the challenges and solutions regarding highly structured knowledge bases.

Of particular interest are

- foundational principles such as theory graphs and colimits
- interchange languages and module systems
- languages and tools for representing, reasoning, computing, managing, and documenting modular knowledge bases

#### Format

- There will be 6 invited speakers, each of which will be asked to present a specific topic.
- Each speaker will give a 15-minute presentation on that topic that is followed by a 30-minute discussion session.
- There will not be a call for papers or other contributions. However, there will be a call for participation that will include the invited speakers and their topics.

#### Invited speakers and topics

Speaker | Modularity in… | Slides |
---|---|---|

Catherine Dubois | Proof Checking | slides |

Georges Gonthier | Large Proofs | slides |

Till Mossakowski | Ontologies | |

Natarajan Shankar | Proof Assistants | slides |

Doug Smith | Software Synthesis | slides |

Nicolas M. Thiery | Mathematical Computation | slides |

#### Organizers

- Jacques Carette, McMaster University (carette@mcmaster.ca)
- Dennis Müller, FAU Erlangen-Nürnberg (d.mueller@kwarc.info)
- Florian Rabe, Jacobs University Bremen (florian.rabe@fau.de)

### Abstracts

#### Modularity in Proof Checking (Catherine Dubois)

The essence of (automated) proof checking is to check proofs for correctness. A proof checker takes a proposition and a presumed proof and merely verifies the proof, checking it for correctness. It is much simpler to verify a proof than finding it. Proof checking can be done in a batch way or interactively, as part of an interactive theorem prover. For example it is triggered in Coq by the keyword that ends a proof script (e.g. Qed or Defined). Moving proof checking outside of theorem provers is explored and implemented for example within the Dedukti system. Proof checking may also include some proof reconstruction.

We may wonder why we need proof checking. First publications in mathematics or computer science may contain errors. Second automated theorem provers as complex software are buggy and thus verifying their answers and proofs is very important. Proof assistants are also complex software and having some independent proof checkers - off the shelves - may also help to reduce the trusted base instead of proving the code of proof assistants (which is an impossible task). Proof checkers are simpler software than interactive or automated proof tools. Their code can be reasonably checked by crossed-reviews for example.

Type theory reduces the problem of proof checking to the problem of type-checking in a programming language. Thus a proof is a lambda-term, it is a correct proof of a formula if its type is exactly the formula to be proven, thanks to the proofs-as-programs analogy. In our presentation, we’ll focus on that definition/dimension of proof checking.

Modular systems in computer science are divided into components or modules with well-defined interfaces and dependencies as small as possible. Modularity requires also mechanisms to compose/compile/assemble the components together to obtain an executable software. There is no doubt that proof checkers are modular software according to the previous definition. Let us go further and apply the feature of components/modules to proofs. We can distinguish internal modularity and external modularity. Internal modularity provides mechanisms to structure large proofs (e.g. modules, inheritance) while external modularity provides mechanisms to interface proofs having different origins and build ‘composite proofs’. I will illustrate external modularity with Dedukti and a proof of Erathostenes sieves made of a Coq component and a HOL component verified by the Dedukti proof checker.

#### Modularity for specification, ontologies, and model-driven engineering (Till Mossakowski) (CANCELED)

Within the fields of software specification, ontologies, model-driven engineering, and others, different notions of modularity have been studied. It turns out that a semantics of these artefacts can be both given in terms of logical theories, as well as in terms of model classes.

Our central hypothesis is that a good support for modularity should one the hand support both the model-theoretic and theory-based point of view. On the other hand, it should be applicable to a wide variety of (logical) languages.

The Distributed Ontology, Modeling and Specification Language (DOL), standardised by the OMG, aims at providing a unified metalanguage for meeting these criteria.

#### What is a Module? (Natarajan Shankar)

A module is a functional subsystem with an interface through which it interacts with other subsystems. A module could be a body of knowledge or a theory (e.g., vector spaces), a biological subsystem (e.g., the nervous system), an organizational unit (e.g., accounting), a physical component (e.g., steering wheel), or a software component (e.g., compiler). In computing, modules appear in specification languages like Z and PVS; modeling languages like B, TLA+, and SAL; and programming languages like Ada and ML. Module mechanisms serve several different purposes: packaging and reuse of related functionality, abstraction and encapsulation of state and internal representations, separate compilation, and composition. We discuss some of the principles underpinning the design of module systems and accompanying composition and reasoning principles.

#### Modular Knowledge in Software Synthesis (Doug Smith)

Software synthesis tools support the translation of requirements into acceptable software de- signs. The requirements may be expressed logically, with a deductive design process, or the requirements may come in the form of datasets, with an inductive design process. This session focuses on modularity in software synthesis, and in particular on modular knowledge about re- quirements, software design, and the structure of the design process. We briefly outline some of the key forms of modularization that arise in these aspects of software synthesis.

1 - Requirements

For applications where mathematical correctness is important, requirements and high-level designs can be structured using formal logical specifications as finite presentations of theo- ries, which are composed using specification morphisms and colimits. For machine learning applications, the requirements come in form of structured data sets.

2 - Software Design Knowledge

Representations of software design knowledge allow developers to guide an automated synthe- sis system so that it can effectively translate requirement-level specifications into acceptable high-level designs. Transformations (replace specification or code patterns by other code patterns) and inference rules that relate logical goals with program structure (e.g. deduc- tive synthesis rules) provide some of the basic design knowledge modules for any synthesis system. Larger grain modules can capture reusable higher-level design knowledge: algorithm design theories, datatype specifications and refinements, formalized design patterns, and architectural patterns. These are used with deductive/refinement techniques to generate ini- tial designs from requirement-level specifications. Machine learning applications also exploit larger-grain patterns, such as linear regression models and neural networks. These provide a computation pattern (with parameters to-be-learned and chosen hyperparameters) that can be instantiated using methods for fitting models to the given data. More ad-hoc forms of de- sign knowledge include sketches, program templates, and schemata. These are used with (1) deductive/refinement techniques to generate instantiations from logical specifications, or (2) inductive generalization techniques that search a space of instances given concrete examples.

3 - Derivations

A software synthesis process that generates interesting software usually involves the compo- sition of many kinds of design knowledge and so it is relevant to consider the structuring of that composition, called a derivation. Derivations can be a sequence (or tree) of specifications where each is derived from its predecessor by applying a module of design knowledge. Typ- ically, the early knowledge applications in a derivation introduce the overall algorithmic or architectural structure and then many more knowledge applications are applied to improve performance and to fit the design to the target computational substrate.

#### Modularity in Mathematical Computation (Nicolas Thiery)

Over the last decades, a huge amount of computational software was developed for pure mathematics, in particular to support research and education. As for any complex ecosystem of software components, the ability to compose them has a multiplier effect on the expressive power, flexibility, and range of applications.

The purpose of this session is to share experience on the many barriers to composability in computational mathematics, and how they are being tackled in various communities. Of particular interest will be the exploitation of knowledge to leverage some of the barriers. feedback from neighbor fields (proofs, data, knowledge) will be most welcome.