Complexity and Real Computation Laboratory

Digital Computers naturally process discrete data, such as bits or integers or strings or graphs. From bits to advanced data structures, from a first semi-conducting transistor to billions in wafer-scale integration, from individual Boolean connectives to entire CPU circuits, from kB to TB memories, from 10² to 10⁹ instructions per second, from assembly code to high-level programming languages:
the success story of digital computing arguably is due to (1) hierarchical layers of abstraction and (2) the ultimate reliability of each layer for the next one to build on ― for processing discrete data.

Continuous data on the other hand commonly arises in Physics, Engineering and Science: natura non facit saltus. Such data mathematically corresponds to real numbers, smooth functions, bounded operators, or compact subsets of some abstract metric space.

The rise of digital (over analog) computers led a stagnation in the realm of continuous (=non-discretized) data processing: 35 years after introduction and hardware standardization of IEEE 754 floating point numbers, mainstream Numerics is arguably still dominated by this forcible discretization of continuous data ― in spite of violating associative and distributive laws, breaking symmetries, introducing and propagating rounding errors, in addition to an involved (and incomplete) axiomatization including NaNs and denormalized numbers.

Figure 1: Floating Point Number Line, from https://courses.engr.illinois.edu/cs357/fa2019/assets/images/figs/floatingpoints.png. See also https://www.jasss.org/9/4/4.html#2.1

Deviations between mathematical structures and their hardware counterparts are common also in the discrete realm, such as the “integer” wraparound 255+1=0 occurring in bytes that led to the Nuclear Gandhi programming bug.

Similarly, deviations between exact and approximate continuous data underlie infamous failures such as the Ariane 501 flight V88 or the Sleipner-A oil platform.

Nowadays high-level programming languages (such as Java or Python) provide a user data type (called for example BigInt or mpz_t) that fully agrees with mathematical integers, simulated in software using a variable number of hardware bytes. This additional layer of abstraction provides the reliability for advanced discrete data types (such as weighted or labelled graphs) to build on, as mentioned above.

We develop Computer Science for continuous data, to catch up with the discrete case: from foundations via practical implementation to commercial applications.

In fact some object-oriented software libraries, such as iRRAM or Core III or realLib or Ariadne or Aern, have long been providing general (=including all transcendental) real numbers as exact encapsulated user data type. Technically they employ finite but variable precision approximations: much like BigInt, but with the added challenge of choosing said precision automatically and adaptively sufficient for the user program to appear as indistinguishable from exact reals. This requires a new (namely partial) semantics for real comparison: formalizing the folklore to “avoid” testing for equality, in terms of Kleene's ternary logic.

Sewon Park in his PhD Thesis has extended that semantics to composite expressions, and further to command sequences aka programs, whose correctness can then be symbolically verified using an extension of Floyd-Hoare Logic;

Complexity and Real Computation Laboratory

Computer Science for Continuous Data: Algorithmic Foundations of Numerics

Motivation

Mission

References