Academic Year 25-26


Automatic synthesis of reactive systems with registers

Promotor: Emmanuel Filiot (efiliot@ulb.be)

Objective: The goal of synthesis is to automatically generate a program from specifications of the properties the program must fulfil, instead of how it must satisfy them. This allows to obtain programs which are correct *by construction*. In this master thesis, the goal is to implement a synthesis algorithm for programs which can be represented as register automata, i.e. automata with registers which can store input data, and compare incoming data to those stored in the registers. Such automata can model reactive systems, which are non-terminating systems in interaction with the environment in which they are executed. The internship will consider specifications both expressed as automata and logical formulas.

The work will be based on research papers such as

Léo Exibard, Emmanuel Filiot, Ayrat Khalimov: A Generic Solution to Register-Bounded Synthesis with an Application to Discrete Orders. ICALP 2022: 122:1-122:19

Please contact the promotor for more details.


Learning algorithm for register automata.

Promotor: Emmanuel Filiot (efiliot@ulb.be)

Objective: learning algorithms for regular languages have been investigated for long. The goal of this thesis is to lift some of those algorithms to regular languages of *data words*. Data words are words over an infinite alphabet of data (integers, reals, etc.). Regular languages of data words are for instance defined as languages recognizable by register automata, i.e. automata with registers which can store input data, and compare incoming data to those stored in the registers. The first goal of the thesis will be to understand a standard algorithm in passive learning of regular languages, called RPNI, and to work on its extension to data word languages.

Please contact the promotor for more details.


Implementation of a forward algorithm for reactive synthesis

Promotor: Emmanuel Filiot (efiliot@ulb.be)

Objective: The goal of synthesis is to automatically generate a program from specifications of the properties the program must fulfil, instead of how it must satisfy them. This allows to obtain programs which are correct *by construction*. In this master thesis, the goal is to implement a synthesis algorithm for programs which can be represented as Mealy machines, within the tool Acacia-Bonzai. Many of the sub-procedures which are necessary to implement this algorithm are already implemented in Acacia-Bonzai. In the preparatory step, the goal is to understand the theory behind this algorithm. A participation to the synthesis competition SYNTCOMP will be considered if the implementation is efficient.

Reference paper: Emmanuel Filiot, Naiyong Jin, Jean-François Raskin: Antichains and compositional algorithms for LTL synthesis. Formal Methods Syst. Des. 39(3): 261-296 (2011)

Reference on Acacia-bonzai: https://arxiv.org/pdf/2204.06079.pdf

Please contact the promotor for more details.


Reactive synthesis from LTL specification and examples

Promotor: Jean-Francois Raskin (jraskin@ulb.ac.be)

Objective: The goal of this project is to synthesize Mealy machines that enforce LTL specifications against all possible behaviours of the environment including hostile ones. In the variant studied here, the user provides the high level LTL specification {\phi} of the system to design, and a set E of examples of executions that the solution must produce. The synthesis algorithm works in two phases. First, it generalizes the decisions taken along the examples E using tailored extensions of automata learning algorithms. This phase generalizes the user-provided examples in E while preserving realizability of {\phi}. Second, the algorithm turns the (usually) incomplete Mealy machine obtained by the learning phase into a complete Mealy machine that realizes {\phi}. The examples are used to guide the synthesis procedure. The goal of the Master thesis is to extend the existing tool with a notion of symbolic example where the examples are giving as Boolean formulas. The synthesis tool is then expected to synthesise symbolic Mealy Machines instead of plain Mealy machines.

Reference: https://arxiv.org/abs/2301.10485

Please contact the promotor for more details.


Mixing formal methods and deep learning

Promotor: Jean-Francois Raskin (jraskin@ulb.ac.be)

Objective: In previous works (see below), we have studied how to efficiently combine formal methods, Monte Carlo Tree Search (MCTS), and deep learning in order to produce high-
quality receding horizon policies in large Markov Decision processes (MDPs). In particular, we use model-checking techniques to guide the MCTS algorithm in order to generate offline samples of high-quality decisions on a representative set of states of the MDP. Those samples can then be used to train a neural network that imitates the policy used to generate them. This neural network can either be used as a guide on a lower-latency MCTS online search, or alternatively be used as a full-fledged policy when minimal latency is required. We use statistical model checking to detect when additional samples are needed and to focus those additional samples on configurations where the learnt neural network policy differs
from the (computationally-expensive) offline policy. We illustrate the use of our method on MDPs that model the Frozen Lake and Pac-Man environments — two popular benchmarks to evaluate reinforcement-learning algorithms.

The master thesis will consist in applying those new techniques to a challenging new case study.

References:

Please contact the promotor for more details.


Using random tree forests to guide MCTS for solving large MDPs

Supervisors: Prof. Gianluca Bontempi and Jean-François Raskin

In previous research efforts (refer to the references below), we have explored the effective integration of formal methods, Monte Carlo Tree Search (MCTS), and deep learning to craft high-quality receding horizon policies for handling large Markov Decision Processes (MDPs). Specifically, we leverage model-checking techniques to steer the MCTS algorithm, enabling the generation of offline samples that embody high-quality decisions across a representative subset of the MDP’s states. These generated samples subsequently serve as the foundation for training a neural network that emulates the policy used in their creation.

This neural network assumes a dual role: it can function as a guiding agent for a lower-latency MCTS online search, or it can operate independently as a comprehensive policy when minimizing latency is a critical requirement. We employ statistical model checking as a means to identify situations where additional samples are necessary and to focus these supplementary samples on configurations where the learned neural network policy diverges from the computationally-intensive offline policy.

To exemplify the applicability of our approach, we conduct experiments on MDPs that simulate environments like Frozen Lake and Pac-Man—two well-established benchmarks for assessing reinforcement-learning algorithms.

The subject of this master thesis is to evaluate the replacement of neural networks by random forests of decision trees. In particular, we would like to compare their accuracy in predicting safe actions in the applications mentioned above. We aim also at comparing the number of sample that random forests need in comparison with neural networks for generalising the decision taken by the MCTS algorithm augmented with the model-checking algorithms.

Elements of bibliography:

https://doi.org/10.1007/978-3-030-85172-9_13
https://drops.dagstuhl.de/opus/volltexte/2020/12852/
https://www.dropbox.com/s/b5id4ils4o442hb/StormMCTS.pdf?dl=0

Please contact the promotor for more details.


Generating data for learning a neural network to solve instances of the Robbins’ problem

Supervisors: Prof. Jean-François Raskin

In the field of probability theory, the problem of optimal stopping, known as Robbins’ problem in honor of Herbert Robbins, is a variation of the secretary problem. It falls under the broader category of problems related to minimizing the expected stoping rank when full information is available.

Consider a sequence of independent and identically distributed random variables, denoted as X1, X2, …, Xn, with each variable uniformly distributed in the interval [0, 1]. Our objective is to sequentially observe these variables and select exactly one of them to stop the process. The key question is: What stopping rule should be employed to minimize the expected rank of the chosen observation, and what is the corresponding value of this rule?

It is worth noting that a general solution to this problem of minimizing the expected rank with full information remains elusive.

Within the scope of this thesis, the primary focus will be dedicated to the following objectives:

1) Investigating the methodologies for generating data to be used in supervised learning in order to train of a neural network for solving the Robbins’ problem.

2) Delving into the determination of an optimal network architecture that is apt for learning effective strategies to tackle the problem at hand.

3) Investigating methods to extend these approaches for handling larger values of n, and assessing the network’s performance in relation to the known theoretical results on this problem.

Elements of bibliography:

https://web.archive.org/web/20060830123936id_/http://www.math.ucla.edu/~tom/papers/exprank.pdf

This subject will be realised in collaboration with with Prof. Thomas Bruss (from the math department at ULB) who is a central contributor to the theoretical study of this problem.

Please contact the promotor for more details.


Using model-checking algorithms and Monte Carlo tree search algorithms for search near optimal strategies in the Robbins’ problem

Supervisor : Prof. Jean-François Raskin

In the field of probability theory, the problem of optimal stopping, known as Robbins’ problem in honor of Herbert Robbins, is a variation of the secretary problem. It falls under the broader category of problems related to minimizing the expected rank when full information is available.

Consider a sequence of independent and identically distributed random variables, denoted as X1, X2, …, Xn, with each variable uniformly distributed in the interval [0, 1]. Our objective is to sequentially observe these variables and select exactly one of them to stop the process. The key question is: What stopping rule should be employed to minimize the expected rank of the chosen observation, and what is the corresponding value of this rule?

It is worth noting that a general solution to this problem of minimizing the expected rank with full information remains elusive.

The focus of this master’s thesis is to explore and apply precise computational methods alongside sampling techniques to determine optimal strategies for instances of the Robbins’ problem where the exact optimal value is currently unknown. This research will encompass the utilization of model-checking algorithms and the implementation of the Monte Carlo tree search algorithm. These computational tools will be harnessed to compute optimal or nearly optimal strategies for this intriguing problem. Consequently, this work will largely revolve around the adaptation and implementation of advanced search algorithms, which have not yet been fully leveraged in the context of this challenging problem.

Elements of bibliography:

https://web.archive.org/web/20060830123936id_/http://www.math.ucla.edu/~tom/papers/exprank.pdf

This subject will be realised in close collaration with Prof. Thomas Bruss (from the math department at ULB) who is a central contributor to the theoretical study of this problem.


Neural networks and formal verification

In the following paper:

https://proceedings.mlr.press/v288/kresse25a.html

The authors study the verification of learning-based systems based on Logic Gate Networks (LGNs), a neural architecture that replaces arithmetic operations with Boolean logic gates, resulting in sparse, netlist-like models that are well suited to symbolic reasoning. They present a SAT-based encoding to verify global robustness and fairness properties of LGNs and evaluates the approach on five benchmark datasets, including a newly introduced 5-class variant. The results show that LGNs enable effective formal verification while preserving competitive predictive performance.

The goal of the master thesis is to get acquainted with the challenges that poses neural networks for formal verification. To apply the methodology described in the paper above to other cases studies.

Please contact the promotor for more details.


AlphaGo Algorithms Augmented with Formal Methods

The game of Go was long considered a major challenge for artificial intelligence. In the paper

Mastering the game of Go with deep neural networks and tree search
https://www.nature.com/articles/nature16961

the authors introduce an algorithmic framework that combines deep learning with search. The approach relies on two neural components: a policy network, which proposes promising moves, and a value network, which estimates the expected outcome of a given position. These networks are trained in two stages. First, they incorporate prior human knowledge through supervised learning from expert games. Second, their performance is improved through reinforcement learning based on self-play. The trained networks are then integrated into a Monte Carlo Tree Search (MCTS) algorithm, where the policy network guides exploration and the value network replaces costly rollout simulations.

A complementary line of work is presented in the paper

https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2020.40

This work studies the online computation of strategies for Markov decision processes with the goal of optimizing the expected average reward. The strategy is computed using a receding-horizon approach combined with MCTS. The authors augment MCTS with symbolic advice, which biases the selection and simulation phases of the search while preserving the classical theoretical guarantees of MCTS. Symbolic advice is implemented using SAT and QBF solvers. The approach is illustrated on the game Pac-Man and shown to outperform both standard MCTS and human players.

The goal of this master’s thesis is to study and understand the methods presented in these two papers, and to explore how deep learning and formal methods can be combined to design efficient artificial agents. The main objective will be to apply this combined methodology to other games, such as Awalé (Oware), Connect Four, or other suitable games that fit the framework.

This topic can be taken by several students, as it naturally allows specialization by focusing on different games or methodological aspects.

Please contact the promotor for more details.


Neural networks techniques to improve formal verification efficacy

In the following paper :

https://research.birmingham.ac.uk/en/publications/let-a-neural-network-be-your-invariant/

The authors of this paper address the verification of both safety and liveness properties of reactive systems, which together form a complete notion of functional correctness. They extend previous neural model checking approaches, which focused only on liveness through ranking functions, by introducing a unified neural certificate architecture that also represents inductive invariants for safety. Their method alternates between learning and formal checking to construct provably sound certificates for linear temporal logic specifications, and leverages constraint solvers to train the neural models more efficiently than gradient-based methods. Experimental results show significant performance improvements over state-of-the-art model checkers on safety, liveness, and combined verification tasks, while supporting richer specifications than earlier neural approaches.

The goal of the master thesis is to get acquainted with the techniques presented in that paper and understand how this techniques can be applied to a new case study. A comparison with more classical BDD techniques will also be considered.

Please contact the promotor for more details.