Abstract
This work describes a new version of a previously published Python package — gym-saturation: a collection of OpenAI Gym environments for guiding saturation-style provers based on the given clause algorithm with reinforcement learning. We contribute usage examples with two different provers: Vampire and iProver. We also have decoupled the proof state representation from reinforcement learning per se and provided examples of using a known ast2vec Python code embedding model as a first-order logic representation. In addition, we demonstrate how environment wrappers can transform a prover into a problem similar to a multi-armed bandit. We applied two reinforcement learning algorithms (Thompson sampling and Proximal policy optimisation) implemented in Ray RLlib to show the ease of experimentation with the new release of our package.
This work has been supported by the French government, through the 3IA Côte d’Azur Investment in the Future project managed by the National Research Agency (ANR) with the reference numbers ANR-19-P3IA-0002.
You have full access to this open access chapter, Download conference paper PDF
Keywords
1 Introduction
This work describes a new version (0.10.0, released 2023.04.25) of a previously published [28] Python package — gym-saturationFootnote 1: a collection of OpenAI Gym [6] environments for guiding saturation-style provers (using the given clause algorithm) with reinforcement learning (RL) algorithms. The new version partly implements the ideas of our project proposal [29]. The main changes from the previous release (0.2.9, on 2022.02.26) are:
-
guiding two popular provers instead of a single experimental one (Sect. 3)
-
pluggable first-order logic formulae embeddings support (Sect. 4)
-
examples of experiments with different RL algorithms (Sect. 5)
-
following the updated Gymnasium [35] API instead of the outdated OpenAI Gym
gym-saturation works with Python 3.8+. One can install it by pip install gym-saturation or conda install -c conda-forge gym-saturation. Then, provided Vampire and/or iProver binaries are on PATH, one can use it as any other Gymnasium environment:
2 Related Work
Guiding provers with RL is a hot topic. Recent projects in this domain include TRAIL (Trial Reasoner for AI that Learns) [2], FLoP (Finding Longer Proofs) [37], and lazyCoP [26]. We will now compare the new gym-saturation features with these three projects.
Usually, one guides either a new prover created for that purpose (lazyCoP; FLoP builds on fCoP [14], an OCaml rewrite of older leanCoP [19]) or an experimental patched version of an existing one (TRAIL relies on a modified E [27]). Contrary to that, gym-saturation works with unmodified stable versions of Vampire [15] and iProver [10].
In addition, known RL-guiding projects are prover-dependent: FLoP could, in principle, work with both fCoP and leanCoP but reported only fCoP experiments. TRAIL claims to be reasoner-agnostic, but to our best knowledge, no one has tried it with anything but a patched E version it uses by default. [26] mentions an anonymous reviewer’s suggestion to create a standalone tool for other existing systems, but we are not aware of further development in this direction. Quite the contrary, we have tested gym-saturation compatibility with two different provers (Vampire and iProver).
Deep learning models expect their input to be real-valued tensors and not, for example, character strings in the TPTP [32] language. Thus, one always uses a representation (or embeddings) — a function mapping a (parsed) logic formula to a real vector. In lazyCoP and FLoP parts of embedding functions belong to the underlying provers, making it harder to vary and experiment with (e.g., one needs Rust or OCaml programming skills to do it). gym-saturation leaves the choice of representation open and supports any mapping from TPTP-formatted string to real vectors. The version described in this work also provides a couple of default options.
3 Architecture and Implementation Details
3.1 Architecture
gym-saturation is compatible with Gymnasium [35], a maintained fork of now-outdated OpenAI Gym standard of RL-environments, and passes all required environment checks. As a result of our migration to Gymnasium, its maintainers featured gym-saturation in a curated list of third-party environmentsFootnote 2.
Previously, gym-saturation guided an experimental pure Python prover [28] which happened to be too slow and abandoned in favour of existing highly efficient provers: Vampire and iProver.
Although the gym-saturation user communicates with both iProver and Vampire in the same manner, under the hood, they use different protocols. For Vampire, we relied on the so-called manual (interactive) clause selection mode implemented several years ago for an unrelated task [11]. In this mode, Vampire interrupts the saturation loop and listens to standard input for a number of a given clause instead of applying heuristics. Independent of this mode, Vampire writes (or not, depending on the option show_all) newly inferred clauses to its standard output. Using Python package pexpect, we attach to Vampire’s standard input and output, pass the action chosen by the agent to the former and read observations from the latter. In manual clause selection mode, Vampire works like a server awaiting a request with an action to which it replies (exactly what an environment typically does).
iProver recently added support of being guided by external agents. An agent has to be a TCP server satisfying a particular API specification. So, iProver behaves as a client which sends a request with observations to some server and awaits a reply containing an action. To make it work with gym-saturation, we implemented a relay server. It accepts a long-running TCP connection from a running iProver thread and stores its requests to a thread-safe queue, and pops a response to it from another such queue filled by gym-saturation thread. See Fig. 1 for a communication scheme.
3.2 Implementation Details
Clause Class. A clause is a Python data class having the following keys and respective values:
-
literals — a string of clause literals in the TPTP format, e.g. ‘member(X0,bb) | member(X0,b)’
-
label — a string label of a clause, e.g. ‘21’. Some provers (e.g. Vampire) use integer numbers for labelling clauses, but others (e.g. iProver) use an alphanumeric mixture (e.g. ‘c_54’)
-
role — a string description of a clause role in a proof (hypothesis, negated conjecture, axiom, et cetera)
-
inference_rule — a string name of an inference rule used to produce the clause. It includes not only resolution and superposition but also values like ‘axiom’ and ‘input’ (for theorem assumptions)
-
inference_parents — a tuple of clause labels if needed by the inference rule (‘axiom’ doesn’t need any, ‘factoring’ expects only one, ‘resolution’ — two, et cetera)
-
birth_step — an integer step number when the clause appeared in the proof state. Axioms, assumptions, and the negated conjecture have birth step zero.
All these fields except the birth_step (computed by the environment itself) are already available as separate entities (and not parts of TPTP-formatted strings) in iProver and Vampire output.
Environment Class
Observation is a Python dictionary with several keys:
-
real_obs is a tuple of all clauses (processed and unprocessed). It can be transformed to tensor representation by so-called observation wrappersFootnote 3. The gym-saturation provides several such wrappers for cases of external embeddings service or hand-coded feature extraction function
-
action_mask is a numpy [13] array of the size max_clauses (a parameter which one can set during the environment object instantiation) having a value 1.0 at index i if and only if a clause with a zero-based order number i currently exists and can be a given clause (e.g. not eliminated as redundant). All other values of action_mask are zeros. This array simplifies tensor operations on observation representations.
Limiting the total number of clauses in a proof state is a proxy of both random-access memory (each clause needs storage space) and time (a prover has to process each clause encountered) limits typical for the CASC [33] competition. One can add a standard Gymnasium time-limit wrapper to limit the number of steps in an episode. Setting wall-clock time and RAM limits is not typical for RL research.
Action is a zero-based order number of a clause from real_obs. If a respective action_mask is zero, an environment throws an exception during the execution of the step method.
Reward is 1.0 after a step if we found the refutation at this step and 0.0 otherwise. One can change this behaviour by either Gymnasium reward wrappers or by collecting trajectories in a local buffer and postprocessing them before feeding the trainer.
Episode is terminated when an empty clause $false appears in the proof state or if there are no more available actions.
Episode is truncated when there are more than max_clauses clauses in the proof state. Since the state is an (extendable) tuple, we don’t raise an exception when a prover generates a few more clauses.
Info dictionary is always empty at every step by default.
Render modes of the environment include two standard ones (‘human’ and ‘ansi’), the first one printing and the second one returning the same TPTP formatted string.
Multi-task Environment. The latest gym-saturation follows a Meta-World benchmark [36] style and defines set_task method with one argument — a TPTP problem full path. If one resets an environment without explicitly setting a task in advance, the environment defaults to a simple group theory problem (any idempotent element equals the identity). Having a default task helps us keep compatibility with algorithms not aware of multi-task RL. One can inherit from gym-saturation environment classes to set a random problem at every reset or implement any other desirable behaviour.
4 Representation Subsystem
4.1 Existing First-Order Formulae Representations and Related Projects
As mentioned in Sect. 2, to apply any deep reinforcement learning algorithm, one needs a representation of the environment state in a tensor form first. There are many known feature engineering procedures. It can be as simple as clause age and weight [25], or information extracted from a clause syntax tree [18] or an inference lineage of a clause [30]. Representing logic formulae as such is an active research domain: for example, in [23], the authors proposed more than a dozen different embedding techniques based on formulae syntax. In communities other than automated deduction, researchers also study first-order formulae representation: for example, in [5], the authors use semantics representation rather than syntax. One can also notice that first-order logic (FOL) is nothing more than a formal language, so abstract syntax trees of FOL are not, in principle, that different from those of programming language statements. And of course, encoding models for programming languages (like code2vec [4] for Java) exist, as well as commercially available solutions as GPT-3 [7] generic code embeddings and comparable free models like LLaMA [34].
To make the first step in this direction, we took advantage of existing pre-trained embedding models for programming languages and tried to apply them to a seemingly disconnected domain of automated provers.
4.2 ast2vec and Our Contributions to It
In [20], the authors proposed a particular neural network architecture they called Recursive Tree Grammar Autoencoders (RTG-AE), which encodes abstract syntax trees produced by a programming language parser into real vectors. Being interested in education applications, they also published the pre-trained model for Python [21]. To make use of it for our purpose, we furnished several technical improvements to their code (our contribution is freely availableFootnote 4):
-
a TorchServe [24] handler for HTTP POST requests for embeddings
-
request caching with the Memcached server [9]
-
Docker container to start the whole subsystem easily on any operating system
To integrate the ast2vec server with gym-saturation environments, we added Gymnasium observation wrappers, one of them mapping a clause in the TPTP language to a boolean-valued statement in Python (in particular, by replacing logic operation symbols, e.g. = in TPTP becomes == in Python). See Fig. 2 for a communication diagram. In principle, since a clause doesn’t contain any quantifiers explicitly, one can rewrite it as a boolean-valued expression in many programming languages for which pre-trained embeddings might exist.
4.3 Latency Considerations
Looking at Fig. 2, one might wonder how efficient is such an architecture. The average response time observed in our experiments was 2 ms (with a 150 ms maximum). A typical natural language processing model which embeds whole texts has a latency from 40 ms to more than 600 ms [17] (depending on the model complexity and the length of a text to embed) when run on CPU, so there is no reason to believe that ast2vec is too slow. When evaluating a prover, one usually fixes the time limit: for example, 60 s is the default value for Vampire. Being written in C++ and with a cornucopia of optimisation tweaks, Vampire can generate around a million clauses during this relatively short timeframe. Thus, to be on par with Vampire, a representation service must have latency around 60 \(\upmu \)s (orders of magnitude faster than we have). There can be several ways to lower the latency:
-
inference in batches (one should train the embedding model to do it; ast2vec doesn’t do it out of the box). The improvement may vary
-
use GPU. NVIDIA reports around 20x improvement vs CPU [16]. However, throwing more GPUs won’t be as efficient without batch inference from the previous point
-
request an embedding for a binary object of an already parsed clause instead of a TPTP string. It means not repeating parsing already done by a prover, which might lower the latency substantially. To do this, one will have to patch an underlying prover to return binary objects instead of TPTP strings
-
use RPC (remote procedure call) instead of REST protocol. TorchServe relies on REST and parcels in JSON format, and in gRPC [12], they prefer the binary protobuf format. One rarely expects sub-millisecond latency from REST, although for RPC, 150 \(\upmu \)s is not unusual. This point doesn’t make much sense without the previous one
5 Usage Examples
We provide examples of experiments easily possible with gym-saturation as a supplementary code to this paperFootnote 5. We don’t consider these experiments as being of any scientific significance per se, serving merely as illustrations and basic usage examples. Tweaking the RL algorithms’ meta-parameters and deep neural network architectures is out of the scope of the present system description.
We coded these experiments in the Ray framework, which includes an RLlib — a library of popular RL algorithms. The Ray is compatible with Tensorflow [1] and PyTorch [22] deep learning frameworks, so it doesn’t limit a potential gym-saturation user by one.
In the experiments, we try to solve SET001-1 from the TPTP with max_clauses=20 (having no more than twenty clauses in the proof state) for guiding Vampire and max_clauses=15 for iProver. This difference is because even a random agent communicating to iProver manages to always solve SET001-1 by generating no more than twenty clauses. We wanted training to start, but keep the examples as simple as possible, so we chose to harden the constraints instead of moving on to a more complicated problem.
In one experiment, we organise clauses in two priority queues (by age and weight) and use an action wrapper to map from a queue number (0 or 1) to the clause number. That means we don’t implant these queues inside provers but follow a Gymnasium idiomatic way to extend environments. Of course, Vampire and iProver have these particular queues as part of their implementation, but our illustration shows one could use any other priorities instead. It transforms our environment into a semblance of a 2-armed bandit, and we use Thompson sampling [3] to train. This experiment reflects ideas similar to those described in [31].
In another experiment, we use ast2vec server for getting clause embeddings and train a Proximal Policy Optimisation (PPO) algorithm as implemented in the Ray RLlib. The default policy network there is a fully connected one, and we used \(256\times 20\) tensors as its input (256 is an embedding size in ast2vec, and 20 is the maximal number of clauses we embed). So, the policy chooses a given clause given the embeddings of all clauses seen up to the current step (including those already chosen or judged to be redundant/subsumed). Such an approach is more similar to [37].
We provide Fig. 3 as a typical training process chart.
6 Conclusion and Future Work
We contributed a new version of gym-saturation, which continued to be free and open-source software, easy to install and use while promising assistance in setting up experiments for RL research in the automated provers domain. In the new version, we enabled anyone interested to conduct experiments with RL algorithms independently of an underlying prover implementation. We also added the possibility of varying representations as external plug-ins for further experimentation. We hope that researchers having such an instrument can focus on more advanced questions, namely how to generate and prioritise training problems to better transfer search patterns learned on simpler theorems to harder ones.
Our experience with adding Vampire and iProver support to gym-saturation shows that working tightly with corresponding prover developers is not mandatory, although it might help immensely. Implementing the prover guidance through the standard I/O (as in Vampire) seems to be relatively easy, and we hope more provers will add similar functionality in future to be more ML-friendly. Such provers could then profit from using any other external guidance (see [8] for a different system using the same iProver technical features as we did).
We identify a discerning and computationally efficient representation service as a bottleneck for our approach and envision an upcoming project of creating a universal first-order logic embedding model usable not only by saturation-style provers but also tableaux-based ones, SMT-solvers, semantic reasoners, and beyond.
References
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/. Software available from tensorflow.org
Abdelaziz, I., et al.: Learning to guide a saturation-based theorem prover. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 738–751 (2023). https://doi.org/10.1109/TPAMI.2022.3140382
Agrawal, S., Goyal, N.: Thompson sampling for contextual bandits with linear payoffs. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 28, pp. 127–135. PMLR, Atlanta, Georgia, USA (17–19 Jun 2013). https://proceedings.mlr.press/v28/agrawal13.html
Alon, U., Zilberstein, M., Levy, O., Yahav, E.: Code2Vec: learning distributed representations of code. Proceed. ACM Programm. Lang. 3(POPL), 1–29 (2019). https://doi.org/10.1145/3290353
Ballout, A., da Costa Pereira, C., Tettamanzi, A.G.B.: Learning to classify logical formulas based on their semantic similarity. In: Aydoğan, R., Criado, N., Lang, J., Sanchez-Anguix, V., Serramia, M. (eds.) PRIMA 2022: Principles and Practice of Multi-Agent Systems, pp. 364–380. PRIMA 2022. LNCS, vol. 13753. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-21203-1_22
Brockman, G., et al.: OpenAI Gym. arXiv (2016).https://doi.org/10.48550/arXiv.1606.01540
Brown, T.B., et al.: Language models are few-shot learners. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. NIPS2020, Curran Associates Inc., Red Hook, NY, USA (2020). https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
Chvalovský, K., Korovin, K., Piepenbrock, J., Urban, J.: Guiding an instantiation prover with graph neural networks. In: Piskac, R., Voronkov, A. (eds.) Proceedings of 24th International Conference on Logic for Programming, Artificial Intelligence and Reasoning. EPiC Series in Computing, vol. 94, pp. 112–123. EasyChair (2023). https://doi.org/10.29007/tp23. https://easychair.org/publications/paper/5z94
Danga Interactive Inc: Memcached (2023). https://github.com/memcached/memcached
Duarte, A., Korovin, K.: Implementing superposition in iProver (system description). In: Peltier, N., Sofronie-Stokkermans, V. (eds.) IJCAR 2020. LNCS (LNAI), vol. 12167, pp. 388–397. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-51054-1_24
Gleiss, B., Kovács, L., Schnedlitz, L.: Interactive visualization of saturation attempts in vampire. In: Ahrendt, W., Tapia Tarifa, S.L. (eds.) IFM 2019. LNCS, vol. 11918, pp. 504–513. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34968-4_28
gRPC authors: gRPC - An RPC library and framework (2023). https://github.com/grpc/grpc
Harris, C.R., et al.: Array programming with NumPy. Nature 585(7825), 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2
Kaliszyk, C., Urban, J., Vyskočil, J.: Certified connection tableaux proofs for HOL light and TPTP. In: Proceedings of the 2015 Conference on Certified Programs and Proofs, pp. 59–66. CPP 2015, Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2676724.2693176
Kovács, L., Voronkov, A.: First-order theorem proving and Vampire. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 1–35. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39799-8_1
Mukherjee, P., Weill, E., Taneja, R., Onofrio, D., Ko, Y.J., Sharma, S.: Real-time natural language understanding with BERT using TensorRT (2019). https://developer.nvidia.com/blog/nlu-with-tensorrt-bert/
Nguyen, V., Srihari, N., Chadha, P., Chen, C., Lee, J., Rodge, J.: Optimizing T5 and GPT-2 for real-time inference with NVIDIA TensorRT (2021). https://developer.nvidia.com/blog/optimizing-t5-and-gpt-2-for-real-time-inference-with-tensorrt/
Olsák, M., Kaliszyk, C., Urban, J.: Property invariant embedding for automated reasoning. In: Giacomo, G.D. et al. (eds.) ECAI 2020–24th European Conference on Artificial Intelligence. Frontiers in Artificial Intelligence and Applications, vol. 325, pp. 1395–1402. IOS Press (2020). https://doi.org/10.3233/FAIA200244
Otten, J., Bibel, W.: leanCoP: lean connection-based theorem proving. J. Symb. Comput. 36(1), 139–161 (2003). https://doi.org/10.1016/S0747-7171(03)00037-3. First Order Theorem Proving
Paaßen, B., Koprinska, I., Yacef, K.: Recursive tree grammar autoencoders. Mach. Learn. 111, 3393–3423 (2022). https://doi.org/10.1007/s10994-022-06223-7
Paassen, B., McBroom, J., Jeffries, B., Koprinska, I., Yacef, K.: Mapping python programs to vectors using recursive neural encodings. J. Educ. Data Min. 13(3), 1–35 (2021). https://doi.org/10.5281/zenodo.5634224. https://jedm.educationaldatamining.org/index.php/JEDM/article/view/499
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc. (2019). https://proceedings.neurips.cc/paper_files/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf
PurgaŁ, S., Parsert, J., Kaliszyk, C.: A study of continuous vector representations for theorem proving. J. Logic Comput. 31(8), 2057–2083 (2021). https://doi.org/10.1093/logcom/exab006
PyTorch serve contributors: TorchServe (2023). https://github.com/pytorch/serve
Rawson, M., Reger, G.: Old Or heavy? Decaying gracefully with age/weight shapes. In: Fontaine, P. (ed.) CADE 2019. LNCS (LNAI), vol. 11716, pp. 462–476. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29436-6_27
Rawson, M., Reger, G.: lazyCoP: lazy paramodulation meets neurally guided search. In: Das, A., Negri, S. (eds.) TABLEAUX 2021. LNCS (LNAI), vol. 12842, pp. 187–199. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86059-2_11
Schulz, S., Cruanes, S., Vukmirović, P.: Faster, higher, stronger: E 2.3. In: Fontaine, P. (ed.) CADE 2019. LNCS (LNAI), vol. 11716, pp. 495–507. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29436-6_29
Shminke, B.: Gym-saturation: an OpenAI Gym environment for saturation provers. J. Open Source Softw. 7(71), 3849 (2022). https://doi.org/10.21105/joss.03849
Shminke, B.: Project proposal: a modular reinforcement learning based automated theorem prover. arXiv (2022). https://doi.org/10.48550/ARXIV.2209.02562
Suda, M.: Improving ENIGMA-style clause selection while learning from history. In: Platzer, A., Sutcliffe, G. (eds.) CADE 2021. LNCS (LNAI), vol. 12699, pp. 543–561. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79876-5_31
Suda, M.: Vampire getting noisy: will random bits help conquer chaos? (System description). In: Blanchette, J., Kovács, L., Pattinson, D. (eds.) Automated Reasoning. IJCAR 2022. LNCS, vol. 13385, pp. 659–667. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-10769-6_38
Sutcliffe, G.: The TPTP problem library and associated infrastructure - from CNF to TH0, TPTP v6.4.0. J. Autom. Reason. 59(4), 483–502 (2017). https://doi.org/10.1007/s10817-017-9407-7
Sutcliffe, G.: The 10th IJCAR automated theorem proving system competition - CASC-J10. AI Commun. 34(2), 163–177 (2021). https://doi.org/10.3233/AIC-201566
Touvron, H., et al.: LLaMA: open and efficient foundation language models. arXiv (2023). https://doi.org/10.48550/arXiv.2302.13971
Towers, M., et al.: Gymnasium (2023). https://doi.org/10.5281/zenodo.8127026
Yu, T., et al.: Meta-world: a benchmark and evaluation for multi-task and meta reinforcement learning. In: Kaelbling, L.P., Kragic, D., Sugiura, K. (eds.) Proceedings of the Conference on Robot Learning. Proceedings of Machine Learning Research, vol. 100, pp. 1094–1100. PMLR (30 Oct-01 Nov 2020). https://proceedings.mlr.press/v100/yu20a.html
Zombori, Z., Csiszárik, A., Michalewski, H., Kaliszyk, C., Urban, J.: Towards finding longer proofs. In: Das, A., Negri, S. (eds.) TABLEAUX 2021. LNCS (LNAI), vol. 12842, pp. 167–186. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86059-2_10
Acknowledgements
We would like to thank Konstantin Korovin for the productive discussion and for adding the external agents’ communication feature to iProver, without which this work won’t be possible. We also thank anonymous reviewers for their meticulous suggestions on improving the present paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2023 The Author(s)
About this paper
Cite this paper
Shminke, B. (2023). gym-saturation: Gymnasium Environments for Saturation Provers (System description). In: Ramanayake, R., Urban, J. (eds) Automated Reasoning with Analytic Tableaux and Related Methods. TABLEAUX 2023. Lecture Notes in Computer Science(), vol 14278. Springer, Cham. https://doi.org/10.1007/978-3-031-43513-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-43513-3_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43512-6
Online ISBN: 978-3-031-43513-3
eBook Packages: Computer ScienceComputer Science (R0)