\n\n## Table of contents\n\n## Overview\n\n
\n\n
Disclaimer: This page is specific to the search of BB(5), which has been completed as of July 2nd 2024.
\n\nWith the Busy Beaver Challenge we want to decide the halting problem of all 5-state Turing machines (from all-0 tape). That way we will learn BB(5), the 5
th busy beaver value. See
Story.\n\nIn order to achieve this goal we need to analyse the behavior of every single 5-state Turing machine. We quickly run into a problem: there are roughly 16 trillion 5-state Turing machines (
to be exact).\n\nThankfully most of this space is not _useful_ to us and only a fraction needs to be studied in order to find BB(5). This is for instance because there are ways to permute the states (aside from the start state) of a machine and 2 ways to permute the move directions which does not change behavior hence only one of these 48 machines needs to be studied.\n\nHence, we aim at _sparsely_ enumerating the space of 5-state Turing machines: that is trying to enumerate the least amount of machines that are necessary to study in order to find BB(5).\n\n\n\n### Phase 1, phase 2\n\nThe method that we present to sparsely enumerate the space of 5-state Turing machines and analyse their behavior is fundamentally inspired by [[Marxen and Buntrock, 1990]](http://turbotm.de/~heiner/BB/mabu90.html) with some notable differences that we will outline. The first difference is that our method is divided into two successive and independent phases:\n\n1. **Phase 1: seed database.** Sparsely enumerate the space of 5-state Turing machines and mark as **undecided** any machine that exceeded the set [time or space limits](#time-and-space-limits). Undecided machines are put in the [seed database](#seed-database) which _seeds_ the Busy Beaver Challenge.\n\n2. **Phase 2: deciders.** Write independent [deciders](#deciders), i.e. programs that will decide the behavior of families of machines in the seed database.\n\n\n**Phase 1** was completed in December 2021:\n\n- it enumerated 125,479,953 Turing machines in 30 hours (splitting the task among several computers in parallel). See these [metrics](#metrics) for more.\n- it marked **88,664,064** machines as undecided and they are stored in the [seed database](#seed-database). We refer to undecided 5-state machines thanks to their index in the seed database (e.g. Machine #7,410,754).\n\nAlthough **Phase 1** of the project was completed, it needs to be reproduced independently in order to confirm its results and increase trust. See Contribute.\n\n**Phase 2** started in January 2022 and you are invited to write your own deciders for the remaining (or yet-unknown) families and to reproduce or verify existing ones! See Contribute.\n\n#### Currently applied deciders\n\nCurrently applied deciders are [listed on the forum](https://discuss.bbchallenge.org/t/currently-applied-deciders/32/3) and you are invited to Contribute.\n\n\n\n### Why two phases?\n\nPhase 2's deciders could be integrated into the enumeration algorithm of phase 1 in order to mark a lot less than 88,664,064 machines as undecided to begin with. This is the approach taken by [[Marxen and Buntrock, 1990]](http://turbotm.de/~heiner/BB/mabu90.html).\n\nHowever we strongly advocate for our model where responsibility is split into two independent phases, that is because:\n\n1. Splitting responsibilities yields shorter and easier to verify/test/debug code for both phases. In particular, it is very important to establish trust in the seed database of undecided machines hence the simpler the code that generates it the better.\n\n2. Some deciders require a lot more resources than others and might only be relevant to a very small and targetted subset of machines. Hence we don't want to execute them on all enumerated machines which would considerably slow down the enumeration process.\n\nOur approach provides modularity and hopefully facilitates reproducibility, peer reviewing, and trust in the final result.\n\n\n\n## Seed database\n\nThe Busy Beaver Challenge is based on a [downloadable](#download) seed database of 88,664,064 undecided 5-state machines which was constructed during [phase 1](#phase-1-phase-2) of the project, completed in December 2021. You are more than invited to reproduce this result, see Contribute.\n\nThe code to construct the seed database is available at [https://github.com/bbchallenge/bbchallenge-seed](https://github.com/bbchallenge/bbchallenge-seed). This code is open source and was built with readability and concision in mind: it \"only\" consists of 675 lines of Go and 105 lines of C and is unit tested. See our reproducibility and verifiability statement.\n\nThis is to be compared to the unpublished ≈8000 lines of C reported by [[Marxen and Buntrock, 1990]](http://turbotm.de/~heiner/BB/mabu90.html#Enumeration) or the ≈6000 uncommented lines of Pascal of [https://skelet.ludost.net/bb/nreg.html](https://skelet.ludost.net/bb/nreg.html) and justifies our clear separation between [phase 1 and phase 2](#phase-1-phase-2) in this project.\n\nThe main aim of the Busy Beaver Challenge is to decide every machine in the seed database.\n\n\n\n### Construction\n\nThe algorithm that we implement to sparsely enumerate the space of 5-state Turing machines is a variation of [[Marxen and Buntrock, 1990]](http://turbotm.de/~heiner/BB/mabu90.html#Enumeration) but the core idea is the same.\n\nThe algorithm recursively constructs the tree of 5-state Turing machines starting from the following common ancestor:\n\n\n
\n\n| | 0 | 1 |\n| --- | --- | --- |\n| A | 1RB | --- |\n| B | --- | --- |\n| C | --- | --- |\n| D | --- | --- |\n| E | --- | --- |\n\n
\n
\n\nEach machine (i.e. node of the tree) is simulated until either:\n\n1. [time or space limits](#time-and-space-limits) are met\n2. the machine exceeds BB(4) = 107 time steps while having visited only 4 states out of 5\n3. an undefined transition is met\n\nIn **case 1**. the machine is marked as **undecided** and is inserted in the seed database. Note that introducing the idea of [a space limit](#bbspace) is novel compared to [[Marxen and Buntrock, 1990]](http://turbotm.de/~heiner/BB/mabu90.html#Enumeration). We conjecture that BB_SPACE(5) = 12,289.\n\nIn **case 2.** the machine is marked as **non-halting**, see Story for more details on BB(4).\n\nIn **case 3.** there are naïvely choices for the undefined transition that was encountered. This number of choices is reduced by imposing an order on the set of states as this allows not to visit machines that are the same up to renaming of the states (_isomorphic machines_). Further pruning methods are implemented to discard redundant machines. The algorithm is then applied recursively to the machines equipped of their new transition.\n\nComplete pseudo-code and details of the construction are available [on the forum](https://discuss.bbchallenge.org/c/seed-database/6).\n\nThanks to (a) using a [space limit](#bbspace), (b) using low level code for the simulation algorithm and (c) using 2021's computers we do not need to burden the algorithm's code with simulation speed-ups as in [[Marxen and Buntrock, 1990]](http://turbotm.de/~heiner/BB/mabu90.html#Acceleration).\n\n\n\n#### Time and space limits\n\nDuring the enumeration algorithm we need a criterion to stop simulating machines that have been running for too long and mark them as **undecided**. We use the conjectured value of BB(5) = 47,176,870 steps as a cut-off time limit [[Marxen and Buntrock, 1990]](http://turbotm.de/~heiner/BB/mabu90.html#http://turbotm.de/~heiner/BB/mabu90.html) [[Aaronson, 2020]](https://www.scottaaronson.com/papers/bb.pdf).\n\nWe introduce the idea of a space limit. Indeed the busy beaver value is traditionally concerned with time only. But we can also ask an analogous question about **space**: \"what is the maximum number of memory cells that a 5-state machine can visit before halting?\".\n\n\n\n#### BB_SPACE\n\nWe define BB_SPACE:\n\n\n
\nBB_SPACE(n) = \n
Maximum number of memory cells visited by a halting Turing machine with n states starting from all-0 memory tape
\n
\n\nNote that BB_SPACE **is not** Rado's function which is the maximum number of 1s on the final tape of a n-state halting Turing machine from all-0 tape, [[Rado, 1962]](https://cs.famaf.unc.edu.ar/~hoffmann/cc18/Rado-On-non-computable.pdf).\n\nWe conjecture:\n\n\nBB_SPACE(5) = 12,289\n
\n\nWhich is the number of memory cells visited by the 5-state busy beaver time champion.\n\nIt turns that BB_SPACE(5) is a much more practical cut-off to use in the enumeration algorithm than BB(5) as many more machines will visit more than 12,289 memory cells before they exceed 47,176,870 time steps.\n\nNote that if our conjecture is false, i.e. if BB_SPACE(5) > 12,289, the true BB_SPACE winner is necessarily in the seed database and should hopefully be discovered through the effort of deciding the database. Same if BB(5) > 47,176,870.\n\n\n\n#### Metrics\n\nThe enumeration algorithm was run in December 2021 and here are some metrics about the enumerated space of 5-state Turing machines:\n\n| | # machines | # machines | # machines |\n| ---------------------------------------- | ----------- | ---------- | ---------- |\n| total halting (undefined transition met) | | 34,104,723 |\n| total non-halting (using BB(4)) | | 2,711,166 |\n| total pruned | | 944,579 |\n| total undecided (time limit) | | | 14,322,029 |\n| total undecided (space limit) | | | 74,342,035 |\n| total undecided | | 88,664,064 | |\n| total enumerated | 126,424,532 | |\n\n\n\n### Download\n\n\n\n#### Mirrors\n\nThe seed database of 88,664,064 undecided 5-state machines is available for download at:\n\n- http://docs.bbchallenge.org/all_5_states_undecided_machines_with_global_header.zip\n- [ipfs://QmcgucgLRjAQAjU41w6HR7GJbcte3F14gv9oXcf8uZ8aFM](ipfs://QmcgucgLRjAQAjU41w6HR7GJbcte3F14gv9oXcf8uZ8aFM)\n\nThe zipped database is 243M and approx 2G unzipped, each machine is encoded on 30 bytes and the first 30 bytes consist of a reserved header, see [format](#format).\n\nDatabase shasum:\n\n1. zipped: `2576b647185063db2aa3dc2f5622908e99f3cd40`\n2. unzipped: `e57063afefd900fa629cfefb40731fd083d90b5e`\n\nYou are welcome to host the database on your own mirror (as long as preserving shasum), see\nContribute.\n\n\n\n### Format\n\nThe database is a binary file where each machine is described on 30 bytes. It starts with a 30-byte reserved **header** which currently contains the following information (first 13 bytes):\n\n1. `14,322,029`: number of undecided machines that exceeded the 47M steps time limit. 4-byte big endian integer\n2. `74,342,035`: number of undecided machines that exceeded the 12k cells space limit. 4-byte big endian integer\n3. `88,664,064`: total number of undecided machines. 4-byte big endian integer\n4. `1`: a boolean indicating that the machines were lexicographically sorted. The first 14,322,029 undecided machines (47M time limit exceeded) were lexicographically sorted independently of the next 74,342,035 undecided machines (12k space limit exceeded). 1-byte boolean\n\nThen, each machine is encoded on 30 bytes. First come the `14,322,029` machines that exceeded the time limit and then the `74,342,035` machines that exceeded the space limit, see [time and space limits](#time-and-space-limits). These two sets of machines are both lexicographically sorted.\n\nThe 30-byte encoding of a 5-state Turing machine is better understood with an example, for instance with machine [#7,103,458](https://bbchallenge.org/7103458) of the database:\n\n\n
\n\n| | 0 | 1 |\n| --- | --- | --- |\n| A | 1RB | 0LD |\n| B | 0LC | 1LE |\n| C | 1LD | 1LC |\n| D | 0RA | --- |\n| E | 1RB | 1RE |\n\n
\n
\n\nThe machine is encoded using the following 30-byte array, with R=0 and L=1:\n\n```\n[1,R,2, 0,L,4,\n 0,L,3, 1,L,5,\n 1,L,4, 1,L,3,\n 0,R,1, 0,0,0,\n 1,R,2, 1,R,5]\n```\n\nNote that states are indexed starting at A=1 as the state value 0 is used to encode undefined transitions. Write and direction bytes of undefined transitions are set to 0 as well.\n\n#### Use the database\n\nHere are some routines that you can use to extract machines from the database:\n\n_Python_\n\n```python\ndef get_header(machine_db_path):\n with open(machine_db_path, \"rb\") as f:\n return f.read(30)\n\ndef get_machine_i(machine_db_path, i, db_has_header=True):\n with open(machine_db_path, \"rb\") as f:\n c = 1 if db_has_header else 0\n f.seek(30*(i+c))\n return f.read(30)\n```\n\nMore Python utils at [https://github.com/bbchallenge/bbchallenge-py/](https://github.com/bbchallenge/bbchallenge-py/)\n\n_Go_\n\n```go\ntype TM [30]byte\n\nfunc GetMachineI(db []byte, i int, hasHeader bool) (tm TM, err error) {\n if i < 0 || i > len(db)/30 {\n err := errors.New(\"invalid db index\")\n return tm, err\n }\n\n offset := 0\n if hasHeader {\n offset = 1\n }\n\n copy(tm[:], db[30*(i+offset):30*(i+offset+1)])\n return tm, nil\n}\n```\n\nMore go utils at [https://github.com/bbchallenge/bbchallenge-go/](https://github.com/bbchallenge/bbchallenge-go/)\n\n\n\n### API\n\nYou can also query the database through the following API:\n\n```\nGET https://api.bbchallenge.org/machine/\n```\n\nFor instance, [https://api.bbchallenge.org/machine/12345678](https://api.bbchallenge.org/machine/12345678) will return:\n\n```json\n{\n\t\"machine_code\": \"1RB1LC_1RC1RC_1LB1RD_1LA1LE_---0RA\",\n\t\"machine_id\": 12345678,\n\t\"status\": \"decided\"\n}\n```\n\n- The field \"machine_code\" is the string representation of the machine.\n\n- The field \"machine_id\" is the ID that you queried.\n\n- The field \"status\" keeps track of whether the machine has been decided by one of the [deciders](#deciders) that have been applied to the database. The goal is for all machines to become \"decided\".\n\nYou can additionally use the API to query which decider, if any, has decided a particular machine:\n\n```\nGET https://api.bbchallenge.org/machine//decider\n```\n\nContinuing the previous example, [https://api.bbchallenge.org/machine/12345678/decider](https://api.bbchallenge.org/machine/12345678/decider) will return:\n\n```json\n{\n\t\"decider_file\": \"cyclers-run-11c0ef00e9c2-time-1000-space-500-minIndex-0-maxIndex-14322029\"\n}\n```\n\nTherefore, this machine was decided by the decider for [Cyclers](link).\n\n\n\n## Deciders\n\n\n\n### Definition\n\nA decider is a program that outputs `true` if it is able to tell whether a given machine halts or not. Deciders are applied to the machines of the [seed database](#seed-database) in order to reduce the number of undecided machines from 88,664,064 to 0.\n\nWe expect that almost all machines of the seed database do not halt hence deciders are primarily focused on deciding that machines do not halt.\n\nTo be trusted, a decider should be accompanied with a proof of correctness which certifies that the machines that it recognises do not halt. The decider's code should also be tested on a significant number of examples and counterexamples machines, see Contribute.\n\nDeciders are closely related to the zoology of 5-state machines as we aim to decide each family of the zoo. For instance:\n\n- 11,229,238 _Cyclers_, such as Machine #123, were decided by the [decider for cyclers](https://discuss.bbchallenge.org/t/decider-cyclers/33).\n\n- 73,860,604 _Translated Cyclers_, such as Machine #59,090,563, were decided by the [decider for translated cyclers](https://discuss.bbchallenge.org/t/decider-translated-cyclers/34).\n\nDeciders are not _necessarily_ directly connected to a family of the zoology, a good example of this case is [the decider for Backward Reasoning](https://discuss.bbchallenge.org/t/decider-backward-reasoning/35) a notion developed in [[Marxen and Buntrock, 1990]](http://turbotm.de/~heiner/BB/mabu90.html#http://turbotm.de/~heiner/BB/mabu90.html#Nontermination).\n\nWriting, testing and proving deciders is a collaborative task, see [the decider section of the forum](https://discuss.bbchallenge.org/c/deciders/5), and you are invited to Contribute.\n\nTrusted deciders are applied to the seed database.\n\n\n\n### Undecided machines index file\n\n[Download the index file](https://github.com/bbchallenge/bbchallenge-undecided-index/) containing all the indices in the [seed database](#seed-database) of the currently undecided machines (i.e. machines that remain undecided after all trusted [deciders](#deciders) are applied).\n\nMachines' indices are stored in order as 4-byte big endian integers. Here are some routines to extract these indices from an index file:\n\n_Python_\n\n```python\ndef get_indices_from_index_file(index_file_path):\n index_file_size = os.path.getsize(index_file_path)\n\n machines_indices = []\n with open(index_file_path, \"rb\") as f:\n for i in range(index_file_size//4):\n chunk = f.read(4)\n machines_indices.append(int.from_bytes(chunk, byteorder=\"big\"))\n\n return machines_indices\n```\n\n_Go_\n\n```go\nfunc GetIndicesFromIndexFile(indexFilePath string) (\n machinesIndices []uint32, err error) {\n\n var rawIndex []byte\n rawIndex, err = ioutil.ReadFile(indexFilePath)\n\n if err != nil {\n return machinesIndices, err\n }\n\n for i := 0; i < len(rawIndex)/4; i += 4 {\n machinesIndices = append(\n machinesIndices, binary.BigEndian.Uint32(rawIndex[i:i+4]))\n }\n\n return machinesIndices, err\n}\n\n```\n\n\n\n## Reproducibility and verifiability statement\n\nAny result coming from the Busy Beaver Challenge will be fundamentally based on the numerous programs involved in the project such as the [seed database](#seed-database)'s generating code or the [deciders](#deciders).\n\nBecause programs can contain bugs (and often do), such computer-based results tend to struggle gaining trust among the scientific community, where the gold standard is mathematical proof in a peer-reviewed publication.\n\nBecause we aim to achieve this standard, the following principles are at the core of the Busy Beaver Challenge. Any program involved in the project must be:\n\n1. Open source\n2. Open to collaboration\n3. Modular, concise and clear\n4. Documented and unit tested\n5. Reproducible with clear build/run instructions\n6. Eventually accompanied by a proof of correctness\n\nFor deciders, we detail the reproducibility rules in this [validation process](https://discuss.bbchallenge.org/t/debate-vote-deciders-validation-process/85).\n\nWe would encourage the use of automatic proving tools such as [Lean](https://leanprover.github.io/) or [Coq](https://coq.inria.fr/) although it would be an extremely demanding endeavour.\n\nYou are invited to Contribute at making the Busy Beaver Challenge more reproducible and verifiable.\n\n\n\n