Skip to content

nevakrien/Turing-compiler

Repository files navigation

Turing-compiler

an optimizing compiler to a binary turing machine.

the syntax is fairly simplistic and we are stealing the perl lsp. this is still in early devlopment so any bugs caught would be very apreshated.

currently for x64_linux and a small native Arm (currently bugged). however nothing is inherently linux based in the compiler itself. the build system just assumes bash instead of powershell. it should be easy to fix if needed. if there is demand I would open a windows brench

The arm branch has slightly diffrent performance charcturistics and is tested less thoroughly. cross compilation testing shows that it does fail in some cases and thus needs to be fixed. currently we are using a crude method which should usually work (although i am not sure about the performance implications)

usage

syntax exmples can be found in code_tests/tasks and in tests/code_samples we use perl highlighting and a relativly free form of syntax.

as of now there are 3 major tools:

  1. tape_tool: used to mainpulate .tape files
  2. run_turing: runs a turing machine described in .t on a .tape file and writes the output to anoter .tape file
  3. tmc0/tmc1/treemc/tmc2: would compile a .t file into a binary that acts like run_turing

this languge is really not ment to be written by hand for long periods of time. the main intent is very big autogenerated machines. in code_tests/code_gen.py you can see some examples of this.

currently using export PATH=$(pwd)/bin:$PATH does not work with tmc because the path to io.o is derived programatically. this can be fixed in the future by directly basking io.o's binary into compiler.o as a string. or by using some sort of install system

dependencies

runtime

we depend on nasm and ld and libc. nothing more

build

runtime+ gnumake gcc and bash.

tests

build+ python (no packages)

general plan

the idea is to wrap all of the syscalls into io.o and have that be a stable abi on all platforms. then we generate intel asembly files that would need to be assembeled with nasm (I debated using opcodes but having it this way lets users read the assembly) linking is done with ld

this means there is no gcc dependency!!! or any other compiler for that matter.

we also want to have code comments on the generated assembly for clarity. the idea is to make it as human readble as possible. names from the source file can be found in the assembly output.

benchmarking

usage

make bench

would run the benchmarks.

for selecting what you run go to code_tests/time.py and define the timers dictionay. for the tmc0 test we had

timers={'run_turing_no_stop':time_run_turing_no_stop,'run_turing':time_run_turing,'tmc0':time_tmc0}

rational and statistics

the core measurment is a simple null hypothesis test. is A faster then B in most cases? this lets us measure if a preformance improvment is meaningful or not. without assuming a distribution for the noise. it also means we dont need to depend on numpy.

interpeters VS tmc0

even the O0 code is consistently faster than the interpeters. this is very promicing since there is a lot wrong with tmc0 in terms of preformance. it is made to be as simple as possible not to be fast

from the tests I ran on the interpeter it apears that not keeping track of the counter has a small yet semi consistent advantage. the advantage could very well be caused by the need to parse another argument or something else that is as silly. for most purposes it would probably not matter. the reason I am not using it is mainly because it simplifies some of the logic for the compiler.

from playing around with code vs tape sizes its become pretty clear that the IO part can mess with the benchmark. because of this it is recommanded to use smaller tapes for measuring.

length does seem to matter a lot. I am thinking this is related to branch prediction, or the instruction cache. essentially the compiled code has a major advantage because it is stored in Li and not Ld. the order of the unrolled loops seems to not be relevent.

tmc1 VS tmc0

tmc1 is slightly faster than tmc0. on the hand made code the diffrence is not statistically significant.

from what I seen the code_proning only really matters for when state order is paticularly bad. having it reordered so that the first states apear first has a significant speed advantage. removing dead code has basically no effect...

On the other hand some of the improvments Of jump eliminations can have a positive effect in some cases. The aligment of the extending functions is also super critical,

tmc2

tmc2 is faster than tmc1 and tmc0 on the handmade code. the diffrence is barely there but its measurble and statistically significant (on the first working test...)

tmc2 is also faster on the larger pool and it makes significantly shorter code EVEN THO it inlines all of the comperisons.

treemc

this compiler was used as an intermidate step for devloping tmc2. its faster than tmc0 but slower than tmc1. It does do unique things compared to tmc1 and it can be faster in some cases. just not in most cases

notes for devs

the make test/bench system deletes the compiled files. since everything compiles under a second its not an issue. if this is not desirble test.py and other python tests would work by themselves.

in code_tests all the generated code testing is handeled. tests is where unit/integration tests can be found.

it is recommended to add a small calling script into test.py for every test added.

calling convention

if you notice our register use does not care to restore regiters at all. this is because we are using _start and not main. so there is actually no one to return to and thus no point preserving registers.

also note that rsp is not treated as a special case in the C++ classes... this means that we allow using it as a temporery register. probably not the most ideal setup but it does make sense if for some reason we ever want to move to a no stack setup.

Coding Style

this is mostly a C project but as a learning exprince I am adding C++ for O2 we will see how it goes.

Arm notes

linking the arm tool chain properly is extremly complicated... we are kind of forced into using gcc as the linker with main instead of start. now this may be a result of virtualization but i am suspecting that its deeper

TODO

compiler

  1. add a way to benchmark old+new version of the compiler to asses changes

general

  1. unify the cli tools

parser

  1. add better error handeling with a dedicated struct.

BUGS

maybe_inline causes UB... need to look into it.

our Exit nodes cause ub when freed after they have moved away from the original state...

About

an optimizing compiler to a binary turing machine

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published