This is the main project repository for the paper Symbolically Synthesized Neural Networks:
Abstract: Neural networks adapt very well to distributed and continuous representations, but struggle to generalize from small amounts of data. Symbolic systems commonly achieve data efficient generalization by exploiting modularity to benefit from local and discrete features of a representation. These features allow symbolic programs to be improved one module at a time and to experience combinatorial growth in the values they can successfully process. However, it is difficult to design a component that can be used to form symbolic abstractions and which is adequately overparametrized to learn arbitrary high-dimensional transformations. I present Graph-based Symbolically Synthesized Neural Networks (G-SSNNs), a class of neural modules that operate on representations modified with synthesized symbolic programs to include a fixed set of local and discrete features. I demonstrate that the choice of injected features within a G-SSNN module modulates the data efficiency and generalization of baseline neural models, creating predictable patterns of both heightened and curtailed generalization. By training G-SSNNs, we also derive information about desirable semantics of symbolic programs without manual engineering. This information is compact and amenable to abstraction, but can also be flexibly recontextualized for other high-dimensional settings. In future work, I will investigate data efficient generalization and the transferability of learned symbolic representations in more complex G-SSNN designs based on more complex classes of symbolic programs. Experimental code and data are available at this https URL .
This repository binds together the data generation capabilities offered by raven-gen
, the library learning capabilities of stitch-core
, the distributional program search and file management capabilities of antireduce
, and the relational structures defined by antireduce-graphs
to support training and evaluation of G-SSNNs.
This directory exists after step (1) below has been performed. The graph/
directory contains the subdirectories used by OCaml binaries and the SymbolicallySynthesizedNetworks
class to manage the state of a population of G-SSNNs. This includes tracking discarded members of the population through graph/discards
, tracking the current population through graph/representations
, tracking iterations of the current Domain-Specific Library (DSL) through graph/dsls
, and tracking visualizations of the current population through graph/visualizations
. This directory also contains the dataset synthesized and used in official experiments with the functions of the raven.py
module. The file graph/128_split_correct.pkl
captures the split of a random subset of the data that was used for training and validation in official experiments.
This directory contains OCaml executables which can be compiled by running make
from the root of the repository, or by running dune build
from within program_synthesis/
, provided that steps (2-4) have been performed. After running one of these commands, appropriate *.exe
files (this naming is platform independent) binaries should exist within program_synthesis/_build/default
. The utilities.py
module wraps calls to execute these binaries in ordinary Python functions.
The results directory functions as an archive of current experimental results. At the root level of the results/
directory exist files corresponding to baseline runs. There are currently three runs, only one of which is directly relevant to analysis of the performance of G-SSNNs. The baseline from the paper is contained in the files named 128_const_color_1_iteration_baseline_*
. The baseline plot from the paper is included here too as 128_const_color_1_iteration_baseline_plot.pdf
; the file was renamed since official submission but is otherwise unaltered. The other files at the root of the repository contain logs of baseline runs that took more than one iteration; the other 128_const_color_*_iteration_baseline_summary.json
files contain data on the average performance of a population of 10 baseline models across 10 and 15 iterations.
The subdirectory results/ssn_0_0
archives the results of the run of the experimental setting reported in the paper. It includes the contents of graph/{discards,dsls,representations,visualizations}
at the time the run concluded, as well as a complete log.json
containing the experimental data and a discard_hist.json
, showing which files in results/ssn_0_0/discards
were removed after which iterations. The plots used in the paper to depict results in the experimental setting are present in this subdirectory as well.
Contains python wrapper functions around calls to executables resulting from OCaml compilation. The three currently used functionalities from this module are given by explore
, stitch_compress
, and incorporate_stitch
. The first of these performs distributional program search in the style of the antireduce
and antireduce-graphs
libraries; the second performs library learning by delegating to the compression functionality of the stitch_core
package; and the third performs work needed to update artifacts in graph/dsls
and graph/representations
that are affected by the creation of primitives and rewriting of programs performed via stitch_core
.
Contains functions for generating data with raven-gen
, loading this data as a subclass of torch.utils.data.dataset.Dataset
, producing random two way splits of this data, and extracting different ground truth annotations.
Contains classes representing baseline and experimental models together with shared training routines. The class GraphStructured
provides the implementation of the embedding function described in the G-SSNNs section of the paper.
Contains the class SymbolicallySynthesizedNetworks
which evolves a population of G-SSNNs by iteratively performing program synthesis, training a G-SSNN for each program in the generated population, then discarding the bottom half of programs according to training accuracy.
Provides high-level routines for running the experiments and baselines reported in the experiment. The run_*
methods of this module also archive their results within the results/
directory. The separate plot_*
methods must be used in order to generate accompanying plots.
-
Unzip
graph.zip
to recreate the intendedgraph/
directory structure. -
Install OPAM (guide) and create a switch
opam switch create 4.14.0
. -
Clone the
antireduce
andantireduce-graphs
repositories. From within the root of each, runopam install .
with the previously-created switch. -
Import the remaining configuration to the switch with the command
opam switch import opam-switch.freeze
. If this does not succeed, attempt to install the packages listed underroots
in the fileopam-switch.freeze
. -
Create a Python virtual environment and install the following packages; for the precise versions used in official experiments, consult
requirements.txt
:
raven-gen
matplotlib
numpy
tqdm
dgl
torch
torchvision
vit-pytorch
einops
pygraphviz
stitch_core
- Generate data in an interpreter with the following commands:
>>> from raven import *
>>> generate_data(20000, "graph/dataset", save_pickle=True)
>>> train, eval = random_split("graph/dataset", constant_color(True), constant_color(False), 128, batch_size=8, n_eval_batches=150, include_incorrect=False)
>>> save_split("graph/128_split_correct.pkl")
- Run the baseline and generate baseline plots with the following commands:
>>> from experiments import *
>>> train, eval = load_split("graph/128_split_correct.pkl", "graph/dataset", constant_color(True), constant_color(False), 128, batch_size=8, n_eval_batches=150, include_incorrect=False)
>>> run_baseline("128_const_color_1_iteration_baseline", 50, train, eval, CONSTANT_COLOR_TARGET_DIM, training_iterations=1)
>>> plot_single_iter_baseline_results("128_const_color_1_iteration_baseline", 50, 6)
- Run the experimental models with the following commands:
>>> from experiments import *
>>> train, eval = load_split("graph/128_split_correct.pkl", "graph/dataset", constant_color(True), constant_color(False), 128, batch_size=8, n_eval_batches=150, include_incorrect=False)
>>> run_experiment("ssn_0", 1, 5, train, eval, CONSTANT_COLOR_TARGET_DIM)
>>> plot_experimental_results("ssn_0_0", 6)