Create thinking_fast_and_slow.md

dmarx · Mar 21, 2024 · 274df77 · 274df77
1 parent 5106b32
commit 274df77
Showing 1 changed file with 21 additions and 0 deletions.
diff --git a/thinking_fast_and_slow.md b/thinking_fast_and_slow.md
@@ -0,0 +1,21 @@
+# thinking fast and slow
+
+labels: experimental
+
+weights = W
+
+decompose W into W = W1 + W2 s.t. W1 and W2 have same dimension
+
+set a (alpha) to be a mixing rate, which starts at zero.
+
+Learn W as W = W1 + a * W2, increasing `a` throughout training in proportion to lr
+
+W2 are the "slow" weights and are learned conventionally
+
+W1 will be learned parameterized as a hyperlora, and so are our "fast" weights.
+
+let W1 = VZ where V is a learnable vector and Z is a fixed, randomly initialized orthonormal matrix (i.e. random projections)
+
+the "slow" weights are essentially a residual.
+
+we could "stack" residuals if we wanted higher-order granularity