Skip to content

Commit

Permalink
Create thinking_fast_and_slow.md
Browse files Browse the repository at this point in the history
  • Loading branch information
dmarx authored Mar 21, 2024
1 parent 5106b32 commit 274df77
Showing 1 changed file with 21 additions and 0 deletions.
21 changes: 21 additions & 0 deletions thinking_fast_and_slow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# thinking fast and slow

labels: experimental

weights = W

decompose W into W = W1 + W2 s.t. W1 and W2 have same dimension

set a (alpha) to be a mixing rate, which starts at zero.

Learn W as W = W1 + a * W2, increasing `a` throughout training in proportion to lr

W2 are the "slow" weights and are learned conventionally

W1 will be learned parameterized as a hyperlora, and so are our "fast" weights.

let W1 = VZ where V is a learnable vector and Z is a fixed, randomly initialized orthonormal matrix (i.e. random projections)

the "slow" weights are essentially a residual.

we could "stack" residuals if we wanted higher-order granularity

0 comments on commit 274df77

Please sign in to comment.