Curated dataset of Satoshi Nakamoto responding to users
This project curates part of the dataset collected by NakamotoInstitute for nakamotoinstitute.org. Particularly the posts.json and emails.json.
The data is cleaned up, merged and re-assembled into Q&As answered by Satoshi Nakamoto. The Q&A data is used by the API powering satoshi-ama to answer user questions.
In addition, the final JSON includes several questions generated by GPT-4, they are all answered by literal quotes from Satoshi Nakamoto.
I aimed to keep the original dataset intact, so all modifications are recorded in inputs/overrides.json and even those don't change the original text, just re-arrange it. It still needs some more cleaning up but I won't be doing that in the short-term.
You can view the dataset as JSON here and visually here: https://flesler.github.io/satoshi-data/
npm install
npm run start
npm run serve
The output is a JSON file, with an array of objects. They are sorted chronologically. Many have a type
which is one of:
- gpt-4: Those generated with GPT-4
- favorite: The ones I personally liked the most
- ignore: The ones I personally didn't find useful for my particular needs