-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathindex.Rmd
42 lines (34 loc) · 5.61 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
title: "Computational Text Analysis"
author: "Marion Lieutaud"
date: "`r Sys.Date()`"
site: bookdown::bookdown_site
output: bookdown::bs4_book
documentclass: book
bibliography: [book.bib, packages.bib, CTA.bib]
url: https://cjbarrie.github.io/CTA-ED/
cover-image: coverb.png
description: |
Online book for ten-week course in "Computational Text Analysis" (PGSP11584)
link-citations: yes
github-repo: cjbarrie/CTA-ED
---
# "Computational Text Analysis" (PGSP11584) {.unnumbered}
![cover](coverb.png){.cover width="250"} This is the dedicated webpage for the course Computational Text Analysis" [(PGSP11584)](http://www.drps.ed.ac.uk/23-24/dpt/cxpgsp11584.htm) at the University of Edinburgh, taught by Christopher Barrie. Go to the Course Overview and Introduction tabs for a course overview and introduction to R.
We will be using this online book throughout the course. Each week has a set of essential and recommended readings. The essential readings must be consulted in full prior to the Lecture and Seminar for that week. In addition, you will find online Exercises and examples written in R. This is a "live" book and will be amended and updated during the course itself.
## Structure
The course is structured of alternating weeks of substantive and technical instruction.
| Week | Focus | Coding assignment(s) | Class activity |
|------------|------------|-----------------------------|-------------------|
| 1 | Retrieving and analyzing text information | [Introductory exercises](https://cjbarrie.github.io/CTA-ED/introduction-to-r.html) + [RTC Workshop by Ugur Ozdemir](https://research-training-centre.sps.ed.ac.uk/micro-methods/) | Seminar discussion |
| 2 | Tokenization and word frequencies | [Demo](https://cjbarrie.github.io/CTA-ED/week-2-demo.html) | Seminar discussion |
| 3 | Dictionary-based techniques | [Demo](https://cjbarrie.github.io/CTA-ED/week-3-demo.html) + [Exercise 2](https://cjbarrie.github.io/CTA-ED/exercise-2-dictionary-based-methods.html) | Flash talk + [Exercise 1](https://cjbarrie.github.io/CTA-ED/exercise-1-word-frequency-analysis.html) group work |
| 4 | Natural language, complexity, and similarity | [Demo](https://cjbarrie.github.io/CTA-ED/week-4-demo.html) | Coding demo of Exercise 2 + Seminar discussion |
| 5 | Scaling techniques | [Demo](https://cjbarrie.github.io/CTA-ED/week-5-demo.html) + [Exercise 4](https://cjbarrie.github.io/CTA-ED/exercise-4-scaling-techniques.html) | Flash talk + [Exercise 3](https://cjbarrie.github.io/CTA-ED/exercise-3-comparison-and-complexity.html) group work |
| 6 | Unsupervised learning (topic models) | [Demo](https://cjbarrie.github.io/CTA-ED/week-6-demo.html) | Coding demo of Exercise 4 + Seminar discussion |
| 7 | Unsupervised learning (word embedding) | [Demo](https://cjbarrie.github.io/CTA-ED/week-7-demo.html) + [Exercise 6](https://cjbarrie.github.io/CTA-ED/exercise-6-unsupervised-learning-word-embedding.html) | Flash talk + [Exercise 5](https://cjbarrie.github.io/CTA-ED/exercise-5-unsupervised-learning-topic-models.html) group work |
| 8 | Sampling text information | [Demo](https://cjbarrie.github.io/CTA-ED/week-8-demo.html) | Coding demo of Exercise 6 + Seminar discussion |
| 9 | Supervised learning | Demo + Exercise 8 | Flash talk + [Exercise 7](https://cjbarrie.github.io/CTA-ED/exercise-7-sampling-text-information.html) group work |
| 10 | Validation | Demo + Exercise 9 | ~~Coding demo of Exercise 8~~ + Seminar discussion |
## Acknowledgments {.unnumbered}
When compiling this course, I benefited from syllabus materials shared online by Bradley Boehmke, Margaret Roberts, Alexandra Siegel, and Arthur Spirling. Thanks also to Justin Grimmer, Margaret Roberts, and Brandon Stewart for providing early view access to their forthcoming [*Text as Data*](https://press.princeton.edu/books/hardcover/9780691207544/text-as-data) book.