Skip to content

Latest commit

 

History

History
83 lines (13 loc) · 965 Bytes

11-word-freq.knit.md

File metadata and controls

83 lines (13 loc) · 965 Bytes

Exercise 1: Word frequency analysis

Introduction

In this tutorial, you will learn how to summarise, aggregate, and analyze text in R:

  • How to tokenize and filter text
  • How to clean and preprocess text
  • How to visualize results with ggplot
  • How to perform automated gender assignment from name data (and think about possible biases these methods may enclose)

Setup

To practice these skills, we will use a dataset that I have already collected from the Edinburgh Fringe Festival website.

You can try this out yourself too: to obtain these data, you must first obtain an API key. Instructions on how to do this are available at the Edinburgh Fringe API page:

Alt Text

Load data and packages

Before proceeding, we'll load the remaining packages we will need for this tutorial.