nytcrossword

An exploration of New York Times crossword answers from 1994-2017, i.e. the Will Shortz era.

Github stars Tracking Chart

24 Years of NYTimes Crossword answers

September 2, 2017

View the notebook here

Description

Exploratory data analysis of 24 years of New York Times Crossword answers. I use data visualization and computational linguistics concepts to discover trends in the Shortz-era puzzles (1994 - present).

Questions include:

  • What are the most common answers?
  • Are words getting longer? Shorter?
  • How does puzzle letter density vary by day?
  • What words have emerged in the crossword only in the past few years?
  • How lexically diverse are the puzzles?

Dependencies

  • tidyverse for everything
  • plyr for data wrangling
  • here for OS-agnostic file paths
  • tidytext for text analysis methods
  • stringr for string-manipulation operations
  • viridis for a simple, colorblind-friendly palette

Data Sources

The original dataset for this project was scraped from XWordInfo.com. Upon their request, however, I have taken down my scraper code and removed the dataset from this repository. Read the notebook for more details.

Main metrics

Overview
Name With Ownerjtanwk/nytcrossword
Primary LanguageHTML
Program languageR (Language Count: 2)
Platform
License:MIT License
所有者活动
Created At2017-09-03 06:51:53
Pushed At2019-02-20 05:33:29
Last Commit At2019-02-19 23:33:20
Release Count0
用户参与
Stargazers Count122
Watchers Count4
Fork Count8
Commits Count31
Has Issues Enabled
Issues Count1
Issue Open Count0
Pull Requests Count0
Pull Requests Open Count0
Pull Requests Close Count0
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private