nytcrossword

An exploration of New York Times crossword answers from 1994-2017, i.e. the Will Shortz era.

Github星跟踪图

24 Years of NYTimes Crossword answers

September 2, 2017

View the notebook here

Description

Exploratory data analysis of 24 years of New York Times Crossword answers. I use data visualization and computational linguistics concepts to discover trends in the Shortz-era puzzles (1994 - present).

Questions include:

  • What are the most common answers?
  • Are words getting longer? Shorter?
  • How does puzzle letter density vary by day?
  • What words have emerged in the crossword only in the past few years?
  • How lexically diverse are the puzzles?

Dependencies

  • tidyverse for everything
  • plyr for data wrangling
  • here for OS-agnostic file paths
  • tidytext for text analysis methods
  • stringr for string-manipulation operations
  • viridis for a simple, colorblind-friendly palette

Data Sources

The original dataset for this project was scraped from XWordInfo.com. Upon their request, however, I have taken down my scraper code and removed the dataset from this repository. Read the notebook for more details.

主要指标

概览
名称与所有者jtanwk/nytcrossword
主编程语言HTML
编程语言R (语言数: 2)
平台
许可证MIT License
所有者活动
创建于2017-09-03 06:51:53
推送于2019-02-20 05:33:29
最后一次提交2019-02-19 23:33:20
发布数0
用户参与
星数122
关注者数4
派生数8
提交数31
已启用问题?
问题数1
打开的问题数0
拉请求数0
打开的拉请求数0
关闭的拉请求数0
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?