nytcrossword

An exploration of New York Times crossword answers from 1994-2017, i.e. the Will Shortz era.

Github星跟蹤圖

24 Years of NYTimes Crossword answers

September 2, 2017

View the notebook here

Description

Exploratory data analysis of 24 years of New York Times Crossword answers. I use data visualization and computational linguistics concepts to discover trends in the Shortz-era puzzles (1994 - present).

Questions include:

  • What are the most common answers?
  • Are words getting longer? Shorter?
  • How does puzzle letter density vary by day?
  • What words have emerged in the crossword only in the past few years?
  • How lexically diverse are the puzzles?

Dependencies

  • tidyverse for everything
  • plyr for data wrangling
  • here for OS-agnostic file paths
  • tidytext for text analysis methods
  • stringr for string-manipulation operations
  • viridis for a simple, colorblind-friendly palette

Data Sources

The original dataset for this project was scraped from XWordInfo.com. Upon their request, however, I have taken down my scraper code and removed the dataset from this repository. Read the notebook for more details.

主要指標

概覽
名稱與所有者jtanwk/nytcrossword
主編程語言HTML
編程語言R (語言數: 2)
平台
許可證MIT License
所有者活动
創建於2017-09-03 06:51:53
推送於2019-02-20 05:33:29
最后一次提交2019-02-19 23:33:20
發布數0
用户参与
星數122
關注者數4
派生數8
提交數31
已啟用問題?
問題數1
打開的問題數0
拉請求數0
打開的拉請求數0
關閉的拉請求數0
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?