multi-class-text-classification-cnn-rnn

Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.

Github stars Tracking Chart

Project: Classify Kaggle San Francisco Crime Description

Highlights:

  • This is a multi-class text classification (sentence classification) problem.
  • The goal of this project is to classify Kaggle San Francisco Crime Description into 39 classes.
  • This model was built with CNN, RNN (LSTM and GRU) and Word Embeddings on Tensorflow.

Data: Kaggle San Francisco Crime

  • Input: Descript

  • Output: Category

  • Examples:

    Descript, Category
    -----------, -----------
    GRAND THEFT FROM LOCKED AUTO, LARCENY/THEFT
    POSSESSION OF NARCOTICS PARAPHERNALIA, DRUG/NARCOTIC
    AIDED CASE, MENTAL DISTURBED, NON-CRIMINAL
    AGGRAVATED ASSAULT WITH BODILY FORCE, ASSAULT
    ATTEMPTED ROBBERY ON THE STREET WITH A GUN, ROBBERY

Train:

  • Command: python3 train.py train_data.file train_parameters.json
  • Example: python3 train.py ./data/train.csv.zip ./training_config.json

Predict:

  • Command: python3 predict.py ./trained_results_dir/ new_data.csv
  • Example: python3 predict.py ./trained_results_1478563595/ ./data/small_samples.csv

Reference:

Main metrics

Overview
Name With Ownerjiegzhan/multi-class-text-classification-cnn-rnn
Primary LanguagePython
Program languagePython (Language Count: 1)
Platform
License:Apache License 2.0
所有者活动
Created At2016-10-28 16:55:06
Pushed At2018-03-23 17:46:57
Last Commit At2018-03-23 10:46:56
Release Count0
用户参与
Stargazers Count599
Watchers Count52
Fork Count263
Commits Count79
Has Issues Enabled
Issues Count38
Issue Open Count30
Pull Requests Count3
Pull Requests Open Count1
Pull Requests Close Count0
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private