
一个开源的神经分层多标签文本分类工具包。「An Open-source Neural Hierarchical Multi-label Text Classification Toolkit」

Github stars Tracking Chart

NeuralClassifier Logo

NeuralClassifier: An Open-source Neural Hierarchical Multi-label Text Classification Toolkit


NeuralClassifier is designed for quick implementation of neural models for hierarchical multi-label classification task, which is more challenging and common in real-world scenarios. A salient feature is that NeuralClassifier currently provides a variety of text encoders, such as FastText, TextCNN, TextRNN, RCNN, VDCNN, DPCNN, DRNN, AttentiveConvNet and Transformer encoder, etc. It also supports other text classification scenarios, including binary-class and multi-class classification. It is built on PyTorch. Experiments show that models built in our toolkit achieve comparable performance with reported results in the literature.

Support tasks

  • Binary-class text classifcation
  • Multi-class text classification
  • Multi-label text classification
  • Hiearchical (multi-label) text classification (HMC)

Support text encoders


  • Python 3
  • PyTorch 0.4+
  • Numpy 1.14.3+

System Architecture

NeuralClassifier Architecture



python conf/train.json

Detail configurations and explanations see Configuration.

The training info will be outputted in standard output and log.logger_file.


python conf/train.json
  • if eval.is_flat = false, hierarchical evaluation will be outputted.
  • eval.model_dir is the model to evaluate.
  • data.test_json_files is the input text file to evaluate.

The evaluation info will be outputed in eval.dir.


python conf/train.json data/predict.json 
  • predict.json should be of json format, while each instance has a dummy label like "其他" or any other label in label map.
  • eval.model_dir is the model to predict.
  • eval.top_k is the number of labels to output.
  • eval.threshold is the probability threshold.

The predict info will be outputed in predict.txt.

Input Data Format

JSON example:

    "doc_label": ["Computer--MachineLearning--DeepLearning", "Neuro--ComputationalNeuro"],
    "doc_token": ["I", "love", "deep", "learning"],
    "doc_keyword": ["deep learning"],
    "doc_topic": ["AI", "Machine learning"]

"doc_keyword" and "doc_topic" are optional.


0. Dataset

1. Compare with state-of-the-art

2. Different text encoders

3. Hierarchical vs Flat


Some public codes are referenced by our toolkit:


  • 2019-04-29, init version


Name With OwnerTencent/NeuralNLP-NeuralClassifier
Primary LanguagePython
Program languagePython (Language Count: 1)
PlatformLinux, Mac, Windows
Release Count0
Created At2019-07-04 08:16:11
Pushed At2023-09-06 06:35:08
Last Commit At2023-09-06 14:35:08
Stargazers Count1.8k
Watchers Count66
Fork Count400
Commits Count36
Has Issues Enabled
Issues Count105
Issue Open Count4
Pull Requests Count12
Pull Requests Open Count0
Pull Requests Close Count6
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private
To the top