ranking

Learning to Rank in TensorFlow

Github星跟蹤圖







TensorFlow Ranking

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the
TensorFlow platform. It contains the following components:

We envision that this library will provide a convenient open platform for
hosting and advancing state-of-the-art ranking models based on deep learning
techniques, and thus facilitate both academic research and industrial
applications.

Tutorial Slides

TF-Ranking was presented at premier conferences in Information Retrieval,
SIGIR 2019 and
ICTIR 2019! The slides are available
here.

Demos

We provide a demo, with no installation required, to get started on using
TF-Ranking. This demo runs on a
colaboratory notebook, an
interactive Python environment. Using sparse features and embeddings in
TF-Ranking
Run in Google Colab.
This demo demonstrates how to:

*   Use sparse/embedding features
*   Process data in TFRecord format
*   Tensorboard integration in colab notebook, for Estimator API

Also see Running Scripts for executable scripts.

Linux Installation

Stable Builds

To install the latest version from
PyPI, run the following:

# Installing with the `--upgrade` flag ensures you'll get the latest version.
pip install --user --upgrade tensorflow_ranking

To force a Python 3-specific install, replace pip with pip3 in the above
commands. For additional installation help, guidance installing prerequisites,
and (optionally) setting up virtual environments, see the
TensorFlow installation guide.

Note: Since TensorFlow is now included as a dependency of the TensorFlow Ranking
package (in setup.py). If you wish to use different versions of TensorFlow
(e.g., tensorflow-gpu), you may need to uninstall the existing verison and
then install your desired version:

$ pip uninstall tensorflow
$ pip install tensorflow-gpu

Installing from Source

  1. To build TensorFlow Ranking locally, you will need to install:

    • Bazel, an open
      source build tool.

      $ sudo apt-get update && sudo apt-get install bazel
      
    • Pip, a Python package manager.

      $ sudo apt-get install python-pip
      
    • VirtualEnv, a tool
      to create isolated Python environments.

      $ pip install --user virtualenv
      
  2. Clone the TensorFlow Ranking repository.

    $ git clone https://github.com/tensorflow/ranking.git
    
  3. Build TensorFlow Ranking wheel file and store them in /tmp/ranking_pip
    folder.

    $ cd ranking  # The folder which was cloned in Step 2.
    $ bazel build //tensorflow_ranking/tools/pip_package:build_pip_package
    $ bazel-bin/tensorflow_ranking/tools/pip_package/build_pip_package /tmp/ranking_pip
    
  4. Install the wheel package using pip. Test in virtualenv, to avoid clash with
    any system dependencies.

    $ ~/.local/bin/virtualenv -p python3 /tmp/tfr
    $ source /tmp/tfr/bin/activate
    (tfr) $ pip install /tmp/ranking_pip/tensorflow_ranking*.whl
    

    In some cases, you may want to install a specific version of tensorflow,
    e.g., tensorflow-gpu or tensorflow==2.0.0. To do so you can either

    (tfr) $ pip uninstall tensorflow
    (tfr) $ pip install tensorflow==2.0.0
    

    or

    (tfr) $ pip uninstall tensorflow
    (tfr) $ pip install tensorflow-gpu
    
  5. Run all TensorFlow Ranking tests.

    (tfr) $ bazel test //tensorflow_ranking/...
    
  6. Invoke TensorFlow Ranking package in python (within virtualenv).

    (tfr) $ python -c "import tensorflow_ranking"
    

Running Scripts

For ease of experimentation, we also provide
a TFRecord example
and
a LIBSVM example
in the form of executable scripts. This is particularly useful for
hyperparameter tuning, where the hyperparameters are supplied as flags to the
script.

TFRecord Example

  1. Set up the data and directory.

    MODEL_DIR=/tmp/tf_record_model && \
    TRAIN=tensorflow_ranking/examples/data/train_elwc.tfrecord && \
    EVAL=tensorflow_ranking/examples/data/eval_elwc.tfrecord && \
    VOCAB=tensorflow_ranking/examples/data/vocab.txt
    
  2. Build and run.

    rm -rf $MODEL_DIR && \
    bazel build -c opt \
    tensorflow_ranking/examples/tf_ranking_tfrecord_py_binary && \
    ./bazel-bin/tensorflow_ranking/examples/tf_ranking_tfrecord_py_binary \
    --train_path=$TRAIN \
    --eval_path=$EVAL \
    --vocab_path=$VOCAB \
    --model_dir=$MODEL_DIR \
    --data_format=example_list_with_context
    

LIBSVM Example

  1. Set up the data and directory.

    OUTPUT_DIR=/tmp/libsvm && \
    TRAIN=tensorflow_ranking/examples/data/train.txt && \
    VALI=tensorflow_ranking/examples/data/vali.txt && \
    TEST=tensorflow_ranking/examples/data/test.txt
    
  2. Build and run.

    rm -rf $OUTPUT_DIR && \
    bazel build -c opt \
    tensorflow_ranking/examples/tf_ranking_libsvm_py_binary && \
    ./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary \
    --train_path=$TRAIN \
    --vali_path=$VALI \
    --test_path=$TEST \
    --output_dir=$OUTPUT_DIR \
    --num_features=136 \
    --num_train_steps=100
    

TensorBoard

The training results such as loss and metrics can be visualized using
Tensorboard.

  1. (Optional) If you are working on remote server, set up port forwarding with
    this command.

    $ ssh <remote-server> -L 8888:127.0.0.1:8888
    
  2. Install Tensorboard and invoke it with the following commands.

    (tfr) $ pip install tensorboard
    (tfr) $ tensorboard --logdir $OUTPUT_DIR
    

Jupyter Notebook

An example jupyter notebook is available in
third_party/tensorflow_ranking/examples/handling_sparse_features.ipynb.

  1. To run this notebook, first follow the steps in installation to set up
    virtualenv environment with tensorflow_ranking package installed.

  2. Install jupyter within virtualenv.

    (tfr) $ pip install jupyter
    
  3. Start a jupyter notebook instance on remote server.

    (tfr) $ jupyter notebook third_party/tensorflow_ranking/examples/handling_sparse_features.ipynb \
            --NotebookApp.allow_origin='https://colab.research.google.com' \
            --port=8888
    
  4. (Optional) If you are working on remote server, set up port forwarding with
    this command.

    $ ssh <remote-server> -L 8888:127.0.0.1:8888
    
  5. Running the notebook.

    • Start jupyter notebook on your local machine at
      http://localhost:8888/ and browse to the
      ipython notebook.

    • An alternative is to use colaboratory notebook via
      colab.research.google.com and open
      the notebook in the browser. Choose local runtime and link to port 8888.

References

  • Rama Kumar Pasumarthi, Sebastian Bruch, Xuanhui Wang, Cheng Li, Michael
    Bendersky, Marc Najork, Jan Pfeifer, Nadav Golbandi, Rohan Anil, Stephan
    Wolf. TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank.
    KDD 2019.

  • Qingyao Ai, Xuanhui Wang, Sebastian Bruch, Nadav Golbandi, Michael
    Bendersky, Marc Najork. Learning Groupwise Scoring Functions Using Deep
    Neural Networks.
    ICTIR 2019

  • Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. Learning
    to Rank with Selection Bias in Personal Search.

    SIGIR 2016.

  • Xuanhui Wang, Cheng Li, Nadav Golbandi, Mike Bendersky, Marc Najork. The
    LambdaLoss Framework for Ranking Metric Optimization
    .
    CIKM 2018.

Citation

If you use TensorFlow Ranking in your research and would like to cite it, we
suggest you use the following citation:

@inproceedings{TensorflowRankingKDD2019,
   author = {Rama Kumar Pasumarthi and Sebastian Bruch and Xuanhui Wang and Cheng Li and Michael Bendersky and Marc Najork and Jan Pfeifer and Nadav Golbandi and Rohan Anil and Stephan Wolf},
   title = {TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank},
   booktitle = {Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
   year = {2019},
   pages = {2970--2978},
   location = {Anchorage, AK}
}

主要指標

概覽
名稱與所有者tensorflow/ranking
主編程語言Python
編程語言Python (語言數: 4)
平台
許可證Apache License 2.0
所有者活动
創建於2018-12-03 20:48:57
推送於2024-03-18 20:31:57
最后一次提交2024-03-18 13:12:37
發布數18
最新版本名稱v0.5.3 (發布於 )
第一版名稱v0.1.3 (發布於 )
用户参与
星數2.8k
關注者數93
派生數480
提交數556
已啟用問題?
問題數319
打開的問題數79
拉請求數13
打開的拉請求數12
關閉的拉請求數19
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?
资源目录

models

使用TensorFlow构建的模型和示例。(Models and examples built with TensorF...
机器学习

TensorFlow

适合所有人的开源机器学习框架。她使用数据流图进行计算,以实现可扩展的机器学习。(An Open Source Machi...
其他资源

playground

Play with neural networks!
C/C++

TensorFlow Serving

一个灵活、高性能的机器学习模型服务系统。「A flexible, high-performance serving sy...
未分类

TensorFlow Probability

TensorFlow中的概率推理和统计分析。「Probabilistic reasoning and statistic...
TypeScript

tfjs

A WebGL accelerated JavaScript library for training and depl...
C/C++

io

Datasets and filesystem extensions maintained by SIG-IO
Python

addons

Useful extra functionality for TensorFlow 2.0 maintained by ...
机器学习

TFX

TFX是一个部署生产ML管道的端到端平台。「TFX is an end-to-end platform for depl...
Python

agents

TF-Agents is a library for Reinforcement Learning in TensorF...
Python

transform

Input pipeline framework
未分类

tpu

Reference models and tools for Cloud TPUs.