implicit

Fast Python Collaborative Filtering for Implicit Feedback Datasets

Github星跟蹤圖

Implicit

Build Status
Windows Build Status

Fast Python Collaborative Filtering for Implicit Datasets.

This project provides fast Python implementations of several different popular recommendation algorithms for
implicit feedback datasets:

All models have multi-threaded training routines, using Cython and OpenMP to fit the models in
parallel among all available CPU cores. In addition, the ALS and BPR models both have custom CUDA
kernels - enabling fitting on compatible GPU's. Approximate nearest neighbours libraries such as Annoy, NMSLIB
and Faiss can also be used by Implicit to speed up
making recommendations
.

To install:

pip install implicit

Basic usage:

import implicit

# initialize a model
model = implicit.als.AlternatingLeastSquares(factors=50)

# train the model on a sparse matrix of item/user/confidence weights
model.fit(item_user_data)

# recommend items for a user
user_items = item_user_data.T.tocsr()
recommendations = model.recommend(userid, user_items)

# find related items
related = model.similar_items(itemid)

The examples folder has a program showing how to use this to compute similar artists on the
last.fm dataset
.

For more information see the documentation.

Articles about Implicit

These blog posts describe the algorithms that power this library:

There are also several other blog posts about using Implicit to build recommendation systems:

Requirements

This library requires SciPy version 0.16 or later. Running on OSX requires an OpenMP compiler,
which can be installed with homebrew: brew install gcc. Running on Windows requires Python
3.5+.

GPU Support requires at least version 8 of the NVidia CUDA Toolkit. The build will use the nvcc compiler
that is found on the path, but this can be overriden by setting the CUDAHOME enviroment variable
to point to your cuda installation. Note that the GPU extensions are not included in the version
on condaforge.

This library has been tested with Python 2.7, 3.5, 3.6 and 3.7 on Ubuntu and OSX, and tested with
Python 3.5 and 3.6 on Windows.

Benchmarks

Simple benchmarks comparing the ALS fitting time versus Spark and QMF can be found here.

Optimal Configuration

I'd recommend configuring SciPy to use Intel's MKL matrix libraries. One easy way of doing this is by installing the Anaconda Python distribution.

For systems using OpenBLAS, I highly recommend setting 'export OPENBLAS_NUM_THREADS=1'. This
disables its internal multithreading ability, which leads to substantial speedups for this
package. Likewise for Intel MKL, setting 'export MKL_NUM_THREADS=1' should also be set.

Released under the MIT License

主要指標

概覽
名稱與所有者benfred/implicit
主編程語言Python
編程語言Python (語言數: 6)
平台
許可證MIT License
所有者活动
創建於2016-04-17 03:45:23
推送於2024-07-11 17:58:17
最后一次提交
發布數21
最新版本名稱v0.7.2 (發布於 2023-09-29 13:43:02)
第一版名稱v0.3.7 (發布於 2018-09-21 22:31:29)
用户参与
星數3.7k
關注者數76
派生數621
提交數435
已啟用問題?
問題數499
打開的問題數91
拉請求數195
打開的拉請求數9
關閉的拉請求數28
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?