tract

Tiny, no-nonsense, self contained, Tensorflow and ONNX inference

Github星跟踪图

tract-logo

rustc >= 1.39.0
MIT/Apache 2
Native Linux test status
Embedded targets status
Doc

Snips' tiny TensorFlow and ONNX inference engine.

This project used to be called tfdeploy, or Tensorflow-deploy-rust.

What ?

tract is a tensorflow- and ONNX- compatible inference library. It loads a
Tensorflow or ONNX frozen model from the regular protobuf format, and flows
data through it.

Quick start

Real-time streaming support

This is a semi-experimental support for real-time applications like voice
processing. In many real time voice applications, processing must happen "as you
go". One can not wait for the end of the incoming audio signal to start
decoding.

While Kaldi has built its inference engine around this streaming constraint,
our approach to the same issue is a bit different. tract graph analyser and
optimiser will reason on "streamed" tensors, in order to generate an equivalent
stateful "pulsing" network that will propagate small time slices ("pulses") of
data. This makes optimisation efforts on pulsing and "finite" tensor modes
mutually benefit each other.

Obviously, this conversion only makes sense for a subset of operators, so not
all networks can be converted to a pulse network: for instance, an aggregation
(like a SoftMax) on the time dimension can only be given a value when the
signal has been processed up to the end.

Status and compatibility

ONNX

As of today (October 2019), tract passes successfully about 85% of ONNX backends
tests. All "real life" integration tests in Onnx test suite are passing:
bvlc_alexnet, densenet121, inception_v1, inception_v2, resnet50, shufflenet,
squeezenet, vgg19, zfnet512.

The following operators are implemented and tested.

Abs, Acos, Acosh, Add, And, ArgMax, ArgMin, Asin, Asinh, Atan, Atanh, AveragePool, BatchNormalization, Cast, CategoryMapper, Ceil, Clip, Compress, Concat, Constant, ConstantLike, ConstantOfShape, Conv, Cos, Cosh, DequantizeLinear, Div, Dropout, Elu, Equal, Erf, Exp, Expand, EyeLike, Flatten, Floor, GRU, Gather, Gemm, GlobalAveragePool, GlobalLpPool, GlobalMaxPool, Greater, HardSigmoid, Hardmax, Identity, IsNaN, LRN, LSTM, LeakyRelu, Less, Log, LogSoftmax, MatMul, Max, MaxPool, Mean, Min, Mul, Neg, Not, Or, PRelu, Pad, ParametricSoftplus, Pow, QuantizeLinear, RNN, Reciprocal, ReduceL1, ReduceL2, ReduceLogSum, ReduceLogSumExp, ReduceMax, ReduceMean, ReduceMin, ReduceProd, ReduceSum, ReduceSumSquare, Relu, Reshape, Rsqrt, ScaledTanh, Scan, Selu, Shape, Shrink, Sigmoid, Sign, Sin, Sinh, Size, Slice, Softmax, Softplus, Softsign, Split, Sqrt, Squeeze, Sub, Sum, Tan, Tanh, ThresholdedRelu, Tile, Transpose, Unsqueeze, Where, Xor

We test these operators against Onnx 1.4.1 (operator set 9) and Onnx 1.5.0
(operator set 10).

TensorFlow

Even if tract is very far from supporting any arbitrary model, it can run
Google Inception v3 and Snips wake word models. Missing operators are easy
to add. The lack of easy to reuse test suite, and the wide diversity of
operators in Tensorflow make it difficult to target a full support.

The following operators are implemented and tested:

Abs, Add, AddN, AddV2, Assign, AvgPool, BatchToSpaceND, BiasAdd, BlockLSTM, Cast, Ceil, ConcatV2, Const, Conv2D, DepthwiseConv2dNative, Div, Enter, Equal, Exit, ExpandDims, FakeQuantWithMinMaxVars, Fill, FloorMod, FusedBatchNorm, GatherNd, GatherV2, Greater, GreaterEqual, Identity, Less, LessEqual, Log, LogicalAnd, LogicalOr, LoopCond, MatMul, Max, MaxPool, Maximum, Mean, Merge, Min, Minimum, Mul, Neg, NoOp, Pack, Pad, Placeholder, Pow, Prod, RandomUniform, RandomUniformInt, Range, RealDiv, Relu, Relu6, Reshape, Rsqrt, Shape, Sigmoid, Slice, Softmax, SpaceToBatchND, Squeeze, StridedSlice, Sub, Sum, Switch, Tanh, Tile, Transpose, VariableV2

TensorFlow-Lite

TensorFlow-Lite is a TensorFlow subproject that also focuses on inference on
smaller devices. It uses a precompiler to transform a TensorFlow network to
its own format. It only supports a subset of operators from TensorFlow though,
and is only optimised for devices with Arm Neon support.

Tract supports a wider subset of TensorFlow operators, and has been optimised
for CPU of the previous generation (ARM VFP), also targetting devices in the
Raspberry Pi Zero family.

Example of supported networks

These models among others, are used to track tract performance evolution as
part of the Continuous Integration jobs. See .travis/README.md and
.travis/bundle-entrypoint.sh for more
information.

Keyword spotting on Arm Cortex-M Microcontrollers

https://github.com/ARM-software/ML-KWS-for-MCU

ARM demonstrated the capabilited of the Cortex-M family by providing
tutorials and pre-trained models for keyword spotting. While the exercise
is ultimately meant for micro-controllers, tract can run the intermediate
TensorFlow models.

For instance, on a Rasperry Pi Zero, the "CNN M" model runs in about 70
micro-seconds, and 11 micro-seconds on a Raspberry Pi 3.

Snips wake word models

https://arxiv.org/abs/1811.07684

Snips uses tract to run the wake word detectors. While earlier models were
class-based and did not require any special treatment, tract pulsing
capabilities made it possible to run WaveNet models efficiently enough for a
Raspberry Pi Zero.

Inception v3, Device, Family, TensorFlow-lite, tract, ---------------------, ----------------, -------------------, ---------, Raspberry Pi Zero, Armv6 VFP, 113s, 39s, Raspberry Pi 2, Armv7 NEON, 25s, 7s, Raspberry Pi 3, aarch32 NEON, 5s, 5s, Notes:

  • while the Raspberry Pi 3 is an Armv8 device, this bench is running
    on Raspbian, an armv6 operating system, crippling the performance
    of both benches
  • there exists other benches on the internet that show better
    performance results for TensorFlow (not -Lite) on the Pi 3.
    They use all four cores of the device. Both TensorFlow-Lite and tract
    here have been made to run on a single-core.

Roadmap

One important guiding cross-concern: this library must cross-compile as
easily as practical to small-ish devices (think 20$ boards).

License

Note: files in the tensorflow/protos directory are copied from the
TensorFlow project and are not
covered by the following licence statement.

Note: files in the onnx/protos directory are copied from the
ONNX project and are not
covered by the following licence statement.

Apache 2.0/MIT

All original work licensed under either of

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted
for inclusion in the work by you, as defined in the Apache-2.0 license, shall
be dual licensed as above, without any additional terms or conditions.

主要指标

概览
名称与所有者sonos/tract
主编程语言Rust
编程语言Rust (语言数: 11)
平台
许可证Other
所有者活动
创建于2017-08-07 09:31:26
推送于2025-06-17 15:59:23
最后一次提交
发布数457
最新版本名称0.21.branching (发布于 )
第一版名称0.0.2 (发布于 2017-08-09 11:43:52)
用户参与
星数2.5k
关注者数43
派生数224
提交数7.6k
已启用问题?
问题数260
打开的问题数54
拉请求数1347
打开的拉请求数19
关闭的拉请求数87
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?