Recognizers-Text

Microsoft.Recognizers.Text provides recognition and resolution of numbers, units, and date/time expressed in multiple languages (CN, EN, FR, ES, PT, DE. Partial support for NL, JA, KO). Contributions are greatly welcome! Packages are available at https://www.nuget.org/profiles/Recognizers.Text and https://www.npmjs.com/~recognizers.text

Github星跟蹤圖

Microsoft Recognizers Text Overview

Build Status
Build Status

Microsoft.Recognizers.Text provides robust recognition and resolution of entities like numbers, units, and date/time; expressed in multiple languages. Full support for Chinese, English, French, Spanish, Portuguese, German, Italian, Turkish, and Hindi. Partial support for Dutch, Japanese, Korean, and Swedish. More on the way.

Utilizing the Project

Microsoft.Recognizers.Text powers pre-built entities in both LUIS: Language Understanding Intelligent Service and Microsoft Bot Framework; base entity types in Text Analytics Cognitive Service; and it is also available as standalone packages (for the base classes and the different entity recognizers).

The Microsoft.Recognizers.Text packages currently target four platforms:

Contributions are greatly welcome! Both for fixes and extensions in the currently supported languages and for expansion to new ones.
Especially for Dutch, Japanese, Korean, Hindi, and others! More info below.

Help

If you have any questions, please go ahead and open an issue, even if it's not an actual bug. Issues are an acceptable discussion forum as well.

Contributing

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Good starting points for contribution are:

  • the list of open issues (especially those marked as help wanted);
  • the json spec cases temporarily marked as NotSupported (Specs); and
  • translating json test spec cases that work in English, but don't yet exist in a target language.

The links below describe the project structure and provide both an overview and tips on how to contribute (although some steps may have become a little out-of-date). Thank you!

Supported Entities across Cultures

The table below summarizes the currently supported entities. Support for English is usually more complete than others. The primary platform is .NET (shown in table) and support should propagate to the others., Entity Type, EN, ZH-CN, NL, FR, DE, IT, JA, KO, PT, ES, :-----------------:, :-------:, :-------:, :-----:, :------:, :-----:, :-------:, :------:, :------:, :------:, :-------:, Number (cardinal), ✓, ✓, ✓, ✓, ✓, ✓, ✓, ✓, ✓, ✓, Ordinal, ✓, ✓, ✓, ✓, ✓, ✓, ✓, SO, ✓, ✓, Percentage, ✓, ✓, ✓, ✓, ✓, ✓, ✓, SO, ✓, ✓, Number Range, ✓, ✓, PA, :x:, :x:, PA, :x:, :x:, :x:, PA, Unit - Age, ✓, ✓, ✓, ✓, ✓, ✓, ✓, SO, ✓, ✓, Unit - Currency, ✓, ✓, ✓, ✓, ✓, ✓, ✓, SP, ✓, ✓, Unit - Dimensions, ✓, ✓, ✓, ✓, ✓, ✓, :x:, SP, ✓, ✓, Unit - Temperature, ✓, ✓, ✓, ✓, ✓, ✓, :x:, SP, ✓, ✓, Choice - Boolean, ✓, ✓, ✓, ✓, ✓, ✓, ✓, SO, ✓, ✓, Seq. - E-mail, G, G*, G, G, G, G, G*, G*, G, G, Seq. - GUID, G, G, G, G, G, G, G, G, G, G, Seq. - Social, G, G, G, G, G, G, G, G, G, G, Seq. - IP Address, G, G, G, G, G, G, G, G, G, G, Seq. - Phone Number, G, G, G, G, G, G, G, G, G, G, Seq. - URL, G, G*, G, G, G, G, G*, G*, G, G, DateTime (+subtypes), ✓, ✓, PA, ✓, ✓, ✓, SP, SP, ✓, ✓, Entity Type, SV, BG, TR, HI, AR, :-----------------:, :-------:, :-------:, :-----:, :------:, :-----:, :-------:, :------:, :------:, :------:, :-------:, Number (cardinal), ✓, :x:, ✓, ✓, :x:, Ordinal, ✓, :x:, ✓, ✓, :x:, Percentage, ✓, :x:, ✓, ✓, :x:, Number Range, :x:, :x:, ✓, :x:, :x:, Unit - Age, :x:, :x:, ✓, ✓, :x:, Unit - Currency, :x:, :x:, ✓, ✓, :x:, Unit - Dimensions, :x:, :x:, ✓, ✓, :x:, Unit - Temperature, :x:, :x:, ✓, ✓, :x:, Choice - Boolean, ✓, ✓, ✓, ✓, ✓, Seq. - E-mail, G, G, G, G, G, Seq. - GUID, G, G, G, G, G, Seq. - Social, G, G, G, G, G, Seq. - IP Address, G, G, G, G, G, Seq. - Phone Number, :x:, :x:, :x:, :x:, :x:, Seq. - URL, G, G, G, G*, G*, DateTime (+subtypes), :x:, :x:, ✓, ✓, :x:, * G: Generic entity, not language-specific (* unicode TLDs not-supported);

  • PA: Partial support;
  • SO: Specs-only;
  • SP: Partial specs;
  • SI: Very initial specs.

主要指標

概覽
名稱與所有者microsoft/Recognizers-Text
主編程語言C#
編程語言C# (語言數: 10)
平台
許可證MIT License
所有者活动
創建於2017-04-17 19:45:47
推送於2025-02-19 04:46:37
最后一次提交2025-02-12 10:38:46
發布數64
最新版本名稱dotnet-v1.8.12 (發布於 )
第一版名稱dotnet-v1.0.1 (發布於 2018-03-13 11:18:33)
用户参与
星數1.7k
關注者數64
派生數434
提交數2.1k
已啟用問題?
問題數913
打開的問題數167
拉請求數1969
打開的拉請求數46
關閉的拉請求數220
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?