licenseclassifier

A License Classifier

Github星跟蹤圖

License Classifier

Build status

Introduction

The license classifier is a library and set of tools that can analyze text to
determine what type of license it contains. It searches for license texts in a
file and compares them to an archive of known licenses. These files could be,
e.g., LICENSE files with a single or multiple licenses in it, or source code
files with the license text in a comment.

A "confidence level" is associated with each result indicating how close the
match was. A confidence level of 1.0 indicates an exact match, while a
confidence level of 0.0 indicates that no license was able to match the text.

Adding a new license

Adding a new license is straight-forward:

  1. Create a file in licenses/.

    • The filename should be the name of the license or its abbreviation. If
      the license is an Open Source license, use the appropriate identifier
      specified at https://spdx.org/licenses/.
    • If the license is the "header" version of the license, append the suffix
      ".header" to it. See licenses/README.md for more details.
  2. Add the license name to the list in license_type.go.

  3. Regenerate the licenses.db file by running the license serializer:

    $ license_serializer -output licenseclassifier/licenses
    
  4. Create and run appropriate tests to verify that the license is indeed
    present.

Tools

Identify license

identify_license is a command line tool that can identify the license(s)
within a file.

$ identify_license LICENSE
LICENSE: GPL-2.0 (confidence: 1, offset: 0, extent: 14794)
LICENSE: LGPL-2.1 (confidence: 1, offset: 18366, extent: 23829)
LICENSE: MIT (confidence: 1, offset: 17255, extent: 1059)

License serializer

The license_serializer tool regenerates the licenses.db archive. The archive
contains preprocessed license texts for quicker comparisons against unknown
texts.

$ license_serializer -output licenseclassifier/licenses

This is not an official Google product (experimental or otherwise), it is just
code that happens to be owned by Google.

主要指標

概覽
名稱與所有者google/licenseclassifier
主編程語言Go
編程語言Go (語言數: 1)
平台
許可證Apache License 2.0
所有者活动
創建於2017-04-10 03:45:47
推送於2025-02-13 17:59:39
最后一次提交
發布數11
最新版本名稱v2.0.0 (發布於 )
第一版名稱v2.0.0-alpha.1 (發布於 )
用户参与
星數325
關注者數12
派生數79
提交數246
已啟用問題?
問題數20
打開的問題數12
拉請求數34
打開的拉請求數6
關閉的拉請求數8
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?