franc

Natural language detection

Github stars Tracking Chart

franc

Build Status
Coverage Status

Detect the language of text.

What’s so cool about franc?

  1. franc can support more languages(†) than any other
    library
  2. franc is packaged with support for 82, 187, or 406
    languages
  3. franc has a CLI

† - Based on the UDHR, the most translated document in the world.

What’s not so cool about franc?

franc supports many languages, which means it’s easily confused on small
samples.
Make sure to pass it big documents to get reliable results.

Install

npm:

npm install franc

This installs the franc package, with support for 187 languages
(languages which have 1 million or more speakers).
franc-min (82 languages, 8m or more speakers) and franc-all (all
406 possible languages) are also available.
Finally, use franc-cli to install the CLI.

Browser builds for franc-min, franc, and franc-all are
available on GitHub Releases.

Use

var franc = require('franc')

franc('Alle menslike wesens word vry') // => 'afr'
franc('এটি একটি ভাষা একক IBM স্ক্রিপ্ট') // => 'ben'
franc('Alle menneske er fødde til fridom') // => 'nno'
franc('') // => 'und'
franc('the') // => 'und'

/* You can change what’s too short (default: 10): */
franc('the', {minLength: 3}) // => 'sco'
.all
console.log(franc.all('O Brasil caiu 26 posições'))

Yields:

[ [ 'por', 1 ],
  [ 'src', 0.8797557538750587 ],
  [ 'glg', 0.8708313762329732 ],
  [ 'snn', 0.8633161108501644 ],
  [ 'bos', 0.8172851103804604 ],
  ... 116 more items ]
only
console.log(franc.all('O Brasil caiu 26 posições', {only: ['por', 'spa']}))

Yields:

[ [ 'por', 1 ], [ 'spa', 0.799906059182715 ] ]
ignore
console.log(franc.all('O Brasil caiu 26 posições', {ignore: ['src', 'glg']}))

Yields:

[ [ 'por', 1 ],
  [ 'snn', 0.8633161108501644 ],
  [ 'bos', 0.8172851103804604 ],
  [ 'hrv', 0.8107092531705026 ],
  [ 'lav', 0.810239549084077 ],
  ... 114 more items ]

CLI

Install:

npm install franc-cli --global

Use:

CLI to detect the language of text

Usage: franc [options] <string>

Options:

  -h, --help                    output usage information
  -v, --version                 output version number
  -m, --min-length <number>     minimum length to accept
  -o, --only <string>           allow languages
  -i, --ignore <string>         disallow languages
  -a, --all                     display all guesses

Usage:

# output language
$ franc "Alle menslike wesens word vry"
# afr

# output language from stdin (expects utf8)
$ echo "এটি একটি ভাষা একক IBM স্ক্রিপ্ট", franc
# ben

# ignore certain languages
$ franc --ignore por,glg "O Brasil caiu 26 posições"
# src

# output language from stdin with only
$ echo "Alle mennesker er født frie og", franc --only nob,dan
# nob

Supported languages, Package, Languages, Speakers, -------, ---------, --------, franc-min, 82, 8M or more, franc, 187, 1M or more, franc-all, 406, -, ### Language code

Note that franc returns ISO 639-3 codes (three letter codes).
Not ISO 639-1 or ISO 639-2.
See also GH-10 and GH-30.

To get more info about the languages represented by ISO 639-3, use
iso-639-3.
There is also an index available to map ISO 639-3 to ISO 639-1 codes,
iso-639-3/to-1.json, but note that not all 639-3 codes can
be represented in 639-1.

Ports

Franc has been ported to several other programming languages.

The works franc is derived from have themselves also been ported to other
languages.

Derivation

Franc is a derivative work from guess-language (Python, LGPL),
guesslanguage (C++, LGPL), and Language::Guess
(Perl, GPL).
Their creators granted me the rights to distribute franc under the MIT license:
respectively, Kent S. Johnson, Jacob R. Rideout, and
Maciej Ceglowski.

License

MIT © Titus Wormer

Overview

Name With Ownerwooorm/franc
Primary LanguageJavaScript
Program languageJavaScript (Language Count: 1)
Platform
License:MIT License
Release Count59
Last Release Name6.2.0 (Posted on )
First Release Name0.0.1 (Posted on )
Created At2014-07-19 14:50:53
Pushed At2024-03-27 12:12:56
Last Commit At
Stargazers Count4k
Watchers Count44
Fork Count174
Commits Count392
Has Issues Enabled
Issues Count99
Issue Open Count4
Pull Requests Count11
Pull Requests Open Count0
Pull Requests Close Count6
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private
To the top