Sloc Cloc and Code (scc)
A tool similar to cloc, sloccount and tokei. For counting physical the lines of code, blank lines, comment lines, and physical lines of source code in many programming languages.
Goal is to be the fastest code counter possible, but also perform COCOMO calculation like sloccount and to estimate code complexity similar to cyclomatic complexity calculators. In short one tool to rule them all.
Also it has a very short name which is easy to type scc
.
If you don't like sloc cloc and code feel free to use the name Succinct Code Counter
.
Dual-licensed under MIT or the UNLICENSE.
Install
Go Get
If you are comfortable using Go and have >= 1.10 installed:
$ go get -u github.com/boyter/scc/
Snap
A snap install exists thanks to Ricardo.
$ sudo snap install scc
Homebrew
Of if you have homebrew installed
$ brew install scc
Manual
Binaries for Windows, GNU/Linux and macOS for both i386 and x86_64 machines are available from the releases page.
Other
If you would like to assist with getting scc
added into apt/chocolatey/etc... please submit a PR or at least raise an issue with instructions.
Background
Read all about how it came to be along with performance benchmarks,
- https://boyter.org/posts/sloc-cloc-code/
- https://boyter.org/posts/why-count-lines-of-code/
- https://boyter.org/posts/sloc-cloc-code-revisited/
- https://boyter.org/posts/sloc-cloc-code-performance/
- https://boyter.org/posts/sloc-cloc-code-performance-update/
Some reviews of scc
- https://nickmchardy.com/2018/10/counting-lines-of-code-in-koi-cms.html
- https://www.feliciano.tech/blog/determine-source-code-size-and-complexity-with-scc/
- https://metaredux.com/posts/2019/12/13/counting-lines.html
A talk given at the first GopherCon AU about scc
(press S to see speaker notes)
For performance see the Performance section
Other similar projects,
- cloc the original sloc counter
- gocloc a sloc counter in Go inspired by tokei
- loc rust implementation similar to tokei but often faster
- loccount Go implementation written and maintained by ESR
- ployglot ATS sloc counter
- sloccount written as a faster cloc
- tokei fast, accurate and written in rust
Interesting reading about other code counting projects tokei, loc, polyglot and loccount
- https://www.reddit.com/r/rust/comments/59bm3t/a_fast_cloc_replacement_in_rust/
- https://www.reddit.com/r/rust/comments/82k9iy/loc_count_lines_of_code_quickly/
- http://blog.vmchale.com/article/polyglot-comparisons
- http://esr.ibiblio.org/?p=8270
Further reading about processing files on the disk performance
Using scc
to process 40 TB of files from Github/Bitbucket/Gitlab
Pitch
Why use scc
?
- It is very fast and gets faster the more CPU you throw at it
- Accurate
- Works very well across multiple platforms without slowdown (Windows, Linux, macOS)
- Large language support
- Can ignore duplicate files
- Has complexity estimations
- You need to tell the difference between Coq and Verilog in the same directory
- cloc yaml output support so potentially a drop in replacement for some users
- Can identify or ignore minified files
- Able to identify many #! files
- Can ignore large files by lines or bytes
Why not use scc
?
- You don't like Go for some reason
- It cannot count D source with different nested multi-line comments correctly https://github.com/boyter/scc/issues/27
Usage
Command line usage of scc
is designed to be as simple as possible.
Full details can be found in scc --help
or scc -h
.
Sloc, Cloc and Code. Count lines of code in a directory with complexity estimation.
Version 2.12.0
Ben Boyter <ben@boyter.org> + Contributors
Usage:
scc [flags]
Flags:
--avg-wage int average wage value used for basic COCOMO calculation (default 56286)
--binary disable binary file detection
--by-file display output for every file
--ci enable CI output settings where stdout is ASCII
--count-as string count extension as language [e.g. jsp:htm,chead:"C Header" maps extension jsp to html and chead to C Header]
--debug enable debug output
--exclude-dir strings directories to exclude (default [.git,.hg,.svn])
--file-gc-count int number of files to parse before turning the GC on (default 10000)
-f, --format string set output format [tabular, wide, json, csv, cloc-yaml, html, html-table] (default "tabular")
--gen identify generated files
--generated-markers strings string markers in head of generated files (default [do not edit])
-h, --help help for scc
-i, --include-ext strings limit to file extensions [comma separated list: e.g. go,java,js]
-l, --languages print supported languages and extensions
--large-byte-count int number of bytes a file can contain before being removed from output (default 1000000)
--large-line-count int number of lines a file can contain before being removed from output (default 40000)
--min identify minified files
-z, --min-gen identify minified or generated files
--min-gen-line-length int number of bytes per average line for file to be considered minified or generated (default 255)
--no-cocomo remove COCOMO calculation output
-c, --no-complexity skip calculation of code complexity
-d, --no-duplicates remove duplicate files from stats and output
--no-gen ignore generated files in output (implies --gen)
--no-gitignore disables .gitignore file logic
--no-ignore disables .ignore file logic
--no-large ignore files over certain byte and line size set by max-line-count and max-byte-count
--no-min ignore minified files in output (implies --min)
--no-min-gen ignore minified or generated files in output (implies --min-gen)
-M, --not-match stringArray ignore files and directories matching regular expression
-o, --output string output filename (default stdout)
-s, --sort string column to sort by [files, name, lines, blanks, code, comments, complexity] (default "files")
-t, --trace enable trace output (not recommended when processing multiple files)
-v, --verbose verbose output
--version version for scc
-w, --wide wider output with additional statistics (implies --complexity)
Output should look something like the below for the redis project
$ scc .
───────────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code Complexity
───────────────────────────────────────────────────────────────────────────────
C 258 153080 17005 26121 109954 27671
C Header 200 28794 3252 5877 19665 1557
TCL 101 17802 1879 981 14942 1439
Shell 36 1109 133 252 724 118
Lua 20 525 68 70 387 65
Autoconf 18 10821 1026 1326 8469 951
Makefile 10 1082 220 103 759 51
Ruby 10 778 78 71 629 115
Markdown 9 1935 527 0 1408 0
gitignore 9 120 16 0 104 0
HTML 5 9658 2928 12 6718 0
C++ 4 286 48 14 224 31
License 4 100 20 0 80 0
YAML 4 266 20 3 243 0
CSS 2 107 16 0 91 0
Python 2 219 39 18 162 68
Batch 1 28 2 0 26 3
C++ Header 1 9 1 3 5 0
Extensible Styleshe… 1 10 0 0 10 0
Plain Text 1 23 7 0 16 0
Smarty Template 1 44 1 0 43 5
m4 1 562 116 53 393 0
───────────────────────────────────────────────────────────────────────────────
Total 698 227358 27402 34904 165052 32074
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop $5,755,686
Estimated Schedule Effort 29.835114 months
Estimated People Required 22.851995
───────────────────────────────────────────────────────────────────────────────
Note that you don't have to specify the directory you want to run against. Running scc
will assume you want to run against the current directory.
You can also run against multiple files or directories scc directory1 directory2 file1 file2
with the results aggregated in the output.
Interesting Use Cases
Used inside Intel Nemu Hypervisor to track code changes between revisions https://github.com/intel/nemu/blob/topic/virt-x86/tools/cloc-change.sh#L9
Appears to also be used inside both http://codescoop.com/ and https://pinpoint.com/
Features
scc
uses a small state machine in order to determine what state the code is when it reaches a newline \n
. As such it is aware of and able to count
- Single Line Comments
- Multi Line Comments
- Strings
- Multi Line Strings
- Blank lines
Because of this it is able to accurately determine if a comment is in a string or is actually a comment.
It also attempts to count the complexity of code. This is done by checking for branching operations in the code. For example, each of the following `for if switch while else