kbd-audio

Tools for capturing and analysing keyboard input paired with microphone capture ?⌨️

Github星跟蹤圖

kbd-audio

Actions Status

This is a collection of command-line and GUI tools for capturing and analyzing audio data.

Keytap

The most interesting tool is called keytap - it can guess pressed keyboard keys only by analyzing the audio captured from the computer's microphone.

Check this blog post for more details:

Keytap: description and some random thoughts

Video: short demo of Keytap in action

Keytap2

The keytap2 tool is another interesting tool for recovering text from audio. It does not require training data - instead it uses statistical information about the frequencies of the letters and n-grams in the English language. The tool is still in development, but you can see a short demonstration here:

Video: Keytap2 - recovering text from typing sound (7:50)

CTF: can you guess the text being typed?

Build instructions

Dependencies:

  • SDL2 - used to capture audio and to open GUI windows libsdl

    [Ubuntu]
    $ sudo apt install libsdl2-dev
    
    [Mac OS with brew]
    $ brew install sdl2
    
  • FFTW3 (optional) - some of the helper tools perform Fourier transformations fftw

Linux and Mac OS

git clone https://github.com/ggerganov/kbd-audio
cd kbd-audio
git submodule update --init
mkdir build && cd build
cmake ..
make

Windows

(todo, PRs welcome)

Tools

Short summary of the available tools. If the status of the tool is not stable, expect problems and non-optimal results., Name, Type, Status, ---, ---, ---, record, text, stable, record-full, text, stable, play, text, stable, play-full, text, stable, view-gui, gui, stable, view-full-gui, gui, stable, keytap, text, stable, keytap-gui, gui, stable, keytap2, text, development, keytap2-gui, gui, development, -, extra, -, guess_qp, text, experiment, guess_qp2, text, experiment, key_detector, text, experiment, scale, text, experiment, subreak, text, experiment, key_average_gui, gui, experiment, ## Tool details

  • record-full

    Record audio to a raw binary file on disk

    ./record-full output.kbd [-cN]
    

  • play-full

    Playback a recording captured via the record-full tool

    ./play-full input.kbd [-pN]
    

  • record

    Record audio only while typing. Useful for collecting training data for keytap

    ./record output.kbd [-cN] [-CN]
    

  • play

    Playback a recording created via the record tool

    ./play input.kbd [-pN]
    

  • keytap

    Detect pressed keys via microphone audio capture in real-time. Uses training data captured via the record tool.

    ./keytap input0.kbd [input1.kbd] [input2.kbd] ... [-cN] [-CN] [-pF] [-tF]
    

  • keytap-gui

    Detect pressed keys via microphone audio capture in real-time. Uses training data captured via the record tool. GUI version.

    ./keytap-gui input0.kbd [input1.kbd] [input2.kbd] ... [-cN] [-CN]
    

    **Live demo (WebAssembly threads required) **

    keytap-gui


  • keytap2-gui (work in progress)

    Detect pressed keys via microphone audio capture. Uses statistical information (n-gram frequencies) about the language. No training data is required. The 'recording.kbd' input file has to be generated via the record-full tool and contains the audio data that will be analyzed. The 'n-gram.txt' file has to contain n-gram probabilities for the corresponding language.

    ./keytap2-gui recording.kbd n-gram.txt
    

    keytap2-gui


  • view-full-gui

    Visualize waveforms recorded with the record-full tool. Can also playback the audio data.

    ./view-full-gui input.kbd [-pN]
    

    view-full-gui


  • view-gui

    Visualize training data recorded with the record tool. Can also playback the audio data.

    ./view-gui input.kbd [-pN]
    

    view-full-gui


Feedback

Any feedback about the performance of the tools is highly appreciated. Please drop a comment here.

主要指標

概覽
名稱與所有者ggerganov/kbd-audio
主編程語言C++
編程語言C++ (語言數: 6)
平台
許可證MIT License
所有者活动
創建於2018-08-27 17:19:02
推送於2023-01-15 07:48:08
最后一次提交2023-01-15 09:47:45
發布數2
最新版本名稱keytap3 (發布於 )
第一版名稱keytap3-alpha (發布於 )
用户参与
星數8.8k
關注者數137
派生數605
提交數216
已啟用問題?
問題數36
打開的問題數12
拉請求數9
打開的拉請求數0
關閉的拉請求數8
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?