compile-time-regular-expressions

A Compile time PCRE (almost) compatible regular expression matcher.

Github stars Tracking Chart

Compile time regular expressions v2

Build Status

Fast compile-time regular expression with support for matching/searching/capturing in compile-time or runtime.

You can use single header version from directory single-header. This header can be regenerated with make single-header.

More info at compile-time.re

What this library can do?

ctre::match<"REGEX">(subject); // C++20
"REGEX"_ctre.match(subject); // C++17 + N3599 extension
  • Matching
  • Searching
  • Capturing content (named captures are supported too)
  • Back-Reference (\g{N} syntax, and \1...\9 syntax too)

The library is implementing most of the PCRE syntax with a few exceptions:

  • atomic groups
  • boundaries other than ^$
  • callouts
  • character properties
  • comments
  • conditional patterns
  • control characters (\cX)
  • horizontal / vertical character classes (\h\H\v\V)
  • match point reset (\K)
  • named characters
  • octal numbers
  • options / modes
  • subroutines
  • unicode grapheme cluster (\X)

More documentation on pcre.org.

What can be subject (input)?

  • std::string-like object (std::string_view or your own string if it's providing begin/end functions with forward iterators)
  • pair of forward iterators

Supported compilers

  • clang 6.0+ (template UDL, C++17 syntax)
  • xcode clang 10.0+ (template UDL, C++17 syntax)
  • gcc 7.4+ (template UDL, C++17 syntax)
  • gcc 9.0+ (C++17 & C++20 cNTTP syntax)
  • MSVC 15.8.8+ (C++17 syntax only)

Template UDL syntax

Compiler must support N3599 extension, as GNU extension in gcc (not in GCC 9.1+) and clang.

constexpr auto match(std::string_view sv) noexcept {
	using namespace ctre::literals;
	return "h.*"_ctre.match(sv);
}

If you need N3599 extension in GCC 9.1+ you can't use -pedantic mode and define macro CTRE_ENABLE_LITERALS.

C++17 syntax

You can provide pattern as a constexpr ctll::fixed_string variable.

static constexpr auto pattern = ctll::fixed_string{ "h.*" };

constexpr auto match(std::string_view sv) noexcept {
	return ctre::match<pattern>(sv);
}

(this is tested in MSVC 15.8.8)

C++20 syntax

Currently only compiler which supports cNTTP syntax ctre::match<PATTERN>(subject) is GCC 9+.

constexpr auto match(std::string_view sv) noexcept {
	return ctre::match<"h.*">(sv);
}

Examples

Extracting number from input

std::optional<std::string_view> extract_number(std::string_view s) noexcept {
	if (auto m = ctre::match<"[a-z]+([0-9]+)">(s)) {
        return m.get<1>().to_view();
    } else {
        return std::nullopt;
    }
}

link to compiler explorer

Extracting values from date

struct date { std::string_view year; std::string_view month; std::string_view day; };

std::optional<date> extract_date(std::string_view s) noexcept {
    using namespace ctre::literals;
    if (auto [whole, year, month, day] = ctre::match<"(\\d{4})/(\\d{1,2})/(\\d{1,2})">(s); whole) {
        return date{year, month, day};
    } else {
        return std::nullopt;
    }
}

//static_assert(extract_date("2018/08/27"sv).has_value());
//static_assert((*extract_date("2018/08/27"sv)).year == "2018"sv);
//static_assert((*extract_date("2018/08/27"sv)).month == "08"sv);
//static_assert((*extract_date("2018/08/27"sv)).day == "27"sv);

link to compiler explorer

Lexer

enum class type {
    unknown, identifier, number
};

struct lex_item {
    type t;
    std::string_view c;
};

std::optional<lex_item> lexer(std::string_view v) noexcept {
    if (auto [m,id,num] = ctre::match<"([a-z]+), ([0-9]+)">(v); m) {
        if (id) {
            return lex_item{type::identifier, id};
        } else if (num) {
            return lex_item{type::number, num};
        }
    }
    return std::nullopt;
}

link to compiler explorer

Range over input

This support is preliminary and probably the API will be changed.

auto input = "123,456,768"sv;

for (auto match: ctre::range<"([0-9]+),?">(input)) {
	std::cout << std::string_view{match.get<0>()} << "\n";
}

Overview

Name With Ownerhanickadot/compile-time-regular-expressions
Primary LanguageC++
Program languageMakefile (Language Count: 5)
Platform
License:Apache License 2.0
Release Count42
Last Release Namev3.8.1 (Posted on 2023-10-14 12:53:46)
First Release Name2017 (Posted on 2018-09-25 07:39:05)
Created At2016-06-25 23:17:18
Pushed At2024-04-25 15:38:48
Last Commit At2020-07-22 07:44:20
Stargazers Count3.2k
Watchers Count67
Fork Count175
Commits Count738
Has Issues Enabled
Issues Count215
Issue Open Count69
Pull Requests Count54
Pull Requests Open Count26
Pull Requests Close Count10
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private
To the top