EmailReplyParser

PHP library for parsing plain text email content.

Github stars Tracking Chart

EmailReplyParser

Build
Status
Total
Downloads
Latest Stable
Version
PHP7 ready

EmailReplyParser is a PHP library for parsing plain text email content,
based on GitHub's email_reply_parser
library written in Ruby.

Installation

The recommended way to install EmailReplyParser is through
Composer:

composer require willdurand/email-reply-parser

Usage

Instantiate an EmailParser object and parse your email:

<?php

use EmailReplyParser\Parser\EmailParser;

$email = (new EmailParser())->parse($emailContent);

You get an Email object that contains a set of Fragment objects. The Email
class exposes two methods:

  • getFragments(): returns all fragments;
  • getVisibleText(): returns a string which represents the content considered
    as "visible".

The Fragment represents a part of the full email content, and has the
following API:

<?php

$fragment = current($email->getFragments());

$fragment->getContent();

$fragment->isSignature();

$fragment->isQuoted();

$fragment->isHidden();

$fragment->isEmpty();

Alternatively, you can rely on the EmailReplyParser to either parse an email
or get its visible content in a single line of code:

$email = \EmailReplyParser\EmailReplyParser::read($emailContent);

$visibleText = \EmailReplyParser\EmailReplyParser::parseReply($emailContent);

Known Issues

Quoted Headers

Quoted headers aren't picked up if there's an extra line break:

On <date>, <author> wrote:

> blah

Also, they're not picked up if the email client breaks it up into
multiple lines. GMail breaks up any lines over 80 characters for you.

On <date>, <author>
wrote:
> blah

The above On ....wrote: can be cleaned up with the following regex:

$fragment_without_date_author = preg_replace(
    '/\nOn(.*?)wrote:(.*?)$/si',
    '',
    $fragment->getContent()
);

Note though that we're search for "on" and "wrote". Therefore, it won't work
with other languages.

Possible solution: Remove "reply@reply.github.com" lines...

Weird Signatures

Lines starting with - or _ sometimes mark the beginning of
signatures:

Hello

--
Rick

Not everyone follows this convention:

Hello

Mr Rick Olson
Galactic President Superstar Mc Awesomeville
GitHub

**********************DISCLAIMER***********************************
* Note: blah blah blah                                            *
**********************DISCLAIMER***********************************

Strange Quoting

Apparently, prefixing lines with > isn't universal either:

Hello

--
Rick

________________________________________
From: Bob [reply@reply.github.com]
Sent: Monday, March 14, 2011 6:16 PM
To: Rick

Unit Tests

Setup the test suite using Composer:

$ composer install

Run it using PHPUnit:

$ phpunit

Contributing

See CONTRIBUTING file.

Credits

License

EmailReplyParser is released under the MIT License. See the bundled LICENSE
file for details.

Main metrics

Overview
Name With Ownerwilldurand/EmailReplyParser
Primary LanguagePHP
Program languagePHP (Language Count: 1)
Platform
License:MIT License
所有者活动
Created At2011-11-16 07:27:29
Pushed At2022-09-20 11:05:55
Last Commit At2022-01-30 21:56:36
Release Count25
Last Release Name2.10.0 (Posted on )
First Release Name0.0.1 (Posted on )
用户参与
Stargazers Count642
Watchers Count28
Fork Count81
Commits Count151
Has Issues Enabled
Issues Count27
Issue Open Count9
Pull Requests Count44
Pull Requests Open Count2
Pull Requests Close Count20
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private