forceutf8

PHP 类编码具有流行的 Encoding::toUTF8() 函数——以前称为forceUTF8()——修复混合编码的字符串。「PHP Class Encoding featuring popular Encoding::toUTF8() function --formerly known as forceUTF8()-- that fixes mixed encoded strings.」

  • 所有者: neitanod/forceutf8
  • 平台: Linux, Mac, Windows
  • 许可证:
  • 分类:
  • 主题:
  • 喜欢:
    0
      比较:

Github星跟踪图

forceutf8

PHP Class Encoding featuring popular \ForceUTF8\Encoding::toUTF8() function --formerly known as forceUTF8()-- that fixes mixed encoded strings.

Description

If you apply the PHP function utf8_encode() to an already-UTF8 string it will return a garbled UTF8 string.

This class addresses this issue and provides a handy static function called \ForceUTF8\Encoding::toUTF8().

You don't need to know what the encoding of your strings is. It can be Latin1 (ISO 8859-1), Windows-1252 or UTF8, or the string can have a mix of them. \ForceUTF8\Encoding::toUTF8() will convert everything to UTF8.

Sometimes you have to deal with services that are unreliable in terms of encoding, possibly mixing UTF8 and Latin1 in the same string.

Update:

I've included another function, \ForceUTF8\Encoding::fixUTF8(), which will fix the double (or multiple) encoded UTF8 string that looks garbled.

Usage:

use \ForceUTF8\Encoding;

$utf8_string = Encoding::toUTF8($utf8_or_latin1_or_mixed_string);

$latin1_string = Encoding::toLatin1($utf8_or_latin1_or_mixed_string);

also:

$utf8_string = Encoding::fixUTF8($garbled_utf8_string);

Examples:

use \ForceUTF8\Encoding;

echo Encoding::fixUTF8("Fédération Camerounaise de Football\n");
echo Encoding::fixUTF8("Fédération Camerounaise de Football\n");
echo Encoding::fixUTF8("Fédération Camerounaise de Football\n");
echo Encoding::fixUTF8("Fédération Camerounaise de Football\n");

will output:

Fédération Camerounaise de Football
Fédération Camerounaise de Football
Fédération Camerounaise de Football
Fédération Camerounaise de Football

Options:

By default, Encoding::fixUTF8 will use the Encoding::WITHOUT_ICONV flag, signalling that iconv should not be used to fix garbled UTF8 strings.

This class also provides options for iconv processing, such as Encoding::ICONV_TRANSLIT and Encoding::ICONV_IGNORE to enable these flags when the iconv class is utilized. The functionality of such flags are documented in the PHP iconv documentation.

Examples:

use \ForceUTF8\Encoding;

$str = "Fédération Camerounaise—de—Football\n"; // Uses U+2014 which is invalid ISO8859-1 but exists in Win1252
echo Encoding::fixUTF8($str); // Will break U+2014
echo Encoding::fixUTF8($str, Encoding::ICONV_IGNORE); // Will preserve U+2014
echo Encoding::fixUTF8($str, Encoding::ICONV_TRANSLIT); // Will preserve U+2014

will output:

Fédération Camerounaise?de?Football
Fédération Camerounaise—de—Football
Fédération Camerounaise—de—Football

while:

use \ForceUTF8\Encoding;

$str = "čęėįšųūž"; // Uses several characters not present in ISO8859-1 / Win1252
echo Encoding::fixUTF8($str); // Will break invalid characters
echo Encoding::fixUTF8($str, Encoding::ICONV_IGNORE); // Will remove invalid characters, keep those present in Win1252
echo Encoding::fixUTF8($str, Encoding::ICONV_TRANSLIT); // Will trasliterate invalid characters, keep those present in Win1252

will output:

????????
šž
ceeišuuž

Install via composer:

Edit your composer.json file to include the following:

{
    "require": {
        "neitanod/forceutf8": "~2.0"
    }
}

Tips:

You can tip me with Bitcoin if you want. :)

主要指标

概览
名称与所有者neitanod/forceutf8
主编程语言PHP
编程语言PHP (语言数: 1)
平台Linux, Mac, Windows
许可证
所有者活动
创建于2013-01-24 21:45:39
推送于2023-06-19 18:08:07
最后一次提交2019-12-10 11:09:14
发布数7
最新版本名称v2.0.4 (发布于 )
第一版名称v1.4 (发布于 )
用户参与
星数1.6k
关注者数92
派生数367
提交数73
已启用问题?
问题数74
打开的问题数12
拉请求数13
打开的拉请求数6
关闭的拉请求数14
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?