12

PHP: preg_replace behaves differently on different servers

 3 years ago
source link: https://www.codesd.com/item/php-preg-replace-behaves-differently-on-different-servers.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

PHP: preg_replace behaves differently on different servers

advertisements

I'm working with a PHP function that takes a string and converts all of its spaces to underscores, converts all of its accented characters to non-accented characters, and removes non word characters. In other words, it creates slugs.

This function works fine on my local machine, which is running MAMP. I've tried it with PHP 5.2.17 and 5.3.6, both without problems. However, on another server, which is running PHP 5.2.10, the function behaves differently.

For example, on my local machine, if I pass in the string "this_is_a_test", the function will return the same string since there are no spaces, accented characters, or non word characters. However, if I pass the same string in on the other server, the function will return "t_t_t".

I compared PHP 5.3.6's php.ini file on my local machine with the php.ini file on the other server and didn't see any differences that should cause something like this. Any ideas? By the way, it's actually a CakePHP project that I'm working with, but I narrowed the code down to this test case, which is pure PHP:

<?php

    $map = array(
        '/ä|æ|ǽ/' => 'ae',
        '/ö|œ/' => 'oe',
        '/ü/' => 'ue',
        '/Ä/' => 'Ae',
        '/Ü/' => 'Ue',
        '/Ö/' => 'Oe',
        '/À|Á|Â|Ã|Ä|Å|Ǻ|Ā|Ă|Ą|Ǎ/' => 'A',
        '/à|á|â|ã|å|ǻ|ā|ă|ą|ǎ|ª/' => 'a',
        '/Ç|Ć|Ĉ|Ċ|Č/' => 'C',
        '/ç|ć|ĉ|ċ|č/' => 'c',
        '/Ð|Ď|Đ/' => 'D',
        '/ð|ď|đ/' => 'd',
        '/È|É|Ê|Ë|Ē|Ĕ|Ė|Ę|Ě/' => 'E',
        '/è|é|ê|ë|ē|ĕ|ė|ę|ě/' => 'e',
        '/Ĝ|Ğ|Ġ|Ģ/' => 'G',
        '/ĝ|ğ|ġ|ģ/' => 'g',
        '/Ĥ|Ħ/' => 'H',
        '/ĥ|ħ/' => 'h',
        '/Ì|Í|Î|Ï|Ĩ|Ī|Ĭ|Ǐ|Į|İ/' => 'I',
        '/ì|í|î|ï|ĩ|ī|ĭ|ǐ|į|ı/' => 'i',
        '/Ĵ/' => 'J',
        '/ĵ/' => 'j',
        '/Ķ/' => 'K',
        '/ķ/' => 'k',
        '/Ĺ|Ļ|Ľ|Ŀ|Ł/' => 'L',
        '/ĺ|ļ|ľ|ŀ|ł/' => 'l',
        '/Ñ|Ń|Ņ|Ň/' => 'N',
        '/ñ|ń|ņ|ň|ʼn/' => 'n',
        '/Ò|Ó|Ô|Õ|Ō|Ŏ|Ǒ|Ő|Ơ|Ø|Ǿ/' => 'O',
        '/ò|ó|ô|õ|ō|ŏ|ǒ|ő|ơ|ø|ǿ|º/' => 'o',
        '/Ŕ|Ŗ|Ř/' => 'R',
        '/ŕ|ŗ|ř/' => 'r',
        '/Ś|Ŝ|Ş|Š/' => 'S',
        '/ś|ŝ|ş|š|ſ/' => 's',
        '/Ţ|Ť|Ŧ/' => 'T',
        '/ţ|ť|ŧ/' => 't',
        '/Ù|Ú|Û|Ũ|Ū|Ŭ|Ů|Ű|Ų|Ư|Ǔ|Ǖ|Ǘ|Ǚ|Ǜ/' => 'U',
        '/ù|ú|û|ũ|ū|ŭ|ů|ű|ų|ư|ǔ|ǖ|ǘ|ǚ|ǜ/' => 'u',
        '/Ý|Ÿ|Ŷ/' => 'Y',
        '/ý|ÿ|ŷ/' => 'y',
        '/Ŵ/' => 'W',
        '/ŵ/' => 'w',
        '/Ź|Ż|Ž/' => 'Z',
        '/ź|ż|ž/' => 'z',
        '/Æ|Ǽ/' => 'AE',
        '/ß/' => 'ss',
        '/IJ/' => 'IJ',
        '/ij/' => 'ij',
        '/Œ/' => 'OE',
        '/ƒ/' => 'f',
        '/[^\s\p{Ll}\p{Lm}\p{Lo}\p{Lt}\p{Lu}\p{Nd}]/mu' => ' ',
        '/\s+/' => '_',
        '/^[_]+|[_]+$/' => ''
    );

    echo preg_replace(array_keys($map), array_values($map), 'this_is_a_test');

?>


Take a look at your charset to make sure it's the same on both platforms.

It looks like you're converting UTF-8 to a single byte character format in your slugs. If your system is ISO-8859-1 it will misinterpret your keys as multiple characters per UTF-8 character.

This is set in your default_charset in php.ini

In the cli try:

php -i


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK