PHP 替换符号所有可能的变体

PHP replace symbols all possible variants

我有我想要替换的数组符号,但我需要生成所有可能性

$lt = array(
    'a' => 'ą',
    'e' => 'ę',
    'i' => 'į',
);

例如,如果我有这个字符串:

tazeki

可能会有大量的结果:

tązeki
tazęki
tązęki
tazekį
tązekį
tazękį
tązękį

我的问题是使用什么公式来包含所有变体?

我不确定你是否可以使用键和值来做到这一点,但肯定是两个数组。

$find = array('ą','ę','į');
$replace = array('a', 'e', 'i');
$string = 'tązekį';
echo str_replace($find, $replace, $string);

我不确定我是否理解你的问题,但这是我的答案:-)

$word = 'taxeki';
$word_arr = array();
$word_arr[] = $word;

//Loop through the $lt-array where $key represents what char to search for
//$letter what to replace with
//
foreach($lt as $key=>$letter) {

    //Loop through each char in the $word-string
    for( $i = 0; $i <= strlen($word)-1; $i++ ) {
        $char = substr( $word, $i, 1 ); 

        //If current letter in word is same as $key from $lt-array
        //then add a word the $word_arr where letter is replace with
        //$letter from the $lt-array
        if ($char === $key) {
            $word_arr[] = str_replace($char, $letter, $word);
        }

    } 

}

var_dump($word_arr);

这是 PHP 中的一个实现:

<?php
/**
 * String variant generator
 */
class stringVariantGenerator
{
    /**
     * Contains assoc of char => array of all its variations
     * @var array
     */
    protected $_mapping = array();

    /**
     * Class constructor
     * 
     * @param array $mapping Assoc array of char => array of all its variation
     */
    public function __construct(array $mapping = array())
    {
        $this->_mapping = $mapping;
    }

    /**
     * Generate all variations
     * 
     * @param string $string String to generate variations from 
     * 
     * @return array Assoc containing variations
     */
    public function generate($string) 
    {
        return array_unique($this->parseString($string));
    }

    /**
     * Parse a string and returns variations
     * 
     * @param string $string String to parse
     * @param int $position Current position analyzed in the string
     * @param array $result Assoc containing all variations
     * 
     * @return array Assoc containing variations
     */
    protected function parseString($string, $position = 0, array &$result = array()) 
    {
        if ($position <= strlen($string) - 1)
        {
            if (isset($this->_mapping[$string{$position}]))
            {
                foreach ($this->_mapping[$string{$position}] as $translatedChar)
                {
                    $string{$position} = $translatedChar;
                    $this->parseString($string, $position + 1, $result);
                }
            }
            else
            {
                $this->parseString($string, $position + 1, $result);
            }
        }
        else
        {
            $result[] = $string;
        }

        return $result;
    }
}

// This is where you define what are the possible variations for each char
$mapping = array(
    'e' => array('#', '_'),
    'p' => array('*'),
);

$word = 'Apple love!';
$generator = new stringVariantGenerator($mapping);
print_r($generator->generate($word));

会 return :

Array
(
    [0] => A**l# lov#!
    [1] => A**l# lov_!
    [2] => A**l_ lov#!
    [3] => A**l_ lov_!
)

在您的情况下,如果您想将字母本身用作有效的翻译值,只需将其添加到数组中即可。

$lt = array(
    'a' => array('a', 'ą'),
    'e' => array('e', 'ę'),
    'i' => array('i', 'į'),
);

我假设你的数组中有已知数量的元素,我假设这个数字是 3。如果你的 $lt 数组中有额外的元素,你将不得不有额外的循环。

$lt = array(
   'a' => array('a', 'x'),
   'e' => array('e', 'x'),
   'i' => array('i', 'x')
);
$str = 'tazeki';
foreach ($lt['a'] as $a)
    foreach ($lt['e'] as $b)
        foreach ($lt['i'] as $c) {
            $newstr = str_replace(array_keys($lt), array($a, $b, $c), $str);
            echo "$newstr<br />\n";
        }

如果 $lt 中的元素数量未知或可变,那么这不是一个好的解决方案。

这是专门针对您的任务的解决方案。您可以传递任何单词和任何数组进行替换,它应该可以工作。

<?php

function getCombinations($word, $charsReplace)
{
    $charsToSplit = array_keys($charsReplace);

    $pattern = '/('.implode('|', $charsToSplit).')/';

    // split whole word into parts by replacing symbols
    $parts = preg_split($pattern, $word, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);

    $replaceParts = array();
    $placeholder = '';

    // create string with placeholders (%s) for sptrinf and array of replacing symbols
    foreach ($parts as $wordPart) {
        if (isset($charsReplace[$wordPart])) {
            $replaceParts[] = $wordPart;
            $placeholder .= '%s';
        } else {
            $placeholder .= $wordPart;
        }
    }

    $paramsCnt = count($replaceParts);
    $combinations = array();
    $combinationsCnt = pow(2, $paramsCnt);

    // iterate all combinations (with help of binary codes)
    for ($i = 0; $i < $combinationsCnt; $i++) {
        $mask = sprintf('%0'.$paramsCnt.'b', $i);
        $sprintfParams = array($placeholder);
        foreach ($replaceParts as $index => $char) {
            $sprintfParams[] = $mask[$index] == 1 ? $charsReplace[$char] : $char;
        }
        // fill current combination into placeholder and collect it in array
        $combinations[] = call_user_func_array('sprintf', $sprintfParams);
    }

    return $combinations;
}


$lt = array(
    'a' => 'ą',
    'e' => 'ę',
    'i' => 'į',
);
$word = 'stazeki';

$combinations = getCombinations($word, $lt);

print_r($combinations);

// Оutput:
// Array
// (
//     [0] => stazeki
//     [1] => stazekį
//     [2] => stazęki
//     [3] => stazękį
//     [4] => stązeki
//     [5] => stązekį
//     [6] => stązęki
//     [7] => stązękį
// )

这应该适合您,简单易行:

这段代码有什么作用?

1。数据部分

在数据部分,我只是用关联数组定义字符串和单个字符的替换(搜索字符作为键,替换作为值)。

2。 getReplacements() 函数

此函数获取必须以这种格式替换的字符的所有组合:

key   = index in the string
value = character

所以在这个代码示例中,数组看起来像这样:

Array (
    [0] => Array (
        [1] => a
    )    
    [1] => Array (
        [3] => e
    )   
    [2] => Array (
        [3] => e
        [1] => a
    )    
    [3] => Array (
        [5] => i
    )    
    [4] => Array (
        [5] => i
        [1] => a
    )    
    [5] => Array (
        [5] => i
        [3] => e
    )    
    [6] => Array (
        [5] => i
        [3] => e
        [1] => a
    )

)

如您所见,此数组包含必须替换的字符的所有组合,格式如下:

[0] => Array (
     //^^^^^ The entire sub array is the combination which holds the single characters which will be replaced
    [1] => a
   //^     ^ A single character of the full combination which will be replaced
   //| The index of the character in the string (This is that it also works if you have a character multiple times in your string)
   // e.g. 1 ->  t *a* z e k i
   //            ^  ^  ^ ^ ^ ^
   //            |  |  | | | |
   //            0 *1* 2 3 4 5
)    

那么它是如何得到所有组合的呢?

非常简单,我遍历每个我想用 foreach 循环替换的字符,然后我遍历我已经拥有的每个单个组合,并将它与当前作为 foreach 循环值的字符组合。

但要使其正常工作,您必须从一个空数组开始。因此,作为一个简单的例子来查看和理解我的意思:

Characters which have to be replaced (Empty array is '[]'): [1, 2, 3]

                               //new combinations for the next iteration
                               |
Character loop for NAN*:

    Combinations:
                  - []         |  -> []

Character loop for 1:

    Combinations:
                  - []    + 1  |  -> [1]    

Character loop for 2:

    Combinations:
                  - []    + 2  |  -> [2]
                  - [1]   + 2  |  -> [1,2]    

Character loop for 3:

    Combinations:
                  - []    + 3  |  -> [3]
                  - [1]   + 3  |  -> [1,3]
                  - [2]   + 3  |  -> [2,3]         
                  - [1,2] + 3  |  -> [1,2,3]    
                               //^ All combinations here

* NAN: 不是数字

如您所见,总有 (2^n)-1 种组合。同样从这个方法中,组合数组中留下了一个空数组,所以在我 return 数组之前,我只是使用 array_filter() to remove all empty arrays and array_values() 重新索引整个数组。

3。更换零件

因此,要从将构建组合的字符串中获取所有字符,我使用以下行:

array_intersect(str_split($str), array_keys($replace))

这与 array_intersect() from the string as array with str_split() and the keys from the replace array with array_keys() 完全巧合。

在这段代码中,您传递给 getReplacements() 函数的数组看起来像这样:

Array
(
    [1] => a
   //^     ^ The single character which is in the string and also in the replace array
   //| Index in the string from the character
    [3] => e
    [5] => i
)

4。替换所有组合

最后你只需要用替换数组替换源字符串中的所有组合。为此,我循环遍历每个组合,并将组合中字符串中的每个字符替换为替换数组中的匹配字符。

这可以简单地用这一行来完成:

$tmp = substr_replace($tmp, $replace[$v], $k, 1);
     //^^^^^^^^^^^^^^       ^^^^^^^^^^^^  ^^  ^ Length of the replacement
     //|                    |             | Index from the string, where it should replace
     //|                    | Get the replaced character to replace it
     //| Replaces every single character one by one in the string 

有关 substr_replace() 的更多信息,请参阅手册:http://php.net/manual/en/function.substr-replace.php

在这一行之后,您只需将替换后的字符串添加到结果数组中,并将该字符串再次放置到源字符串中。


代码:

<?php

    //data
    $str = "tazeki"; 

    $replace = array(
        'a' => 'ą',
        'e' => 'ę',
        'i' => 'į',
    );


    function getReplacements($array) {

        //initalize array
        $results = [[]];

        //get all combinations
        foreach ($array as $k => $element) {
            foreach ($results as $combination)
                $results[] = [$k => $element] + $combination;
        }

        //return filtered array
        return array_values(array_filter($results));

    }

    //get all combinations to replace
    $combinations = getReplacements(array_intersect(str_split($str), array_keys($replace)));

    //replace all combinations
    foreach($combinations as $word) {
        $tmp = $str;
        foreach($word as $k => $v)
            $tmp = substr_replace($tmp, $replace[$v], $k, 1);
        $result[] = $tmp;
    }

    //print data
    print_r($result);

?>

输出:

Array
(
    [0] => tązeki
    [1] => tazęki
    [2] => tązęki
    [3] => tazekį
    [4] => tązekį
    [5] => tazękį
    [6] => tązękį
)

嗯,虽然@Rizier123 和其他人已经提供了很好的答案和清晰的解释,但我也想留下我的贡献。这一次,尊重短源代码的方式而不是可读性......;-)

$lt   = array('a' => 'ą', 'e' => 'ę', 'i' => 'į');
$word = 'tazeki';

for ($i = 0; $i < strlen($word); $i++)
    $lt[$word[$i]] && $r[pow(2, $u++)] = [$lt[$word[$i]], $i];

for ($i = 1; $i < pow(2, count($r)); $i++) {
    for ($w = $word, $u = end(array_keys($r)); $u > 0; $u >>= 1)
        ($i & $u) && $w = substr_replace($w, $r[$u][0], $r[$u][1], 1);
    $res[] = $w;
}

print_r($res);

输出:

Array
(
    [0] => tązeki
    [1] => tazęki
    [2] => tązęki
    [3] => tazekį
    [4] => tązekį
    [5] => tazękį
    [6] => tązękį
)