作者:杰_Jb_131 | 来源:互联网 | 2020-09-12 00:09
搜索联想功能是各大搜索引擎具备的基础功能,如下图所示,这个功能简化了用户的输入行为,并且能够给用户推荐热门的搜索词,下面我们来讲一下如何用php实现搜索联想的功能。
搜索联想功能是各大搜索引擎具备的基础功能,如下图所示,这个功能简化了用户的输入行为,并且能够给用户推荐热门的搜索词,下面我们来讲一下如何用php实现搜索联想的功能。
当查询的时候只需要从根开始按字符沿着树进行深度遍历,就可以把已该词为前缀的其他查询词查找出来。
代码实现
用于实现搜索联想功能的核心方法有两个:
1、将查询词的数据集构建成Trie树
2、查找以某个查询词为前缀的所有查询词
第一步:构建Trie树
注意由于一个字符串有中文有英文,所以对每个字符串使用以下代码进行了分割,将字符串转化成了一个字符的数组
$charArray = preg_split('/(?然后对每个字符串执行addWordToTrieTree方法,这个方法将一个词加入到Trie树中,这里用到了递归的方法
public function buildTrieTree($strList) {
$tree = [];
foreach($strList as $str) {
$charArray = preg_split('/(?addWordToTrieTree($charArray, $tree);
}
return $tree;
}
public function addWordToTrieTree($charArray, $tree) {
if (count($charArray) == 0) {
return [];
}
$char = $charArray[0];
$leftStr = array_slice($charArray, 1);
$tree[$char] = $this->addWordToTrieTree($leftStr, $tree[$char]);
return $tree;
}
第二步:查询前缀词
查询前缀词即给定一个字符串,查询树中所有以该串为前缀的字符串,也就是联想的过程。
首先调用findSubTree方法,从Trie中找到该前缀所在的子树,然后调用traverseTree方法,遍历这颗子树,把所有的字符串都提取出来,这里也是用了递归的方法。
public function queryPrefix($prefix) {
$charArray = preg_split('/(?findSubTree($charArray, $this->tree);
$words = $this->traverseTree($subTree);
foreach($words as &$word) {
$word = $prefix . $word;
}
return $words;
}
public function findSubTree($charArray, $tree) {
foreach($charArray as $char) {
if (in_array($char, array_keys($tree))) {
$tree = $tree[$char];
} else {
return [];
}
}
return $tree;
}
public function traverseTree($tree) {
$words = [];
foreach ($tree as $node => $subTree) {
if (empty($subTree)) {
$words[] = $node;
return $words;
}
$chars = $this->traverseTree($subTree);
foreach($chars as $char) {
$words[] = $node . $char;
}
}
return $words;
}
代码与测试结果
完整代码:
tree = $this->buildTrieTree($strList);
}
public function queryPrefix($prefix) {
$charArray = preg_split('/(?findSubTree($charArray, $this->tree);
$words = $this->traverseTree($subTree);
foreach($words as &$word) {
$word = $prefix . $word;
}
return $words;
}
public function findSubTree($charArray, $tree) {
foreach($charArray as $char) {
if (in_array($char, array_keys($tree))) {
$tree = $tree[$char];
} else {
return [];
}
}
return $tree;
}
public function traverseTree($tree) {
$words = [];
foreach ($tree as $node => $subTree) {
if (empty($subTree)) {
$words[] = $node;
return $words;
}
$chars = $this->traverseTree($subTree);
foreach($chars as $char) {
$words[] = $node . $char;
}
}
return $words;
}
/**
* 将字符串的数组构建成Trie树
*
* @param [array] $strList
* @return void
*/
public function buildTrieTree($strList) {
$tree = [];
foreach($strList as $str) {
$charArray = preg_split('/(?addWordToTrieTree($charArray, $tree);
}
return $tree;
}
/**
* 把一个词加入到Trie树中
*
* @param [type] $charArray
* @param [type] $tree
* @return void
*/
public function addWordToTrieTree($charArray, $tree) {
if (count($charArray) == 0) {
return [];
}
$char = $charArray[0];
$leftStr = array_slice($charArray, 1);
$tree[$char] = $this->addWordToTrieTree($leftStr, $tree[$char]);
return $tree;
}
public function getTree() {
return $this->tree;
}
}
$strList = ['春风十里','春天在哪里','一百万个可能','一千年以后','后来','后来的我们','春天里','后会无期'];
$trieTree = new TrieTree($strList);
print_r($trieTree->getTree());
$prefix = '春';
$queryRes = $trieTree->queryPrefix($prefix);
print_r($queryRes);
将’春风十里’,‘春天在哪里’,‘一百万个可能’,‘一千年以后’,‘后来’,‘后来的我们’,‘春天里’,'后会无期’这些歌名作为数据集,构建一个Trie树并进行测试。
可以看到输出以下结果
Trie树:
Array
(
[春] => Array
(
[风] => Array
(
[十] => Array
(
[里] => Array
(
)
)
)
[天] => Array
(
[在] => Array
(
[哪] => Array
(
[里] => Array
(
)
)
)
[里] => Array
(
)
)
)
[一] => Array
(
[百] => Array
(
[万] => Array
(
[个] => Array
(
[可] => Array
(
[能] => Array
(
)
)
)
)
)
[千] => Array
(
[年] => Array
(
[以] => Array
(
[后] => Array
(
)
)
)
)
)
[后] => Array
(
[来] => Array
(
[的] => Array
(
[我] => Array
(
[们] => Array
(
)
)
)
)
[会] => Array
(
[无] => Array
(
[期] => Array
(
)
)
)
)
)
查询以“春为前缀的字符串”
Array
(
[0] => 春风十里
[1] => 春天在哪里
[2] => 春天里
)
以上就是PHP实现搜索联想功能(基于字典树算法)的详细内容,更多请关注 第一PHP社区 其它相关文章!