早些年就玩过内存特征码搜索,
学习过Sunday,必须他理解最简单
然后写了个X86 X64通用特征码匹配
发帖这边
https://www.chinapyg.com/thread-98948-1-1.html
然后又把kmp学了一下,分享这边
https://www.chinapyg.com/thread-135745-1-1.html
现在闲着把BM的思想也理解了一下,
参考了一些先人的写法逻辑,终于感觉能写出来了
把搜索的那几种算法都玩一次,
暴力就不说了
kmp:我的理解就是最简单的匹配失败后,尝试不断向后推进,记录这个尝试结果,用于后续
sunday : 利用hash表开放式寻址,直接定位char的定长256,记录子串字符出现最后一个位置来推进
Boyer-Moore :这个大多数文章都只有思路,坏字符应该就说的Sunday方式,而好字符应该就是kmp这类思路,从子串推进,然后结合2者
我就把这2者结合来写了,
主要推进Sunday也就是坏字符的思路,
加入kmp之类的好字符思路兜底,
因为只是兜底所以没有去跟坏字符比长度取最大来推进,
毕竟对比再去最大,这花销,好多时候达到的推进估计补不回来,
所以我就直接推进kmp改进后的表了,不去对比取最大,节约点开销
听说玩这些的估计不会有易语言玩家,
所以我特别改了个英文版来发
源码如下:
intptr_t BM_find(const char
*
t, size_t lt, const char
*
s, size_t ls)
{
/
/
建立跳转表
size_t
*
Tpkmp
=
new size_t[ls];
intptr_t k
=
0
, coiled
=
1
;
/
/
计算子串特征_[
next
表]
for
(size_t i
=
1
; i < ls; i
+
+
)
{
if
(s[i]
=
=
s[k]) {
+
+
k; }
else
{ k
=
(s[i]
=
=
s[
0
]) ?
1
:
0
; }
Tpkmp[i
-
1
]
=
i
-
k;
}
/
/
优化表
k
=
-
1
;
for
(size_t i
=
0
; i < ls
-
1
; i
+
+
)
{
if
(s[i]
=
=
s[ls
-
1
]) { k
=
i; }
if
(s[i]
=
=
s[i
+
1
] && coiled)
{
Tpkmp[i
+
1
]
=
Tpkmp[i] >
+
+
coiled ? Tpkmp[i] : coiled;
}
else
{
coiled
=
0
;
}
}
Tpkmp[
0
]
=
ls
-
k
-
1
;
for
(size_t i
=
1
; i < ls; i
+
+
)
{
if
(Tpkmp[ls
-
i] < Tpkmp[
0
]) { Tpkmp[ls
-
i]
=
Tpkmp[
0
]; }
}
/
/
开始制作哈希表
char Thash[
256
];
for
(size_t i
=
0
; i <
256
; i
+
+
) { Thash[i]
=
ls; }
for
(size_t i
=
0
; i < ls; i
+
+
) { Thash[(size_t)s[i] &
0xff
]
=
ls
-
1
-
i; }
/
/
制表到此完成
ls
-
=
1
;
lt
-
=
ls;
intptr_t out
=
-
1
;
/
/
开始遍历搜索
for
(size_t i
=
0
; i < lt;)
{
/
/
坏字符
if
(Thash[(size_t)
*
(t
+
i
+
ls) &
0xff
]) { i
+
=
Thash[(size_t)
*
(t
+
i
+
ls) &
0xff
]; goto fail; }
/
/
好字符
for
(size_t n
=
0
; n < ls; n
+
+
)
{
if
(
*
(t
+
i
+
n) !
=
*
(s
+
n)) { i
+
=
Tpkmp[n]; goto fail; }
}
/
/
找到_结束循环
out
=
i;
break
;
fail:;
}
/
/
收尾工作
delete[] Tpkmp;
return
out;
}
intptr_t BM_find(const char
*
t, size_t lt, const char
*
s, size_t ls)
{
/
/
建立跳转表
size_t
*
Tpkmp
=
new size_t[ls];
intptr_t k
=
0
, coiled
=
1
;
/
/
计算子串特征_[
next
表]
for
(size_t i
=
1
; i < ls; i
+
+
)
{
if
(s[i]
=
=
s[k]) {
+
+
k; }
else
{ k
=
(s[i]
=
=
s[
0
]) ?
1
:
0
; }
Tpkmp[i
-
1
]
=
i
-
k;
}
/
/
优化表
k
=
-
1
;
for
(size_t i
=
0
; i < ls
-
1
; i
+
+
)
{
if
(s[i]
=
=
s[ls
-
1
]) { k
=
i; }
if
(s[i]
=
=
s[i
+
1
] && coiled)
{
Tpkmp[i
+
1
]
=
Tpkmp[i] >
+
+
coiled ? Tpkmp[i] : coiled;
}
else
{
coiled
=
0
;
}
[注意]传递专业知识、拓宽行业人脉——看雪讲师团队等你加入!