当前位置: 开发笔记 > 编程语言 > 正文

查找字符串是否是迭代子字符串？-Findingifastringisaniterativesubstring?

作者：刘德华 | 来源：互联网 | 2023-10-13 08:45

IhaveastringS.HowcanIfindifthestringfollowsSnT.我有一个字符串S.我怎么能找到字符串是否遵循SnT。Examp

I have a string S. How can I find if the string follows S = nT.

我有一个字符串S.我怎么能找到字符串是否遵循S = nT。

Examples:
Function should return true if
1) S = "abab"
2) S = "abcdabcd"
3) S = "abcabcabc"
4) S = "zzxzzxzzx"

示例:如果1)S =“abab”2)S =“abcdabcd”3)S =“abcabcabc”4)S =“zzxzzxzzx”,则函数应返回true

But if S="abcb" returns false.

但是如果S =“abcb”返回false。

I though maybe we can repeatedly call KMP on substrings of S and then decide.

我虽然也许我们可以在S的子串上反复调用KMP然后决定。

eg: for "abab": call on KMP on "a". it returns 2(two instances). now 2*len("a")!=len(s)
call on KMP on "ab". it returns 2. now 2*len("ab")==len(s) so return true

例如:对于“abab”:在“a”上拨打KMP。它返回2(两个实例)。现在2 * len(“a”)!= len(s)在“ab”上拨打KMP。它返回2.现在2 * len(“ab”)== len(s)所以返回true

Can you suggest any better algorithms?

你能建议更好的算法吗?

8 个解决方案

#1

I can think of a heuristic, only call KMP on a sub string if Len(original string)/Len of(sub string) is a positive integer.

如果Len(原始字符串)/ Len(子字符串)是一个正整数,我可以想到一个启发式,只在子字符串上调用KMP。

Also the maximum length of the sub string has to be less than N/2.

此外,子串的最大长度必须小于N / 2。

EDIT

Using these Heuristics Iwrote the follwing python code because my C is rusty at the moment

使用这些启发式方法我写了下面的python代码,因为我的C现在生锈了

oldstr='ABCDABCD'    

for i in xrange(0,len(oldstr)/2):
       newslice=oldstr[0:i+1]
         if newslice*(len(oldstr)/len(newslice)) == oldstr:
             print 'pattern found', newslice
             break

#2

You actually only need to care about testing substring lengths that are equal to the full string length divided by a prime number. The reason is: If S contains n copies of T, and n is not prime, then n = ab, and so S actually also contains a copies of bT (where "bT" means "T repeated b times"). This is an extension of anijhaw's answer.

实际上,您只需要关心测试子串长度等于完整字符串长度除以素数。原因是:如果S包含n个拷贝的T,并且n不是素数,则n = ab,因此S实际上还包含bT的副本(其中“bT”表示“T重复b次”)。这是anijhaw答案的延伸。

int primes[] = { 2, 3, 5, 7, 11, 13, 17 };  /* There are one or two more... ;) */
int nPrimes = sizeof primes / sizeof primes[0];

/* Passing in the string length instead of assuming ASCIIZ strings means we
 * don't have to modify the string in-place or allocate memory for new copies
 * to handle recursion. */
int is_iterative(char *s, int len) {
    int i, j;
    for (i = 0; i

 
Notice that when recursing to find even shorter repeated substrings, we don't need to check the entire string again, just the first larger repeat -- since we've already established that the remaining large repeats are, well, repeats of the first one. :) 
请注意,当递归查找更短的重复子串时,我们不需要再次检查整个字符串,只需要检查第一个较大的重复 - 因为我们已经确定剩余的大重复是,重复第一个重复。 :)

                        
                           
							  
							    #3
							    
							    
							      
1  
I don't see that the KMP algorithm helps in this case. It is not a matter of determining where to begin the next match. It seems that one way to reduce the search time is to start with the longest possibility (half the length) and work downward. The only lengths that neeed to be tested are lengths that evenly divide into the total length. Here is an example in Ruby. I should add that I realize the question was tagged as C, but this was just a simple way to show the algorithm I was thinking about (and allowed me to test that it worked). 
在这种情况下,我没有看到KMP算法有帮助。这不是确定从哪里开始下一场比赛的问题。似乎减少搜索时间的一种方法是从最长的可能性(长度的一半)开始并向下工作。需要测试的唯一长度是均匀分成总长度的长度。这是Ruby中的一个例子。我应该补充一点,我意识到问题被标记为C,但这只是一种简单的方式来显示我正在考虑的算法(并允许我测试它是否有效)。 
class String
def IsPattern( )
    len = self.length
    testlen = len / 2
    # the fastest is to start with two entries and work down
    while ( testlen > 0 )
        # if this is not an even divisor then it can't fit the pattern
        if ( len % testlen == 0 )
            # evenly divides, so it may match
            if ( self == self[0..testlen-1] * ( len / testlen ))
                return true
            end

        end
        testlen = testlen - 1
    end
    # must not have matched
    false
end
end

if __FILE__ == $0

   ARGV.each do |str|
       puts "%s, %s" % [str, str.IsPattern ? "true" : "false" ]
   end

end



[C:\test]ruby patterntest.rb a aa abab abcdabcd abcabcabc zzxzzxzzx abcd
a, false
aa, true
abab, true
abcdabcd, true
abcabcabc, true
zzxzzxzzx, true
abcd, false

							     
							                          
                           
							  
							    #4
							    
							    
							      
0  
I suppose you could try the following algorithm: 
我想你可以尝试以下算法: 
Lets L to be a possible substring length which generates the original word. For L from 1 to strlen(s)/2 check if the first character acquires in all L*i positions for i from 1 to strlen(s)/L. If it does then it could be a possible solution and you should check it with memcmp, if not try the next L. Of course you can skip some L values which are not dividing strlen(s). 
让L成为可能产生原始单词的子串长度。对于L从1到strlen(s)/ 2,检查第一个字符是否在i的所有L * i位置从1到strlen(s)/ L获得。如果它确实那么它可能是一个可能的解决方案你应该用memcmp检查它,如果没有尝试下一个L.当然你可以跳过一些没有划分strlen的L值。
							     
							                          
                           
							  
							    #5
							    
							    
							      
0  
Try this: 
    char s[] = "abcabcabcabc";
int nStringLength = strlen (s);
int nMaxCheckLength = nStringLength / 2;
int nThisOffset;
int nNumberOfSubStrings;
char cMustMatch;
char cCompare;
BOOL bThisSubStringLengthRepeats;
// Check all sub string lengths up to half the total length
for (int nSubStringLength = 1;  nSubStringLength <= nMaxCheckLength;  nSubStringLength++)
{
    // How many substrings will there be?
    nNumberOfSubStrings = nStringLength / nSubStringLength;

    // Only check substrings that fit exactly
    if (nSubStringLength * nNumberOfSubStrings == nStringLength)
    {
        // Assume it's going to be ok
        bThisSubStringLengthRepeats = TRUE;

        // check each character in substring
        for (nThisOffset = 0;  nThisOffset 

							     

							  
                        
                           
							  
							    #6
							    
							    
							      
0  
This is Java code but you should get the idea: 
这是Java代码,但你应该明白这个想法: 
        String str = "ababcababc";
    int repPos = 0;
    int repLen = 0;
    for( int i = 0; i 
 
This will return the length of the shortest repeating chunk or the length of the string if there's no repetition. 
如果没有重复,这将返回最短重复块的长度或字符串的长度。
							     

							  
                        
                           
							  
							    #7
							    
							    
							      
0  
You can build the suffix array of the string, sort it.
 Now look for series of ever doubling suffixes and when you've reached one that's the size of the entire string (S) the first in the series will give you T. 
您可以构建字符串的后缀数组,对其进行排序。现在寻找一系列加倍的后缀,当你达到整个字符串(S)的大小时,系列中的第一个会给你T. 
For example: 
abcd <-- T
abcdabcd <-- S
bcd
bcdabcd
cd
cdabcd
d
dabcd

x
xzzx
xzzxzzx
zx
zxzzx
zxzzxzzx
zzx <-- T
zzxzzx
zzxzzxzzx <-- S

a
apa
apapa
apapapa
pa <-- T
papa
papapa <-- Another T, not detected by this algo
papapapa <-- S

							     
							                          
                           
							  
							    #8
							    
							    
							      
0  
The brute force approach would be to pick all possible substrings and see if they can form the entire string.  
蛮力方法是挑选所有可能的子串,看看它们是否可以形成整个弦。 
We can do one better using the observation that for a substring to be a valid candidate len(str) % len(substr) == 0. This can be deduced from the problem statement. 
我们可以使用观察结果更好地做一个子字符串作为有效候选len(str)%len(substr)== 0.这可以从问题陈述中推断出来。 
Here is the full code: 
这是完整的代码: 
bool isRational(const string &str){
    int len = str.length();
    const auto &factors = getFactors(len); // this would include 1 but exclude len
    // sort(factors.begin(), factors.end()); To get out of the loop faster. Why? See https://stackoverflow.com/a/4698155/1043773
    for(auto iter = factors.rbegin(); iter != factors.rend(); ++iter){
        auto factor = *iter;
        bool result = true;
        for(int i = 0; i 
 
Notice that there is a faster variation in terms of time complexity which uses KMP. 
请注意,使用KMP的时间复杂度存在更快的变化。 
The above algorithm is O(N * factorCount(N)) But the good thing about this algorithm is it can bail out much faster than the KMP algorithm. Also the number of factors do not grow much.  
上面的算法是O(N * factorCount(N))但是这个算法的好处是它可以比KMP算法更快地拯救。此外,因素的数量也不会增长太多。 
Here is the graph of [i, factorCount(i)] for i <= 10^6  
这是i <= 10 ^ 6的[i,factorCount(i)]的图表 
 
Here is how the algorithm performs as against the KMP algorithm. Red graph is O(N * factorCount(N)) and Blue is O(N) KMP  
以下是算法如何执行KMP算法。红色图是O(N * factorCount(N)),蓝色是O(N)KMP 
The KMP code is picked from here 
从这里挑选KMP代码




    
        
                        string
                        substring
                        go
                        function
                        ci
                        instance
                        算法
                        uri
                        int
                    
    



    
        写下你的评论吧 !
        
            
                吐个槽吧,看都看了
            
            
                
                                        会员登录 | 用户注册
                                    
                
            
        

        
    

    
        推荐阅读
        
            
                                
                    
                        bit
                        寻找满足特定条件的整数N的最大和(a+b)
                    

                    
                                                
                        本文探讨了如何在给定整数N的情况下，找到两个不同的整数a和b，使得它们的和最大，并且满足特定的数学条件。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-26 19:26:18
                    

                

                
                                
                    
                        go
                        Python 的 10 个开发技巧！太实用了
                    

                    
                                                
                        1.如何在运行状态查看源代码？查看函数的源代码，我们通常会使用IDE来完成。比如在PyCharm中，你可以Ctrl+鼠标点击进入函数的源代码。那如果没有IDE呢？当我们想使用一个函 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-27 18:36:54
                    

                

                                
                    
                    
                
                
                                
                    
                        foreach
                        Akka BackoffSupervisor的深入解析与实践
                    

                    
                                                
                        本文详细介绍了Akka中的BackoffSupervisor机制，探讨其在处理持久化失败和Actor重启时的应用。通过具体示例，展示了如何配置和使用BackoffSupervisor以实现更细粒度的异常处理。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-27 15:04:09
                    

                

                
                                
                    
                        python
                        深入理解Python的os和sys模块
                    

                    
                                                
                        本文详细解析了Python中的os和sys模块，介绍了它们的功能、常用方法及其在实际编程中的应用。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-26 22:04:19
                    

                

                
                                
                    
                        python
                        Python学习笔记：使用pydoc工具查询文档
                    

                    
                                                
                        本文介绍了在Windows环境下使用pydoc工具的方法，并详细解释了如何通过命令行和浏览器查看Python内置函数的文档。此外，还提供了关于raw_input和open函数的具体用法和功能说明。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-26 17:05:56
                    

                

                
                                
                    
                        char
                        Python 实现字符串双拆分并转换为矩阵
                    

                    
                                                
                        本文介绍如何使用 Python 将一个字符串按照指定的行和元素分隔符进行两次拆分，最终将字符串转换为矩阵形式。通过两种不同的方法实现这一功能：一种是使用循环与 split() 方法，另一种是利用列表推导式。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-28 12:15:45
                    

                

                
                                
                    
                        go
                        技术分享：从动态网站提取站点密钥的解决方案
                    

                    
                                                
                            
                        
                                                
                        本文探讨了如何从动态网站中提取站点密钥，特别是针对验证码（reCAPTCHA）的处理方法。通过结合Selenium和requests库，提供了详细的代码示例和优化建议。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-28 04:11:47
                    

                

                
                                
                    
                        const
                        深入解析Spring Cloud Ribbon负载均衡机制
                    

                    
                                                
                            
                        
                                                
                        本文详细介绍了Spring Cloud中的Ribbon组件如何实现服务调用的负载均衡。通过分析其工作原理、源码结构及配置方式，帮助读者理解Ribbon在分布式系统中的重要作用。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-27 16:01:25
                    

                

                
                                
                    
                        replace
                        Java 序列化接口详解
                    

                    
                                                
                        本文深入探讨了 Java 中的 Serializable 接口，解释了其实现机制、用途及注意事项，帮助开发者更好地理解和使用序列化功能。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-27 15:06:12
                    

                

                
                                
                    
                        go
                        深入理解C++中的KMP算法：高效字符串匹配的利器
                    

                    
                                                
                            
                        
                                                
                        本文详细介绍C++中实现KMP算法的方法，探讨其在字符串匹配问题上的优势。通过对比暴力匹配（BF）算法，展示KMP算法如何利用前缀表优化匹配过程，显著提升效率。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-27 14:45:30
                    

                

                
                                
                    
                        replace
                        Java面试题解析
                    

                    
                                                
                        本文详细介绍了Java编程语言中的核心概念和常见面试问题，包括集合类、数据结构、线程处理、Java虚拟机（JVM）、HTTP协议以及Git操作等方面的内容。通过深入分析每个主题，帮助读者更好地理解Java的关键特性和最佳实践。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-27 13:55:14
                    

                

                
                                
                    
                        const
                        Unity 客户端框架设计：UI管理系统的构建
                    

                    
                                                
                        本文详细介绍了如何构建一个高效的UI管理系统，集中处理UI页面的打开、关闭、层级管理和页面跳转等问题。通过UIManager统一管理外部切换逻辑，实现功能逻辑分散化和代码复用，支持多人协作开发。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-27 10:28:40
                    

                

                
                                
                    
                        const
                        ImmutableX Poised to Pioneer Web3 Gaming Revolution
                    

                    
                                                
                            
                        
                                                
                        ImmutableX is set to spearhead the evolution of Web3 gaming, with its innovative technologies and strategic partnerships driving significant advancements in the industry. ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-27 08:55:17
                    

                

                
                                
                    
                        buffer
                        解析JSON格式文本并处理数据
                    

                    
                                                
                        本文介绍如何使用阿里云的fastjson库解析包含时间戳、IP地址和参数等信息的JSON格式文本，并进行数据处理和保存。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-26 16:06:09
                    

                

                
                                
                    
                        go
                        移动 UI 设计基础：打造简洁高效的用户界面
                    

                    
                                                
                        本章将深入探讨移动 UI 设计的核心原则，帮助开发者构建简洁、高效且用户友好的界面。通过学习设计规则和用户体验优化技巧，您将能够创建出既美观又实用的移动应用。 ...
                        [详细]
                    
                    

                    
                        蜡笔小新   2024-12-27 08:43:40