需要根据平滑逻辑从字符串中找到缺失值

Question

我得到了一个包含数字和“”（缺失值）符号的字符串，我必须按照说明替换“”符号：

Input1: "_,_,_,24"
Output1: 6,6,6,6
Input2: "40,_,_,_,60"
Output2: 20,20,20,20,20
Input4: "_,_,30,_,_,_,50,_,_"
Output4: 10,10,12,12,12,12,4,4,4

我尝试过使用基本的 for 循环和 if then else 使用 2 个枢轴点，但是当输入字符串发生变化时，所有这些方法都会受到挑战。我发现设计通用系统有点挑战。我不确定 python 中是否有任何特定的库可以用来执行此操作。欢迎提出任何形式的伪代码建议。

Answer 1

试试这个代码：

output = '10,,35,,,67,400'
output = output.split(',')
new_output = []

for i in output:
    if i != '':
        new_output.append(i)

Answer 2

当然可以增强，但这样做就可以了：

string = "_,_,30,_,_,_,50,_,_"
output = string.split(',')

pos = 0
next_value = 0
last_pos = 0
last_value = 0

while pos < len(output):
    if output[pos] != '_' or (pos + 1 == len(output)):
        if output[pos] != '_':
            next_value = int(output[pos])
        else:
            next_value = 0
        new_value = (next_value + last_value) / (pos - last_pos + 1)
        for i in range(last_pos, pos + 1):
            output[i] = new_value
        last_value = new_value
        last_pos = pos
    pos += 1

print(output)

这会生成一个浮点数组：[10.0, 10.0, 12.0, 12.0, 12.0, 12.0, 4.0, 4.0, 4.0]

额外信息：

您必须通过遍历数组来找到非缺失值。
找到一个后，将其添加到找到的最后一个非缺失值（0 否则）并设置所有单元格的平均值里程碑（包括它们本身）
当你到达数组的末尾时，不要忘记做同样的事情。当前值变为0，您将其添加到上一个并再次分享

如果我们取下面的字符串_,_,30,_,_,_,50,_,_

首先我们找到30。我们在开始和当前位置之间共享它。

我们得到了：10,10,10,_,_,_,50,_,_

然后我们找到50，前面的数值是10，所以我们在10和50的pos之间共享60（也就是5格）

我们得到了：10,10,12,12,12,12,12,_,_

我们到达了数组的末尾。

0 + 12 = 12 -> 我们在当前位置和最后 12 个位置（即 3 个单元格）之间共享它

我们得到了10,10,12,12,12,12,4,4,4

Answer 3

def curve_smoothing(string):
    
    S=string.split(',')       #Splitting the string, storing it in new variable
    index=0                   #initialising index variable to track current index  
    
    while index<len(S)-1:
        
        if S[index] =='_':    #Handling the case where first element is '_' 
            for i in range(index,len(S)):
                if S[i]!='_':     #when first number traced
                    S[index:i+1]=[int(S[i])//(i-index+1) for x in range(index,i+1)]
                    index=i
                    break
            else:             #If string only contains '_' , return 0
                S[index:len(S)]=[0 for x in range(len(S))]
        
        else:                 #When first number is not '_'                   
            if S[index+1]!='_':       #If numbers found at consecutive position, iterate index by 1
                index=index+1
            for i in range(index+1,len(S)):    #Handling the case when there are '_' between two numbers    
                if S[i]!='_':
                    S[index:i+1]=[(int(S[index])+int(S[i]))//(i-index+1) for x in range(index,i+1)]
                    index=i
                    break
            else:                 #If the only number present in list is at 0th index and rest elements are '_' 
                S[index:len(S)]=[int(S[index])//(len(S)-index) for x in range(index,len(S))]
    return S

S=  "_,_,_,_,50"
smoothed_values= curve_smoothing(S)
print(smoothed_values)

Answer 4

代码

def getStr(st):
    s = st.split(',')
    k = {}
    counter = 0
    lstcounter = 0
    for i in range(len(s)):
        if s[i].isdigit():
            if lstcounter != i:
                k[counter] = (lstcounter,i)
                counter = counter + 1
                lstcounter = i

        if lstcounter < len(s):
            if lstcounter != len(s)-1:
                k[counter] = (lstcounter,len(s)-1)
    
    return k

def getCal(s,d):
    lst = []
    for i in range(len(d)):
        firstIndex = d[i][0]
        secondIndex = d[i][1]
        first_ele = str(s[d[i][0]])
        second_ele = str(s[d[i][1]])

        if first_ele.isdigit() and second_ele.isdigit():
            for j in range(firstIndex, secondIndex + 1):
                s[j] = ((int(first_ele) + int(second_ele)) // (secondIndex - firstIndex + 1))

        elif second_ele.isdigit():
            for j in range(firstIndex, secondIndex + 1):
                s[j] = (( int(second_ele)) // (secondIndex - firstIndex + 1))
        elif first_ele.isdigit():
            for j in range(firstIndex, secondIndex + 1):
                s[j] = (( int(first_ele)) // (secondIndex - firstIndex + 1))
    return s



def getSmootString(string):
    
    indexes = getStr(string)
    lst = getCal(string.split(','),getStr(string))
    
    return lst


a =  "_,_,30,_,_,_,50,_,_"
b = "40,_,_,_,60"
c = "_,_,_,24"
d = "80,_,_,_,_"
e = "10_,_,30,_,_,_,50,_,_,20"


print(getSmootString(a))
print(getSmootString(b))
print(getSmootString(c))
print(getSmootString(d))
print(getSmootString(e))

输出

[10, 10, 12, 12, 12, 12, 4, 4, 4]

[20, 20, 20, 20, 20]

[6, 6, 6, 6]

[16, 16, 16, 16, 16]

[10, 10, 12, 12, 12, 12, 8, 8, 8, 8]

需要根据平滑逻辑从字符串中找到缺失值

Need to find the missing values from a string based on smoothening logic

python

curve-fitting

smoothing

missing-data