pyspark提取特定值到变量
pyspark extracting specific value to variable
我有以下脚本。
我对这个特定的部分有点困惑:
datex = datetime.datetime.strptime(df1.start_time,'%Y-%m-%d %H:%M:%S')
我不知道如何从 start_time 字段中提取实际值并将其存储在 datex 变量中。
有人能帮帮我吗?
while iters <10:
time_to_add = iters * 900
time_to_checkx = time_to_check + datetime.timedelta(seconds=time_to_add)
iters = iters + 1
session = 0
for row in df1.rdd.collect():
datex = datetime.datetime.strptime(df1.start_time,'%Y-%m-%d %H:%M:%S')
print(datex)
filterx = df1.filter(datex < time_to_checkx)
session = session + filterx.count()
print('current session value' + str(session))
print(session)
看看这个。我已经转换了你的 for 循环。如果你能给我更多关于 iters
变量的信息或你希望它如何工作的解释:
import pyspark.sql.functions a F
spark_date_format = "YYYY-MM-dd hh:mm:ss"
session = 0
time_to_checkx = time_to_check + datetime.timedelta(seconds=time_to_add)
df1 = df1.withColumn('start_time', F.to_timestamp(F.col(date_column), spark_date_format))
filterx = df1.filter(df1.start_time < time_to_checkx)
session = session + filterx.count()
我有以下脚本。
我对这个特定的部分有点困惑:
datex = datetime.datetime.strptime(df1.start_time,'%Y-%m-%d %H:%M:%S')
我不知道如何从 start_time 字段中提取实际值并将其存储在 datex 变量中。
有人能帮帮我吗?
while iters <10:
time_to_add = iters * 900
time_to_checkx = time_to_check + datetime.timedelta(seconds=time_to_add)
iters = iters + 1
session = 0
for row in df1.rdd.collect():
datex = datetime.datetime.strptime(df1.start_time,'%Y-%m-%d %H:%M:%S')
print(datex)
filterx = df1.filter(datex < time_to_checkx)
session = session + filterx.count()
print('current session value' + str(session))
print(session)
看看这个。我已经转换了你的 for 循环。如果你能给我更多关于 iters
变量的信息或你希望它如何工作的解释:
import pyspark.sql.functions a F
spark_date_format = "YYYY-MM-dd hh:mm:ss"
session = 0
time_to_checkx = time_to_check + datetime.timedelta(seconds=time_to_add)
df1 = df1.withColumn('start_time', F.to_timestamp(F.col(date_column), spark_date_format))
filterx = df1.filter(df1.start_time < time_to_checkx)
session = session + filterx.count()