如何使用 Python3 比较 Linux 中两个不同日期的文件大小
How to compare file sizes for two different dates in Linux by using Python3
我是python的新手,我会每天备份我的服务器数据。
我使用 shell 脚本来检查我的备份日期,但是当 Web Host 变得越来越多时,使用 shell 脚本需要改变很多。
所以,我想用 python 检查我的备份文件。
我的环境是:
OS: ubuntu 16.04
Python版本:3.4.3
我的目录和文件结构如下:
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_20171105_htdocs.tar.gz
/mnt/disk2/JP/TFP-1/Config/crontab_backup_20171105.txt
/mnt/disk2/JP/TFP-1/Config/mysql_config_backup_20170724.tar.gz
/mnt/disk2/JP/SPT_1/Web/2017/11/SPT_20171105_htdocs.tar.gz
/mnt/disk2/JP/SPT_1/Config/nginx_config_backup_20171030.tar.gz
/mnt/disk2/CN/LHD-1/Web/2017/11/LHD_20171105_htdocs.tar.gz
/mnt/disk2/CN/LHD-1/Config/crontab_backup_20171105.txt
/mnt/disk2/CN/LHD-1/Config/mysql_config_backup_20170724.tar.gz
/mnt/disk2/CN/TTY_1/Web/2017/11/TTY_20171105_htdocs.tar.gz
/mnt/disk2/CN/TTY_1/Config/nginx_config_backup_20171030.tar.gz
因为我的备份文件上有日期时间,所以我的 shell 脚本将使用今天的文件大小减去昨天的文件大小。
如果等于0表示备份文件没有变化,如果不是0它会发送警报邮件通知我。
(但是如果文件大小相差不大的话,没关系。我只需要注意那些文件大小相差超过1GB的文件。所以,这就是为什么我不使用 md5 或 filecmp 来比较)
现在我想使用 python 制作相同的功能程序,但我坚持计算两个不同日期的文件大小。
这是我的代码:
## Import Module
import sys
import os
import re
from datetime import datetime, timedelta
# Global Variables
jpWebList = ["/mnt/disk2/JP/TFP-1/Web", "/mnt/disk2/JP/SPT_1/Web"]
jpConfigList = ["/mnt/disk2/JP/TFP-1/Conig", "/mnt/disk2/JP/SPT_1/Config"]
## Function Program
#-- Get file name's time and calculate yesterday.
def findYtdFile(filePath):
YtdData = ""
fsize = 0
now = datetime.now()
aDay = timedelta(days=-1)
yDay = now + aDay
yDay = yDay.strftime("%Y%m%d") # formatted the byDay value into 20170820.
# print(yDay) # Check yDay's value.
# print(filePath)
if re.search(yDay, filePath) is not None:
# print(filePath)
YtdData = filePath
# print(YtdData) # Check what kinds of file we got.
fsize = os.path.getsize(YtdData)
print(YtdData, "--file size is", fsize)
return fsize
#-- Get file name's time and calculate the day before yesterday
def findDbyFile(filePath):
DbyData = ""
fsize = 0
now = datetime.now()
aDay = timedelta(days=-2)
byDay = now + aDay
byDay = byDay.strftime("%Y%m%d") # formatted the byDay value into 20170820
# print(byDay) # Check byDay's value.
if re.search(byDay, filePath) is not None:
DbyData = filePath
fsize = os.path.getsize(DbyData)
print(DbyData, "--file size is", fsize)
return fsize
#--Main, Get tar.gz and txt file list.
for tmpList in jpWebList:
for root, dirs, files in os.walk(tmpList): # recursive to get directories and files list.
for file in files:
if file.endswith((".tar.gz", ".txt")):
filePath = os.path.join(root, file)
ytdFileSize = findYtdFile(filePath)
dbyFileSize = findDbyFile(filePath)
a = ytdFileSize - dbyFileSize
print(a)
并且终端显示:
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_backend_20171106_htdocs.tar.gz --file size is 76021633
76021633
0
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_backend_20171105_htdocs.tar.gz --file size is 76012434
-76012434
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_Test_backend_20171106_htdocs.tar.gz --file size is 62391961
62391961
0
0
0
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_Test_front_20171105_htdocs.tar.gz --file size is 82379384
-82379384
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_Test_front_20171106_htdocs.tar.gz --file size is 82379384
82379384
0
0
0
0
0
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_Test_backend_20171105_htdocs.tar.gz --file size is 62389231
-62389231
答案假设喜欢 "TFP_Test_front_20171106_htdocs.tar.gz(82379384)" 减去 "TFP_Test_backend_20171105_htdocs.tar.gz(62389231)" 等于 19990153.
我一直在尝试 glob, re.findall, os.listdir 但仍然不能正常工作。
有什么我没有注意到的吗?或者我可以参考的东西?
感谢您的帮助!
我看不出你的代码有什么问题,但我已经更改了你的代码并尝试简化它:
import sys, os
from datetime import datetime, timedelta
jpWebList = ["/mnt/disk2/JP/TFP-1/Web", "/mnt/disk2/JP/SPT_1/Web"]
jpConfigList = ["/mnt/disk2/JP/TFP-1/Conig", "/mnt/disk2/JP/SPT_1/Config"]
def get_date(d):
aDay = timedelta(days=d)
byDay = datetime.now() + aDay
return byDay.strftime("%Y%m%d")
#main
today = get_date(0)
yesterday = get_date(-1)
for tmpList in jpWebList:
for root, dirs, files in os.walk(tmpList):
todays_files = [file for file in files if today in file and file.endswith((".tar.gz", ".txt"))]
yesterdays_files = [file for file in files if yesterday in file and file.endswith((".tar.gz", ".txt"))]
for todays_file in todays_files:
yesterdays_file = todays_file.replace(today, yesterday)
if yesterdays_file in yesterdays_files:
todays_path = os.path.join(root, todays_file)
yesterdays_path = os.path.join(root, yesterdays_file)
size_difference = os.path.getsize(todays_path) - os.path.getsize(yesterdays_path)
print(size_difference)
如果没有你的文件夹和文件,我无法完全检查它,但我尝试了 2 个文件,它工作正常。如果它不起作用,请告诉我。
我是python的新手,我会每天备份我的服务器数据。 我使用 shell 脚本来检查我的备份日期,但是当 Web Host 变得越来越多时,使用 shell 脚本需要改变很多。 所以,我想用 python 检查我的备份文件。
我的环境是:
OS: ubuntu 16.04
Python版本:3.4.3
我的目录和文件结构如下:
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_20171105_htdocs.tar.gz
/mnt/disk2/JP/TFP-1/Config/crontab_backup_20171105.txt
/mnt/disk2/JP/TFP-1/Config/mysql_config_backup_20170724.tar.gz
/mnt/disk2/JP/SPT_1/Web/2017/11/SPT_20171105_htdocs.tar.gz
/mnt/disk2/JP/SPT_1/Config/nginx_config_backup_20171030.tar.gz
/mnt/disk2/CN/LHD-1/Web/2017/11/LHD_20171105_htdocs.tar.gz
/mnt/disk2/CN/LHD-1/Config/crontab_backup_20171105.txt
/mnt/disk2/CN/LHD-1/Config/mysql_config_backup_20170724.tar.gz
/mnt/disk2/CN/TTY_1/Web/2017/11/TTY_20171105_htdocs.tar.gz
/mnt/disk2/CN/TTY_1/Config/nginx_config_backup_20171030.tar.gz
因为我的备份文件上有日期时间,所以我的 shell 脚本将使用今天的文件大小减去昨天的文件大小。 如果等于0表示备份文件没有变化,如果不是0它会发送警报邮件通知我。
(但是如果文件大小相差不大的话,没关系。我只需要注意那些文件大小相差超过1GB的文件。所以,这就是为什么我不使用 md5 或 filecmp 来比较)
现在我想使用 python 制作相同的功能程序,但我坚持计算两个不同日期的文件大小。
这是我的代码:
## Import Module
import sys
import os
import re
from datetime import datetime, timedelta
# Global Variables
jpWebList = ["/mnt/disk2/JP/TFP-1/Web", "/mnt/disk2/JP/SPT_1/Web"]
jpConfigList = ["/mnt/disk2/JP/TFP-1/Conig", "/mnt/disk2/JP/SPT_1/Config"]
## Function Program
#-- Get file name's time and calculate yesterday.
def findYtdFile(filePath):
YtdData = ""
fsize = 0
now = datetime.now()
aDay = timedelta(days=-1)
yDay = now + aDay
yDay = yDay.strftime("%Y%m%d") # formatted the byDay value into 20170820.
# print(yDay) # Check yDay's value.
# print(filePath)
if re.search(yDay, filePath) is not None:
# print(filePath)
YtdData = filePath
# print(YtdData) # Check what kinds of file we got.
fsize = os.path.getsize(YtdData)
print(YtdData, "--file size is", fsize)
return fsize
#-- Get file name's time and calculate the day before yesterday
def findDbyFile(filePath):
DbyData = ""
fsize = 0
now = datetime.now()
aDay = timedelta(days=-2)
byDay = now + aDay
byDay = byDay.strftime("%Y%m%d") # formatted the byDay value into 20170820
# print(byDay) # Check byDay's value.
if re.search(byDay, filePath) is not None:
DbyData = filePath
fsize = os.path.getsize(DbyData)
print(DbyData, "--file size is", fsize)
return fsize
#--Main, Get tar.gz and txt file list.
for tmpList in jpWebList:
for root, dirs, files in os.walk(tmpList): # recursive to get directories and files list.
for file in files:
if file.endswith((".tar.gz", ".txt")):
filePath = os.path.join(root, file)
ytdFileSize = findYtdFile(filePath)
dbyFileSize = findDbyFile(filePath)
a = ytdFileSize - dbyFileSize
print(a)
并且终端显示:
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_backend_20171106_htdocs.tar.gz --file size is 76021633
76021633
0
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_backend_20171105_htdocs.tar.gz --file size is 76012434
-76012434
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_Test_backend_20171106_htdocs.tar.gz --file size is 62391961
62391961
0
0
0
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_Test_front_20171105_htdocs.tar.gz --file size is 82379384
-82379384
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_Test_front_20171106_htdocs.tar.gz --file size is 82379384
82379384
0
0
0
0
0
/mnt/disk2/JP/TFP-1/Web/2017/11/TFP_Test_backend_20171105_htdocs.tar.gz --file size is 62389231
-62389231
答案假设喜欢 "TFP_Test_front_20171106_htdocs.tar.gz(82379384)" 减去 "TFP_Test_backend_20171105_htdocs.tar.gz(62389231)" 等于 19990153.
我一直在尝试 glob, re.findall, os.listdir 但仍然不能正常工作。 有什么我没有注意到的吗?或者我可以参考的东西? 感谢您的帮助!
我看不出你的代码有什么问题,但我已经更改了你的代码并尝试简化它:
import sys, os
from datetime import datetime, timedelta
jpWebList = ["/mnt/disk2/JP/TFP-1/Web", "/mnt/disk2/JP/SPT_1/Web"]
jpConfigList = ["/mnt/disk2/JP/TFP-1/Conig", "/mnt/disk2/JP/SPT_1/Config"]
def get_date(d):
aDay = timedelta(days=d)
byDay = datetime.now() + aDay
return byDay.strftime("%Y%m%d")
#main
today = get_date(0)
yesterday = get_date(-1)
for tmpList in jpWebList:
for root, dirs, files in os.walk(tmpList):
todays_files = [file for file in files if today in file and file.endswith((".tar.gz", ".txt"))]
yesterdays_files = [file for file in files if yesterday in file and file.endswith((".tar.gz", ".txt"))]
for todays_file in todays_files:
yesterdays_file = todays_file.replace(today, yesterday)
if yesterdays_file in yesterdays_files:
todays_path = os.path.join(root, todays_file)
yesterdays_path = os.path.join(root, yesterdays_file)
size_difference = os.path.getsize(todays_path) - os.path.getsize(yesterdays_path)
print(size_difference)
如果没有你的文件夹和文件,我无法完全检查它,但我尝试了 2 个文件,它工作正常。如果它不起作用,请告诉我。