注意:本脚本不计算软链接的真实文件大小
一、脚本内容 {#一、脚本内容}
#!/bin/sh
set -eu
tabs -8
检查输入参数,如果没有指定目录,则默认为根目录
=======================
dir=${1:-/}
使用find命令统计文件大小,排除指定目录
=====================
find "$dir" ( -path "/proc/" -o -path "/sys/" -o -path "/boot/" -o -path "/run/" -o -path "/dev/\*" ) -prune -o -type f -exec du -b -- {} + \| awk -vOFS='\\t' '
BEGIN {split("KB MB GB TB PB", u); u\[0\] = "B"}
{
++hist\[$1 ? length($1) - 1 : -1\]
total += $1
}
END {
max = -2
for (i in hist)
max = (i \> max ? i : max)
print "From", "To", "Count\n"
for (i = -1; i <= max; ++i)
{
if (i in hist)
{
if (i == -1)
print "0B", "0B", hist[i]
else
print 10 ** (i % 3) u[int(i / 3)],
10 ** ((i + 1) % 3) u[int((i + 1) / 3)],
hist[i]
}
}
base = 1024
unit = "B"
if (total >= base) {
total /= base
unit = "KB"
}
if (total >= base) {
total /=base
unit = "MB"
}
if (total >= base) {
total /=base
unit = "GB"
}
if (total >= base) {
total /=base
unit = "TB"
}
printf "\nTotal: %.1f %s in %d files\n", total, unit, NR
`}'`
二、使用方法 {#二、使用方法}
[root@dameng linuxscript]# ./file_size_distribution.sh /home/wwwroot/wordpress/
From To Count
0B 0B 27
1B 10B 8
10B 100B 296
100B 1KB 3309
1KB 10KB 6629
10KB 100KB 4444
100KB 1MB 1225
1MB 10MB 54
10MB 100MB 1
100MB 1GB 3
`Total: 1.8 GB in 15996 files
[root@dameng linuxscript]# find /home/wwwroot/wordpress/ -type f -size +10M
/home/wwwroot/wordpress/wp-content/plugins/yaya-comment-ip/ip-data/qqwry.dat
/home/wwwroot/wordpress/wp-content/backups-dup-lite/WordPress_Backup_202411281540_743e9f6d5a6adcd51101_20241128074043_archive.zip
/home/wwwroot/wordpress/wp-content/backups-dup-lite/WordPress_Backup_202411281540_2d35e6e0f53082cd2170_20241204031044_archive.zip
/home/wwwroot/wordpress/wp-content/backups-dup-lite/WordPress_First_Backup_849864da946d7fcc5140_20241125093438_archive.zip`
效果图如下
如图,脚本分析结果大于10M的文件共有1+3=4个,与find命令查找到的文件数一致。