shell编程:安装部署前常见环境检查
脚本任务
监测主机是否联通正常
检查安装操作系统版本是否和需求一致
检查CPU是否满足规格要求
检查内存是否满足规格要求
检查数据磁盘是否满足规格要求
检查操作系统分区目录大小是否满足需求
检查集群主机时间是否一致
0.配置文件准备及脚本变量初始化
编写config.ini存放主机配置文件
[hosts]
10.0.1.10 node01
10.0.1.20 node02
10.0.1.30 node03[root_password]
000000
编写 env-check.sh 取出ip和 密码
#!/bin/bash
#
if [ ! -e ./config.ini ];thenecho "config.ini is not exist...please check.."EXIT 1
fiHOSTS_IP=`cat config.ini | sed -n '/\[hosts\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^# |awk '{print $1}'`ROOT_PATH=`cat config.ini | sed -n '/\[root_password\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`
1.监测主机是否联通正常
function check_host_online
{echo "++++++++++++++++监测主机是否联通正常++++++++++++++++"for host in $HOSTS_IP;doping -w 3 $host &> /dev/nullif [ $? -eq 0 ];thenecho "检测主机 $host 连通性通过"elseecho "检测主机 $host 无法连通"ping_faild_hosts="$ping_faild_hosts $host"fidoneif [[ "$ping_faild_hosts" == "" ]];thenecho "1.使用ping对主机连通性检查,全部通过"elseecho "1.使用ping对主机连通性检查,未全部通过:$ping_faild_hosts"exit 1fi
}
循环遍历主机列表:通过for
循环遍历HOSTS_IP
变量中包含的所有主机IP地址。
执行ping命令:对每个主机执行ping
命令,-w 3
参数指定ping操作持续3秒,&> /dev/null
将标准输出和标准错误都重定向到/dev/null
,即不显示ping命令的输出。
检查ping命令的返回值:使用$?
获取上一个命令(即ping命令)的退出状态。如果返回值为0(-eq 0
),表示ping命令成功,即主机连通性通过。
打印连通性结果:如果主机连通性通过,则打印"检测主机 $host 连通性通过";如果不通,则打印"检测主机 $host 无法连通",并将该主机IP添加到ping_faild_hosts
变量中,用于记录失败的主机。
检查是否有主机未通过连通性检查:循环结束后,使用if
语句检查ping_faild_hosts
变量是否为空。如果为空,表示所有主机的连通性检查都通过了,打印"1.使用ping对主机连通性检查,全部通过"。如果不为空,表示有主机未通过连通性检查,打印"1.使用ping对主机连通性检查,未全部通过:$ping_faild_hosts",并退出脚本,返回状态为1。
退出脚本:如果检测到有主机连通性失败,脚本将通过exit 1
命令退出,并返回状态码1,表示脚本执行失败。
2.检查安装操作系统版本是否和需求一致
先编写检查主机是否能登录
function verify_password
{if [ $# -lt 2 ];thenecho "Usage: verify_password IP root_password"exit 1fisshpass -p$2 ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$1 "df -h" &> /dev/nullif [ $? -ne 0 ];thenecho "尝试登录ssh主机$1 失败,检查后重试"return 255elsereturn 0fi
}
编写检查操作系统版本是否一致
grep $OS_VERSION /etc/redhat-release
OS_VERSION=`cat config.ini | sed -n '/\[os_version\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`function check_os_version
{for host in $HOSTS_IP;doverify_password $host $ROOT_PASSif [ $? -eq 0 ];thensshpass -p$ROOT_PASS ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$host "grep $OS_VERSION /etc/redhat-release" &> /dev/nullif [ $? -ne 0 ];thenecho "检查主机$host 版本,与目标不一致,检查不通过"os_failed_hosts="$os_failed_hosts $host"elseecho "检查主机$host 版本,与目标一致,检查通过"fielseos_failed_hosts="$os_failed_hosts $host"fidoneif [[ "$os_failed_hosts" == "" ]];thenecho "2.对主机版本检查,全部通过"elseecho "2.对主机版本检查,未全部通过: $os_failed_hosts"exit 1fi
}
循环遍历主机列表:通过for
循环遍历HOSTS_IP
变量中包含的所有主机IP地址。
验证密码:对每个主机调用verify_password
函数,传入主机IP和root用户的密码$ROOT_PASS
,以验证SSH连接的密码是否正确。
检查SSH连接:如果verify_password
函数返回状态码0(表示密码验证成功),则继续执行;否则,跳过当前主机。
SSH连接并检查操作系统版本:使用sshpass
工具和提供的root用户密码通过SSH连接到主机,并执行grep
命令检查/etc/redhat-release
文件中是否包含特定的操作系统版本字符串$OS_VERSION
。这里使用了-o StrictHostKeyChecking=no
选项来禁用严格的主机密钥检查,这在自动化脚本中常用于避免每次连接时的手动确认。
记录检查结果:如果操作系统版本检查未通过(即grep
命令返回非0状态码),则打印该主机的检查未通过信息,并将主机IP添加到os_failed_hosts
变量中;如果检查通过,则打印该主机的检查通过信息。
汇总检查结果:循环结束后,使用if
语句检查os_failed_hosts
变量是否为空。如果为空,表示所有主机的操作系统版本检查都通过了,打印"2.对主机版本检查,全部通过"。如果不为空,表示有主机的操作系统版本检查未通过,打印"2.对主机版本检查,未全部通过:$os_faild_hosts",并退出脚本,返回状态码1。
退出脚本:如果检测到有主机操作系统版本不一致,脚本将通过exit 1
命令退出,并返回状态码1,表示脚本执行失败。
3.检查CPU是否满足规格要求
cat /proc/cpuinfo |grep ^processor |sort |uniq |wc -l
CPU_CORES=`cat config.ini | sed -n '/\[cpu_cores\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`
function check_cpu_cores
{for host in $HOSTS_IP;doverify_password $host $ROOT_PASSif [ $? -eq 0 ];thenDST_CPU_CORES=`sshpass -p$ROOT_PASS ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$host "cat /proc/cpuinfo |grep ^processor |sort |uniq |wc -l"` &> /dev/nullif [ $DST_CPU_CORES -lt $CPU_CORES ];thenecho "检查主机CPU逻辑核心数量,检查不通过"cpu_failed_hosts="$cpu_failed_hosts $host"elseecho "检查主机CPU逻辑核心数量,检查通过"fielsecpu_failed_hosts="$cpu_failed_hosts $host"fidoneif [[ "$cpu_failed_hosts" == "" ]];thenecho "3.对主机cpu检查,全部通过"elseecho "3.对主机cpu检查,未全部通过: $cpu_failed_hosts"exit 1fi
4.检查内存是否满足规格要求
cat /proc/meminfo |grep MemTotal| awk '{print $2}'
TOTAL_MEMORY=`cat config.ini | sed -n '/\[memory_kb\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`
function check_memory
{for host in $HOSTS_IP;doverify_password $host $ROOT_PASSif [ $? -eq 0 ];thenDST_TOTAL_MEMORY=`sshpass -p$ROOT_PASS ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$host "cat /proc/meminfo |grep MemTotal| awk '{print \\$2}'"` &> /dev/nullif [ $DST_TOTAL_MEMORY -lt $TOTAL_MEMORY ];thenecho "检查主机 $host 内存大小,检查不通过"men_failed_hosts="$men_failed_hosts $host"elseecho "检查主机 $host 内存大小,检查通过"fielsemen_failed_hosts="$men_failed_hosts $host"fidoneif [[ "$men_failed_hosts" == "" ]];thenecho "4.对主机内存检查,全部通过"elseecho "4.对主机内存检查,未全部通过,未通过的主机: $men_failed_hosts"fi
}
5.检查数据磁盘是否满足规格要求
function check_disk_number
{for host in $HOSTS_IP;doverify_password $host $ROOT_PASSALL_DISK_SYMBOLS=`sshpass -p$ROOT_PASS ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$host "lsscsi |awk '\\$2~/disk/{print \\$8}'"` &> /dev/nullfor d in $ALL_DISK_SYMBOLS;dodf -h | grep "$d" &> /dev/nullif [ $? -eq 0 ];thenDATA_DISK_SYMBOLS=`echo $ALL_DISK_SYMBOLS | sed "s#$d##g"`fidoneDATA_DISK_NUMBER=`echo $DATA_DISK_SYMBOLS | awk '{print NF}'`if [ $DATA_DISK_NUMBER -ge $NUM_OF_DISK ]; thenecho "检查主机 $host 数据盘个数,检查通过"elseecho "检查主机 $host 数据盘个数,检查不通过"disk_failed_hosts="$disk_failed_hosts $host"fidoneif [[ "$disk_failed_hosts" == "" ]];thenecho "5.对主机内存检查,全部通过"elseecho "5.对主机内存检查,未全部通过,未通过的主机: $disk_failed_hosts"fi}
6.检查操作系统分区目录大小是否满足需求
PARTITION_SIZE=`cat config.ini | sed -n '/\[root_partition_size\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`
function check_root_partition_size
{for host in $HOSTS_IP;doverify_password $host $ROOT_PASSif [ $? -eq 0 ];thenROOT_PARTITION_SIZE=`sshpass -p$ROOT_PASS ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$host "df -h |awk '\\$6==\"/\"{print \\$2}' |sed 's/[^0-9]//g'"` &> /dev/nullif [ $ROOT_PARTITION_SIZE -ge $PARTITION_SIZE ]; thenecho "检查主机 $host 分区大小,检查通过"elseecho "检查主机 $host 分区大小,检查不通过"part_failed_hosts="$part_failed_hosts $host"fielsepart_failed_hosts="$part_failed_hosts $host"fidoneif [[ "$part_failed_hosts" == "" ]];thenecho "6.对主机分区大小,全部通过"elseecho "6.对主机分区大小,未全部通过,未通过的主机: $part_failed_hosts"fi
}
7.检查集群主机时间是否一致
TIME_VALUE=`cat config.ini | sed -n '/\[time_sync_diff\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`
function check_time_sync
{for host in $HOSTS_IP;doLOCAL_TIME=`date "+%Y%m%d%H%M%S"`verify_password $host $ROOT_PASSif [ $? -eq 0 ];thenDST_HOST_TIME=`sshpass -p$ROOT_PASS ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$host 'date "+%Y%m%d%H%M%S"'` &> /dev/nullTIME_DIFF=`expr $LOCAL_TIME - $DST_HOST_TIME |sed 's/[^0-9]//g'`if [ $TIME_DIFF -lt $TIME_VALUE ];thenecho "检查主机 $host 时间同步通过"elseecho "检查主机 $host 时间同步不通过,时间误差在 $TIME_DIFF"time_failed_hosts="$time_failed_hosts $host"fielsetime_failed_hosts="$time_failed_hosts $host"fidoneif [[ "$time_failed_hosts" == "" ]];thenecho "7.对主机时间检查,全部通过"elseecho "7.对主机时间检查,未全部通过,未通过的主机: $time_failed_hosts"fi
}
全部代码在这
#!/bin/bash
#
if [ ! -e ./config.ini ];thenecho "config.ini is not exist...please check.."exit 1
fiHOSTS_IP=`cat config.ini | sed -n '/\[hosts\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^# |awk '{print $1}'`ROOT_PASS=`cat config.ini | sed -n '/\[root_password\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`OS_VERSION=`cat config.ini | sed -n '/\[os_version\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`CPU_CORES=`cat config.ini | sed -n '/\[cpu_cores\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`TOTAL_MEMORY=`cat config.ini | sed -n '/\[memory_kb\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`NUM_OF_DISK=`cat config.ini | sed -n '/\[data_disk_number\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`PARTITION_SIZE=`cat config.ini | sed -n '/\[root_partition_size\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`TIME_VALUE=`cat config.ini | sed -n '/\[time_sync_diff\]/,/\[.*\]/p' | grep -v "\[.*\]" |grep -v ^$ |grep -v ^#`if [[ "$HOSTS_IP" == "" ]];thenecho "NO HOST IP ADDRESS is configured in config.please check config.ini"exit 1
elseecho "the cluster includes the following hosts:"for host in $HOSTS_IP; doecho $hostdoneread -p "Please confirm,input yes/no:" choiceif [[ "$choice" == "yes" || "$choice" == "YES" || "$choice" == "Y" || "$choice" == "y" ]];thencontinueelseexitfi
fifunction format_print
{if [ $# -lt 1 ]; thenecho "Usage: format_print 'args1 args2...'"exit 1fifor str in $@;doecho $strdone
}function verify_password
{if [ $# -lt 2 ];thenecho "Usage: verify_password IP root_password"exit 1fisshpass -p$2 ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$1 "df -h" &> /dev/nullif [ $? -ne 0 ];thenecho "尝试登录ssh主机$1 失败,检查后重试"return 255elsereturn 0fi
}function check_host_online
{echo "++++++++++++++++1、检查主机是否联通正常++++++++++++++++"for host in $HOSTS_IP;doping -w 3 $host &> /dev/nullif [ $? -eq 0 ];thenecho "检测主机 $host 连通性通过"elseecho "检测主机 $host 无法连通"ping_faild_hosts="$ping_faild_hosts $host"fidoneif [[ "$ping_faild_hosts" == "" ]];thenecho "1.使用ping对主机连通性检查,全部通过"elseecho "1.使用ping对主机连通性检查,未全部通过:"format_print $ping_faild_hostsfi
}
function check_os_version
{echo "++++++++++++++++2、检查安装操作系统版本是否和需求一致++++++++++++++++"for host in $HOSTS_IP;doverify_password $host $ROOT_PASSif [ $? -eq 0 ];thensshpass -p$ROOT_PASS ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$host "grep $OS_VERSION /etc/redhat-release" &> /dev/nullif [ $? -ne 0 ];thenecho "检查主机$host 版本,与目标不一致,检查不通过"os_failed_hosts="$os_failed_hosts $host"elseecho "检查主机$host 版本,与目标一致,检查通过"fielseos_failed_hosts="$os_failed_hosts $host"fidoneif [[ "$os_failed_hosts" == "" ]];thenecho "2.对主机版本检查,全部通过"elseecho "2.对主机版本检查,未全部通过,未通过的主机: "format_print $os_failed_hostsfi
}
function check_cpu_cores
{echo "++++++++++++++++3、检查CPU是否满足规格要求++++++++++++++++"for host in $HOSTS_IP;doverify_password $host $ROOT_PASSif [ $? -eq 0 ];thenDST_CPU_CORES=`sshpass -p$ROOT_PASS ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$host "cat /proc/cpuinfo |grep ^processor |sort |uniq |wc -l"` &> /dev/nullif [ $DST_CPU_CORES -lt $CPU_CORES ];thenecho "检查主机CPU逻辑核心数量,检查不通过"cpu_failed_hosts="$cpu_failed_hosts $host"echo $cpu_failed_hostselseecho "检查主机CPU逻辑核心数量,检查通过"fielsecpu_failed_hosts="$cpu_failed_hosts $host"fidoneif [[ "$cpu_failed_hosts" == "" ]];thenecho "3.对主机cpu检查,全部通过"elseecho "3.对主机cpu检查,未全部通过,未通过的主机: "format_print $cpu_failed_hostsfi
}
function check_memory
{echo "++++++++++++++++4、检查内存是否满足规格要求++++++++++++++++"for host in $HOSTS_IP;doverify_password $host $ROOT_PASSif [ $? -eq 0 ];thenDST_TOTAL_MEMORY=`sshpass -p$ROOT_PASS ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$host "cat /proc/meminfo |grep MemTotal| awk '{print \\$2}'"` &> /dev/nullif [ $DST_TOTAL_MEMORY -lt $TOTAL_MEMORY ];thenecho "检查主机 $host 内存大小,检查不通过"men_failed_hosts="$men_failed_hosts $host"elseecho "检查主机 $host 内存大小,检查通过"fielsemen_failed_hosts="$men_failed_hosts $host"fidoneif [[ "$men_failed_hosts" == "" ]];thenecho "4.对主机内存检查,全部通过"elseecho "4.对主机内存检查,未全部通过,未通过的主机:"format_print $men_failed_hostsfi
}
function check_disk_number
{echo "++++++++++++++++5、检查数据磁盘是否满足规格要求++++++++++++++++"for host in $HOSTS_IP;doverify_password $host $ROOT_PASSALL_DISK_SYMBOLS=`sshpass -p$ROOT_PASS ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$host "lsscsi |awk '\\$2~/disk/{print \\$8}'"` &> /dev/nullfor d in $ALL_DISK_SYMBOLS;dodf -h | grep "$d" &> /dev/nullif [ $? -eq 0 ];thenDATA_DISK_SYMBOLS=`echo $ALL_DISK_SYMBOLS | sed "s#$d##g"`fidoneDATA_DISK_NUMBER=`echo $DATA_DISK_SYMBOLS | awk '{print NF}'`if [ $DATA_DISK_NUMBER -ge $NUM_OF_DISK ]; thenecho "检查主机 $host 数据盘个数,检查通过"elseecho "检查主机 $host 数据盘个数,检查不通过"disk_failed_hosts="$disk_failed_hosts $host"fidoneif [[ "$disk_failed_hosts" == "" ]];thenecho "5.对主机内存检查,全部通过"elseecho "5.对主机内存检查,未全部通过,未通过的主机:"format_print $disk_failed_hostsfi}
function check_root_partition_size
{echo "++++++++++++++++6、检查操作系统分区目录大小是否满足需求++++++++++++++++"for host in $HOSTS_IP;doverify_password $host $ROOT_PASSif [ $? -eq 0 ];thenROOT_PARTITION_SIZE=`sshpass -p$ROOT_PASS ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$host "df -h |awk '\\$6==\"/\"{print \\$2}' |sed 's/[^0-9]//g'"` &> /dev/nullif [ $ROOT_PARTITION_SIZE -ge $PARTITION_SIZE ]; thenecho "检查主机 $host 分区大小,检查通过"elseecho "检查主机 $host 分区大小,检查不通过"part_failed_hosts="$part_failed_hosts $host"fielsepart_failed_hosts="$part_failed_hosts $host"fidoneif [[ "$part_failed_hosts" == "" ]];thenecho "6.对主机分区大小,全部通过"elseecho "6.对主机分区大小,未全部通过,未通过的主机:"format_print $part_failed_hostsfi}
function check_time_sync
{echo "++++++++++++++++7、检查集群主机时间是否一致++++++++++++++++"for host in $HOSTS_IP;doLOCAL_TIME=`date "+%Y%m%d%H%M%S"`verify_password $host $ROOT_PASSif [ $? -eq 0 ];thenDST_HOST_TIME=`sshpass -p$ROOT_PASS ssh -o StrictHostKeyChecking=no -o ConnectTimeout=2 root@$host 'date "+%Y%m%d%H%M%S"'` &> /dev/nullTIME_DIFF=`expr $LOCAL_TIME - $DST_HOST_TIME |sed 's/[^0-9]//g'`if [ $TIME_DIFF -lt $TIME_VALUE ];thenecho "检查主机 $host 时间同步通过"elseecho "检查主机 $host 时间同步不通过,时间误差在 $TIME_DIFF"time_failed_hosts="$time_failed_hosts $host"fielsetime_failed_hosts="$time_failed_hosts $host"fidoneif [[ "$time_failed_hosts" == "" ]];thenecho "7.对主机时间检查,全部通过"elseecho "7.对主机时间检查,未全部通过,未通过的主机: "format_print $time_failed_hostsfi
}
check_host_online
check_os_version
check_cpu_cores
check_memory
check_disk_number
check_root_partition_size
check_time_sync