初識(shí)ceph
ceph功能介紹
?????直接進(jìn)入主題,ceph目前提供對(duì)象存儲(chǔ)(RADOSGW)、塊存儲(chǔ)RDB以及 CephFS 文件系統(tǒng)這 3 種功能。對(duì)于這3種功能介紹,分別如下:1.對(duì)象存儲(chǔ),也就是通常意義的鍵值存儲(chǔ),其接口就是簡(jiǎn)單的GET、PUT、DEL 和其他擴(kuò)展,代表主要有 Swift 、S3 以及 Gluster 等; 2.塊存儲(chǔ),這種接口通常以 QEMU Driver 或者 Kernel Module 的方式存在,這種接口需要實(shí)現(xiàn) Linux 的 Block Device 的接口或者 QEMU 提供的 Block Driver 接口,如 Sheepdog,AWS 的 EBS,青云的云硬盤和阿里云的盤古系統(tǒng),還有 Ceph 的 RBD(RBD是Ceph面向塊存儲(chǔ)的接口)。在常見(jiàn)的存儲(chǔ)中 DAS、SAN 提供的也是塊存儲(chǔ); 3.文件存儲(chǔ),通常意義是支持 POSIX 接口,它跟傳統(tǒng)的文件系統(tǒng)如 Ext4 是一個(gè)類型的,但區(qū)別在于分布式存儲(chǔ)提供了并行化的能力,如 Ceph 的 CephFS (CephFS是Ceph面向文件存儲(chǔ)的接口),但是有時(shí)候又會(huì)把 GlusterFS ,HDFS 這種非POSIX接口的類文件存儲(chǔ)接口歸入此類。當(dāng)然 NFS、NAS也是屬于文件系統(tǒng)存儲(chǔ);
ceph組件介紹????
大概簡(jiǎn)單介紹一下ceph的組件架構(gòu):

Monitor,負(fù)責(zé)監(jiān)視整個(gè)集群的運(yùn)行狀況,信息由維護(hù)集群成員的守護(hù)程序來(lái)提供,各節(jié)點(diǎn)之間的狀態(tài)、集群配置信息。Ceph monitor map主要包括OSD map、PG map、MDS map 和 CRUSH 等,這些 map 被統(tǒng)稱為?集群Map。ceph monitor 不存儲(chǔ)任何數(shù)據(jù)。下面分別開(kāi)始介紹這些map的功能:
Monitor map:包括有關(guān)monitor節(jié)點(diǎn)端到端的信息,其中包括 Ceph 集群ID,監(jiān)控主機(jī)名和IP以及端口。并且存儲(chǔ)當(dāng)前版本信息以及最新更改信息,通過(guò) "ceph mon dump" 查看 monitor map。
OSD map:包括一些常用的信息,如集群ID、創(chuàng)建OSD map的 版本信息和最后修改信息,以及pool相關(guān)信息,主要包括pool 名字、pool的ID、類型,副本數(shù)目以及PGP等,還包括數(shù)量、狀態(tài)、權(quán)重、最新的清潔間隔和OSD主機(jī)信息。通過(guò)命令 "ceph osd dump" 查看。
PG map:包括當(dāng)前PG版本、時(shí)間戳、最新的OSD Map的版本信息、空間使用比例,以及接近占滿比例信息,同事,也包括每個(gè)PG ID、對(duì)象數(shù)目、狀態(tài)、OSD 的狀態(tài)以及深度清理的詳細(xì)信息。通過(guò)命令 "ceph pg dump" 可以查看相關(guān)狀態(tài)。
CRUSH map:CRUSH map包括集群存儲(chǔ)設(shè)備信息,故障域?qū)哟谓Y(jié)構(gòu)和存儲(chǔ)數(shù)據(jù)時(shí)定義失敗域規(guī)則信息。通過(guò) 命令 "ceph osd crush map" 查看。
MDS map:MDS Map包括存儲(chǔ)當(dāng)前 MDS map 的版本信息、創(chuàng)建當(dāng)前的Map的信息、修改時(shí)間、數(shù)據(jù)和元數(shù)據(jù)POOL ID、集群MDS數(shù)目和MDS狀態(tài),可通過(guò)"ceph mds dump"查看。
OSD,Ceph OSD是由物理磁盤驅(qū)動(dòng)器、在其之上的 Linux 文件系統(tǒng)以及 Ceph OSD 服務(wù)組成。Ceph OSD 將數(shù)據(jù)以對(duì)象的形式存儲(chǔ)到集群中的每個(gè)節(jié)點(diǎn)的物理磁盤上,完成存儲(chǔ)數(shù)據(jù)的工作絕大多數(shù)是由 OSD daemon 進(jìn)程實(shí)現(xiàn)。在構(gòu)建 Ceph OSD的時(shí)候,建議采用SSD 磁盤以及xfs文件系統(tǒng)來(lái)格式化分區(qū)。BTRFS 雖然有較好的性能,但是目前不建議使用到生產(chǎn)中,目前建議還是處于圍觀狀態(tài)。
Ceph 元數(shù)據(jù),MDS。ceph 塊設(shè)備和RDB并不需要MDS,MDS只為 CephFS服務(wù)。
RADOS,Reliable Autonomic Distributed Object Store。RADOS是ceph存儲(chǔ)集群的基礎(chǔ)。在ceph中,所有數(shù)據(jù)都以對(duì)象的形式存儲(chǔ),并且無(wú)論什么數(shù)據(jù)類型,RADOS對(duì)象存儲(chǔ)都將負(fù)責(zé)保存這些對(duì)象。RADOS層可以確保數(shù)據(jù)始終保持一致。
librados,librados庫(kù),為應(yīng)用程度提供訪問(wèn)接口。同時(shí)也為塊存儲(chǔ)、對(duì)象存儲(chǔ)、文件系統(tǒng)提供原生的接口。
ADOS塊設(shè)備,它能夠自動(dòng)精簡(jiǎn)配置并可調(diào)整大小,而且將數(shù)據(jù)分散存儲(chǔ)在多個(gè)OSD上。RADOSGW,網(wǎng)關(guān)接口,提供對(duì)象存儲(chǔ)服務(wù)。它使用librgw和librados來(lái)實(shí)現(xiàn)允許應(yīng)用程序與Ceph對(duì)象存儲(chǔ)建立連接。并且提供S3 和 Swift 兼容的RESTful API接口。CephFS,Ceph文件系統(tǒng),與POSIX兼容的文件系統(tǒng),基于librados封裝原生接口。
簡(jiǎn)單說(shuō)下CRUSH,Controlled Replication Under Scalable Hashing,它表示數(shù)據(jù)存儲(chǔ)的分布式選擇算法, ceph 的高性能/高可用就是采用這種算法實(shí)現(xiàn)。CRUSH 算法取代了在元數(shù)據(jù)表中為每個(gè)客戶端請(qǐng)求進(jìn)行查找,它通過(guò)計(jì)算系統(tǒng)中數(shù)據(jù)應(yīng)該被寫入或讀出的位置。CRUSH能夠感知基礎(chǔ)架構(gòu),能夠理解基礎(chǔ)設(shè)施各個(gè)部件之間的關(guān)系。并且CRUSH保存數(shù)據(jù)的多個(gè)副本,這樣即使一個(gè)故障域的幾個(gè)組件都出現(xiàn)故障,數(shù)據(jù)依然可用。CRUSH 算是使得 ceph 實(shí)現(xiàn)了自我管理和自我修復(fù)。
RADOS 分布式存儲(chǔ)相較于傳統(tǒng)分布式存儲(chǔ)的優(yōu)勢(shì)在于:
1. 將文件映射到object后,利用Cluster Map 通過(guò)CRUSH 計(jì)算而不是查找表方式定位文件數(shù)據(jù)存儲(chǔ)到存儲(chǔ)設(shè)備的具體位置。優(yōu)化了傳統(tǒng)文件到塊的映射和Block MAp的管理。
? ? 2. RADOS充分利用OSD的智能特點(diǎn),將部分任務(wù)授權(quán)給OSD,最大程度地實(shí)現(xiàn)可擴(kuò)展。
ceph數(shù)據(jù)讀寫流程
Pool:存儲(chǔ)池、分區(qū),存儲(chǔ)池的大小取決于底層的存儲(chǔ)空間。
PG(placement group):一個(gè)pool內(nèi)部可以有多個(gè)?PG?存在,pool?和?PG?都是抽象的邏輯概?念,一個(gè)?pool?中有多少個(gè)?PG?可以通過(guò)公式計(jì)算。
OSD(Object Storage Daemon,對(duì)象存儲(chǔ)設(shè)備):每一塊磁盤都是一個(gè)?osd,一個(gè)主機(jī)由一個(gè)或?多個(gè)?osd?組成.ceph?集群部署好之后,要先創(chuàng)建存儲(chǔ)池才能向?ceph?寫入數(shù)據(jù),文件在向?ceph?保存之前要?先進(jìn)行一致性?hash?計(jì)算,計(jì)算后會(huì)把文件保存在某個(gè)對(duì)應(yīng)的?PG?的,此文件一定屬于某個(gè)?pool?的一個(gè)?PG,在通過(guò)?PG?保存在?OSD?上。?數(shù)據(jù)對(duì)象在寫到主?OSD?之后再同步對(duì)從?OSD?以實(shí)現(xiàn)數(shù)據(jù)的高可用。

?注:存儲(chǔ)文件過(guò)程:
第一步:計(jì)算文件到對(duì)象的映射:?計(jì)算文件到對(duì)象的映射,假如?file?為客戶端要讀寫的文件,得到?oid(object id) = ino + ono?ino:inode number (INO),F(xiàn)ile?的元數(shù)據(jù)序列號(hào),F(xiàn)ile?的唯一?id。?ono:object number (ONO),F(xiàn)ile?切分產(chǎn)生的某個(gè)?object?的序號(hào),默認(rèn)以?4M?切分一個(gè)塊大?小。
第二步:通過(guò)hash算法計(jì)算出文件對(duì)應(yīng)的?pool?中的?PG:?通過(guò)一致性?HASH?計(jì)算?Object?到?PG,?Object -> PG?映射?hash(oid) & mask-> pgid
第三步:通過(guò)CRUSH把對(duì)象映射到?PG?中的?OSD?通過(guò)?CRUSH?算法計(jì)算?PG?到?OSD,PG?-> OSD?映射:[CRUSH(pgid)->(osd1,osd2,osd3)]
第四步:PG中的主?OSD?將對(duì)象寫入到硬盤
第五步:主OSD將數(shù)據(jù)同步給備份?OSD,并等待備份?OSD?返回確認(rèn)
第六步:主?OSD?將寫入完成返回給客戶端
Ceph集群安裝
主機(jī)準(zhǔn)備
機(jī)名Public networkCluster network功能組件硬盤
deploy192.168.0.140192.168.10.140ceph-deploy,mon40G
node1192.168.0.141192.168.10.141mon,osd,mgr40G,20G*3
node2192.168.0.142192.168.10.142mon,osd,mgr,mds40G,20G*3
node3192.168.0.143192.168.10.143rgw,osd40G,20G*3
添加網(wǎng)卡(所有節(jié)點(diǎn))


添加硬盤(ceph-node1,ceph-node2,ceph-node3)
每臺(tái)機(jī)器添加20G*3
系統(tǒng)設(shè)置(所有節(jié)點(diǎn))
sudo vim /etc/ssh/sshd_config
#PermitRootLogin prohibit-password #默認(rèn)為禁?登錄
PermitRootLogin yes #改為允許登錄
58 PasswordAuthentication yes #打開(kāi)密碼認(rèn)證,其實(shí)默認(rèn)就是允許通過(guò)密碼認(rèn)證登錄
UseDNS no(加快ssh連接速度)
sudo su - root?
~# passwd #設(shè)置密碼
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
systemctl restart sshd #重啟ssh服務(wù)并測(cè)試root?戶遠(yuǎn)程ssh連接
網(wǎng)絡(luò)設(shè)置:
# 1.設(shè)置?卡
#更改?卡名稱
sudo vim /etc/default/grub
GRUB_CMDLINE_LINUX=""
#改為:
GRUB_CMDLINE_LINUX="net.ifnames=0 biosdevname=0"
#更新開(kāi)機(jī)啟動(dòng)?件
sudo grub-mkconfig -o /boot/grub/grub.cfg
#禁?ipv6
root@devops:~# cat > /etc/sysctl.conf <<EOF
#禁?ipv6
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1
EOF
root@devops:~# sysctl -p
#設(shè)置雙?卡?deploy
root@deploy:~# cat /etc/netplan/01-netcfg.yaml
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
??version: 2
??renderer: networkd
??ethernets:
????eth0:
??????dhcp4: no
??????dhcp6: no
??????addresses: [192.168.0.140/24]
??????gateway4: 192.168.0.1
??????nameservers:
????????addresses: [192.168.0.1]
????eth1:
??????dhcp4: no
??????dhcp6: no
??????addresses: [192.168.10.140/24]
#設(shè)置雙?卡?node1
root@deploy:~# cat /etc/netplan/01-netcfg.yaml
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
??version: 2
??renderer: networkd
??ethernets:
????eth0:
??????dhcp4: no
??????dhcp6: no
??????addresses: [192.168.0.141/24]
??????gateway4: 192.168.0.1
??????nameservers:
????????addresses: [192.168.0.1]
????eth1:
??????dhcp4: no
??????dhcp6: no
??????addresses: [192.168.10.141/24]
#設(shè)置雙?卡?node2
root@deploy:~# cat /etc/netplan/01-netcfg.yaml
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
??version: 2
??renderer: networkd
??ethernets:
????eth0:
??????dhcp4: no
??????dhcp6: no
??????addresses: [192.168.0.142/24]
??????gateway4: 192.168.0.1
??????nameservers:
????????addresses: [192.168.0.1]
????eth1:
??????dhcp4: no
??????dhcp6: no
??????addresses: [192.168.10.142/24]
#設(shè)置雙?卡?node3
root@deploy:~# cat /etc/netplan/01-netcfg.yaml
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
??version: 2
??renderer: networkd
??ethernets:
????eth0:
??????dhcp4: no
??????dhcp6: no
??????addresses: [192.168.0.143/24]
??????gateway4: 192.168.0.1
??????nameservers:
????????addresses: [192.168.0.1]
????eth1:
??????dhcp4: no
??????dhcp6: no
??????addresses: [192.168.10.143/24]
#取消 ssh 登錄指紋驗(yàn)證
sudo sed -i '/ask/{s/#//;s/ask/no/}' /etc/ssh/ssh_config
# sudo netplan apply?重啟網(wǎng)絡(luò)服務(wù)
reboot重啟
設(shè)置源(所有節(jié)點(diǎn))
sudo mv /etc/apt/{sources.list,sources.list.old}
sudo cat > /etc/apt/sources.list <<EOF
#默認(rèn)注釋了源碼鏡像以提高 apt update 速度,如有需要可自行取消注釋
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse
EOF
#添加ceph 源
wget -q -O- 'https://mirrors.tuna.tsinghua.edu.cn/ceph/keys/release.asc' | sudo apt-key add -
sudo echo "deb https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic main" >> /etc/apt/sources.list
sudo apt update更新倉(cāng)庫(kù)sudo apt upgrade
安裝包(所有節(jié)點(diǎn))
?apt ?install iproute2 ?ntpdate ?tcpdump telnet traceroute nfs-kernel-server nfs-common ?lrzsz tree ?openssl libssl-dev libpcre3 libpcre3-dev zlib1g-dev ntpdate tcpdump telnet traceroute ?gcc openssh-server lrzsz tree ?openssl libssl-dev libpcre3 libpcre3-dev zlib1g-dev ntpdate tcpdump telnet traceroute iotop unzip zip??-y
apt install net-tools vim wget git build-essential -y
配置主機(jī)名:(所有節(jié)點(diǎn))
hostnamectl set-hostname "主機(jī)名"
關(guān)閉防火墻(所有節(jié)點(diǎn))
ufw disable
配置節(jié)點(diǎn)內(nèi)核(所有節(jié)點(diǎn))
cat >/etc/sysctl.conf <<EOF
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded
applications. kernel.core_uses_pid = 1
# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1
# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65536
# # Controls the maximum size of a message, in bytes
kernel.msgmax = 65536
# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736
# # Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296
# TCP kernel paramater
net.ipv4.tcp_mem = 786432 1048576 1572864
net.ipv4.tcp_rmem = 4096 ???????87380 ??4194304
net.ipv4.tcp_wmem = 4096 ???????16384 ??4194304 n
et.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_sack = 1
# socket buffer
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.netdev_max_backlog = 262144
net.core.somaxconn = 20480
net.core.optmem_max = 81920
# TCP conn
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_syn_retries = 3
net.ipv4.tcp_retries1 = 3
net.ipv4.tcp_retries2 = 15
# tcp conn reuse
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_tw_reuse = 0
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_fin_timeout = 1
net.ipv4.tcp_max_tw_buckets = 20000
net.ipv4.tcp_max_orphans = 3276800
net.ipv4.tcp_synack_retries = 1
net.ipv4.tcp_syncookies = 1
# keepalive conn
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.ip_local_port_range = 10001 ???65000
# swap
vm.overcommit_memory = 0
vm.swappiness = 10
#net.ipv4.conf.eth1.rp_filter = 0
#net.ipv4.conf.lo.arp_ignore = 1
#net.ipv4.conf.lo.arp_announce = 2
#net.ipv4.conf.all.arp_ignore = 1
#net.ipv4.conf.all.arp_announce = 2
EOF
時(shí)鐘同步(所有節(jié)點(diǎn))dpkg-reconfigure tzdata? #系統(tǒng)時(shí)區(qū)設(shè)置成上海sudo apt install chrony -ysudo vim /etc/chrony/chrony.conf
修改為阿里云時(shí)鐘同步服務(wù)器server ntp.aliyun.com minpoll 4 maxpoll 10 iburstserver ntp1.aliyun.com minpoll 4 maxpoll 10 iburstserver ntp2.aliyun.com minpoll 4 maxpoll 10 iburstserver ntp3.aliyun.com minpoll 4 maxpoll 10 iburstserver ntp4.aliyun.com minpoll 4 maxpoll 10 iburstserver ntp5.aliyun.com minpoll 4 maxpoll 10 iburstserver ntp6.aliyun.com minpoll 4 maxpoll 10 iburstserver ntp7.aliyun.com minpoll 4 maxpoll 10 iburst
重啟chrony服務(wù)sudo systemctl restart chronysudo systemctl status chronysudo systemctl enable chrony查看是否激活sudo chronyc activity# 查看時(shí)鐘同步狀態(tài)sudo timedatectl status寫入系統(tǒng)時(shí)鐘sudo hwclock -w
apt install ntpdate
每5分鐘同步一次時(shí)間
?echo "*/5 * * * * /usr/sbin/ntpdate time1.aliyun.com &> /dev/null && hwclock -w" >> /var/spool/cron/crontabs/root
編輯hosts文件(所有節(jié)點(diǎn)需要操作):
cat >> /etc/hosts<< EOF
# ceph host
192.168.0.140 ??ceph-deploy ???????????
192.168.0.141 ??ceph-node1 ?????????
192.168.0.142 ??ceph-node2
192.168.0.143 ??ceph-node3 ??
# ceph host
EOF ????
節(jié)點(diǎn)安裝python2(所有節(jié)點(diǎn))
做ceph初始化時(shí),需要python2.7
sudo apt install python2.7 -y
sudo ln -sv /usr/bin/python2.7 /usr/bin/python2
創(chuàng)建ceph用戶(所有節(jié)點(diǎn))
groupadd -r -g 2022 ceph && useradd -r -m -s /bin/bash -u 2022 -g 2022 ceph && echo ceph:123456 | chpasswd
echo "ceph ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
配置互信
su - ceph
ssh-keygen -t rsa ??
ssh-copy-id ceph@192.168.0.140
ssh-copy-id ceph@192.168.0.141
ssh-copy-id ceph@192.168.0.142
ssh-copy-id ceph@192.168.0.143
安裝ceph-deploy管理工具(ceph-deploy節(jié)點(diǎn))
ceph用戶部署
sudo apt-cache madison ceph-deploy??列出ceph-deploy的版本信息

sudo apt-get install ceph-deploy


初始化Mon節(jié)點(diǎn):
在管理節(jié)點(diǎn)初始化mon 節(jié)點(diǎn)
前期只先初始化ceph-mon1(ceph-node1)節(jié)點(diǎn),ceph-mon2(deploy)和ceph-mon3(ceph-node2)在集群部署完成后,再手動(dòng)添加
ceph@deploy:~$ mkdir ceph-cluster
ceph@deploy:~$ cd ceph-cluster/
ceph@deploy:~/ceph-cluster$ ceph-deploy new --cluster-network 192.168.10.0/24 --public-network 192.168.0.0/24 ceph-node1


ceph@deploy:~/ceph-cluster$ cat ceph.conf

安裝mon節(jié)點(diǎn)
sudo apt install ceph-mon -y?(deploy ceph-node1 ceph-node2)
初始化存儲(chǔ)節(jié)點(diǎn)
ceph@ceph-deploy:~/ceph-cluster$ sudo ceph-deploy install --no-adjust-repos --nogpgcheck ceph-node1 ceph-node2 ceph-node3(中途需要輸入各幾點(diǎn)密碼)


配置mon節(jié)點(diǎn)并生成及同步秘鑰
安裝mon組件
root@ceph-deploy:~# apt install ceph-mon

root@ceph-node1:~# apt install ceph-mon

root@ceph-node2:~# apt install ceph-mon

初始化mon
ceph@ceph-deploy:/home/ceph/ceph-cluster$ sudo ceph-deploy mon create-initial


驗(yàn)證mon節(jié)點(diǎn)

分發(fā)admin秘鑰到node節(jié)點(diǎn)
root@ceph-deploy:~# apt install ceph-common
root@ceph-node1:~# apt install ceph-common
root@ceph-node2:~# apt install ceph-common
root@ceph-node3:~# apt install ceph-common
ceph@ceph-deploy:/home/ceph/ceph-cluster$ sudo ceph-deploy admin ceph-deploy ceph-node1 ceph-node2 ceph-node3

驗(yàn)證密鑰




在ceph-node節(jié)點(diǎn)對(duì)密鑰授權(quán)
root@ceph-node1:~# setfacl -m u:ceph:rw /etc/ceph/ceph.client.admin.keyring
root@ceph-node2:~# setfacl -m u:ceph:rw /etc/ceph/ceph.client.admin.keyring
root@ceph-node3:~# setfacl -m u:ceph:rw /etc/ceph/ceph.client.admin.keyring
root@ceph-deploy:~# setfacl -m u:ceph:rw /etc/ceph/ceph.client.admin.keyring
部署ceph-mgr節(jié)點(diǎn)
在ceph-node1和ceph-node2上安裝ceph-mgr軟件
root@ceph-node1:~# apt install -y ceph-mgr

root@ceph-node2:~# apt install -y ceph-mgr

在ceph-deploy上初始化ceph-mgr節(jié)點(diǎn)
只初始化ceph-node1節(jié)點(diǎn)
ceph@ceph-deploy:/home/ceph/ceph-cluster$ sudo ceph-deploy mgr create ceph-node1

驗(yàn)證ceph-mgr節(jié)點(diǎn)
在ceph-node1上

ceph-deploy管理ceph集群信息

禁用非安全模式通信
ceph@ceph-deploy:~$ sudo ceph config set mon auth_allow_insecure_global_id_reclaim false

ceph集群個(gè)組件版本

OSD設(shè)置
準(zhǔn)備OSD節(jié)點(diǎn)
擦除磁盤之前通過(guò)deploy節(jié)點(diǎn)對(duì)node 節(jié)點(diǎn)執(zhí)行安裝ceph 基本運(yùn)行環(huán)境。
在ceph-deploy上操作
ceph@ceph-deploy:/home/ceph/ceph-cluster$ sudo ceph-deploy install --release pacific ceph-node1

ceph@ceph-deploy:/home/ceph/ceph-cluster$ sudo ceph-deploy install --release pacific ceph-node2

ceph@ceph-deploy:/home/ceph/ceph-cluster$ sudo ceph-deploy install --release pacific ceph-node3

列出ceph node節(jié)點(diǎn)磁盤
ceph@ceph-deploy:/home/ceph/ceph-cluster$ sudo ceph-deploy disk list ceph-node1

ceph@ceph-deploy:/home/ceph/ceph-cluster$ sudo ceph-deploy disk list ceph-node2

ceph@ceph-deploy:/home/ceph/ceph-cluster$ sudo ceph-deploy disk list ceph-node3

使用ceph-deploy disk zap擦除各ceph node 的ceph數(shù)據(jù)磁盤:
#在ceph-deploy上操作
sudo ceph-deploy ?disk zap ceph-node1 ?/dev/sdb
sudo ceph-deploy ?disk zap ceph-node1 ?/dev/sdc
sudo ceph-deploy ?disk zap ceph-node1 ?/dev/sdd
sudo ceph-deploy ?disk zap ceph-node2 ?/dev/sdb
sudo ceph-deploy ?disk zap ceph-node2 ?/dev/sdc
sudo ceph-deploy ?disk zap ceph-node2 ?/dev/sdd
sudo ceph-deploy ?disk zap ceph-node3 ?/dev/sdb
sudo ceph-deploy ?disk zap ceph-node3 ?/dev/sdc
sudo ceph-deploy ?disk zap ceph-node3 ?/dev/sdd

添加OSD:
#在ceph-deploy上操作
sudo ceph-deploy osd create ceph-node1 --data /dev/sdb
sudo ceph-deploy osd create ceph-node1 --data /dev/sdc
sudo ceph-deploy osd create ceph-node1 --data /dev/sdd
sudo ceph-deploy osd create ceph-node2 --data /dev/sdb
sudo ceph-deploy osd create ceph-node2 --data /dev/sdc
sudo ceph-deploy osd create ceph-node2 --data /dev/sdd
sudo ceph-deploy osd create ceph-node3 --data /dev/sdb
sudo ceph-deploy osd create ceph-node3 --data /dev/sdc
sudo ceph-deploy osd create ceph-node3 --data /dev/sdd
驗(yàn)證OSD

設(shè)置OSD服務(wù)自啟動(dòng)
默認(rèn)就已經(jīng)為自啟動(dòng), node節(jié)點(diǎn)添加完成后,開(kāi)源測(cè)試node 服務(wù)器重啟后,OSD 是否會(huì)自動(dòng)啟動(dòng)。
在ceph-node1上測(cè)試,其他node節(jié)點(diǎn)操作一樣
ps -ef|grep osd

root@ceph-node1:~# systemctl enable ceph-osd@0 ceph-osd@1 ceph-osd@2

root@ceph-node2:~# systemctl enable ceph-osd@3 ceph-osd@4 ceph-osd@5
root@ceph-node3:~# systemctl enable ceph-osd@6 ceph-osd@7 ceph-osd@8??
擴(kuò)展ceph集群高可用
擴(kuò)展ceph-mon節(jié)點(diǎn)
Ceph-mon是原生具備自選舉以實(shí)現(xiàn)高可用機(jī)制的ceph 服務(wù),節(jié)點(diǎn)數(shù)量通常是奇數(shù)。
#在ceph-deploy節(jié)點(diǎn)添加mon節(jié)點(diǎn)
ceph@ceph-deploy:/home/ceph/ceph-cluster$ sudo ceph-deploy mon add ceph-deploy
ceph@ceph-deploy:/home/ceph/ceph-cluster$ sudo ceph-deploy mon add ceph-node2
ceph-mon結(jié)果驗(yàn)證

ceph-mon的狀態(tài)
ceph@ceph-deploy:~$ ceph quorum_status --format json-pretty

擴(kuò)展mgr節(jié)點(diǎn)
在ceph-node2節(jié)點(diǎn)上安裝ceph-mgr
root@ceph-node2:~# apt install -y ceph-mgr

在ceph-deploy上添加ceph-node2
ceph@ceph-deploy:/home/ceph/ceph-cluster$ sudo ceph-deploy mgr create ceph-node2
cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s

測(cè)試上傳和下載數(shù)據(jù):
創(chuàng)建pool:
ceph osd pool create mypool 32 32 # 32PG和32PGP
ceph pg ls-by-pool mypool | awk '{print $1,$2,$15}' # 驗(yàn)證PG和PGP的組合
ceph osd pool ls或者 rados lspools #列出所有的pool
上傳文件:
rados put msg1 /home/ceph/ceph-cluster/cephtestfile ?--pool=mypool
#文件上傳mypool并且指定對(duì)象id為msg1
列出文件:
rados ls --pool=mypool
文件信息:
ceph osd map mypool msg1?# ceph osd map命令可以獲取存儲(chǔ)池中數(shù)據(jù)對(duì)象的具體位置信息
下載文件:
rados get msg1 --pool=mypool /tmp/my.txt
刪除文件:
rados rm msg1 --pool=mypool