2 Star 1 Fork 0

zhrun8899 / learning-notes

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
GlusterDfs.md 12.03 KB
一键复制 编辑 原始数据 按行查看 历史

GlusterDfs

1.定义

  • Brick: 存储的基本单元,表现为服务器上可信存储池的导出目录
  • Volume: 卷是bricks的逻辑集合
  • Cluster: 一组计算机组成的集群
  • Distributed File System: 允许多个客户端通过网络并发访问的文件系统
  • GFID: 每个GlusterFs中的文件或者目录都有一个128bit的数字标识称为GFID
  • glusterd: 管理守护进程需要运行在所有的提供授信存储池的服务器上
  • Namespace: 名称空间是被创建的一个抽象容器或环境用来保存唯一标识符号
  • Quorum: 设置一个可信的存储池中最多失效的主机节点数量
  • Quota: 允许通过目录或者卷设置对磁盘空间使用的限制
  • Posix: 可移植操作系统接口是IEEE定义的一系列相关api标准
  • Vol File: Vol文件是glusterfs进程使用的配置文件
  • Distributed: 分布卷
  • Replicated: 复本卷
  • Distributed Replicated: 分布式复制卷
  • Geo-Replication: 异地备份为站点通过局域网、广域网、互联网提供一个连续的异步的和增量复制服务
  • Metedata: 元数据定义为关于数据的数据描述信息,在GlusterFs没有专用的元数据存储内容
  • Extended Attributes: 扩展属性是文件系统的一个特性 FUSE: 用户空间的文件系统是一个用于类Unix操作系统的可加载内核模块,以允许非特权用户在适用内核代码的情况下创建自己的文件系统。实现了在用户空间下运行文件系统代码

2.准备工作

机器:172.30.120.54 172.30.121.54 172.30.122.54

https://docs.gluster.org/en/latest/Install-Guide/Overview/

squid设置,使用各服务器能上网

ntp安装及时间同步

ansible temp -m shell -a "ntpdate 172.30.120.4"

1.虚拟机需要2块硬盘

one for the base OS and one that we will use as a Gluster “brick”.

2. 2NIC’s using VirtIO driver

3.centos server 安装

1.yum repos

yum install centos-release-gluster

在ansible(4)上执行:

ansible web -m yum -a "name=centos-release-gluster state=present"

2.设置主机名

ansible web -m hostname -a "name={{host_name}}"

说明:hosts文件需要配置hostname

[web]
## alpha.example.org
## beta.example.org
172.30.120.54 hostname=glusterNode01
172.30.121.54 hostname=glusterNode02
172.30.122.54 hostname=glusterNode03

vi /etc/hosts

172.30.120.24 glusterNode24
172.30.120.25 glusterNode25
172.30.120.26 glusterNode26

cat addHosts.sh

#!/bin/bash
## delete old hosts
sed -i '/^172.30.120/d' /etc/hosts
sed -i '$a 172.30.120.24 glusterNode24' /etc/hosts
sed -i '$a 172.30.120.25 glusterNode25' /etc/hosts
sed -i '$a 172.30.120.26 glusterNode26' /etc/hosts

ansible temp -m script -a "addHosts.sh"

3.挂载sdb并mount

ansible temp -m shell -a "mkfs.xfs -i size=512 /dev/sdb"
若提示:有gpt标签,使用以下命令:
ansible temp -m shell -a "mkfs.xfs -i size=512 /dev/sdb -f"
ansible temp -m shell -a "mkdir -p /bricks/brick1"
ansible-console temp
echo "/dev/sdb /bricks/brick1 xfs defaults 1 2" >>/etc/fstab
ansible web -m shell -a "mount -a && mount"
ansible temp -m shell -a "df -lh"

4.安装 yum install glusterfs-server

ansible temp -m yum -a "name=glusterfs-server state=present"

5 - 防火墙设置

监听:24007,但每增加一个brick,会新加一个端口,察看:

gluster volume status

6.Configure the trusted pool

ansible web -m shell -a "gluster peer probe glusterNode21"
ansible web -m shell -a "gluster peer probe glusterNode22"
ansible web -m shell -a "gluster peer probe glusterNode23"
ansible temp -m shell -a "gluster peer probe glusterNode24"
ansible temp -m shell -a "gluster peer probe glusterNode25"
ansible temp -m shell -a "gluster peer probe glusterNode26"
172.16.16.0/24 via 172.20.35.254
172.16.32.0/24 via dev ens224

察看状态:

gluster peer status

7.Set up a GlusterFS volume

ansible temp -m shell -a "mkdir /bricks/brick1/gv0"

在glusterNode01上执行:

 gluster volume create gv1 replica 3 arbiter 1 glusterNode24:/bricks/brick1/gv1 glusterNode25:/bricks/brick1/gv1 glusterNode26:/bricks/brick1/gv1 glusterNode24:/bricks/brick2/gv1 glusterNode25:/bricks/brick2/gv1 glusterNode26:/bricks/brick2/gv1
gluster volume start gv1

gluster volume create myvol1 replica 2 server{1..4}:/data/glusterfs/myvol1/brick1/brick
This will create the volume myvol1 which uses the directory

repliaca 2 会提示:

gluster volume create gv1 replica 2 glusterNode24:/bricks/brick1/gv1 glusterNode25:/bricks/brick1/gv1 glusterNode26:/bricks/brick1/gv1 glusterNode24:/bricks/brick2/gv1 glusterNode25:/bricks/brick2/gv1 glusterNode26:/bricks/brick2/gv1
Replica 2 volumes are prone to split-brain. Use Arbiter or Replica 3 to avoid this. See: http://docs.gluster.org/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/.
Do you still want to continue?

如果未启动成功,日志文件:/var/log/glusterfs

解决方案:使用arbiter volume

挂载

mkdir /glusterdir
mount -t glusterfs glusterNode25:/gv1 /glusterdir

挂载失败:

E [fuse-bridge.c:5211:fuse_first_lookup] 0-fuse: first lookup on root failed (传输端点尚未连接)

原因:新开一个49152端口,需要回到防火墙中。

ansible web -m shell -a "firewall-cmd --add-port=49152/tcp --permanent"
ansible web -m shell -a "firewall-cmd --reload"
ansible web -m shell -a "firewall-cmd --list-all"

重新挂载成功:

mount -t glusterfs glusterNode01:/gv0 /glusterdir

查验:

cd /glusterdir
df  -h ./

相关命令:

# gluster volume create <NEW-VOLNAME>[stripe <COUNT> | replica <COUNT>] [transport [tcp | rdma | tcp,rdma]] <NEW-BRICK1> <NEW-BRICK2> <NEW-BRICK3> <NEW-BRICK4>...
# gluster volume start <VOLNAME>
# gluster volume stop <VOLNAME>
# gluster volume delete <VOLNAME>  ## 注意,删除卷的前提是先停止卷。
gluster volume list
gluster volume status [all]   /*查看集群中的卷状态*/

4.centos client 安装

1. 增加fuse模块到linux内核

# modprobe fuse

2.确保fuse模块已经加载

# dmesg | grep -i fuse

fuse init (API version 7.13)`

3.安装所需软件

$ sudo yum -y install openssh-server wget fuse fuse-libs libibverbs

说明:openib 找不到

4.服务端防火墙设置。开放24007,24008,同时还有49152开始的端口。

如果有5个 bricks, 需要开放 49152 到 49156.

5.下载及安装客户端

==说明:下载页面现在(20191015)是不存在的,下载页面只有centos8的下载。centos7可以通过yum安装==

需要先安装yum repos,否则找不到.

yum install centos-release-gluster

然后安装:

yum install -y glusterfs glusterfs-fuse glusterfs-rdma

6.Mounting Volumes

1.手动挂载:

mount -t glusterfs HOSTNAME-OR-IPADDRESS:/VOLNAME MOUNTDIR
如:mount -t glusterfs glusterNode24:/gv1 /glusterdir

backupvolfile-server=server-name volfile-max-fetch-attempts=number of attempts log-level=loglevel log-file=logfile transport=transport-type direct-io-mode=[enable|disable] use-readdirp=[yes|no]

# mount -t glusterfs -o backupvolfile-server=volfile_server2,use-readdirp=no,volfile-max-fetch-attempts=2,log-level=WARNING,log-file=/var/log/gluster.log server1:/test-volume /mnt/glusterfs
mount -t glusterfs -o log-level=WARNING,log-file=/var/log/gluster.log glusterNode24:/gv1 /glusterdir

If backupvolfile-server option is added while mounting fuse client, when the first volfile server fails, then the server specified in backupvolfile-server option is used as volfile server to mount the client.

2.自动挂载

可以配置客户端在系统启动时自动挂载。

To mount a volume, edit the /etc/fstab file and add the following line:

HOSTNAME-OR-IPADDRESS:/VOLNAME MOUNTDIR glusterfs defaults,_netdev 0 0

For example:

glusterNode24:/gv1 /glusterdir glusterfs defaults,_netdev 0 0

mount -a

5. 监控

gluster volume top readdir brick list-cnt
gluster volume top opendir [brick ] [list-cnt ]
gluster volume top write [brick ] [list-cnt ]
gluster volume top open [brick ] [list-cnt ]
gluster volume top read-perf bs 256 count 1 brick list-cnt
gluster volume info all

The statedump files are created on the brick servers in the/tmp directory or in the directory set using server.statedump-path volume option. The naming convention of the dump file is <brick-path>.<brick-pid>.dump.

 gluster volume status gv0
 gluster volume status gv0 all
 gluster volume status detail
 gluster volume status gv0  mem
 gluster volume status gv0 inode

6.缺点

列文件目录时,需要查询所有节点,并对文件目录信息及属性进行聚合。这时,哈希算法根本发挥不上作用,相对于有中心的元数据服务,查询效率要差很多

当集群规模变大以及文件数量达到百万级别时,ls文件目录和rm删除文件目录这两个典型元数据操作就会变得非常慢,创建和删除100万个空文件可能会花上15分钟。如何解决这个问题呢?

建议:

合理组织文件目录,目录层次不要太深,单个目录下文件数量不要过多;

增大服务器内存配置,并且增大GlusterFS目录缓存参数;

网络配置方面,建议采用万兆或者InfiniBand。

理论和实践上分析,GlusterFS目前主要适用大文件存储场景,对于小文件尤其是海量小文件,存储效率和访问性能都表现不佳。海量小文件LOSF问题是工业界和学术界公认的难题,GlusterFS作为通用的分布式文件系统,并没有对小文件作额外的优化措施,性能不好也是可以理解的。

7.管理

7.1扩展

数量:replica*n

步骤:

1.probe

gluster peer probe serverName

2.add brick

# gluster volume add-brick <VOLNAME> <NEW-BRICK>

For example:

# gluster volume add-brick gv1 glusterNode21:/gv1
Add Brick successful

3.检查volume状态

# gluster volume info <VOLNAME>

4.需要重新均衡

7.2 重新均衡**

4.1 修改布局

# gluster volume rebalance gv1 fix-layout start
eg:
# gluster volume rebalance gv1 fix-layout start

4.2 修改布局并迁移数据

Start the rebalance operation on any one of the server using the following command:

# gluster volume rebalance gv1 start

For example:

# gluster volume rebalance test-volume start

4.3 检查均衡结果

gluster volume rebalance status

7.3.缩容

When shrinking distributed replicated and distributed dispersed volumes ,you need to remove a number of bricks that is a multiple of the replica or stripe count.

1.Remove the brick using the following command:

# gluster volume remove-brick <VOLNAME> <BRICKNAME> start

For example, to remove server2:/exp2:

# gluster volume remove-brick test-volume server2:/exp2 start
gluster volume remove-brick gv1 glusterNode24:/bricks/brick2/gv1 glusterNode25:/bricks/brick2/gv1 glusterNode26:/bricks/brick2/gv1 start

2.View the status of the remove brick operation using the following command:

# gluster volume remove-brick <VOLNAME> <BRICKNAME> status

For example, to view the status of remove brick operation on server2:/exp2 brick:

# gluster volume remove-brick gv1 glusterNode24:/bricks/brick2/gv1 status

3.Once the status displays "completed", commit the remove-brick operation

# gluster volume remove-brick <VOLNAME> <BRICKNAME> commit
gluster volume remove-brick gv1 glusterNode24:/bricks/brick2/gv1 commit

4.Check the volume information using the following command:

# gluster volume info

7.4.修复

gluster volume heal <VOLNAME>
gluster volume heal <VOLNAME> full 
gluster volume heal <VOLNAME> info
gluster volume heal test-volume info healed
gluster volume heal <VOLNAME> info failed

gluster volume heal info split-brain

7.5.Non Uniform File Allocation

gluster volume set cluster.nufa enable on

文件尽量分布在本地

7.6 BitRot Detection

文件损坏及未通过fuse 或ntf ,直接访问brick目录.

gluster volume bitrot enable

1
https://gitee.com/zhrun8899/learning-notes.git
git@gitee.com:zhrun8899/learning-notes.git
zhrun8899
learning-notes
learning-notes
master

搜索帮助