最近由于項(xiàng)目需要,自己實(shí)現(xiàn)了Prometheus對(duì)Swarm的服務(wù)發(fā)現(xiàn),以方便收集metrics。代碼在github,retrieval/discovery/swarm?;?a target="_blank" rel="nofollow">Prometheus 0.17.0和Swarm v1.1.0。
系統(tǒng)需求
因?yàn)镾warm自身并沒有為prometheus提供metrics的輸出接口,所以需要在Swarm的每個(gè)master和node上跑一個(gè)CAdvisor。插件默認(rèn)CAdvisor的/metricsentrypoint默認(rèn)接口為8070(可配置)。
配置
- job_name: service-swarm
swarm_sd_configs:
- masters:
- 'http://swarm.example.com:8080'
refresh_interval: 1s
metrics_port: '8060'
- refresh_interval 制定插件去收集metrics的時(shí)間間隔;
-
metrics 制定CAdvisor的
/metircs端口; - label policy的相關(guān)配置和Kubernetes的一致。
原理
Prometheus服務(wù)發(fā)現(xiàn)插件接口
// prometheus/retrieval/targetmanager.go
// A TargetProvider provides information about target groups. It maintains a set
// of sources from which TargetGroups can originate. Whenever a target provider
// detects a potential change, it sends the TargetGroup through its provided channel.
//
// The TargetProvider does not have to guarantee that an actual change happened.
// It does guarantee that it sends the new TargetGroup whenever a change happens.
//
// Sources() is guaranteed to be called exactly once before each call to Run().
// On a call to Run() implementing types must send a valid target group for each of
// the sources they declared in the last call to Sources().
type TargetProvider interface {
// Sources returns the source identifiers the provider is currently aware of.
Sources() []string
// Run hands a channel to the target provider through which it can send
// updated target groups. The channel must be closed by the target provider
// if no more updates will be sent.
// On receiving from done Run must return.
Run(up chan<- config.TargetGroup, done <-chan struct{})
}
- Sources() []string,返回當(dāng)前provider的標(biāo)示。target manager將返回的string作為target group的ID
- Run(up chan<- config.TargetGroup, done <-chan struct{}),啟動(dòng)target provider。provider會(huì)將最新的target group信息輸出到up這個(gè)channel中,以通知target manager。
// prometheus/config/config.go
// TargetGroup is a set of targets with a common label set.
type TargetGroup struct {
// Targets is a list of targets identified by a label set. Each target is
// uniquely identifiable in the group by its address label.
Targets []model.LabelSet
// Labels is a set of labels that is common across all targets in the group.
Labels model.LabelSet
// Source is an identifier that describes a group of targets.
Source string
}
- Targets,會(huì)被prometheus解析,用于獲得target metrics的訪問信息,會(huì)被添加到metrics記錄上;
- Labels,普通的label,會(huì)被添加到metrics記錄上;
- Source,等同于ID。
Swarm服務(wù)發(fā)現(xiàn)原理
Nodes
1.通過定時(shí)訪問Swarm master的REST API/info,拿到cluster的最新信息。
// 訪問swarm master的/info api 得到的response body
{
"ID": "",
"Containers": 16,
"ContainersRunning": 10,
"ContainersPaused": 0,
"ContainersStopped": 6,
"Images": 30,
"Driver": "",
"DriverStatus": null,
"SystemStatus": [
[
"Role",
"primary"
],
[
"Strategy",
"spread"
],
[
"Filters",
"health, port, dependency, affinity, constraint"
],
[
"Nodes",
"2"
],
[
" hh-yun-k8s-128049.vclound.com",
"10.199.128.49:2375"
],
[
" └ Status",
"Healthy"
],
[
" └ Containers",
"8"
],
[
" └ Reserved CPUs",
"12 / 25"
],
[
" └ Reserved Memory",
"3.75 GiB / 132 GiB"
],
[
" └ Labels",
"executiondriver=native-0.2, kernelversion=3.10.0-229.4.2.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), storagedriver=devicemapper"
],
[
" └ Error",
"(none)"
],
[
" └ UpdatedAt",
"2016-04-05T09:22:57Z"
],
[
" hh-yun-k8s-128050.vclound.com",
"10.199.128.50:2375"
],
[
" └ Status",
"Healthy"
],
[
" └ Containers",
"8"
],
[
" └ Reserved CPUs",
"12 / 25"
],
[
" └ Reserved Memory",
"3 GiB / 132 GiB"
],
[
" └ Labels",
"executiondriver=native-0.2, kernelversion=3.10.0-229.4.2.el7.x86_64, operatingsystem=CentOS Linux 7 (Core), storagedriver=devicemapper"
],
[
" └ Error",
"(none)"
],
[
" └ UpdatedAt",
"2016-04-05T09:23:29Z"
]
],
"Plugins": {
"Volume": null,
"Network": null,
"Authorization": null
},
"MemoryLimit": true,
"SwapLimit": true,
"CpuCfsPeriod": true,
"CpuCfsQuota": true,
"CPUShares": true,
"CPUSet": true,
"IPv4Forwarding": true,
"BridgeNfIptables": true,
"BridgeNfIp6tables": true,
"Debug": false,
"NFd": 0,
"OomKillDisable": true,
"NGoroutines": 0,
"SystemTime": "2016-04-05T17:23:50.830465718+08:00",
"ExecutionDriver": "",
"LoggingDriver": "",
"NEventsListener": 0,
"KernelVersion": "3.10.0-229.4.2.el7.x86_64",
"OperatingSystem": "linux",
"OSType": "",
"Architecture": "amd64",
"IndexServerAddress": "",
"RegistryConfig": null,
"NCPU": 50,
"MemTotal": 283536760012,
"DockerRootDir": "",
"HttpProxy": "",
"HttpsProxy": "",
"NoProxy": "",
"Name": "hh-yun-k8s-128050.vclound.com",
"Labels": null,
"ExperimentalBuild": false,
"ServerVersion": "",
"ClusterStore": "",
"ClusterAdvertise": ""
}
2.解析文本,提取出node的信息。將這些信息封裝到target group中,通過channel通知target manager
func (d *Discovery) Run(up chan<- config.TargetGroup, done <-chan struct{}) {
defer close(up)
if tg := d.masterTargetGroup(); tg != nil {
select {
case <-done:
return
case up <- *tg: // 將Swarm master的信息通知給target manager
}
}
retryInterval := time.Duration(d.Conf.RefreshInterval)
update := make(chan []*Node, 10)
go d.watchNodes(update, done, retryInterval) // 持續(xù)的從master拿node的信息,并放進(jìn)update channel中
for {
select {
case <-done:
return
case nodes := <-update: // 一旦有更新,則處理
d.updateNodes(nodes) // 更新cache中的node信息
tg := d.nodeTargetGroup() // 返回封裝了最新node信息的target group
up <- *tg // 通知target manager
}
}
}
Masters
Swarm master是靜態(tài)的,通過prometheus配置文件提供。當(dāng)provider啟動(dòng)后,直接將master信息通知到target manager(見Run方法代碼)。
但是訪問master獲取node信息的時(shí)候,添加有rotation機(jī)制,以找到當(dāng)前正在工作的master。
func (c *swarmClient) getNodeInfo() (*Info, error) {
c.masterMu.Lock()
defer c.masterMu.Unlock()
for _, master := range c.masters {
urlStr := fmt.Sprintf("%s/info", master.String())
req, err := http.NewRequest("GET", urlStr, nil)
if err != nil {
return nil, err
}
var resp *http.Response
if c.do != nil {
// code for testing
resp, err = c.do(req)
} else {
resp, err = c.client.Do(req)
}
if err == nil {
return c.processNodeInfo(resp)
}
c.rotateMaster() // rotate master
}
return nil, errors.New("No available master.")
}
func (c *swarmClient) rotateMaster() {
if len(c.masters) > 1 {
c.masters = append(c.masters[1:], c.masters[0]) // 換下一個(gè)master
}
}
一些想法
- Swarm master的
/info返回的json文本,格式相當(dāng)粗糙。直接就是無腦的把docker -H X.X.X.X:2375 info命令的輸出給轉(zhuǎn)換成了kv格式。所以才會(huì)出現(xiàn)└這種符號(hào)。給文本解析帶來不便; - 除了REST API的方式,還可以考慮etcd的watch機(jī)制,或者Swarm自己的發(fā)現(xiàn)機(jī)制
docker/docker/pkg/discovery. - 通過文本解析的方式去獲得node信息,是非常原始的方法,暴力且不可靠。但之所以選擇它,主要是考慮到盡量不帶入額外的第三方依賴。這樣在以后更新Prometheus版本的時(shí)候會(huì)帶來方便(畢竟才0.X.0版本)。