runsisi's

technical notes

rbd map failed to add secret

2019-01-17 runsisi#debug#rbd

故障现象

rbd

故障分析

现场所使用的 Ceph 为 Jewel 版本,定位到出错代码所在的位置如下:

src/krbd.cc#L158

static int build_map_buf(CephContext *cct, const char *pool, const char *image,
                         const char *snap, const char *options, string *pbuf)
{
  ...

  if (keyring.get_secret(cct->_conf->name, secret)) {
    string secret_str;
    secret.encode_base64(secret_str);

    r = set_kernel_secret(secret_str.c_str(), key_name.c_str());
    if (r >= 0) {
      if (r == 0)
        cerr << "rbd: warning: secret has length 0" << std::endl;
      oss << ",key=" << key_name;
    } else if (r == -ENODEV || r == -ENOSYS) {
      // running against older kernel; fall back to secret= in options
      oss << ",secret=" << secret_str;
    } else {
      cerr << "rbd: failed to add secret '" << key_name << "' to kernel"
           << std::endl;
      return r;
    }
  } else if (is_kernel_secret(key_name.c_str())) {
    oss << ",key=" << key_name;
  }

  ...
}

src/common/secret.c#L74

int set_kernel_secret(const char *secret, const char *key_name)
{
  ...

  serial = add_key("ceph", key_name, payload, sizeof(payload), KEY_SPEC_PROCESS_KEYRING);
  if (serial == -1) {
    ret = -errno;
  }

  return ret;
}

显然是因为调用的 add_key 接口返回了错误导致 set_kernel_secret 返回错误 EDQUOT,而 add_key 是 keyutils.h 中定义的函数:

~]$ rpm -qf /usr/include/keyutils.h
keyutils-libs-devel-1.5.8-3.el7.x86_64

其实现非常简单,直接调用的系统调用:

key_serial_t __weak add_key(const char *type,
			    const char *description,
			    const void *payload,
			    size_t plen,
			    key_serial_t ringid)
{
	return syscall(__NR_add_key,
		       type, description, payload, plen, ringid);
}

系统调用 add_key 的调用关系大概如下:

security/keys/keyctl.c

SYSCALL_DEFINE5(add_key, const char __user *, _type,

security/keys/key.c

key_create_or_update

key_alloc
    no_quota:
	spin_unlock(&user->lock);
	key_user_put(user);
	key = ERR_PTR(-EDQUOT);
	goto error;

其中只有 key_alloc 会返回 EDQUOT 错误,具体看 key_alloc 为什么跳到 no_quota 标签:

unsigned int key_quota_root_maxkeys = 1000000;	/* root's key count quota */
unsigned int key_quota_root_maxbytes = 25000000; /* root's key space quota */

/* check that the user's quota permits allocation of another key and
 * its description */
if (!(flags & KEY_ALLOC_NOT_IN_QUOTA)) {
	unsigned maxkeys = uid_eq(uid, GLOBAL_ROOT_UID) ?
		key_quota_root_maxkeys : key_quota_maxkeys;
	unsigned maxbytes = uid_eq(uid, GLOBAL_ROOT_UID) ?
		key_quota_root_maxbytes : key_quota_maxbytes;

	spin_lock(&user->lock);
	if (!(flags & KEY_ALLOC_QUOTA_OVERRUN)) {
		if (user->qnkeys + 1 >= maxkeys ||
		    user->qnbytes + quotalen >= maxbytes ||
		    user->qnbytes + quotalen < user->qnbytes)
			goto no_quota;
	}

	user->qnkeys++;
	user->qnbytes += quotalen;
	spin_unlock(&user->lock);
}

显然,达到配额的限制自然就会出现前面提到的错误,我们可以查看一下系统中的配额:

~]# sysctl -a | grep keys
kernel.keys.gc_delay = 300
kernel.keys.maxbytes = 20000
kernel.keys.maxkeys = 200
kernel.keys.persistent_keyring_expiry = 259200
kernel.keys.root_maxbytes = 20000
kernel.keys.root_maxkeys = 200

查看一下出现问题所在的系统内核版本:

~]# uname -r
3.10.0-229.el7.x86_64

结合下面链接中的内核补丁和 commit 信息:

[PATCH 1/5] KEYS: Increase root_maxkeys and root_maxbytes sizes http://lkml.iu.edu/hypermail/linux/kernel/1409.0/00904.html

keys: make the keyring quotas controllable through /proc/sys https://github.com/torvalds/linux/commit/0b77f5bfb45c13e1e5142374f9d6ca75292252a4

可知 CentOS 7.1 内核(3.10.0-229)并没有合入上面链接中提到的补丁,而默认的配置数值过小,导致了问题的出现。

解决方案

通过 sysctl 命令临时增大 kernel.keys.root_maxkeyskernel.keys.root_maxbytes 配额:

~]# sysctl kernel.keys.root_maxbytes=25000000
kernel.keys.root_maxbytes = 25000000
~]# sysctl kernel.keys.root_maxkeys=1000000
kernel.keys.root_maxkeys = 1000000

同时修改系统配置文件 /etc/sysctl.conf,增加如下两行,使得系统重启后配额仍然生效:

~]# vi /etc/sysctl.conf
...
kernel.keys.root_maxbytes = 25000000
kernel.keys.root_maxkeys = 1000000