Skip to content

集群节点管理

目前集群系统可进行的节点操作如下:

  • info: 显示集群节点状态

  • del: 删除计算节点

[root@sonmi ~]# sonmictl node -h
Manage compute nodes

Usage:
  sonmictl node [command]

Available Commands:
  del         Delete a compute node
  info        Display the status of cluster nodes

Flags:
  -h, --help   help for node

Use "sonmictl node [command] --help" for more information about a command.
[root@sonmi ~]# sonmictl node -h
Manage compute nodes

Usage:
  sonmictl node [command]

Available Commands:
  del         Delete a compute node
  info        Display the status of cluster nodes

Flags:
  -h, --help   help for node

Use "sonmictl node [command] --help" for more information about a command.

查看集群节点状态

通过 sonmictl node info 可以查看集群中所有节点的状态,如下所示:

node-1

各列含义如下:

  • NODENAME:节点名称
  • CPU ALLOC/TOTAL:每个节点已分配给用户的CPU核心数/总核心数
  • MEMORY FREE/TOTAL:每个节点空闲的内存/总内存
  • CPU TEMPERATURE:对应节点的CPU温度
  • CPU LOAD:对应节点的CPU负载
  • CPU STATE:对应节点的状态,跟slurm中节点的状态对应

删除节点

通过执行sonmictl node del <COMPUTE-NODE>命令就可以从集群中移除对应的节点 COMPUTE-NODE。在示例中通过以下命令从集群中移除compute-0-0节点:

sonmictl node del compute-0-0
sonmictl node del compute-0-0

node-2

重新注册节点

在将节点从集群中移除之后,如果想要将该节点重新注册到集群,可以登陆到已经删除的节点之上,使用 root 用户执行以下命令,重新注册到集群中:

sonmid-register
sonmid-register

下面例子中,从集群中移除节点compute-0-0之后,通过SSH登陆到compute-0-0节点,然后重新通过sonmid-register注册回集群中。

[root@sonmi ~]# sonmictl node info 
+-------------+-----------------+-------------------+-----------------+----------+-------+
| NODENAME    | CPU ALLOC/TOTAL | MEMORY FREE/TOTAL | CPU TEMPERATURE | CPU LOAD | STATE |
+-------------+-----------------+-------------------+-----------------+----------+-------+
|    sonmi    |      0 / 8      |   2.2GB / 3.6GB   |  34.0°C|35.0°C  |   0.00   |  IDLE |
| compute-0-0 |      0 / 8      |   571MB / 1.7GB   |  34.0°C|35.0°C  |   2.00   |  IDLE |
| compute-0-1 |      0 / 8      |   700MB / 1.7GB   |  34.0°C|35.0°C  |   2.00   |  IDLE |
+-------------+-----------------+-------------------+-----------------+----------+-------+
[root@sonmi ~]# sonmictl node del compute-0-0
Delete compute node: compute-0-0
[root@sonmi ~]# sonmictl node info 
+-------------+-----------------+-------------------+-----------------+----------+-------+
| NODENAME    | CPU ALLOC/TOTAL | MEMORY FREE/TOTAL | CPU TEMPERATURE | CPU LOAD | STATE |
+-------------+-----------------+-------------------+-----------------+----------+-------+
|    sonmi    |      0 / 8      |   2.2GB / 3.6GB   |  33.0°C|35.0°C  |   0.00   |  IDLE |
| compute-0-1 |      0 / 8      |   700MB / 1.7GB   |  33.0°C|35.0°C  |   2.00   |  IDLE |
+-------------+-----------------+-------------------+-----------------+----------+-------+
[root@sonmi ~]# ssh 10.1.1.2
╔╗╔╗╔╗──╔╗──────────────╔╗────╔═══╗─────────╔╗─╔╦═══╦═══╗
║║║║║║──║║─────────────╔╝╚╗───║╔═╗║─────────║║─║║╔═╗║╔═╗║
║║║║║╠══╣║╔══╦══╦╗╔╦══╗╚╗╔╬══╗║╚══╦══╦═╗╔╗╔╦╣╚═╝║╚═╝║║─╚╝
║╚╝╚╝║║═╣║║╔═╣╔╗║╚╝║║═╣─║║║╔╗║╚══╗║╔╗║╔╗╣╚╝╠╣╔═╗║╔══╣║─╔╗
╚╗╔╗╔╣║═╣╚╣╚═╣╚╝║║║║║═╣─║╚╣╚╝║║╚═╝║╚╝║║║║║║║║║─║║║──║╚═╝║
─╚╝╚╝╚══╩═╩══╩══╩╩╩╩══╝─╚═╩══╝╚═══╩══╩╝╚╩╩╩╩╩╝─╚╩╝──╚═══╝
Activate the web console with: systemctl enable --now cockpit.socket

Last login: Tue Sep 26 14:37:09 2023 from 10.1.1.1
[root@compute-0-0 ~]# sonmid-register 
[root@compute-0-0 ~]# exit
logout
Connection to 10.1.1.2 closed.
[root@sonmi ~]# sonmictl node info 
+-------------+-----------------+-------------------+-----------------+----------+-------+
| NODENAME    | CPU ALLOC/TOTAL | MEMORY FREE/TOTAL | CPU TEMPERATURE | CPU LOAD | STATE |
+-------------+-----------------+-------------------+-----------------+----------+-------+
|    sonmi    |      0 / 8      |   2.2GB / 3.6GB   |  34.0°C|36.0°C  |   0.00   |  IDLE |
| compute-0-0 |      0 / 8      |   573MB / 1.7GB   |  34.0°C|36.0°C  |   2.00   |  IDLE |
| compute-0-1 |      0 / 8      |   700MB / 1.7GB   |  34.0°C|36.0°C  |   2.00   |  IDLE |
+-------------+-----------------+-------------------+-----------------+----------+-------+
[root@sonmi ~]# sonmictl node info 
+-------------+-----------------+-------------------+-----------------+----------+-------+
| NODENAME    | CPU ALLOC/TOTAL | MEMORY FREE/TOTAL | CPU TEMPERATURE | CPU LOAD | STATE |
+-------------+-----------------+-------------------+-----------------+----------+-------+
|    sonmi    |      0 / 8      |   2.2GB / 3.6GB   |  34.0°C|35.0°C  |   0.00   |  IDLE |
| compute-0-0 |      0 / 8      |   571MB / 1.7GB   |  34.0°C|35.0°C  |   2.00   |  IDLE |
| compute-0-1 |      0 / 8      |   700MB / 1.7GB   |  34.0°C|35.0°C  |   2.00   |  IDLE |
+-------------+-----------------+-------------------+-----------------+----------+-------+
[root@sonmi ~]# sonmictl node del compute-0-0
Delete compute node: compute-0-0
[root@sonmi ~]# sonmictl node info 
+-------------+-----------------+-------------------+-----------------+----------+-------+
| NODENAME    | CPU ALLOC/TOTAL | MEMORY FREE/TOTAL | CPU TEMPERATURE | CPU LOAD | STATE |
+-------------+-----------------+-------------------+-----------------+----------+-------+
|    sonmi    |      0 / 8      |   2.2GB / 3.6GB   |  33.0°C|35.0°C  |   0.00   |  IDLE |
| compute-0-1 |      0 / 8      |   700MB / 1.7GB   |  33.0°C|35.0°C  |   2.00   |  IDLE |
+-------------+-----------------+-------------------+-----------------+----------+-------+
[root@sonmi ~]# ssh 10.1.1.2
╔╗╔╗╔╗──╔╗──────────────╔╗────╔═══╗─────────╔╗─╔╦═══╦═══╗
║║║║║║──║║─────────────╔╝╚╗───║╔═╗║─────────║║─║║╔═╗║╔═╗║
║║║║║╠══╣║╔══╦══╦╗╔╦══╗╚╗╔╬══╗║╚══╦══╦═╗╔╗╔╦╣╚═╝║╚═╝║║─╚╝
║╚╝╚╝║║═╣║║╔═╣╔╗║╚╝║║═╣─║║║╔╗║╚══╗║╔╗║╔╗╣╚╝╠╣╔═╗║╔══╣║─╔╗
╚╗╔╗╔╣║═╣╚╣╚═╣╚╝║║║║║═╣─║╚╣╚╝║║╚═╝║╚╝║║║║║║║║║─║║║──║╚═╝║
─╚╝╚╝╚══╩═╩══╩══╩╩╩╩══╝─╚═╩══╝╚═══╩══╩╝╚╩╩╩╩╩╝─╚╩╝──╚═══╝
Activate the web console with: systemctl enable --now cockpit.socket

Last login: Tue Sep 26 14:37:09 2023 from 10.1.1.1
[root@compute-0-0 ~]# sonmid-register 
[root@compute-0-0 ~]# exit
logout
Connection to 10.1.1.2 closed.
[root@sonmi ~]# sonmictl node info 
+-------------+-----------------+-------------------+-----------------+----------+-------+
| NODENAME    | CPU ALLOC/TOTAL | MEMORY FREE/TOTAL | CPU TEMPERATURE | CPU LOAD | STATE |
+-------------+-----------------+-------------------+-----------------+----------+-------+
|    sonmi    |      0 / 8      |   2.2GB / 3.6GB   |  34.0°C|36.0°C  |   0.00   |  IDLE |
| compute-0-0 |      0 / 8      |   573MB / 1.7GB   |  34.0°C|36.0°C  |   2.00   |  IDLE |
| compute-0-1 |      0 / 8      |   700MB / 1.7GB   |  34.0°C|36.0°C  |   2.00   |  IDLE |
+-------------+-----------------+-------------------+-----------------+----------+-------+

本站内容未经授权禁止转载
联系邮箱: [email protected]