将设为首页浏览此站
开启辅助访问 天气与日历 收藏本站联系我们切换到窄版

易陆发现论坛

 找回密码
 开始注册
查看: 108|回复: 2
收起左侧

openstack 相关gpu配置

[复制链接]
发表于 2022-6-10 22:02:28 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。

您需要 登录 才可以下载或查看,没有帐号?开始注册

x
openstack的相关配置
4 Z/ z4 Q: E4 A7 I1. 配置nova-scheduler (controller节点),编辑文件 /etc/nova/nova.conf:
( X3 ?" M  u: G  Q[DEFAULT]
- L/ {. R: J) P6 m! qscheduler_default_filters = RetryFilter, AvailabilityZoneFilter, RamFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter5 v5 A& Z6 }2 G) C
scheduler_available_filters = nova.scheduler.filters.all_filters
) ]/ _1 `" k8 ?" M' B* D重启nova-scheduler服务' x6 H. i9 j; b% C
[root@controller ~]# systemctl restart openstack-nova-scheduler.service
/ u7 L3 E2 ^3 C4 G( @; Z" Y% t# w) m4 A
[root@controller ~]# systemctl status openstack-nova-scheduler.service 8 U3 J3 j  p5 d; A+ O: c
● openstack-nova-scheduler.service - OpenStack Nova Scheduler Server) @! |7 U! s5 z$ J( `2 O
   Loaded: loaded (/usr/lib/systemd/system/openstack-nova-scheduler.service; enabled; vendor preset: disabled)
9 K7 e0 G, _! |8 y, J   Active: active (running) since Fri 2022-06-10 21:50:56 CST; 22s ago9 G( \4 ?: Q" p" ^
Main PID: 105509 (nova-scheduler)
5 @: Q/ }( g3 i. s4 k2 R    Tasks: 9 (limit: 100963)
  U; d2 d# Y& k' g1 l   Memory: 276.0M
$ w; R, ]0 x4 y" y# W   CGroup: /system.slice/openstack-nova-scheduler.service
- U; N; f- \* B( o6 u           ├─105509 /usr/bin/python3 /usr/bin/nova-scheduler% i7 [9 w0 A1 v! R" b& _# C
           ├─105528 /usr/bin/python3 /usr/bin/nova-scheduler
+ M- X) }, k# ~2 ^           ├─105529 /usr/bin/python3 /usr/bin/nova-scheduler
0 D& ?7 n' P% l1 J  a% {2 m           ├─105530 /usr/bin/python3 /usr/bin/nova-scheduler6 p6 F( u) P& m3 l) J5 x
           ├─105531 /usr/bin/python3 /usr/bin/nova-scheduler8 D" L- _6 g' S" T
           ├─105532 /usr/bin/python3 /usr/bin/nova-scheduler
* _! E- e1 P5 W           ├─105533 /usr/bin/python3 /usr/bin/nova-scheduler
3 M2 j& ^+ \: H9 e           ├─105534 /usr/bin/python3 /usr/bin/nova-scheduler3 r  ^% x" c. @0 w+ q6 f9 Y* E# R; Q
           └─105535 /usr/bin/python3 /usr/bin/nova-scheduler6 Q7 _' y- o/ K7 Z/ `9 a+ a6 I
Jun 10 21:50:52 controller systemd[1]: Starting OpenStack Nova Scheduler Server...
8 W1 d* g6 \+ ?  t0 k9 q0 LJun 10 21:50:56 controller systemd[1]: Started OpenStack Nova Scheduler Server.
$ C, L  b4 L9 H; U- h* w+ V* G& v5 V9 s/ n

2 g8 i% }2 `  E+ e5 `1 n* l2. 配置nova-api (controller节点),编辑文件 /etc/nova/nova.conf:
. Z% H1 k7 Z1 _7 J2 w' d[pci]
! N9 J" H+ Z, G2 E! F6 Ealias = { "name": "nvidia1080", "product_id": "1b06", "vendor_id": "10de", "device_type": "type-PCI" }
, J9 W8 M4 n* T; b( q[pci]
7 t; N6 e3 h  G! ]alias = { "name": "nvidiaGF119","product_id": "104a","vendor_id": "10de","device_type": "type-PCI" }
* P  Q' d( l6 d# y
- z- H- c1 I. F+ F% u) B. x重启nova-api服务
8 B6 y8 d' z; R" I9 N3 M
, z% E  J- d: g5 d% X% S# C[root@controller ~]# systemctl restart openstack-nova-api.service
0 y  C- U. u' H
  ~+ u5 I* s7 [, h. y+ v3. 配置nova-compute(compute 节点),编辑文件/etc/nova/nova.conf:
$ Q. g& i" I8 J* k! T[pci]
# M2 e4 ]  z6 P  D- ~passthrough_whitelist = { "vendor_id": "10de", "product_id": "104a" }% t8 P$ K. a  c0 T( g4 Y
alias = {% E/ E  ?& d2 O5 ^/ z, Z8 H  f
       "name": "nvidiaGF119",
& u: S9 S! N% o$ j% \       "product_id": "104a",
6 S" F5 q- \* R/ ?0 U       "vendor_id": "10de",, G" E/ l: u! e  I5 _7 e# H
       "device_type": "type-PCI"% r6 r* X3 v6 |3 F: i1 z: ]3 i3 B
}/ _6 S" i" a" `+ [# r3 \# Q

6 C6 o' m( \7 y/ r  v8 v* \1 ~- M4 T) l3 L9 \$ a6 J" U
[pci]4 \) l) {: {! H& n5 N! d2 E
passthrough_whitelist = { "vendor_id": "10de", "product_id": "104a" }
* d) r: z3 k! Q; K+ H* T0 |2 valias = { "name": "nvidiaGF119", "product_id": "104a", "vendor_id": "104a", "device_type": "type_PCI" }. D" A. n  T& b8 m% J3 e

* i( F" i& e* S- Z重启nova-compute服务
1 B: c4 X* V# A% X; f8 ]# N4 Z; K5 M[root@compute01 ~]# systemctl restart openstack-nova-compute.service 4 R2 P( {5 u8 p) f
2 R  ~7 E1 }8 m4 e( I$ q0 ~
[root@compute01 ~]# systemctl restart openstack-nova-compute.service
+ _3 S+ g4 n4 u: {% g: w4 r[root@compute01 ~]# tail -f /var/log/nova/3 _3 H1 _$ |: Q5 x
nova-compute.log    privsep-helper.log  / c9 W! \+ h/ g
[root@compute01 ~]# tail -f /var/log/nova/nova-compute.log : N2 u7 [5 k. c% z$ J% v: `
2022-06-10 22:10:51.891 12258 INFO oslo.privsep.daemon [req-e600b0bc-1cc4-4e85-a406-4d0c094560ee - - - - -] Spawned new privsep daemon via rootwrap
% l& P/ c' m; g& W$ w+ I. x2022-06-10 22:10:51.796 12299 INFO oslo.privsep.daemon [-] privsep daemon starting6 U& _3 Q  g# i9 S9 [: \; o
2022-06-10 22:10:51.800 12299 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/09 U- Z; a8 M5 T+ j1 k0 o
2022-06-10 22:10:51.804 12299 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_NET_ADMIN/CAP_NET_ADMIN/none; t( B, h" J! z
2022-06-10 22:10:51.804 12299 INFO oslo.privsep.daemon [-] privsep daemon running as pid 12299
( w2 a/ ^6 q. g' ~2022-06-10 22:10:52.437 12258 INFO os_vif [req-e600b0bc-1cc4-4e85-a406-4d0c094560ee - - - - -] Successfully plugged vif VIFBridge(active=True,address=fa:16:3e:bf:2a:4e,bridge_name='qbr24719437-3e',has_traffic_filtering=True,id=24719437-3ee6-469b-af02-c1fcea041be2,network=Network(b83e2ffc-eaad-455f-b299-18e09d58be32),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap24719437-3e')
1 |4 x3 W6 x6 {; y, P2022-06-10 22:10:52.459 12258 INFO os_vif [req-e600b0bc-1cc4-4e85-a406-4d0c094560ee - - - - -] Successfully plugged vif VIFBridge(active=True,address=fa:16:3e:fe:4c:d1,bridge_name='qbr58f2e526-38',has_traffic_filtering=True,id=58f2e526-386b-43da-9818-208b6a34b6e8,network=Network(5eb067d8-cd9b-4eec-ac0b-b5982752e75d),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap58f2e526-38')+ O2 S' g( O, A/ z* [2 I/ s) r8 C) q
2022-06-10 22:10:52.478 12258 INFO os_vif [req-e600b0bc-1cc4-4e85-a406-4d0c094560ee - - - - -] Successfully plugged vif VIFBridge(active=True,address=fa:16:3e:bd:8b:42,bridge_name='qbr24c6e701-e5',has_traffic_filtering=True,id=24c6e701-e5b4-4277-9895-cc67a4097280,network=Network(5eb067d8-cd9b-4eec-ac0b-b5982752e75d),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap24c6e701-e5')' b3 R0 R; K7 S. ^
2022-06-10 22:10:52.481 12258 INFO nova.compute.manager [req-e600b0bc-1cc4-4e85-a406-4d0c094560ee - - - - -] Looking for unclaimed instances stuck in BUILDING status for nodes managed by this host
; `" c3 W- @1 \* J2022-06-10 22:10:54.740 12258 INFO nova.virt.libvirt.host [req-e600b0bc-1cc4-4e85-a406-4d0c094560ee - - - - -] kernel doesn't support AMD SEV8 i3 J  N1 g% s1 _9 y% F
3 k, B: L3 L, |+ c
0 c  h: G' R( a: d
三 验证
! w' x& O' ^2 ~8 q1 J. L) w8 ~1. 创建设置flavor:* o% L0 G# z6 p- c5 M5 Q6 U4 |
openstack flavor create --public --ram 2048 --disk 20 --vcpus 2 m1.large0 C5 y2 v- B3 `# c' n3 V0 B+ h7 u6 u
openstack flavor set m1.large --property pci_passthrough:alias='nvidia1080:2'
0 i, k! G/ e7 s% k1 Dnvidia1080 即为alias中的那么, 2为GPU的数量。
2 Y) z! q; D; h8 G2. 创建instance:& v/ c3 T. ^- L6 p+ s0 p% ~
openstack server create --flavor m1.large --image cirros-0.3.5-x86_64-uec --wait test-pci3 V' P7 R  H1 \2 F. s6 [7 F6 `9 i
3. 在cirros下查看GPU信息如下:
# K+ v$ d& }' u4 t: U$ lspci -k
6 r; m2 V. S  L8 {1 X1 n...4 s: f8 n) j# a+ t* t+ U9 I/ h6 `5 P4 _2 H
00:05.0 Class 0300: 10de:1b06
; a+ d6 y, \' }: H00:06.0 Class 0300: 10de:1b06: |" r0 t+ l% E2 g' x
...+ A$ d5 |, }8 y/ G6 \% A! L; ~
四 NVIDIA显卡的问题& u2 y* u; d3 W7 D# q
因为NIVIDIA显卡的驱动会检测是否跑在虚拟机里,如果在虚拟机里驱动就会出错,所以我们需要对显卡驱动隐藏hypervisor id。在OpenStack的Pile版本中的Glance 镜像引入了img_hide_hypervisor_id=true的property,所以可以对镜像执行如下的命令隐藏hupervisor id:, j) X6 a: C* T* L( L2 D: {
$ openstack image set IMG-UUID --property img_hide_hypervisor_id=true
0 W0 q9 s+ J& T- S$ @, F" A3 T通过此镜像安装的instance就会隐藏hypervisor id。1 [. r( G3 J& P4 W
如果是Pike之前的版本, 可以参考Consumer-grade GPUs in an OpenStack system (NVIDIA GPUs)这篇文章的做法。/ F" S% K, H" Y" i
可以通过下边的命令查看hypervisor id是否隐藏:
1 |1 ]9 a8 N  L, M7 }% F( b1 g$ cpuid | grep hypervisor_id
% Q0 H& f' g$ h* R   hypervisor_id = "KVMKVMKVM   "! h/ H) k. Y' T' x* P. P
   hypervisor_id = "KVMKVMKVM   ": `' Q: w- C. f# f8 l5 b! w. p
上边的显示结果说明没有隐藏,下边的显示结果说明已经隐藏:
& M2 q) Y, H, ~2 D1 F1 |  z$ u$ cpuid | grep hypervisor_id
3 i, J9 t8 \3 n2 ~6 A$ X7 @; l# U8 S   hypervisor_id = "  @  @    "
( q+ k. J, b: ?   hypervisor_id = "  @  @    ". x* g9 P& `( C# n$ [
! c5 D9 [9 S1 X

9 g4 n4 n  {3 Z6 O
5 l# I. P: v0 a: M, w. {6 |5 L( A# Q- S
 楼主| 发表于 2022-6-10 22:13:07 | 显示全部楼层
[root@controller ~]# openstack flavor create --public --ram 2048 --disk 20 --vcpus 2 m1.large
# V- X2 q" h2 X7 h5 N5 e9 ^; S+----------------------------+--------------------------------------+
0 N/ r: p9 a% u5 w" I1 q+ F| Field                      | Value                                |
" Q3 O$ E5 u2 X+----------------------------+--------------------------------------+
0 B- f* S9 Y& Q1 H1 x| OS-FLV-DISABLED:disabled   | False                                |. a9 _7 N' q7 n1 D) U9 o
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |3 ]4 S1 G" i" `4 g) W5 I) k
| description                | None                                 |0 P  L, H+ W: F6 V  m5 a! m1 h6 G1 y9 d
| disk                       | 20                                   |
( |; E8 E6 [8 K2 \4 x# A| id                         | a56773dd-2ab1-453b-ab94-95c559334567 |
2 t5 G- g7 R2 w4 p' L/ `' f| name                       | m1.large                             |  ?; G2 d0 U+ O5 a2 x- Y. {
| os-flavor-access:is_public | True                                 |
5 h$ ?) @  E- t- G' }4 y, R% h+ s| properties                 |                                      |% |1 J- i5 S# |4 [1 V# T& u$ q. r
| ram                        | 2048                                 |
' z: D  v$ j8 J" G9 n| rxtx_factor                | 1.0                                  |
, C6 a$ W7 U+ P# L3 ]| swap                       |                                      |
" N! \$ `6 H* ~: P* U( J| vcpus                      | 2                                    |! M( d/ {; E( @  Y' l5 m' o, P7 S  m
+----------------------------+--------------------------------------+
% t' D( D, O8 }# C) K' W5 o/ [[root@controller ~]#  openstack flavor set m1.large --property pci_passthrough:alias='nvidia1080:2'
 楼主| 发表于 2022-6-10 22:17:35 | 显示全部楼层
[root@controller ~]#  openstack flavor set m1.large --property pci_passthrough:alias='nvidiaGF119:1'' j, k4 t3 y+ _3 x5 k
& u) J! k: h5 M
这里的值必须和nova.conf中的值一样  % R0 _* ^) X& k5 Z) A: T5 M
否则报错
* x" ], K9 c. Q/ ^% A! |& a8 l
您需要登录后才可以回帖 登录 | 开始注册

本版积分规则

关闭

站长推荐上一条 /4 下一条

如有购买积分卡请联系497906712

QQ|返回首页|Archiver|手机版|小黑屋|易陆发现 点击这里给我发消息

GMT+8, 2022-7-4 11:09 , Processed in 0.042225 second(s), 21 queries .

Powered by LR.LINUX.cloud bbs168x X3.2 Licensed

© 2012-2022 Comsenz Inc.

快速回复 返回顶部 返回列表