1 Star 4 Fork 3

ApulisPlatform / ascend-device-plugin

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
Apache-2.0

ascend-device-plugin

1 环境依赖

检查项 要求
dos2unix 已安装。
atlas的驱动版本 大于等于1.73.5.0.B050
Go语言环境版本 大于等于1.14.3。
gcc版本 大于等于7.3.0。
Kubernetes版本 大于等于1.13.0。
Docker环境 已安装Docker,可以从镜像仓拉取镜像或已有对应操作系统的镜像。

2 编译

  1. 执行以下目录安装最新版本的pkg-config。

    apt-get install -y pkg-config
  2. 执行以下命令,设置环境变量。

    export GO111MODULE=on
    
    export GOPROXY=https://gocenter.io
    
    export GONOSUMDB=\

    GOPROXY代理地址请根据实际选择,可通过go mod download命令进行检查。 如出现x509错误,需要配置证书。

  3. 进入ascend-device-plugin目录,执行以下命令,修改yaml文件。

    vi ascendplugin.yaml
apiVersion: apps/v1
   kind: DaemonSet
   metadata:
     name: ascend-device-plugin-daemonset
     namespace: kube-system
   spec:
     selector:
       matchLabels:
         name: ascend-device-plugin-ds
     updateStrategy:
       type: RollingUpdate
     template:
       metadata:
         annotations:
           scheduler.alpha.kubernetes.io/critical-pod: ""
         labels:
           name: ascend-device-plugin-ds
       spec:
         tolerations:
           - key: CriticalAddonsOnly
             operator: Exists
           - key: npu.huawei.com/NPU  #资源名称,根据芯片类型设置。
             operator: Exists
             effect: NoSchedule
           - key: "ascendplugin"
             operator: "Equal"
             value: "v2"
             effect: NoSchedule
         priorityClassName: "system-node-critical"
         nodeSelector:
           accelerator: huawei-Ascend910  #根据芯片类型设置标签名称。
         containers:
         - image: ascend-k8sdeviceplugin:v0.0.1  #镜像名称及版本号。
           name: device-plugin-01
           command: [ "/bin/bash", "-c", "--"]
           args: [ "./build/build_in_docker.sh;ascendplugin  --useAscendDocker=${USE_ASCEND_DOCKER}" ] #使用Ascend310,则需要增加--mode=ascend310
           securityContext:
             privileged: true
           imagePullPolicy: Never
           volumeMounts:
             - name: device-plugin
               mountPath: /var/lib/kubelet/device-plugins
             - name: hiai-driver
               mountPath: /usr/local/Ascend/driver  #驱动安装目录,用户根据实际填写。
             - name: log-path
               mountPath: /var/log/devicePlugin
         volumes:
           - name: device-plugin
             hostPath:
               path: /var/lib/kubelet/device-plugins
           - name: hiai-driver
             hostPath:
               path: /usr/local/Ascend/driver  #驱动安装目录,用户根据实际填写。
           - name: log-path
             hostPath:
               path: /var/log/devicePlugin
如果是和volcano实现亲和性调度:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: pods-device-plugin
subjects:
  - kind: ServiceAccount
    name: default
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: ascend-device-plugin-daemonset
  namespace: kube-system
spec:
  selector:
    matchLabels:
      name: ascend-device-plugin-ds
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ""
      labels:
        name: ascend-device-plugin-ds
    spec:
      tolerations:
        - key: CriticalAddonsOnly
          operator: Exists
        - key: npu.huawei.com/NPU
          operator: Exists
          effect: NoSchedule
        - key: "ascendplugin"
          operator: "Equal"
          value: "v2"
          effect: NoSchedule
      priorityClassName: "system-node-critical"
      nodeSelector:
        accelerator: huawei-Ascend910
      containers:
      - image: ascend-k8sdeviceplugin:v0.0.1
        name: device-plugin-01
        command: [ "/bin/bash", "-c", "--"]
        args: [ "./build/build_in_docker.sh;ascendplugin  --useAscendDocker=${USE_ASCEND_DOCKER} --volcanoType=true" ]
        securityContext:
          privileged: true
        imagePullPolicy: Never
        volumeMounts:
          - name: device-plugin
            mountPath: /var/lib/kubelet/device-plugins
          - name: hiai-driver
            mountPath: /usr/local/Ascend/driver
          - name: log-path
            mountPath: /var/log/devicePlugin
        env:
          - name: NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
      volumes:
        - name: device-plugin
          hostPath:
            path: /var/lib/kubelet/device-plugins
        - name: hiai-driver
          hostPath:
            path: /usr/local/Ascend/driver
        - name: log-path
          hostPath:
            path: /var/log/devicePlugin
  1. 执行以下命令,编辑Dockerfile文件,将镜像修改为查询的镜像名及版本号。

    vi /home/test/ascend-device-plugin/build/Dockerfile

#用户根据实际选择需要使用的带go编译的基础镜像,可通过docker images命令查询。

FROM golang:1.13.11-buster as build

#是否使用昇腾Docker,true表示使用,false表示不使用(将会使用原生Docker)。
ENV USE_ASCEND_DOCKER true

ENV GOPATH /usr/app/

ENV GO111MODULE off

ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH
#libdrvdsmi_host.so所在目录,Ascend 310和Ascend 910目录不同。
ENV LD_LIBRARY_PATH  /usr/local/Ascend/driver/lib64/driver:/usr/local/Ascend/driver/lib64/common

RUN mkdir -p /usr/app/src/ascend-device-plugin

COPY . /usr/app/src/Ascend-device-plugin

WORKDIR /usr/app/src/Ascend-device-plugin
  1. 进入ascend_device_plugin.pc文件所在目录,执行以下命令,查看以下路径是否正确,根据实际修改。

    • Ascend 310目录:ascend-device-plugin/src/plugin/config/config_310
    • Ascend 910目录:ascend-device-plugin/src/plugin/config/config_910
    vi ascend_device_plugin.pc
    #Package Information for pkg-config
    #驱动安装目录,根据实际填写。
    prefix=/usr/local/Ascend
    #dsmi动态库地址,根据实际修改。
    libdriver=${prefix}/driver/lib64
    #dsmi驱动头文件dsmi_common_interface.h所在目录。
    includedir=${prefix}/driver/kernel/inc/driver/
    Name: ascend_docker_plugin
    Description: Ascend device plugin
    Version: 0.0.1
    Libs: -L${libdriver}/    -ldrvdsmi_host
    Cflags: -I${includedir}

    支持修改插件镜像的名称,build目录下build_common.sh中修改“docker_images_name”即可。

  2. 进入“/ascend-device-plugin/build”目录,执行以下命令,查看CONFIGDIR是否正确。

    vi build_in_docker.sh
    #!/bin/bash
    set -x
    CUR_DIR=$(dirname $(readlink -f $0))
    TOP_DIR=$(realpath ${CUR_DIR}/..)
    CONFIGDIR=${TOP_DIR}/src/plugin/config/config_910 #默认使用config_910,    使用Ascend 310请改为config_310。
    
    OUTPUT_NAME="ascendplugin"
    export PKG_CONFIG_PATH=${CONFIGDIR}:$PKG_CONFIG_PATH
    function main() {
        rm -rf ${TOP_DIR}/output/*
        rm -rf ~/.cache/go-build
        rm -rf /tmp/gobuildplguin
        mkdir -p /tmp/gobuildplguin
        chmod 750 /tmp/gobuildplguin
        cd ${TOP_DIR}/src/plugin/cmd/ascendplugin
        go build -ldflags "-X main.BuildName=${OUTPUT_NAME} \
                -X main.BuildVersion=${build_version} \
                -buildid none     \
                -s   \
                -tmpdir /tmp/gobuildplguin" \
                -o ${OUTPUT_NAME}       \
                -trimpath
    
        ls ${OUTPUT_NAME}
        if [ $? -ne 0 ]; then
            echo "fail to find ascendplugin"
            exit 1
        fi
        cp ${TOP_DIR}/src/plugin/cmd/ascendplugin/${OUTPUT_NAME}              /usr/local/bin/
    }
    main

    简易步骤

    为了简化在3,4,5,6步骤,提供了一个shell脚本进行快速修改: 在./build/中创建并执行shell文件

      #!/bin/bash
      ASCNED_TYPE=910 #根据芯片类型选择310或910。
      ASCNED_INSTALL_PATH=/usr/local/Ascend  #驱动安装路径,根据实际修改。
      USE_ASCEND_DOCKER=false  #是否使用昇腾Docker。
      
      
      CUR_DIR=$(dirname $(readlink -f $0))
      TOP_DIR=$(realpath ${CUR_DIR}/..)
      LD_LIBRARY_PATH_PARA1=${ASCNED_INSTALL_PATH}/driver/lib64/driver
      LD_LIBRARY_PATH_PARA2=${ASCNED_INSTALL_PATH}/driver/lib64
      TYPE=Ascend910
      PKG_PATH=${TOP_DIR}/src/plugin/config/config_910
      PKG_PATH_STRING=\$\{TOP_DIR\}/src/plugin/config/config_910
      LIBDRIVER="driver/lib64/driver"
      if [ ${ASCNED_TYPE} == "310"  ]; then
        TYPE=Ascend310
        LD_LIBRARY_PATH_PARA1=${ASCNED_INSTALL_PATH}/driver/lib64
        PKG_PATH=${TOP_DIR}/src/plugin/config/config_310
        PKG_PATH_STRING=\$\{TOP_DIR\}/src/plugin/config/config_310
        LIBDRIVER="/driver/lib64"
        sed -i "s#ascendplugin  --useAscendDocker=\${USE_ASCEND_DOCKER}#ascendplugin --mode=ascend310 --useAscendDocker=${USE_ASCEND_DOCKER}#g" ${TOP_DIR}/ascendplugin.yaml
      fi
      sed -i "s/Ascend[0-9]\{3\}/${TYPE}/g" ${TOP_DIR}/ascendplugin.yaml
      sed -i "s#ath: /usr/local/Ascend/driver#ath: ${ASCNED_INSTALL_PATH}/driver#g" ${TOP_DIR}/ascendplugin.yaml
      sed -i "/^ENV LD_LIBRARY_PATH /c ENV LD_LIBRARY_PATH ${LD_LIBRARY_PATH_PARA1}:${LD_LIBRARY_PATH_PARA2}/common" ${TOP_DIR}/Dockerfile
      sed -i "/^ENV USE_ASCEND_DOCKER /c ENV USE_ASCEND_DOCKER ${USE_ASCEND_DOCKER}" ${TOP_DIR}/Dockerfile
      sed -i "/^libdriver=/c libdriver=$\{prefix\}/${LIBDRIVER}" ${PKG_PATH}/ascend_device_plugin.pc
      sed -i "/^prefix=/c prefix=${ASCNED_INSTALL_PATH}" ${PKG_PATH}/ascend_device_plugin.pc
      sed -i "/^CONFIGDIR=/c CONFIGDIR=${PKG_PATH_STRING}" ${CUR_DIR}/build_in_docker.sh
  3. 执行以下命令,根据实际选择执行的脚本,生成二进制和镜像文件。

    Ascend 910请选择build910.sh,Ascend 310请选择build_310.sh。

    cd /home/test/ascend-device-plugin/build/
    chmod +x build_910.sh
    ./build_910.sh dockerimages
  4. 执行以下命令,查看生成的软件包。

    ll /home/test/ascend-device-plugin/output

    X86和ARM生成的软件包名不同,以下示例为ARM环境: Ascend-K8sDevicePlugin-xxx-arm64-Docker.tar.gz:K8S设备插件镜像。 Ascend-K8sDevicePlugin-xxx-arm64-Linux.tar.gz:K8S设备插件二进制安装包。

    drwxr-xr-x 2 root root     4096 Jun  8 18:42 ./
    drwxr-xr-x 9 root root     4096 Jun  8 17:12 ../
    -rw-r--r-- 1 root root 29584705 Jun  9 10:37 Ascend-K8sDevicePlugin-xxx-arm64-Docker.tar.gz
    -rwxr-xr-x 1 root root  6721073 Jun  9 16:20 Ascend-K8sDevicePlugin-xxx-arm64-Linux.tar.gz

3 使用ascend-deviceplugin镜像创建Daemonset

操作步骤

以下操作以ARM平台下生成的tar.gz文件为例。

  1. 进入生成的Docker软件包所在目录,执行以下命令,导入Docker镜像。

    cd /home/test/ascend-device-plugin/output
    
    docker load <Ascend-K8sDevicePlugin-xxx-arm64-Docker.tar.gz
  2. 执行如下命令,给带有Ascend 910(或Ascend 310)的节点打标签。

    kubectl label nodes  localhost.localdomain accelerator=huawei-Ascend910

    localhost.localdomain为有Ascend 910(或Ascend 310)的节点名称,可通过kubectl get node命令查看。

  3. 执行以下命令,部署DaemonSet。

    cd /home/test/ascend-device-plugin
    
    kubectl apply -f  ascendplugin.yaml
  4. 执行如下命令,查看节点设备部署信息。

    kubectl describe node

    如下所示,字段中对应标签及节点数量正确说明部署成功。

    Capacity:
      cpu:                   128
      ephemeral-storage:     3842380928Ki
      npu.huawei.com/NPU:  8
      hugepages-2Mi:         0
      memory:                263865068Ki
      pods:                  110
    Allocatable:
      cpu:                   128
      ephemeral-storage:     3541138257382
      npu.huawei.com/NPU:  8
      hugepages-2Mi:         0
      memory:                263762668Ki
      pods:                  110

    4 使用yaml创建带Ascend芯片的任务容器

    4.1 编写任务yaml,在resources中指定任务容器需要的芯片类型和个数

      vi ascend.yaml
     apiVersion: v1  #指定api版本,此值必须在kubectl apiversion中
     kind: Pod #指定创建资源的角色/类型
     metadata:
       name: rest502 #资源的名字,在同一个namespace中必须唯一
     spec:
       containers:
       - name: rest502 #容器的名字
         image: ubuntu_arm64_resnet50:18.04 #容器使用的镜像地址
         imagePullPolicy: Never
         resources:
           limits: #资源限制
             npu.huawei.com/NPU: 2  #使用芯片类型和个数,如为310芯片,修改为huawei.com/Ascend310
         volumeMounts:
           - name: joblog
             mountPath: /home/log/  #容器内部日志路径,根据任务需要修改。
           - name: model
             mountPath: /home/app/model #容器内部模型路径,根据任务需要修改。
       volumes:
         - name: joblog
           hostPath:
             path: /home/test/docker_log    #宿主机挂载日志路径,根据任务需要修改。
         - name: model
           hostPath:
             path: /home/test/docker_model/  #宿主机挂载模型路径,根据任务需要修改。

    4.2 执行命令启动Pod

    kubectl apply -f ascend.yaml

    4.3 分别执行以下命令,进入pod查看分配信息。

    kubectl exec -it pod名称 bash

    pod名称为yaml中的资源名称。

    ls /dev/

    如下类似回显信息中可以看到davinci3和davinci4即为分配的pod。

    core davinci3 davinci4 davinci_manager devmm_svm fd full hisi_hdc   mqueue null ptmx
Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

简介

暂无描述 展开 收起
Go 等 3 种语言
Apache-2.0
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
1
https://gitee.com/apulisplatform/ascend-device-plugin.git
git@gitee.com:apulisplatform/ascend-device-plugin.git
apulisplatform
ascend-device-plugin
ascend-device-plugin
v1.0.0

搜索帮助