klever

解決問題:

模型的管理和分發(fā)
模型解析和轉(zhuǎn)換
在線模型服務(wù)部署和管理

組件

ormb：模型打包、解壓、上傳、下載工具，
model-registry：模型倉庫及模型服務(wù) API 管理層，model-registry上傳文件的時候會有bug，文件未保存，簡單改下代碼就行了
modeljob-operator：ModelJob controller，管理模型解析、模型轉(zhuǎn)換任務(wù)
klever-web：前端組件

依賴組件

Istio：開源服務(wù)網(wǎng)格組件，模型服務(wù)通過 Istio 對外暴露模型服務(wù)地址，實現(xiàn)模型服務(wù)按內(nèi)容分流和按比例分流
Harbor：模型底層存儲組件，對模型配置和模型文件進行分層存儲
Seldon Core：開源模型服務(wù)管理的 Seldon Deployment CRD 的 controller，通過 SeldonDeployment CR 實現(xiàn)模型服務(wù)的管理

ORMB使用

模型定義config，mediaType 暫定為 application/vnd.caicloud.model.config.v1alpha1+json

模型文件較難分層存儲，ormb在設(shè)計中，模型文件以 application/tar+gzip 的 mediaType 壓縮歸檔后上傳到鏡像倉庫

注意：ormb pull或者push加上--plain-http，跳過https連接

ormb login 當前目錄生成config.json，保存令牌

$ ormb login  --insecure 192.168.194.129:30022 -u admin -p Ormb123456

ormb save (將模型目錄中的文件保存在本地文件系統(tǒng)的緩存中cache)

$ ormb save <model directory> 192.168.194.129:30022/lgy/fashion-model:v1

ormb push (將保存在緩存中的模型推送到遠端倉庫中)

$ ormb push 192.168.194.129:30022/lgy/fashion-model:v1

ormb tag

ormb tag 192.168.194.129:30022/lgy/fashion-model:v1 192.168.194.129:30022/lgy/fashion-model:v2

ormb pull

ormb pull --plain-http 192.168.194.129:30022/lgy/fashion-model:v1

ormb export

ormb export 192.168.194.129:30022/lgy/fashion-model:v1

ormb-storage-initializer使用

該命令整合ormb的login、pull、export，需要設(shè)置環(huán)境變量ORMB_USERNAME（Harbor賬號）和ORMB_PASSWORD（Harbor密碼）

ormb-storage-initializer pull-and-export 192.168.194.129:30022/lgy/fashion-model:v1 ./model

拉取模型文件保存在./model目錄下，結(jié)構(gòu)：

[root@localhost model]# tree
.
├── 1
│   ├── saved_model.pb
│   └── variables
│       ├── variables.data-00000-of-00001
│       └── variables.index
└── ormbfile.yaml

modeljob-operator

模型解析：

apiVersion: kleveross.io/v1alpha1
kind: ModelJob
metadata:
  name: modeljob-savedmodel-extract
  namespace: default
  labels:
    modeljob/extract: "true"
spec:
  # Add fields here
  model: "192.168.194.129:30022/lgy/fashion-model:v1"
  extraction:
    format: "SavedModel"

模型轉(zhuǎn)換：

apiVersion: kleveross.io/v1alpha1
kind: ModelJob
metadata:
  name: modeljob-caffe-convert
  namespace: default
  labels:
    modeljob/convert: "true"
spec:
  # Add fields here
  model: "192.168.194.129:30022/lgy/caff-model:v1"
  desiredTag: "192.168.194.129:30022/lgy/caff-model:v2"
  conversion:
    mmdnn:
      from: "CaffeModel"
      to: "NetDef"

MLflow

模塊	功能
Tracking	記錄實驗參數(shù)和比較結(jié)果的API，并提供了可視化的UI界面
Projects	提供了一種標準目錄格式，包括一個描述文件，把機器學習代碼打包成可復用，可重現(xiàn)，可分享的項目
Models	提供通用的模型文件管理和部署能力，支持多種框架和多種平臺的部署
Model Registry	提供模型全生命周期管理和協(xié)作中心，包括版本管理，環(huán)境遷移等等

安裝

pip install mlflow
pip install conda

Tracking

import os
from random import random, randint

from mlflow import log_metric, log_param, log_artifacts

if __name__ == "__main__":
    print("Running mlflow_tracking.py")

    log_param("param1", randint(0, 100))

    log_metric("foo", random())
    log_metric("foo", random() + 1)
    log_metric("foo", random() + 2)

    if not os.path.exists("outputs"):
        os.makedirs("outputs")
    with open("outputs/test.txt", "w") as f:
        f.write("hello world!")

    log_artifacts("outputs")

python mlflow_tracking.py

Tracking的結(jié)果會記錄在目錄下，生成mlruns目錄

效果

mlflow ui

訪問

image

Projects

代碼和執(zhí)行環(huán)境打包

新建MLproject文件和conda.yaml

conda.yaml

name: tutorial
channels:
  - conda-forge
dependencies:
  - python=3.6
  - pip
  - pip:
    - scikit-learn==0.23.2
    - mlflow>=1.0
    - pandas

MLproject

name: tutorial

conda_env: conda.yaml

entry_points:
  main:
    parameters:
      alpha: {type: float, default: 0.5}
      l1_ratio: {type: float, default: 0.1}
    command: "python train.py {alpha} {l1_ratio}"

這樣就會記錄該模型所需的環(huán)境信息，執(zhí)行如下命令即可復現(xiàn)模型結(jié)果。如果不需要conda，則需要保障運行的環(huán)境已經(jīng)安裝了必要的依賴，在命令上加上--no-conda即可

mlflow run sklearn_elasticnet_wine -P alpha=0.5 -P l1_ratio=0.1

image

Models

$ python sklearn_logistic_regression/train.py
Score: 0.6666666666666666
Model saved in run 96f95c78fe7d4de88199a89f87a89762

啟動一個web服務(wù)器來服務(wù)一個用MLflow保存的模型(算法服務(wù))

$ mlflow models serve -m runs:/c19192871687493b940100db7c461fd3/model
2021/09/09 15:14:44 INFO mlflow.models.cli: Selected backend for flavor 'python_function'
2021/09/09 15:14:44 INFO mlflow.pyfunc.backend: === Running command 'source /root/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-258677fee9248770821ae816e559134654b19176 1>&2 && gunicorn --timeout=60 -b 127.0.0.1:5000 -w 1 ${GUNICORN_CMD_ARGS} -- mlflow.pyfunc.scoring_server.wsgi:app'
[2021-09-09 15:14:44 +0800] [90727] [INFO] Starting gunicorn 20.1.0
[2021-09-09 15:14:44 +0800] [90727] [INFO] Listening at: http://127.0.0.1:5000 (90727)
[2021-09-09 15:14:44 +0800] [90727] [INFO] Using worker: sync
[2021-09-09 15:14:44 +0800] [90757] [INFO] Booting worker with pid: 90757

請求：

curl -d '{"columns":["x"], "data":[[1], [-1]]}' -H 'Content-Type: application/json; format=pandas-split' -X POST 127.0.0.1:5000/invocations
[1, 0]

model registry

集中的模型存儲,apis,UI,用來全周期的管理model，他能提供一種模型血緣，模型版本，以及模型的階段切換

minio作為模型數(shù)據(jù)的存儲后臺，sqlite作為模型元數(shù)據(jù)的存儲

export AWS_ACCESS_KEY_ID=admin      
export AWS_SECRET_ACCESS_KEY=liguoyu3564      
export MLFLOW_S3_ENDPOINT_URL=http://localhost:9000     
mlflow server \
--host 0.0.0.0 -p 5002 \
--default-artifact-root s3://mlflow \
--backend-store-uri sqlite:///mlflow.db

image

可以進行stage的切換，默認stage是None，Staging 表示正在籌備階段，Production表示已經(jīng)在線上環(huán)境階段，Archived 表示存檔階段，也就是處于拋棄狀態(tài)

image

啟動方式的改變：

mlflow models serve -m "models:/newmodel/Production" -p 12346 -h 0.0.0.0 --no-conda

Hub

解決問題：數(shù)據(jù)的管理和預處理

hub可以存儲數(shù)據(jù)集合作為單一的numpy類型的數(shù)組，數(shù)據(jù)大小可以到PT級別，并存儲在云上，無縫地在任何機器上訪問和使用這些數(shù)據(jù)。

Hub使得任何類型的存儲在云上的數(shù)據(jù)，可以同前端存儲一樣快速地被使用，數(shù)據(jù)類型包括圖片音頻和視頻?？梢耘cpytorch和TensorFlow集成

安裝

$ pip3 install hub

創(chuàng)建數(shù)據(jù)集

(base) [root@localhost hub]# tree animals/ -C   // 現(xiàn)有數(shù)據(jù)集目錄
animals/
├── cats
│   ├── image_1.jpg
│   └── image_2.jpg
└── dogs
    ├── image_3.jpg
    └── image_4.jpg

創(chuàng)建hub數(shù)據(jù)集

手動創(chuàng)建

import hub
from PIL import Image
import numpy as np
import os
# 創(chuàng)建空數(shù)據(jù)集
ds = hub.empty('./animals_hub') # Creates the dataset
# 便利獲取需要上傳的數(shù)據(jù)
dataset_folder = './animals'

class_names = os.listdir(dataset_folder)

files_list = []
for dirpath, dirnames, filenames in os.walk(dataset_folder):
    for filename in filenames:
        files_list.append(os.path.join(dirpath, filename))
        
# 創(chuàng)建數(shù)據(jù)張量和元數(shù)據(jù)        
with ds:
    ds.create_tensor('images', htype = 'image', sample_compression = 'jpeg')
    ds.create_tensor('labels', htype = 'class_label', class_names = class_names)
    ds.info.update(description = 'My first Hub dataset')
    ds.images.info.update(camera_type = 'SLR')
# 數(shù)據(jù)填充 
with ds:
    # Iterate through the files and append to hub dataset
    for file in files_list:
        label_text = os.path.basename(os.path.dirname(file))
        label_num = class_names.index(label_text)
        
        ds.images.append(hub.read(file))  # Append to images tensor using hub.read
        ds.labels.append(np.uint32(label_num)) # Append to labels tensor

自動創(chuàng)建

src = "./animals"
dest = './animals_hub_auto' // 數(shù)據(jù)集，這里采用本地存儲

ds = hub.ingest(src, dest)

數(shù)據(jù)壓縮

ds.create_tensor('images', htype = 'image', sample_compression = 'jpeg')

數(shù)據(jù)訪問

import hub

# Local Filepath
ds = hub.load('./my_dataset_path')

# S3
ds = hub.load('s3://my_dataset_bucket', creds={...})

## Activeloop Storage - See Step 6
# Public Dataset hosted by Activeloop
ds = hub.load('hub://activeloop/public_dataset_name')

# Dataset in another workspace on Activeloop Platform
ds = hub.load('hub://workspace_name/dataset_name')

### NO HIERARCHY ###
ds.images # is equivalent to
ds['images']

ds.labels # is equivalent to
ds['labels']

### WITH HIERARCHY - COMING SOON ###
ds.localization.boxes # is equivalent to
ds['localization/boxes']

ds.localization.labels # is equivalent to
ds['localization/labels']

# Indexing
W = ds.images[0].numpy() # Fetch an image and return a NumPy array
X = ds.labels[0].numpy(aslist=True) # Fetch a label and store it as a 
                                    # list of NumPy arrays

# Slicing
Y = ds.images[0:100].numpy() # Fetch 100 images and return a NumPy array
                             # The method above produces an exception if 
                             # the images are not all the same size

Z = ds.labels[0:100].numpy(aslist=True) # Fetch 100 labels and store 
                                         # them as a list of NumPy arrays

DVC

模塊	功能
Data Versioning	提供大數(shù)據(jù)文件、數(shù)據(jù)集和機器學習模型的版本管理能力。數(shù)據(jù)會被單獨存儲，類似git for data。
Data Access	提供了項目之外訪問DVC管理的數(shù)據(jù)工件的能力，比如下載某個版本的模型文件并部署。
Data Pipelines	定義了模型和其它數(shù)據(jù)工件加工生成的流水線，類似傳統(tǒng)意義上的Makefile。
Metrics, parameters, and plots	流水線上各環(huán)節(jié)記錄的信息
Experiments	一個可視化的瀏覽比較工具

image

DVC和git結(jié)合，對數(shù)據(jù)、模型、代碼進行版本管理。
安裝簡單，pip3 install dvc
使用方便，dvc push; dev pull等
速度快，在dvc add之后，會生成一個新的文件，如，dvc add data.sql,會生成data.sql.dvc（kb級別），git會上傳data.sql.dvc這個文件，dvc根據(jù)。dvc的文件可以pull到對應的文件。如果需要指定版本的data、model、code，只需要git checkout 版本號，然后dvc pull就好。

安裝 python 3.6+

pip3 install dvc

使用

DVC 目前支援以下七種remote 類型:

local - 本地目錄
s3 - Amazon S3
gs - Google 云端
azure - Azure Blob
ssh - ssh
hdfs - Hadoop 分佈式文件系統(tǒng)
http - HTTP 和 HTTPS

對數(shù)據(jù)和模型進行版本管理

git init  // git初始化

dvc init  // dvc初始化，目錄下生成.dvc目錄，其內(nèi)包括 .dvc/.gitignore 、 .dvc/cache/ 、 . dvc/config 其中最重要的是 .dvc/cache/， DVC 會在這里保存檔案的緩存，也是最終會 push 到云端的檔案

image

dvc add data/data.xml // data.xml加入dvc版控

image

git add data/data.xml.dvc data/.gitignore
git commit -m "Add raw data"

存儲和共享

DVC 支持多種遠程存儲類型，包括 Amazon S3、SSH、Google Drive、Azure Blob Storage 和 HDFS

dvc remote add -d storage s3://mybucket/dvcstore 
dvc remote add -d localstorage /tmp/dev_storage
git add .dvc/config
git commit -m "Configure remote storage"
dvc push // 數(shù)據(jù)保存到前面設(shè)置的存儲位置
git push -u origin master  // 提交引用文件到git

image

S3

$ dvc remote add -d s3 s3://dvc
$ dvc remote modify s3 access_key_id admin
$ dvc remote modify s3 secret_access_key liguoyu3564
$ dvc remote modify s3 endpointurl http://127.0.0.1:9000
$ cat .dvc/config 
[core]
    remote = s3
['remote "storage"']
    url = s3://mybucket/dvcstore
['remote "localstorage"']
    url = /tmp/dvc-storage
['remote "s3"']
    url = s3://dvc
    access_key_id = admin
    secret_access_key = liguoyu3564
    endpointurl = http://127.0.0.1:9000

拉取數(shù)據(jù)

dvc pull

image

數(shù)據(jù)修改

image

版本切換

git checkout HEAD~1 data/data.xml.dvc
dvc checkout

列出文件

(base) [root@localhost data]# dvc list http://192.168.194.129:3000/liguoyu/test data
.gitignore                                                                    
data.xml
data.xml.dvc

文件下載

該指令指揮下載數(shù)據(jù)源文件

(base) [root@localhost test]# dvc get http://192.168.194.129:3000/liguoyu/test data/data.xml -o datatest/data.xml
(base) [root@localhost test]# tree datatest/                                                                                                                                      
datatest/
└── data.xml

0 directories, 1 file

文件導出

(base) [root@localhost test]# dvc import http://192.168.194.129:3000/liguoyu/test data/data.xml -o datatest/data.xml
Importing 'data/data.xml (http://192.168.194.129:3000/liguoyu/test)' -> 'datatest/data.xml'
                                                                                                                                                                                  
To track the changes with git, run:

    git add datatest/.gitignore datatest/data.xml.dvc
(base) [root@localhost test]# tree datatest/ -C
datatest/
├── data.xml
└── data.xml.dvc

0 directories, 2 files
(base) [root@localhost test]# ls -al datatest/
total 37012
drwxr-xr-x. 2 root root       60 Sep  8 11:43 .
drwxr-xr-x. 6 root root       76 Sep  8 11:43 ..
-rw-r--r--. 1 root root 37891863 Sep  8 11:43 data.xml
-rw-r--r--. 1 root root      272 Sep  8 11:43 data.xml.dvc
-rw-r--r--. 1 root root       10 Sep  8 11:43 .gitignore

API調(diào)用

這樣可以在運行時直接從應用程序內(nèi)部訪問數(shù)據(jù)內(nèi)容

import dvc.api

with dvc.api.open(
    'data/data.xml',
    repo='http://192.168.194.129:3000/liguoyu/test'
) as fd:

配置數(shù)據(jù)，設(shè)置訓練和驗證集

$ wget https://code.dvc.org/get-started/code.zip
$ unzip code.zip
$ rm -f code.zip
$ tree
.
├── params.yaml
└── src
    ├── evaluate.py
    ├── featurization.py
    ├── prepare.py
    ├── requirements.txt
    └── train.py  
$ pip3 install -r src/requirements.txt
$ dvc run -n prepare \
          -p prepare.seed,prepare.split \
          -d src/prepare.py -d data/data.xml \
          -o data/prepared \
          python src/prepare.py data/data.xml

DAG:一個階段的輸出指定為另一個階段的依賴項

$ dvc run -n featurize \
          -p featurize.max_features,featurize.ngrams \
          -d src/featurization.py -d data/prepared \
          -o data/features \
          python src/featurization.py data/prepared data/features

指標收集

$ dvc run -n evaluate \
          -d src/evaluate.py -d model.pkl -d data/features \
          -M scores.json \
          --plots-no-cache prc.json \
          --plots-no-cache roc.json \
          python src/evaluate.py model.pkl \
                 data/features scores.json prc.json roc.json

結(jié)論：

與git類系統(tǒng)深度整合，命令也非常類似，看起來學習上手成本低
對于大數(shù)據(jù)文件存儲的支持方案豐富，從云存儲到私有的大數(shù)據(jù)存儲，應有盡有，包括Google Drive, Amazon S3, Azure Blob Storage, Google Cloud Storage, Aliyun OSS, SSH, HDFS, and HTTP等
多種編程語言和深度學習框架支持，如Python, R, Julia, Scala Spark, custom binary, Notebooks, flatfiles/TensorFlow, PyTorch

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

klever、MLflow、Hub、DVC的簡單使用

klever

解決問題:

組件

依賴組件

相關(guān)命令

ORMB使用

ormb-storage-initializer使用

modeljob-operator

MLflow

安裝

Tracking

效果

Projects

Models

model registry

Hub

安裝

創(chuàng)建數(shù)據(jù)集

創(chuàng)建hub數(shù)據(jù)集

手動創(chuàng)建

自動創(chuàng)建

數(shù)據(jù)壓縮

數(shù)據(jù)訪問

DVC

安裝 python 3.6+

使用

DVC 目前支援以下七種remote 類型:

對數(shù)據(jù)和模型進行版本管理

存儲和共享

S3

拉取數(shù)據(jù)

數(shù)據(jù)修改

版本切換

列出文件

文件下載

文件導出

API調(diào)用

配置數(shù)據(jù)，設(shè)置訓練和驗證集

DAG:一個階段的輸出指定為另一個階段的依賴項

指標收集

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

klever、MLflow、Hub、DVC的簡單使用