1. 注冊(cè)
注冊(cè)地址:https://huggingface.co/
注冊(cè)完成進(jìn)去看到推送上去的模型和數(shù)據(jù)集,現(xiàn)在暫時(shí)還沒(méi)有。

模型和數(shù)據(jù)集
2. 生成token用于和huggingface hub傳輸
網(wǎng)址: https://huggingface.co/settings/tokens
點(diǎn)擊New token,有兩種模式:只讀和讀寫。一般選讀寫,并且一臺(tái)機(jī)器一個(gè)token。

token
3. 把ssh key添加到huggingface
ssh key:cat ~/.ssh/id_rsa.pub
網(wǎng)址:https://huggingface.co/settings/keys

添加ssh key
4. 安裝git-lfs用于大文件傳輸
網(wǎng)址:https://git-lfs.com/
macos安裝git-lfs:
brew install git-lfs
git lfs install
ubuntu安裝git-lfs:
sudo apt install git-lfs
git lfs install
5. 登陸huggingface客戶端
shell輸入:
huggingface-cli login
輸入第2步生成的token,之后這臺(tái)機(jī)器跑代碼就不再需要token了。

登陸成功
6. 創(chuàng)建一個(gè)本地hub的目錄
mkdir ~/Documents/huggingface_local_hub
cd ~/Documents/huggingface_local_hub
git init
7. 把本地hub推到遠(yuǎn)程hub
- output_dir是本地存放的目錄
- hub_model_id是遠(yuǎn)程存放的用戶空間和模型名稱
training_args = TrainingArguments(
output_dir="~/Documents/huggingface_local_hub/llm/task_qa_distilbert",
hub_model_id="smile367/task_qa_distilbert",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=1,
weight_decay=0.01,
push_to_hub=True,
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_set,
eval_dataset=test_set,
tokenizer=tokenizer,
data_collator=data_collator,
)
trainer.push_to_hub()
8. 查看是否推送成功
可以看到,遠(yuǎn)程已經(jīng)有了這個(gè)模型。

推送成功
9. 使用遠(yuǎn)程的模型
pipeline("question-answering", model="smile367/task_qa_distilbert"),其中model參數(shù)指定遠(yuǎn)程的模型。
test_data = load_dataset("squad", split="validation[:1]")
print("question: {}".format(test_data["question"]))
print("context: {}".format(test_data["context"]))
print("answers: {}".format(test_data["answers"]))
question_answerer = pipeline("question-answering", model="smile367/task_qa_distilbert")
result = question_answerer(question=test_data["question"], context=test_data["context"])
print("result: {}".format(result))