MLC-LLM構(gòu)建Android 應(yīng)用

源碼下載

# 指定 docs_typo_mlc_chat 分支克隆
git clone -b docs_typo_mlc_chat --single-branch https://github.com/mlc-ai/mlc-llm.git
# 進(jìn)入 mlc-llm 項(xiàng)目
cd mlc-llm
# 克隆子模塊代碼
git submodule update --init --recursive
# 進(jìn)入 MLCChat 目錄
cd ./android/MLCChat

編輯環(huán)境變量

vim ~/.bashrc  查看環(huán)境變量

export ANDROID_NDK=/home/lenovo/Android/Sdk/ndk/26.1.10909125
export ANDROID_HOME=/home/lenovo/Android/Sdk
export PATH=$PATH:/home/lenovo/Android/Sdk/cmake/3.10.2.4988404/bin
export PATH=$PATH:/home/lenovo/Android/Sdk/platform-tools
export TVM_NDK_CC=$ANDROID_NDK/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android23-clang

export JAVA_HOME=/home/lenovo/.jdks/corretto-18.0.2
export PATH=$PATH:$JAVA_HOME/bin

export MLC_LLM_SOURCE_DIR=/sda/xj/t/mlc-llm
export TVM_SOURCE_DIR=/sda/xj/t/mlc-llm/3rdparty/tvm

<font style="color:#DF2A3F;">需要注意的是jdk版本要和androidStudio里面使用的版本保持一致。</font>

<font style="color:#DF2A3F;"></font>

export MLC_LLM_SOURCE_DIR=/sda/xj/mlc-llm-llama3/mlc-llm
export TVM_SOURCE_DIR=/sda/xj/mlc-llm-llama3/mlc-llm/3rdparty/tvm

export MLC_LLM_SOURCE_DIR=/sda/xj/mlc-llm-llama3/mlc-llm

export TVM_SOURCE_DIR=/sda/xj/mlc-llm-llama3/mlc-llm/3rdparty/tvm

轉(zhuǎn)換模型權(quán)重

下載 MiniCPM-2B-dpo-bf16-llama-format 模型庫(kù)

官網(wǎng) huggingface 下載 openbmb/MiniCPM-2B-dpo-bf16-llama-format ,放入 dist/models 目錄。

convert_weight 權(quán)重轉(zhuǎn)換

# 進(jìn)入 mlc-llm 的安卓 MLCChat 根目錄
cd D:\mlc-llm\android\MLCChat
# MiniCPM-2B-dpo-bf16-llama-format 模型轉(zhuǎn)換
mlc_llm convert_weight ./dist/models/MiniCPM-2B-dpo-bf16-llama-format/ --quantization q4f16_1
-o dist/bundle/MiniCPM-2B-dpo-bf16-llama-format-q4f16_1

llama8b模型轉(zhuǎn)化

mlc_llm convert_weight ./dist/models/Llama-3-8B-Instruct-llama-format/ --quantization q3f16_1 -o dist/bundle/Llama-3-8B-Instruct-llama-format-q3f16_1

生成MLC聊天配置

mlc_llm gen_config ./dist/models/MiniCPM-2B-dpo-bf16-llama-format/ --quantization q4f16_1 -
-conv-template redpajama_chat -o dist/bundle/MiniCPM-2B-dpo-bf16-llama-format-q4f16_1/

執(zhí)行成功后, dist/bundle/MiniCPM-2B-dpo-bf16-llama-format-q4f16_1 目錄下會(huì)多生成 mlc-chat-config.json 、 tokenizer.json 、tokenizer.model 、 tokenizer_config.json 四個(gè)文件。

mlc_llm gen_config ./dist/models/llama_3.1_0.5_4-30/ --quantization q4f16_1 --conv-template redpajama_chat -o dist/bundle/llama_3.1_0.5_4-30-q4f16_1/

mlc_llm gen_config ./dist/models/llama3_pruned/ --quantization q0f16 -

-conv-template redpajama_chat -o dist/bundle/llama3_pruned/

mlc_llm convert_weight ./dist/models/llama3_pruned/ --quantization q0f1

-o dist/bundle/llama3_pruned-format-q4f16

mlc_llm gen_config ./dist/models/llama3_pruned/ --quantization q0f16 -

-conv-template redpajama_chat -o dist/bundle/MiniCPM-2B-dpo-bf16-llama-format-q4f16_1/

編譯安卓依賴庫(kù)&jar包

把轉(zhuǎn)換好的 MiniCPM-2B-dpo-bf16-llama-format-q4f16_1 模型復(fù)制到

mlc_llm\model_weights\hf\mlc-ai 目錄下。(<font style="color:#DF2A3F;">model_weights需要?jiǎng)?chuàng)建</font>)若找不到會(huì)

去官網(wǎng) https://huggingface.co/mlc-ai 下載。不建議去下載。下載模型配置文件在MLCChat/mlc-package-config.json內(nèi)編輯。

mlc_llm package

會(huì)生成以下 /dist/lib/mlc4j 目錄下的文件。一個(gè)<font style="color:#DF2A3F;">libtvm4j_runtime_packed.so</font>、<font style="color:#DF2A3F;">tvm4j_core.jar</font>。

構(gòu)建apk

打開AS, 點(diǎn)擊Build → Generate Signed Bundle / APK

啟動(dòng)AS過程中不小心將gradle給清空后,再次下載會(huì)很慢。可以使用國(guó)內(nèi)騰訊源:

<font style="color:rgb(0, 0, 0);background-color:rgb(149, 236, 105);">https://mirrors.cloud.tencent.com/gradle/gradle-8.5-bin.zip</font>

拷貝模型到手機(jī)端

cd mlc-llm\android\MLCChat
python bundle_weight.py --apk-path app/release/app-release.apk

這里的release指的是在AS中需要設(shè)置應(yīng)用前面編譯構(gòu)建正式應(yīng)用。需要在操作6中完成。

mlc_llm convert_weight ./dist/models/MiniCPM-2B-dpo-bf16-llama-format/ --quantization q4f16_1

-o dist/bundle/MiniCPM-2B-dpo-bf16-llama-format-q4f16_1

python bundle_weight.py --apk-path app/debug/app-debug.apk

其他

<font style="color:rgb(56, 58, 66);">python -m pip </font><font style="color:rgb(64, 120, 242);">install</font><font style="color:rgb(56, 58, 66);"> -U mlc-llm-nightly-cu121.whl mlc-ai-nightly-cu121.whl</font>

<font style="color:rgb(56, 58, 66);">mlc_llm convert_weight ./dist/models/llama3_pruned/ --quantization q0f16 -o dist/bundle/llama3-pruned-format-q0f16</font>

<font style="color:rgb(56, 58, 66);">mlc_llm gen_config ./dist/models/llama3_pruned/ --quantization q0f16 --conv-template redpajama_chat -o dist/bundle/llama3-pruned-format-q0f16/</font>

mlc_llm convert_weight ./dist/models/llama3_pruned/ --quantization q4f16_1 -o dist/bundle/llama3-pruned-format-q4f16_1

mlc_llm gen_config ./dist/models/llama3_pruned/ --quantization q4f16_1 --conv-template redpajama_chat -o dist/bundle/llama3-pruned-format-q4f16_1/

/home/xj/sda/xj/mlc-llm-llama3/mlc-llm/android/MLCChat

可用路徑

conda activate mlc-chat-cpm3

project path: /sda/xj/mlc-llm-llama3/mlc-llm/android/MLCChat

setting env:
注意點(diǎn): 在/sda/xj/mlc-llm-llama3/mlc-llm目錄執(zhí)行
export ANDROID_NDK=/home/lenovo/Android/Sdk/ndk/26.1.10909125
export ANDROID_HOME=/home/lenovo/Android/Sdk
export PATH=$PATH:/home/lenovo/Android/Sdk/cmake/3.10.2.4988404/bin
export PATH=$PATH:/home/lenovo/Android/Sdk/platform-tools
export TVM_NDK_CC=$ANDROID_NDK/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android23-clang
export JAVA_HOME=/home/lenovo/.jdks/corretto-18.0.2
export PATH=$PATH:$JAVA_HOME/bin
export MLC_LLM_SOURCE_DIR=/sda/xj/mlc-llm-llama3/mlc-llm
export TVM_SOURCE_DIR=/sda/xj/mlc-llm-llama3/mlc-llm/3rdparty/tvm
注意點(diǎn):在/sda/xj/mlc-llm-llama3/mlc-llm/android/MLCChat執(zhí)行mlc_llm package
注意點(diǎn):生成聊天配置也是在/sda/xj/mlc-llm-llama3/mlc-llm/android/MLCChat這個(gè)目錄執(zhí)行指令

192.168.1.129

export ANDROID_NDK=/home/xj/Android/Sdk/ndk/26.1.10909125
export ANDROID_HOME=/home/xj/Android/Sdk
export PATH=$PATH:/home/xj/Android/Sdk/cmake/3.10.2.4988404/bin
export PATH=$PATH:/home/xj/Android/Sdk/platform-tools
export TVM_NDK_CC=$ANDROID_NDK/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android23-clang

export JAVA_HOME=/home/xj/.jdks/corretto-18.0.2
export PATH=$PATH:$JAVA_HOME/bin

export MLC_LLM_SOURCE_DIR=/home/xj/mlc-llm
export TVM_SOURCE_DIR=/home/xj/mlc-llm/3rdparty/tvm

source $HOME/.cargo/env

// 轉(zhuǎn)化模型權(quán)重
mlc_llm convert_weight ./dist/models/Qwen1.5-1.8B-Chat/ --quantization q4f16_1 -o dist/models/qwen1.5-1.8b-q4f16_1

// 生成聊天配置
mlc_llm gen_config ./dist/models/Qwen1.5-1.8B-Chat/ --quantization q4f16_1 --conv-template redpajama_chat -o dist/models/qwen1.5-1.8b-q4f16_1/



mlc_llm gen_config ./dist/models/Qwen1.5-1.8B-Chat \
    --model-type qwen2 \
    --quantization q4f16_1 \
    --conv-template chatml \
    --context-window-size 2048 \
    --max-batch-size 1 \
    -o dist/models/qwen1.5-1.8b-q4f16_1

自動(dòng)化編譯打包

進(jìn)入android project下執(zhí)行構(gòu)建:
cd /sda/xj/mlc-llm-llama3/mlc-llm/android/MLCChat

使用Gradle Wrapper編譯項(xiàng)目:
./gradlew build

打包Release版本的APK:
./gradlew assembleDebug

打包Release版本的APK:
./gradlew assembleRelease

清理項(xiàng)目:
./gradlew clean
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容