ChatGLM-6B代码微调实战训练 完整版

clone github上的项目

In [1]:

# 首先git clone ChatGLM-Med这个项目
!git clone https://github.com/SCIR-HI/Med-ChatGLM.git
Cloning into 'Med-ChatGLM'...
remote: Enumerating objects: 57, done.
remote: Counting objects: 100% (57/57), done.
remote: Compressing objects: 100% (40/40), done.
remote: Total 57 (delta 20), reused 32 (delta 9), pack-reused 0
Unpacking objects: 100% (57/57), 809.49 KiB | 112.00 KiB/s, done.

In [2]:

%cd Med-ChatGLM
/home/mw/project/Med-ChatGLM

In [3]:

!ls
chat_dataset.py		  LICENSE	       requirements.txt		wandb
configuration_chatglm.py  model		       run_clm.py
data			  modeling_chatglm.py  scripts
infer.py		  README.md	       tokenization_chatglm.py

安装项目依赖

由于网络原因,必须将requirements.txt的最后一行git+https://github.com/huggingface/peft.git 删除,否则无法安装依赖会失败,同时将里面的protobuf那一行改为protobuf==3.18

In [4]:

!pip install -r requirements.txt -i https://mirrors.cloud.tencent.com/pypi/simple
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Collecting bitsandbytes==0.37.1
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/ec/18/75dbd7529844c8600944df123160216323982d39d24a30e9f6806279f935/bitsandbytes-0.37.1-py3-none-any.whl (76.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 76.3/76.3 MB 2.2 MB/s eta 0:00:0000:01:00:01
Collecting accelerate==0.17.1
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/0a/ca/96a50d122bd07d06c66e20e6bd275b5c8829602398f4e141f8755a25e31e/accelerate-0.17.1-py3-none-any.whl (212 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 212.8/212.8 kB 14.8 kB/s eta 0:00:00a 0:00:01
Collecting protobuf==3.18
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/c8/fe/f09a3b764c10a0a79af67daab0fb28ffb940912ba699ea5b1ca44463565c/protobuf-3.18.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 126.2 kB/s eta 0:00:00:0100:01
Collecting transformers==4.27.1
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/6d/9b/2f536f9e73390209e0b27b74691355dac494b7ec8154f3012fdc6debbae7/transformers-4.27.1-py3-none-any.whl (6.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.7/6.7 MB 1.9 MB/s eta 0:00:0000:0100:010m
Collecting icetk
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/bf/8a/731927e0901273815b779e6ce0e081a95ecf78835ff80be30830505ae06c/icetk-0.0.7-py3-none-any.whl (16 kB)
Collecting cpm_kernels==1.0.11
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/af/84/1831ce6ffa87b8fd4d9673c3595d0fc4e6631c0691eb43f406d3bf89b951/cpm_kernels-1.0.11-py3-none-any.whl (416 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 416.6/416.6 kB 2.7 MB/s eta 0:00:00a 0:00:01
Collecting torch>=1.13.1
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/e5/9a/ce0fe125f226ffce8deba6a18bd8d0b9f589aa236780a83a6d70b5525f56/torch-2.0.1-cp39-cp39-manylinux1_x86_64.whl (619.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 619.9/619.9 MB 717.0 kB/s eta 0:00:0000:0100:02
Collecting evaluate
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/0f/33/969f10bc16e294747a964662f0067c226b08664e963c12db21beb0fd5df3/evaluate-0.4.0-py3-none-any.whl (81 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 81.4/81.4 kB 1.2 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: scikit-learn in /opt/conda/lib/python3.9/site-packages (from -r requirements.txt (line 12)) (1.1.2)
Collecting datasets==2.10.1
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/fe/17/5825fdf034ff1a315becdbb9b6fe5a2bd9d8e724464535f18809593bf9c2/datasets-2.10.1-py3-none-any.whl (469 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 469.0/469.0 kB 273.1 kB/s eta 0:00:0000:0100:01
Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.9/site-packages (from accelerate==0.17.1->-r requirements.txt (line 3)) (21.3)
Requirement already satisfied: psutil in /opt/conda/lib/python3.9/site-packages (from accelerate==0.17.1->-r requirements.txt (line 3)) (5.9.1)
Requirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.9/site-packages (from accelerate==0.17.1->-r requirements.txt (line 3)) (1.22.4)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.9/site-packages (from accelerate==0.17.1->-r requirements.txt (line 3)) (5.4.1)
Collecting filelock
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/00/45/ec3407adf6f6b5bf867a4462b2b0af27597a26bd3cd6e2534cb6ab029938/filelock-3.12.2-py3-none-any.whl (10 kB)
Collecting regex!=2019.12.17
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/c0/f4/278e305e02245937579a7952b8a3205116b4d2480a3c03fa11e599b773d6/regex-2023.8.8-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (771 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 771.4/771.4 kB 520.1 kB/s eta 0:00:0000:0100:01
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/d6/27/07a337087dd507170a1b20fed3bbf8da81401185a7130a6e74e440c52040/tokenizers-0.13.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.8/7.8 MB 10.2 MB/s eta 0:00:0000:0100:01
Collecting huggingface-hub<1.0,>=0.11.0
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/7f/c4/adcbe9a696c135578cabcbdd7331332daad4d49b7c43688bc2d36b3a47d2/huggingface_hub-0.16.4-py3-none-any.whl (268 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 268.8/268.8 kB 674.9 kB/s eta 0:00:0000:0100:01
Requirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.9/site-packages (from transformers==4.27.1->-r requirements.txt (line 7)) (4.64.0)
Requirement already satisfied: requests in /opt/conda/lib/python3.9/site-packages (from transformers==4.27.1->-r requirements.txt (line 7)) (2.28.1)
Requirement already satisfied: fsspec[http]>=2021.11.1 in /opt/conda/lib/python3.9/site-packages (from datasets==2.10.1->-r requirements.txt (line 15)) (2022.7.1)
Collecting responses<0.19
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/79/f3/2b3a6dc5986303b3dd1bbbcf482022acb2583c428cd23f0b6d37b1a1a519/responses-0.18.0-py3-none-any.whl (38 kB)
Collecting xxhash
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/45/63/40da996350689cf29db7f8819aafa74c9d36feca4f0e4393d220c619a1dc/xxhash-3.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (193 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 193.8/193.8 kB 5.1 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: dill<0.3.7,>=0.3.0 in /opt/conda/lib/python3.9/site-packages (from datasets==2.10.1->-r requirements.txt (line 15)) (0.3.5.1)
Collecting multiprocess
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/c6/c9/820b5ab056f4ada76fbe05bd481a948f287957d6cbfd59e2dd2618b408c1/multiprocess-0.70.15-py39-none-any.whl (133 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.3/133.3 kB 4.5 MB/s eta 0:00:00
Collecting pyarrow>=6.0.0
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/54/a2/5976df95323c4ca2b7baba31cb7a2a61a17461706043239d38a8e9dc281e/pyarrow-12.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (39.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.0/39.0 MB 2.7 MB/s eta 0:00:0000:0100:01m
Collecting aiohttp
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/5b/8d/821fcb268cfc056964a75da3823896b17eabaa4968a2414121bc93b0c501/aiohttp-3.8.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 460.5 kB/s eta 0:00:00:0100:01
Requirement already satisfied: pandas in /opt/conda/lib/python3.9/site-packages (from datasets==2.10.1->-r requirements.txt (line 15)) (1.4.3)
Collecting sentencepiece
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/6b/22/4157918b2112d47014fb1e79b0dd6d5a141b8d1b049bae695d405150ebaf/sentencepiece-0.1.99-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 187.9 kB/s eta 0:00:00:0100:010m
Requirement already satisfied: torchvision in /opt/conda/lib/python3.9/site-packages (from icetk->-r requirements.txt (line 8)) (0.13.1+cu116)
Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.1->-r requirements.txt (line 10)) (4.3.0)
Collecting nvidia-cuda-runtime-cu11==11.7.99
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/36/92/89cf558b514125d2ebd8344dd2f0533404b416486ff681d5434a5832a019/nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 849.3/849.3 kB 307.6 kB/s eta 0:00:0000:0100:01
Requirement already satisfied: jinja2 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.1->-r requirements.txt (line 10)) (3.1.2)
Collecting nvidia-cusolver-cu11==11.4.0.1
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/3e/77/66149e3153b19312fb782ea367f3f950123b93916a45538b573fe373570a/nvidia_cusolver_cu11-11.4.0.1-2-py3-none-manylinux1_x86_64.whl (102.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 102.6/102.6 MB 2.8 MB/s eta 0:00:0000:0100:01
Collecting nvidia-nvtx-cu11==11.7.91
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/23/d5/09493ff0e64fd77523afbbb075108f27a13790479efe86b9ffb4587671b5/nvidia_nvtx_cu11-11.7.91-py3-none-manylinux1_x86_64.whl (98 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.6/98.6 kB 33.5 kB/s eta 0:00:00a 0:00:01
Collecting triton==2.0.0
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/77/ac/28b74ec1177c730d0da8803eaff5e5025bd532bcf07cadb0fcf661abed97/triton-2.0.0-1-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.3/63.3 MB 8.3 MB/s eta 0:00:00:00:0100:01
Collecting nvidia-cuda-cupti-cu11==11.7.101
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/e6/9d/dd0cdcd800e642e3c82ee3b5987c751afd4f3fb9cc2752517f42c3bc6e49/nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.8/11.8 MB 9.2 MB/s eta 0:00:0000:0100:01m
Requirement already satisfied: networkx in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.1->-r requirements.txt (line 10)) (2.8.6)
Collecting nvidia-cusparse-cu11==11.7.4.91
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/ea/6f/6d032cc1bb7db88a989ddce3f4968419a7edeafda362847f42f614b1f845/nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 173.2/173.2 MB 1.7 MB/s eta 0:00:0000:0100:01
Collecting nvidia-cudnn-cu11==8.5.0.96
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/dc/30/66d4347d6e864334da5bb1c7571305e501dcb11b9155971421bb7bb5315f/nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 557.1/557.1 MB 1.5 MB/s eta 0:00:0000:0100:02
Collecting nvidia-cuda-nvrtc-cu11==11.7.99
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/ef/25/922c5996aada6611b79b53985af7999fc629aee1d5d001b6a22431e18fec/nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.0/21.0 MB 2.8 MB/s eta 0:00:00:00:0100:01
Collecting nvidia-nccl-cu11==2.14.3
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/55/92/914cdb650b6a5d1478f83148597a25e90ea37d739bd563c5096b0e8a5f43/nvidia_nccl_cu11-2.14.3-py3-none-manylinux1_x86_64.whl (177.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 177.1/177.1 MB 1.8 MB/s eta 0:00:0000:0100:01
Collecting nvidia-curand-cu11==10.2.10.91
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/8f/11/af78d54b2420e64a4dd19e704f5bb69dcb5a6a3138b4465d6a48cdf59a21/nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.6/54.6 MB 1.0 MB/s eta 0:00:0000:01m00:01
Collecting nvidia-cufft-cu11==10.9.0.58
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/74/79/b912a77e38e41f15a0581a59f5c3548d1ddfdda3225936fb67c342719e7a/nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 168.4/168.4 MB 5.0 MB/s eta 0:00:0000:0100:01
Requirement already satisfied: sympy in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.1->-r requirements.txt (line 10)) (1.10.1)
Collecting nvidia-cublas-cu11==11.10.3.66
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/ce/41/fdeb62b5437996e841d83d7d2714ca75b886547ee8017ee2fe6ea409d983/nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 317.1/317.1 MB 2.0 MB/s eta 0:00:0000:0100:02
Requirement already satisfied: wheel in /opt/conda/lib/python3.9/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=1.13.1->-r requirements.txt (line 10)) (0.37.1)
Requirement already satisfied: setuptools in /opt/conda/lib/python3.9/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=1.13.1->-r requirements.txt (line 10)) (65.2.0)
Collecting cmake
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/2e/51/3a4672a819b4532a378bfefad8f886cfe71057556e0d4eefb64523fd370a/cmake-3.27.2-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (26.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 26.1/26.1 MB 4.0 MB/s eta 0:00:00:00:0100:01
Collecting lit
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/bf/fa/0b75c53253ebf3ab566be702a9da16f5783862d8c1ae404c907a8830f283/lit-16.0.6.tar.gz (153 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 153.7/153.7 kB 422.2 kB/s eta 0:00:00a 0:00:01
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: scipy>=1.3.2 in /opt/conda/lib/python3.9/site-packages (from scikit-learn->-r requirements.txt (line 12)) (1.9.0)
Requirement already satisfied: joblib>=1.0.0 in /opt/conda/lib/python3.9/site-packages (from scikit-learn->-r requirements.txt (line 12)) (1.1.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/lib/python3.9/site-packages (from scikit-learn->-r requirements.txt (line 12)) (3.1.0)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /opt/conda/lib/python3.9/site-packages (from packaging>=20.0->accelerate==0.17.1->-r requirements.txt (line 3)) (3.0.9)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.9/site-packages (from requests->transformers==4.27.1->-r requirements.txt (line 7)) (2022.6.15)
Requirement already satisfied: charset-normalizer<3,>=2 in /opt/conda/lib/python3.9/site-packages (from requests->transformers==4.27.1->-r requirements.txt (line 7)) (2.1.1)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/conda/lib/python3.9/site-packages (from requests->transformers==4.27.1->-r requirements.txt (line 7)) (1.26.11)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.9/site-packages (from requests->transformers==4.27.1->-r requirements.txt (line 7)) (3.3)
Collecting frozenlist>=1.1.1
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/b5/03/7dec2e257bd173b5ca1f74477863b97d322149f6f0284d7decead8c5ceeb/frozenlist-1.4.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (228 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 228.0/228.0 kB 532.3 kB/s eta 0:00:00a 0:00:01
Collecting multidict<7.0,>=4.5
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/69/48/2750fd3ace4d778b4e1f7110db3ad637906de3496abc9c450ce726b97337/multidict-6.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 114.2/114.2 kB 363.7 kB/s eta 0:00:00a 0:00:01
Collecting aiosignal>=1.1.2
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/76/ac/a7305707cb852b7e16ff80eaf5692309bde30e2b1100a1fcacdc8f731d97/aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Requirement already satisfied: attrs>=17.3.0 in /opt/conda/lib/python3.9/site-packages (from aiohttp->datasets==2.10.1->-r requirements.txt (line 15)) (22.1.0)
Collecting yarl<2.0,>=1.0
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/11/3d/785761e64dc90fda6feb9bd0459dc55ebe282a7d4564642a4a8ee277e0c0/yarl-1.9.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (269 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 269.4/269.4 kB 3.4 MB/s eta 0:00:0000:0100:01
Collecting async-timeout<5.0,>=4.0.0a3
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/a7/fa/e01228c2938de91d47b307831c62ab9e4001e747789d0b05baf779a6488c/async_timeout-4.0.3-py3-none-any.whl (5.7 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.9/site-packages (from jinja2->torch>=1.13.1->-r requirements.txt (line 10)) (2.1.1)
Collecting multiprocess
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/6a/f4/fbeb03ef7abdda54db4a6a75c971b88ab73d724ff09e3275cc1e99f1c946/multiprocess-0.70.14-py39-none-any.whl (132 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 132.9/132.9 kB 3.5 MB/s eta 0:00:00a 0:00:01
Collecting dill<0.3.7,>=0.3.0
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/be/e3/a84bf2e561beed15813080d693b4b27573262433fced9c1d1fea59e60553/dill-0.3.6-py3-none-any.whl (110 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 110.5/110.5 kB 3.3 MB/s eta 0:00:00
Requirement already satisfied: python-dateutil>=2.8.1 in /opt/conda/lib/python3.9/site-packages (from pandas->datasets==2.10.1->-r requirements.txt (line 15)) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.9/site-packages (from pandas->datasets==2.10.1->-r requirements.txt (line 15)) (2022.2.1)
Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.9/site-packages (from sympy->torch>=1.13.1->-r requirements.txt (line 10)) (1.2.1)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /opt/conda/lib/python3.9/site-packages (from torchvision->icetk->-r requirements.txt (line 8)) (9.2.0)
Collecting torchvision
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/41/9e/8809e45a084680394e8d219fcf8a2c0eed2dddf1ec0a7968f4052826a6e9/torchvision-0.15.2-cp39-cp39-manylinux1_x86_64.whl (6.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.0/6.0 MB 4.2 MB/s eta 0:00:0000:0100:010m
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.9/site-packages (from python-dateutil>=2.8.1->pandas->datasets==2.10.1->-r requirements.txt (line 15)) (1.16.0)
WARNING: The candidate selected for download or install is a yanked version: 'protobuf' candidate (version 3.18.0 at https://mirrors.cloud.tencent.com/pypi/packages/c8/fe/f09a3b764c10a0a79af67daab0fb28ffb940912ba699ea5b1ca44463565c/protobuf-3.18.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=42c04e66ec5a38ad2171639dc9860c2f9594668f709ea3a4a192acf7346853a7 (from https://mirrors.cloud.tencent.com/pypi/simple/protobuf/))
Reason for being yanked: This version claims to support Python 2 but does not
Building wheels for collected packages: lit
  Building wheel for lit (pyproject.toml) ... done
  Created wheel for lit: filename=lit-16.0.6-py3-none-any.whl size=93583 sha256=6e27613af9f5623d1ac01d1782ea3bfb20ac97d96442c892f97363de91ac0b70
  Stored in directory: /home/mw/.cache/pip/wheels/bc/3b/78/d7bdc80444196ff9edfbbcad27d8a3164842c0fc456be2010e
Successfully built lit
Installing collected packages: tokenizers, sentencepiece, protobuf, lit, cpm_kernels, cmake, bitsandbytes, xxhash, regex, pyarrow, nvidia-nvtx-cu11, nvidia-nccl-cu11, nvidia-cusparse-cu11, nvidia-curand-cu11, nvidia-cufft-cu11, nvidia-cuda-runtime-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-cupti-cu11, nvidia-cublas-cu11, multidict, frozenlist, filelock, dill, async-timeout, yarl, responses, nvidia-cusolver-cu11, nvidia-cudnn-cu11, multiprocess, huggingface-hub, aiosignal, transformers, aiohttp, datasets, evaluate, triton, torch, torchvision, icetk, accelerate
  Attempting uninstall: protobuf
    Found existing installation: protobuf 3.20.1
    Uninstalling protobuf-3.20.1:
      Successfully uninstalled protobuf-3.20.1
  Attempting uninstall: dill
    Found existing installation: dill 0.3.5.1
    Uninstalling dill-0.3.5.1:
      Successfully uninstalled dill-0.3.5.1
  Attempting uninstall: torch
    Found existing installation: torch 1.12.1+cu116
    Uninstalling torch-1.12.1+cu116:
      Successfully uninstalled torch-1.12.1+cu116
  Attempting uninstall: torchvision
    Found existing installation: torchvision 0.13.1+cu116
    Uninstalling torchvision-0.13.1+cu116:
      Successfully uninstalled torchvision-0.13.1+cu116
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchaudio 0.12.1+cu116 requires torch==1.12.1, but you have torch 2.0.1 which is incompatible.
Successfully installed accelerate-0.17.1 aiohttp-3.8.5 aiosignal-1.3.1 async-timeout-4.0.3 bitsandbytes-0.37.1 cmake-3.27.2 cpm_kernels-1.0.11 datasets-2.10.1 dill-0.3.6 evaluate-0.4.0 filelock-3.12.2 frozenlist-1.4.0 huggingface-hub-0.16.4 icetk-0.0.7 lit-16.0.6 multidict-6.0.4 multiprocess-0.70.14 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 protobuf-3.18.0 pyarrow-12.0.1 regex-2023.8.8 responses-0.18.0 sentencepiece-0.1.99 tokenizers-0.13.3 torch-2.0.1 torchvision-0.15.2 transformers-4.27.1 triton-2.0.0 xxhash-3.3.0 yarl-1.9.2

安装peft库

我们去https://github.com/huggingface/peft.git 的地址,下载压缩文件,并解压,进入该目录后,用pip install进行安装

In [5]:

cd /home/mw/project/peft-main
/home/mw/project/peft-main

In [6]:

!pip install peft
Collecting peft
  Downloading peft-0.4.0-py3-none-any.whl (72 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.9/72.9 kB 84.8 kB/s eta 0:00:00a 0:00:01
Requirement already satisfied: torch>=1.13.0 in /opt/conda/lib/python3.9/site-packages (from peft) (2.0.1)
Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.9/site-packages (from peft) (21.3)
Requirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.9/site-packages (from peft) (1.22.4)
Collecting safetensors
  Downloading safetensors-0.3.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 24.6 kB/s eta 0:00:0000:0100:02m
Requirement already satisfied: psutil in /opt/conda/lib/python3.9/site-packages (from peft) (5.9.1)
Requirement already satisfied: transformers in /opt/conda/lib/python3.9/site-packages (from peft) (4.27.1)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.9/site-packages (from peft) (5.4.1)
Requirement already satisfied: accelerate in /opt/conda/lib/python3.9/site-packages (from peft) (0.17.1)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /opt/conda/lib/python3.9/site-packages (from packaging>=20.0->peft) (3.0.9)
Requirement already satisfied: networkx in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (2.8.6)
Requirement already satisfied: filelock in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (3.12.2)
Requirement already satisfied: jinja2 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (3.1.2)
Requirement already satisfied: sympy in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (1.10.1)
Requirement already satisfied: nvidia-cuda-nvrtc-cu11==11.7.99 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (11.7.99)
Requirement already satisfied: nvidia-cuda-cupti-cu11==11.7.101 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (11.7.101)
Requirement already satisfied: nvidia-cufft-cu11==10.9.0.58 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (10.9.0.58)
Requirement already satisfied: nvidia-nccl-cu11==2.14.3 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (2.14.3)
Requirement already satisfied: nvidia-cusparse-cu11==11.7.4.91 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (11.7.4.91)
Requirement already satisfied: nvidia-nvtx-cu11==11.7.91 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (11.7.91)
Requirement already satisfied: triton==2.0.0 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (2.0.0)
Requirement already satisfied: nvidia-cusolver-cu11==11.4.0.1 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (11.4.0.1)
Requirement already satisfied: nvidia-curand-cu11==10.2.10.91 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (10.2.10.91)
Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (4.3.0)
Requirement already satisfied: nvidia-cublas-cu11==11.10.3.66 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (11.10.3.66)
Requirement already satisfied: nvidia-cuda-runtime-cu11==11.7.99 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (11.7.99)
Requirement already satisfied: nvidia-cudnn-cu11==8.5.0.96 in /opt/conda/lib/python3.9/site-packages (from torch>=1.13.0->peft) (8.5.0.96)
Requirement already satisfied: setuptools in /opt/conda/lib/python3.9/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=1.13.0->peft) (65.2.0)
Requirement already satisfied: wheel in /opt/conda/lib/python3.9/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=1.13.0->peft) (0.37.1)
Requirement already satisfied: cmake in /opt/conda/lib/python3.9/site-packages (from triton==2.0.0->torch>=1.13.0->peft) (3.27.2)
Requirement already satisfied: lit in /opt/conda/lib/python3.9/site-packages (from triton==2.0.0->torch>=1.13.0->peft) (16.0.6)
Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /opt/conda/lib/python3.9/site-packages (from transformers->peft) (0.13.3)
Requirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.9/site-packages (from transformers->peft) (4.64.0)
Requirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.9/site-packages (from transformers->peft) (2023.8.8)
Requirement already satisfied: huggingface-hub<1.0,>=0.11.0 in /opt/conda/lib/python3.9/site-packages (from transformers->peft) (0.16.4)
Requirement already satisfied: requests in /opt/conda/lib/python3.9/site-packages (from transformers->peft) (2.28.1)
Requirement already satisfied: fsspec in /opt/conda/lib/python3.9/site-packages (from huggingface-hub<1.0,>=0.11.0->transformers->peft) (2022.7.1)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.9/site-packages (from jinja2->torch>=1.13.0->peft) (2.1.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.9/site-packages (from requests->transformers->peft) (2022.6.15)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.9/site-packages (from requests->transformers->peft) (3.3)
Requirement already satisfied: charset-normalizer<3,>=2 in /opt/conda/lib/python3.9/site-packages (from requests->transformers->peft) (2.1.1)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/conda/lib/python3.9/site-packages (from requests->transformers->peft) (1.26.11)
Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.9/site-packages (from sympy->torch>=1.13.0->peft) (1.2.1)
Installing collected packages: safetensors, peft
Successfully installed peft-0.4.0 safetensors-0.3.2

In [1]:

# 进入医疗大模型的项目文件夹
%cd /home/mw/project/Med-ChatGLM
/home/mw/project/Med-ChatGLM

注意

由于这个项目的模型文件是放在谷歌和百度网盘上的,故将其与训练模型放在了数据集中,所以得修改下该项目下的infer.py文件,让其读取挂载数据的路径,可以在数据集中搜索医疗模型大数据集,选择挂载后复制路径,修改infer.py文件里面from_pretrained里的路径就可以了,保证这个路径下有模型文件

查看模型文件目录

In [9]:

!ls /home/mw/input/model7596/chatglm-6b-med/chatglm-6b-med
config.json			  rng_state.pth
configuration_chatglm.py	  scheduler.pt
generation_config.json		  special_tokens_map.json
ice_text.model			  tokenization_chatglm.py
pytorch_model-00001-of-00002.bin  tokenizer_config.json
pytorch_model-00002-of-00002.bin  trainer_state.json
pytorch_model.bin.index.json	  training_args.bin

编写输出代码

由于在notebook环境下使用input进行用户输入的时候会卡住,所以我们可以模仿项目的infer.py的格式读取模型文件,如下,方便自己输入

In [3]:

!python infer.py
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Loading checkpoint shards: 100%|██████████████████| 2/2 [00:28<00:00, 14.40s/it]
请输入您的问题:(输入q以退出)^C
Traceback (most recent call last):
  File "/home/mw/project/Med-ChatGLM/infer.py", line 9, in <module>
    a = input("请输入您的问题:(输入q以退出)")
KeyboardInterrupt

In [2]:

import torch
from transformers import AutoTokenizer, AutoModel
from modeling_chatglm import ChatGLMForConditionalGeneration
tokenizer = AutoTokenizer.from_pretrained(
    "/home/mw/input/model7596/chatglm-6b-med/chatglm-6b-med", trust_remote_code=True)
model = ChatGLMForConditionalGeneration.from_pretrained(
    "/home/mw/input/model7596/chatglm-6b-med/chatglm-6b-med").half().cuda()
def medica_answer(text):
    response, history = model.chat(tokenizer, "问题:" + text.strip() + '
答案:', max_length=256, history=[])
    return response
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

模型尝试

In [3]:

medica_answer("我头疼怎么办")
/opt/conda/lib/python3.9/site-packages/transformers/generation/utils.py:1201: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)
  warnings.warn(

Out[3]:

'对于头痛的治疗,建议采取多种治疗方法,如口服药物、物理治疗等,同时要注意避免过度疲劳和压力,保持好的生活习惯和饮食习惯。'

In [4]:

medica_answer("小李最近出现了心动过速的症状,伴有轻度胸痛。体检发现P-R间期延长,伴有T波低平和ST段异常")

Out[4]:

'小李可能患有原发性心动过速,需要进一步检查以明确诊断。治疗方案为苯妥英钠、地西泮、阿托品等。'

In [6]:

medica_answer("吸毒会给身体带来什么影响")

Out[6]:

'吸毒会对身体健康造成严重影响,可能导致营养不良、体重减轻、精神异常、营养不良、感染等。同时,吸毒对心理健康也有一定影响,可能导致焦虑、抑郁、精神分裂等问题。建议避免使用毒品,保持身体健康。'

In [7]:

medica_answer("怎么减少跑步后的腿部酸胀")

Out[7]:

'建议进行适当的运动和休息,并注意加强营养和饮食调理。'

In [8]:

medica_answer("谷氨酰转肽酶水平会因吸毒或饮酒而升高吗?")

Out[8]:

'可能会。吸毒和饮酒可能导致肝损伤、肝损伤等并发症,因此可能会影响谷氨酰转肽酶水平。'

注意事项

除了前面安装依赖包的事情,还有个注意事项
由于版本问题在git clone Med-ChatGLM项目后在运行上面的medica_answer代码可能会报ValueError: 130001 is not in list ,这个时候可以如将仓库回退至commit为cb9d827的版本,链接为https://github.com/SCIR-HI/Med-ChatGLM/tree/cb9d82738021ec6f82b307d6031e8595a49dcb00 下载后将文件夹上传,原来的文件夹删除,将新的文件夹命名为Med-ChatGLM,进入此文件夹后,运行上面的medica_answer就不会报错了,记住在命名之后要重启下kernel,在进入该文件夹