118 Star 817 Fork 469

MindSpore / mindformers

 / 详情

llama3-8b lora微调报错json.decoder.JSONDecodeError:

TODO
Bug-Report
创建于  
2024-05-22 17:34

镜像

mindformers_v1.1.0-mindspore_2.3.0rc2-cann_8.0rc1-py_3.9-ubuntu_18.04-aarch64-d910

运行
cd mindformers/research

单机8卡默认快速启动

bash ../scripts/msrun_launcher.sh
"llama3/run_finetune.py
--config llama3/run_llama3_8b_8k_800T_A2_64G.yaml
--load_checkpoint /home/ma-user/Models/Llama3-8B-Chinese-Chat.ckpt
--auto_trans_ckpt True
--use_parallel True
--run_mode finetune
--train_data /home/ma-user/Projects/llama3/data"

报错如下
No parameter is entered. Notice that the program will run on default 8 cards.
../scripts/msrun_launcher.sh: line 119: ulimit: max user processes: cannot modify limit: Operation not permitted
Running Command: msrun --worker_num=8 --local_worker_num=8 --master_port=8118 --log_dir=output/msrun_log --join=False --cluster_time_out=600 llama3/run_finetune.py --config llama3/run_llama3_8b_8k_800T_A2_64G.yaml --load_checkpoint /home/ma-user/Models/Llama3-8B-Chinese-Chat.ckpt --auto_trans_ckpt True --use_parallel True --run_mode finetune --train_data /home/ma-user/Projects/llama3/data

Please check log files in output/msrun_log

/home/ma-user/miniconda3/envs/llama/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
setattr(self, word, getattr(machar, word).flat[0])
/home/ma-user/miniconda3/envs/llama/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
return self._float_to_str(self.smallest_subnormal)
/home/ma-user/miniconda3/envs/llama/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
setattr(self, word, getattr(machar, word).flat[0])
/home/ma-user/miniconda3/envs/llama/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
return self._float_to_str(self.smallest_subnormal)
/bin/sh: 1: ip: not found

Traceback (most recent call last):
File "/home/ma-user/miniconda3/envs/llama/bin/msrun", line 8, in
sys.exit(main())
File "/home/ma-user/miniconda3/envs/llama/lib/python3.9/site-packages/mindspore/parallel/cluster/run.py", line 136, in main
run(args)
File "/home/ma-user/miniconda3/envs/llama/lib/python3.9/site-packages/mindspore/parallel/cluster/run.py", line 129, in run
process_manager = _ProcessManager(args)
File "/home/ma-user/miniconda3/envs/llama/lib/python3.9/site-packages/mindspore/parallel/cluster/process_entity/_api.py", line 104, in init
self.is_master = _is_local_ip(args.master_addr)
File "/home/ma-user/miniconda3/envs/llama/lib/python3.9/site-packages/mindspore/parallel/cluster/process_entity/_utils.py", line 78, in _is_local_ip
addr_infos = json.loads(addr_info_str)
File "/home/ma-user/miniconda3/envs/llama/lib/python3.9/json/init.py", line 346, in loads
return _default_decoder.decode(s)
File "/home/ma-user/miniconda3/envs/llama/lib/python3.9/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/ma-user/miniconda3/envs/llama/lib/python3.9/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

评论 (4)

请问你这个环境有镜像吗,我在modelarts云平台基础镜像mindspore_2.1.0-cann_6.3.2-py_3.7-euler_2.8.3-aarch64-d910下按照MindSpore教程又安装了ascend_toolkit,结果mindspore一直报错

安装一下iproute这个库

如果没有sudo权限的话有其他办法解决这个问题吗

暂时没有,要root权限安装

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(3)
Python
1
https://gitee.com/mindspore/mindformers.git
git@gitee.com:mindspore/mindformers.git
mindspore
mindformers
mindformers

搜索帮助

344bd9b3 5694891 D2dac590 5694891