Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add export model check #5488

Open
wants to merge 9 commits into
base: release/2.3
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions tutorials/tipc/export_model_infer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Linux GPU/CPU PYTHON 动转静精度测试

动转静精度测试的主程序为`test_export_shell.sh`,可以测试模型动转静的功能和inferecn预测结果的正确性。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inferecn -> inference


<a name="1"></a>
## 1. 开发说明

该文档主要介绍 动转静精度验证 开发过程,将该脚本拷贝到相关repo中,完成适配后直接运行 `test_export_shell.sh` 即可。

运行该脚本,会完成以下3个步骤:

1. 首先在模型库路径下创建 check_inference.py 文件,用于验证动态图预测精度和paddle.inference推理精度。
2. 改写模型库中的 export_model.py, 完成动转静操作后把模型路径、输入数据shape、模型结构作为参数传给 check_inference.py。
** 注意 ** 不同套件中这部分名称不同,可能需要根据套件代码进行修改。

3. 在 train_infer 链条中,自动升级 export_model 命令,检查推理一致性。(默认比对阈值为 1e-4)

<details>
<summary><b> 脚本实现细节(点击以展开详细内容或者折叠)</b></summary>

1. 创建 check_inference.py 文件
将模版内容写入脚本,用于在启动模型动转静时测试精度误差

```
def verify_paddle_inference_correctness(layer, path):
from paddle import inference
import numpy as np
model_file_path = path + ".pdmodel"
params_file_path = path + ".pdiparams"
config = inference.Config(model_file_path, params_file_path)
predictor = inference.create_predictor(config)
input_names = predictor.get_input_names()
input_data = get_input_shape("xxxxxx")
dygraph_input = {}
if input_names == ["im_shape", "image", "scale_factor"]:
input_names = ["image", "im_shape", "scale_factor"]
for i,name in enumerate(input_names):
input_tensor = predictor.get_input_handle(name)
fake_input = input_data[i]
input_tensor.copy_from_cpu(fake_input)
dygraph_input[name] = paddle.to_tensor(fake_input)
predictor.run()
output_names = predictor.get_output_names()
output_tensors = []
for output_name in output_names:
output_tensor = predictor.get_output_handle(output_name)
output_tensors.append(output_tensor)
prob_out = output_tensors[0].copy_to_cpu()
layer.eval()
pred = layer(dygraph_input)
pred = list(pred.values())[0] if isinstance(pred, dict) else pred
correct = np.allclose(pred, prob_out, rtol=1e-4, atol=1e-4)
absolute_diff = np.abs(pred.numpy() - prob_out)
max_absolute_diff = np.max(absolute_diff)
# print("max_absolute_diff:", max_absolute_diff)
assert correct, "Result diff when load and inference:\nlayer max_absolute_diff:{}"\
.format(max_absolute_diff)
print("Successful, dygraph and inference predictions are consistent.")'
```

2. 找到动转静执行文件

在 133-175 行中找到对应套件,修改
- 模型保存操作的真实路径--export_file
- 模型变量名--layer
- 模型保存路径--path

以 PaddleClas repo 为例,模型导出命令为 `python3.7 tools/export_model.py -c xxx`, 以tools/export_model.py 为入口,找到真正执行`jit.save`操作的文件: `ppcls/engine/engine.py`,具体代码在410行:

```
model = paddle.jit.to_static(
model,
input_spec=[
paddle.static.InputSpec(
shape=[None] + self.config["Global"]["image_shape"],
dtype='float32')
])
paddle.jit.save(model, save_path)
```

paddle.jit.save命令,保存模型为 model, 存储路径为 save_path, 预测输入shape为 [None] + self.config["Global"]["image_shape"], 由此可在 `test_export_shell.sh` 中为以下变量赋值:

```
export_file=${root_path}/ppcls/engine/engine.py
# define layer path and img_shape
layer="model"
path="save_path"
```

执行 `test_export_shell.sh` 脚本后,`ppcls/engine/engine.py` 中会添加5行代码,在每次执行动转静操作时自动测试精度:

```
model = paddle.jit.to_static(
model,
input_spec=[
paddle.static.InputSpec(
shape=[None] + self.config["Global"]["image_shape"],
dtype='float32')
])
paddle.jit.save(model, save_path)
# 以下为自动添加的代码
from check_inference import verify_paddle_inference_correctness
layer = model
path = save_path
verify_paddle_inference_correctness(layer, path)
```

</details>



## 2. 测试说明

测试需完成两项工作:

1. 根据config生成 check_inference.py 文件,以MobileNetV3为例:

```
bash test_export_shell.sh test_tipc/config/MobileNetV3/MobileNetV3_large_x0_5_train_infer_python.txt
Copy link
Contributor

@zhengya01 zhengya01 Jun 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1、补充执行路径
2、config->configs

```

2. 正常执行tipc “lite_train_lite_infer” 链条
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tipc 基础链条的"lite_train_lite_infer"模式

```
bash test_train_inference_python.sh test_tipc/config/MobileNetV3/MobileNetV3_large_x0_5_train_infer_python.txt "lite_train_lite_infer"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bash test_train_inference_python.sh test_tipc/config/MobileNetV3/MobileNetV3_large_x0_5_train_infer_python.txt "lite_train_lite_infer"
->
bash test_tipc/test_train_inference_python.sh test_tipc/config/mobilenet_v3_small/train_infer_python.txt "lite_train_lite_infer"

```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

执行完以上命令后如何看结果

198 changes: 198 additions & 0 deletions tutorials/tipc/export_model_infer/test_export_shell.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
source test_tipc/common_func.sh
source test_tipc/utils_func.sh

echo "-------- Test inference result --------- "

FILENAME=$1
dataline=$(awk 'NR==52, NR==53{print}' $FILENAME)
IFS=$'\n'
lines=(${dataline})
input_shape=$(func_parser_value "${lines[1]}")

function add_check_function(){
jit_replace='import numpy as np
import paddle

def getdtype(dtype="float32"):
if dtype == "float32" or dtype == "float":
return np.float32
if dtype == "float64":
return np.float64
if dtype == "int32":
return np.int32
if dtype == "int64":
return np.int64

def randtool(dtype, low, high, shape):
"""
np random tools
"""
data = None
if dtype.count("int"):
data = np.random.randint(low, high, shape)
elif dtype.count("float"):
data = low + (high - low) * np.random.random(shape)
elif dtype.count("bool"):
data = np.random.randint(low, high, shape)
return data.astype(getdtype(dtype))


def get_input_shape(data):
config = {}
data = data.split(";")
shape_dict = {}
for item in data:
shape_list = item.strip("][").strip("{}").split("},{")
for i, shape in enumerate(shape_list):
if str(i) not in shape_dict:
shape_dict[str(i)] = {}
shape_dict[str(i)]["dtype"] = []
shape_dict[str(i)]["shape"] = []
arr = shape.strip("][").split(",[")
dtype, shape = arr[0], list(map(int, arr[1].split(",")))
shape.insert(0, -1)
shape_dict[str(i)]["dtype"].append(dtype)
shape_dict[str(i)]["shape"].append(shape)
config["input_shape"] = shape_dict

input_data = []
for i, val in enumerate(config["input_shape"]):
input_shape = config["input_shape"][val]
shape = [1] + input_shape["shape"][0][1:]
dtype = input_shape["dtype"][0]
data = randtool(dtype, -1, 1, shape)
input_data.append(data)
return input_data

def verify_paddle_inference_correctness(layer, path):
from paddle import inference
import numpy as np

model_file_path = path + ".pdmodel"
params_file_path = path + ".pdiparams"
config = inference.Config(model_file_path, params_file_path)
predictor = inference.create_predictor(config)
input_names = predictor.get_input_names()
input_data = get_input_shape("xxxxxx")
dygraph_input = {}
if input_names == ["im_shape", "image", "scale_factor"]:
input_names = ["image", "im_shape", "scale_factor"]
for i,name in enumerate(input_names):
input_tensor = predictor.get_input_handle(name)
fake_input = input_data[i]
input_tensor.copy_from_cpu(fake_input)
dygraph_input[name] = paddle.to_tensor(fake_input)
predictor.run()
output_names = predictor.get_output_names()
output_tensors = []
for output_name in output_names:
output_tensor = predictor.get_output_handle(output_name)
output_tensors.append(output_tensor)
prob_out = output_tensors[0].copy_to_cpu()

layer.eval()
pred = layer(dygraph_input)
pred = list(pred.values())[0] if isinstance(pred, dict) else pred
correct = np.allclose(pred, prob_out, rtol=1e-4, atol=1e-4)
absolute_diff = np.abs(pred.numpy() - prob_out)
max_absolute_diff = np.max(absolute_diff)
# print("max_absolute_diff:", max_absolute_diff)
assert correct, "Result diff when load and inference:\nlayer max_absolute_diff:{}"\
.format(max_absolute_diff)
print("Successful, dygraph and inference predictions are consistent.")'

echo "${jit_replace//xxxxxx/$input_shape}" > check_inference.py
}

add_check_function $FILENAME

function fun_run_check(){
tab_num=$1
dy_model=$2
model_path=$3

space=`expr $tab_num \* 2`
tab=$(printf "%-${space}s" " ")
line0="${tab}from check_inference import verify_paddle_inference_correctness"
line1="${tab}layer = ${dy_model}"
line2="${tab}path = ${model_path}"
line4="${tab}verify_paddle_inference_correctness(layer, path)"
echo "${line0}
${line1}
${line2}
${line4}" > tmp_file.txt
}

root_path=.
model_type=$PWD

echo $model_type

echo "-------- Add check infer code in jit.save ------"

if [[ $model_type =~ "PaddleClas" ]]; then
echo "PaddleClas"
# get export file
export_file=${root_path}/ppcls/engine/engine.py
# define layer path and img_shape
layer="model"
path="save_path"
elif [[ $model_type =~ "PaddleOCR" ]]; then
echo "PaddleOCR"
# get export file
export_file=${root_path}/tools/export_model.py
# define layer path and img_shape
layer="model"
path="save_path"
elif [[ $model_type =~ "PaddleDetection" ]]; then
echo "PaddleDetection"
# get export file
export_file=${root_path}/ppdet/engine/trainer.py
# define layer path and img_shape
layer="static_model"
path="os.path.join(save_dir, 'model')"
elif [[ $model_type =~ "PaddleGAN" ]]; then
echo "PaddleGAN"
# get export file
export_file=${root_path}/ppgan/models/base_model.py
# define layer path and img_shape
layer="static_model"
path="os.path.join(output_dir, model_name)"
elif [[ $model_type =~ "PaddleSeg" ]]; then
echo "PaddleSeg"
# get export file
export_file=${root_path}/export.py
# define layer path and img_shape
layer="new_net"
path="save_path"
elif [[ $model_type =~ "PaddleVideo" ]]; then
echo "PaddleVideo"
# get export file
export_file=${root_path}/export.py
# define layer path and img_shape
layer="model"
path="args.output_path"
fi

echo $export_file

if [[ $model_type =~ "PaddleDetection" ]]; then
tab_num="6"
# get insert line
line_number="782"
elif [[ $model_type =~ "PaddleGAN" ]]; then
tab_num="3"
line_number="205"
else
tab_num=`cat ${export_file} | grep "jit.save" | awk -F" " '{print NF-1}'`
# get insert line
line_number=`cat ${export_file} | grep -n "jit.save" | awk -F ":" '{print $1}'`
fi

fun_run_check $tab_num $layer $path

if [ `grep -c "check_inference" $export_file` -ne '0' ];then
echo "The export file has add check code!"
else
sed -i "${line_number} r tmp_file.txt" ${export_file}
fi