This article illustrates how to change train program to an EDL distill train, and run student with fixed teacher or dynamic teacher.
- Define an input to represent the variable obtained from the teacher.
- Use DistillReader to define the original reader input and the variables that need to be obtained from the teacher. Then set train_reader as the data source of DistillReader.
- Use student's prediction and teacher's prediction to define loss function. Take mnist_distill demo as an example, the code is as follows
# 1. define an input represent teacher prediction
soft_label = fluid.data(name='soft_label', shape=[None, 10], dtype='float32')
inputs.append(soft_label)
# 2. define DistillReader
dr = DistillReader(ins=['img', 'label'], predicts=['fc_0.tmp_2'])
train_reader = dr.set_sample_list_generator(train_reader)
# 3. define distill loss
distill_loss = fluid.layers.cross_entropy(
input=prediction, label=soft_label, soft_label=True)
distill_loss = fluid.layers.mean(distill_loss)
loss = distill_loss
# Start distill train.
# data includes the original reader input and the prediction results obtained from the teacher,
# that is (img, label, soft_label)
for data in train_reader():
metrics = exe.run(main_program, feed=data, fetch_list=[loss, acc])
- First, you need deploy teacher. We using Paddle Serving to deploy teacher. You can see here to save your own teacher model.
python -m paddle_serving_server_gpu.serve \
--model TEACHER_MODEL \
--port TEACHER_PORT \
--gpu_ids 0
- Prepare student code. Use
set_fixed_teacher
to set fixed teacher.
# see example/distill/mnist_distill/train_with_fleet.py
dr = DistillReader(ins=reader_ins, predicts=teacher_predicts)
dr.set_fixed_teacher(args.distill_teachers)
train_reader = dr.set_sample_list_generator(train_reader)
Run student.
python train_with_fleet.py \
--use_distill_service True \
--distill_teachers TEACHER_IP:TEACHER_PORT
In addition to the teacher and student, a discovery service and a database is required.
Once the database and discovery service is deployed, they can be used permanently for different students and teachers.
- Install & deploy redis as a database. The teacher service will be registered in the database, and discovery service query teacher from database.
redis-server
- Deploy distill discovery service. The service also provides a balanced function.
python -m paddle_edl.distill.redis.balance_server \
--db_endpoints REDIS_HOST:REDIS_PORT \
--server DISCOVERY_IP:DISCOVERY_PORT
- Register teacher to database. You can register or stop teacher any time.
python -m paddle_edl.distill.redis.server_register \
--db_endpoints REDIS_HOST:REDIS_PORT \
--service_name TEACHER_SERVICE_NAME \
--server TEACHER_IP:TEACHER_PORT
- Use
set_dynamic_teacher
get dynamic teacher from discovery service.
dr = DistillReader(ins=reader_ins, predicts=teacher_predicts)
dr.set_dynamic_teacher(DISCOVERY_IP:DISCOVERY_PORT, TEACHER_SERVICE_NAME)
train_reader = dr.set_sample_list_generator(train_reader)
The run student code.
python train_with_fleet.py --use_distill_service True
We have built the docker images for you and you can start a demo on Kubernetes immediately: TBD