Purpose

This article illustrates how to change train program to an EDL distill train, and run student with fixed teacher or dynamic teacher.

How to change from a normal train program to an EDL distill train program

Define an input to represent the variable obtained from the teacher.
Use DistillReader to define the original reader input and the variables that need to be obtained from the teacher. Then set train_reader as the data source of DistillReader.
Use student's prediction and teacher's prediction to define loss function. Take mnist_distill demo as an example, the code is as follows

# 1. define an input represent teacher prediction
soft_label = fluid.data(name='soft_label', shape=[None, 10], dtype='float32')
inputs.append(soft_label)

# 2. define DistillReader
dr = DistillReader(ins=['img', 'label'], predicts=['fc_0.tmp_2'])
train_reader = dr.set_sample_list_generator(train_reader)

# 3. define distill loss
distill_loss = fluid.layers.cross_entropy(
    input=prediction, label=soft_label, soft_label=True)
distill_loss = fluid.layers.mean(distill_loss)
loss = distill_loss

# Start distill train.
# data includes the original reader input and the prediction results obtained from the teacher,
# that is (img, label, soft_label)
for data in train_reader():
    metrics = exe.run(main_program, feed=data, fetch_list=[loss, acc])

Run with fixed teacher

First, you need deploy teacher. We using Paddle Serving to deploy teacher. You can see here to save your own teacher model.

python -m paddle_serving_server_gpu.serve \
  --model TEACHER_MODEL \
  --port TEACHER_PORT \
  --gpu_ids 0

Prepare student code. Use set_fixed_teacher to set fixed teacher.

# see example/distill/mnist_distill/train_with_fleet.py
dr = DistillReader(ins=reader_ins, predicts=teacher_predicts)
dr.set_fixed_teacher(args.distill_teachers)
train_reader = dr.set_sample_list_generator(train_reader)

Run student.

python train_with_fleet.py \
  --use_distill_service True \
  --distill_teachers TEACHER_IP:TEACHER_PORT

Run with dynamic teacher

In addition to the teacher and student, a discovery service and a database is required. Once the database and discovery service is deployed, they can be used permanently for different students and teachers.

Install & deploy redis as a database. The teacher service will be registered in the database, and discovery service query teacher from database.

redis-server

Deploy distill discovery service. The service also provides a balanced function.

python -m paddle_edl.distill.redis.balance_server \
  --db_endpoints REDIS_HOST:REDIS_PORT \
  --server DISCOVERY_IP:DISCOVERY_PORT

Register teacher to database. You can register or stop teacher any time.

python -m paddle_edl.distill.redis.server_register \
  --db_endpoints REDIS_HOST:REDIS_PORT \
  --service_name TEACHER_SERVICE_NAME \
  --server TEACHER_IP:TEACHER_PORT

Use set_dynamic_teacher get dynamic teacher from discovery service.

dr = DistillReader(ins=reader_ins, predicts=teacher_predicts)
dr.set_dynamic_teacher(DISCOVERY_IP:DISCOVERY_PORT, TEACHER_SERVICE_NAME)
train_reader = dr.set_sample_list_generator(train_reader)

The run student code.

python train_with_fleet.py --use_distill_service True

On Kubernetes

We have built the docker images for you and you can start a demo on Kubernetes immediately: TBD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Purpose

How to change from a normal train program to an EDL distill train program

Run with fixed teacher

Run with dynamic teacher

On Kubernetes

Files

README.md

Latest commit

History

README.md

File metadata and controls

Purpose

How to change from a normal train program to an EDL distill train program

Run with fixed teacher

Run with dynamic teacher

On Kubernetes