Imitation Learning with Knowledge Sharing

Introduction

In a classification task, a new classifier may be trained to imitate an arbitrarily large single network or ensemble. In imitation learning, there may be an unlimited amount of training data because the task of the imitator machine learning system is to match the output of the imitated system, regardless of whether the imitated system is correct and even regardless of whether the correct answer is known. The imitation training data may even include data generated at random or data produced by a synthesizer or a generator network. Imitation learning may use unlabeled data, but the training of the new classifier is supervised learning because there is a specified target output for each input datum.

If the reference system and the new system are neural networks, knowledge sharing may be used to extend the principle of imitation to nodes in the inner layers of the reference network and the new network. If the new network is larger than the reference network, each node in the reference system may be a knowledge providing node to zero or more nodes in the new network. If the new network is smaller than the reference network, each node in the new network may be a knowledge receiving node from zero or more nodes in the reference network. Knowledge sharing applies regularization to encourage a relationship, typically an equality or some form of inequality relationship, between the node activations of the knowledge providing node and the knowledge receiving node, when their networks are activated with the same input. Therefore, knowledge sharing encourages imitation for inner nodes of the networks, not just the output nodes.

Unlimited Amount of Data, No Problem of Over Fitting, No Need for Early Stopping

In imitation learning, any problem of over fitting may be addressed simply by generating more data. Thus, in imitation learning, if the imitator system is not fixed in size but may be grown to an arbitrary size, the system may achieve arbtrarily accurate performance on the imitation task. On the other hand, the size of the new system may be controlled to reduce computation cost. Under the control of a human + AI learning management system, the imitator system may be trained to convergence to an optimum cost/performance trade-off. No early stopping is needed. Instead, more training data is generated. If there are errors, the imitator network may be grown, unless limited by a cost/performance trade off.

Imitation Learning, Transfer Learning and Knowledge Sharing

Imitation learning and transfer learning both train a new network with the aid of a reference network that has already been trained on a similar task, so the learning methods are superficially similar. However, in practice, their characteristics are quite different, and they apply in completely separate situations. In any application of transfer learning, imitation learning may be used in addition, especially if there is new data or adaptive training. However, in any other application of imitation learning, transfer learning can't be used at all or works very poorly.

Transfer learning requires that the architectures of the reference system and the new system be the same or very similar. If the reference system and the new system are trained on the same task with the same data, the new system is simply a clone and the transfer is trivial. Neither transfer learning nor imitation learning would be needed in such a case. If the reference system and the new system are trained on different tasks on different data, transfer learning might be used to initialize the new system, but imitation learning would not apply. Imitation learning compares the activations of a knowledge providing node and a linked knowledge receiving node when the networks are activated with the same datum. Imitation learning may be used in addition to transfer learning if there is new data in addition to the data on which the reference network was trained. For example, if there is unlabeled data, the new network might be trained with semi-supervised learning. Under semi-supervised learning, the new network may drift away from the reference network. Imitation learning does not require the data to be labeled, so imitation learning may be used to maintain some degree of relationship between the reference system and the new system.

Imitation learning and knowledge sharing can be used in situations in which transfer learning cannot be used. Transfer learning requires that the architecture of the new system be the same or very similar to the architecture of the reference system. Imitation learning and knowledge sharing allow the architectures of the reference system and the new system to be different. Standard knowledge sharing requires that both the reference system and the new system be neural networks. Some specialized versions of knowledge sharing only require that the knowledge receiving node be in a neural network. The knowledge providing source may be in an external knowledge base of some other type. Imitation learning by itself does not require either system to be a neural network.

Imitation learning and knowledge sharing continue to transfer knowledge from the reference system to the new system throughout the training process. Transfer learning is only used for initialization.

Uses of Imitation Learning

For Experimenting under Controlled Conditions
For Training a More Cost Effective System
Fitting an Arbitrary Known Function
Training an Interpretable Network to Match a Hard-to-Interpret Network
Training a Companion Network to Match an Inner Node in a Larger Network
For Creating a Network to Match an Objective Back Propagated to an Inner Layer
For Training a Companion Network of a Judgment Node

For experimenters: Note that an imitation learning task has a potentially unlimited amount of data. If you have a novel idea, first try it out on an imitation learning task, where you can control the amount of data in order to separate issues caused by insufficient data from other issues. Go to the Experimenters Page for more suggestions.

Navigation Menu

by James K Baker and Bradley J Baker

The text in this work is licensed under a Creative Commons Attribution 4.0 International License.

Some of the ideas presented here are covered by issued or pending patents. No license to such patents is created or implied by publication or reference to herein.