## Issue

I have a model that I am using transfer learning for MobileNetV2 and I’d like to quantize it and compare the accuracy difference against a non-quantized model with transfer learning. However, they do not entirely support recursive quantization, but according to this, this method should quantize my model: https://github.com/tensorflow/model-optimization/issues/377#issuecomment-820948555

What I tried doing was:

```
import tensorflow as tf
import tensorflow_model_optimization as tfmot
pretrained_model = tf.keras.applications.MobileNetV2(include_top=False)
pretrained_model.trainable = True
for layer in pretrained_model.layers[:-1]:
layer.trainable = False
quantize_model_pretrained = tfmot.quantization.keras.quantize_model
q_pretrained_model = quantize_model_pretrained(pretrained_model)
original_inputs = tf.keras.layers.Input(shape=(224, 224, 3))
y = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)(original_inputs)
y = base_model(original_inputs)
y = tf.keras.layers.GlobalAveragePooling2D()(y)
original_outputs = tf.keras.layers.Dense(5, activation="softmax")(y)
model_1 = tf.keras.Model(original_inputs, original_outputs)
quantize_model = tfmot.quantization.keras.quantize_model
q_aware_model = quantize_model(model_1)
```

It is still giving me the following error:

```
ValueError: Quantizing a tf.keras Model inside another tf.keras Model is not supported.
```

I’d like to understand what is the correct way to perform quantization-aware-training in this case?

## Solution

According to the issue you mentioned, you should quantize each model separately and then bring them together afterwards. Something like this:

```
import tensorflow as tf
import tensorflow_model_optimization as tfmot
pretrained_model = tf.keras.applications.MobileNetV2(input_shape=(224, 224, 3), include_top=False)
pretrained_model.trainable = True
for layer in pretrained_model.layers[:-1]:
layer.trainable = False
q_pretrained_model = tfmot.quantization.keras.quantize_model(pretrained_model)
q_base_model = tfmot.quantization.keras.quantize_model(tf.keras.Sequential([tf.keras.layers.GlobalAveragePooling2D(input_shape=(7, 7, 1280)), tf.keras.layers.Dense(5, activation="softmax")]))
original_inputs = tf.keras.layers.Input(shape=(224, 224, 3))
y = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)(original_inputs)
y = q_pretrained_model(original_inputs)
original_outputs = q_base_model(y)
model = tf.keras.Model(original_inputs, original_outputs)
```

It does not look like it is already supported out of the box, even though this is claimed.

Answered By – AloneTogether

**This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 **