Image classification - cifar100 experimental research

In order to solve cifar100 val_ The problem of low ACC is essentially an over fitting problem, so I went to the papers with code website to see how much the cifar100 benchmark has achieved in the first place, as shown in the figure below, val_cc = 0.96, a little bit ha, so what we need to do now is to study Sam (sharpness aware minimization), which is mainly used to improve the generalization of the model.

I ran the code here first, but the data set is cifar10, val_acc = 0.97, I think it's still very stable. At present, it's running cifar100, but the code is of pytoch version, and it needs to be migrated to Tensorflow later. The screenshot of cifar10 training is shown below. Code address:

Update: ran cifar100, but val_acc is different from that in the imagination. Generally speaking, it is higher than the previous 0.8. At present, it is val_acc = 0.83, the training screenshot is shown below

Special note: the Name in the model training log, such as DenseNet121_RandomFlip_…/ In fact, the network used for validation is DenseNet121, and RandAugmentation is used for data enhancement. Randomflip can be ignored because of the previous flower_ Reasons left over from photos experimental research

Dataset: visual_domain_decathlon/cifar100

Config description: Data based on "CIFAR-100", with images resized isotropically to have a shorter size of 72 pixels

train: 40000 pictures

test: 10000 pictures

validation: 10000 pictures

The number of categories is 100

Use 180 x 180 x 3 for training

NASNetMobile is special and needs to resize to 224 x 224 x 3

In the first stage, we use the model pre trained on ImageNet to do feature extraction, which means to free the convolution part of the pre training model, and then only train the newly added top classifier. The training results are shown in the figure below

Here we can see that val_ The highest ACC is ResNet50, with a value of 0.7421. In fact, the highest ACC is ResNet101, but considering the amount of calculation, we take ResNet50. But what's amazing here is the Val of ResNet50_ ACC is the highest. I guess it's the resolution of the data set. After all, the original image resolution of our task is only 72 x 72 x 3.

Let's paste the first phase of the code

rand_aug = iaa.RandAugment(n=3, m=7)

def augment(images):
    # Input to `augment()` is a TensorFlow tensor which
    # is not supported by `imgaug`. This is why we first
    # convert it to its `numpy` variant.
    images = tf.cast(images, tf.uint8)
    return rand_aug(images=images.numpy())


train_ds = train_ds.shuffle(buffer_size=len(train_ds)).cache().batch(batch_size).map(
    lambda x, y: (tf.py_function(augment, [x], [tf.float32])[0], y), num_parallel_calls=AUTOTUNE).prefetch(
val_ds = val_ds.cache().batch(batch_size).prefetch(buffer_size=AUTOTUNE)

preprocess_input = tf.keras.applications.resnet.preprocess_input
base_model = tf.keras.applications.ResNet101(input_shape=img_size,

Here, I didn't paste all the code. If you need to view the source code, please go here:

As shown in the figure above, we need the branch corresponding to checkout.

Based on this, we made fine tune for ResNet50 and inception resnetv2 respectively, and the results are as follows

Fine tune is not made for all models in the first stage here. It can be found from the above figure that it is still Val of ResNet50_ ACC is slightly higher, but so far, we are in visual_ domain_ Val on decathlon / cifar100_ ACC is still lower, only 0.8041, which needs to be improved.

preprocess_input = tf.keras.applications.resnet50.preprocess_input
base_model = tf.keras.applications.ResNet50(input_shape=img_size,
base_model.trainable = True
# Let's take a look to see how many layers are in the base model
print("Number of layers in the base model: ", len(base_model.layers))

# Fine-tune from this layer onwards
fine_tune_at = 120

# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
    layer.trainable = False

inputs = tf.keras.Input(shape=img_size)
x = data_augmentation(inputs)
x = preprocess_input(x)
x = base_model(x, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dropout(0.2)(x)
outputs = tf.keras.layers.Dense(num_classes)(x)
model = tf.keras.Model(inputs, outputs)

Finally, let's talk about the code in the fine tune phase. It should be noted that different models have different network layers, so fine_ tune_ The at parameter depends on the situation, and the address of the loading model should not be mistaken.

Tags: AI Deep Learning TensorFlow Pytorch

Posted by ThaSpY on Wed, 22 Sep 2021 06:40:08 +0530