Python --- smile detection + mask face recognition

1. The process of dividing positive and negative samples of the smiling face dataset (genki4k), model training and testing (including at least SVM, CNN), and outputting model training accuracy and testing accuracy (F1-score and ROC)

1. Dataset download

Smiley dataset download link: https://pan.baidu.com/s/1ldA8CF6pH4q0Bheik_Tttg
Extraction code: fui1

2. Training dataset

1) Import Keras

import keras
keras.__version__

2) Read the dataset, train

import os, shutil
# The path to the directory where the original
# dataset was uncompressed
original_dataset_dir = 'smileface/genki4k'

# The directory where we will
# store our smaller dataset
base_dir = 'smileface/smile_and_nosmile'
os.mkdir(base_dir)

# Directories for our training,
# validation and test splits
train_dir = os.path.join(base_dir, 'train')
os.mkdir(train_dir)
validation_dir = os.path.join(base_dir, 'validation')
os.mkdir(validation_dir)
test_dir = os.path.join(base_dir, 'test')
os.mkdir(test_dir)

# Directory with our training smile pictures
train_smile_dir = os.path.join(train_dir, 'smiles')
os.mkdir(train_smile_dir)

# Directory with our training nosmile pictures
train_nosmile_dir = os.path.join(train_dir, 'no_smiles')
os.mkdir(train_nosmile_dir)

# Directory with our validation smile pictures
validation_smile_dir = os.path.join(validation_dir, 'smiles')
os.mkdir(validation_smile_dir)

# Directory with our validation nosmile pictures
validation_nosmile_dir = os.path.join(validation_dir, 'no_smiles')
os.mkdir(validation_nosmile_dir)

# Directory with our validation smile pictures
test_smile_dir = os.path.join(test_dir, 'smiles')
os.mkdir(test_smile_dir)

# Directory with our validation nosmile pictures
test_nosmile_dir = os.path.join(test_dir, 'no_smiles')
os.mkdir(test_nosmile_dir)

3) Put smiley pictures and non-smiley pictures into corresponding folders

Put the two data in the smiley dataset into the corresponding location in the generated folder

4) Print the number of smiley and non-smiley pictures in the dataset

print('total training smile images:', len(os.listdir(train_smile_dir)))
print('total training nosmile images:', len(os.listdir(train_nosmile_dir)))
print('total validation smile images:', len(os.listdir(validation_smile_dir)))
print('total validation nosmile images:', len(os.listdir(validation_nosmile_dir)))
print('total test smile images:', len(os.listdir(test_smile_dir)))
print('total test nosmile images:', len(os.listdir(test_nosmile_dir)))


Here only dozens of images are selected for model construction, and naturally its final accuracy will also drop a lot. Of course, more images will naturally create more accurate models, depending on your computer's configuration.

5) Build a small convolutional neural network

from keras import layers
from keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

See how the size of the feature map changes with each layer

model.summary()

from keras import optimizers
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])

6) Data preprocessing

from keras.preprocessing.image import ImageDataGenerator
# All images will be rescaled by 1./255
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')


generator output

for data_batch, labels_batch in train_generator:
    print('data batch shape:', data_batch.shape)
    print('labels batch shape:', labels_batch.shape)
    break

7) Train the model

history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=30,
      validation_data=validation_generator,
      validation_steps=50)



This is done here using the fit_generator method, which is equivalent to the fit method for data generators like ours.

save model

model.save('smile_and_nosmile.h5')

After saving the model, the face recognition of this model can be performed, but the accuracy will not be too high, after all, only such a small number of pictures are used

Plot the loss and accuracy of the model

import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()

8) Data Augmentation

Data augmentation works by generating more training data from existing training samples by "augmenting" the samples through a series of random transformations to produce believable-looking images. Our goal is that while training, our model does not see the exact same image twice. This helps expose the model to more aspects of the data and generalize better.

datagen = ImageDataGenerator(
      rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest')

View the data-augmented image

# This is module with image preprocessing utilities
from keras.preprocessing import image

fnames = [os.path.join(train_smile_dir, fname) for fname in os.listdir(train_smile_dir)]

# We pick one image to "augment"
img_path = fnames[3]

# Read the image and resize it
img = image.load_img(img_path, target_size=(150, 150))

# Convert it to a Numpy array with shape (150, 150, 3)
x = image.img_to_array(img)

# Reshape it to (1, 150, 150, 3)
x = x.reshape((1,) + x.shape)

# The .flow() command below generates batches of randomly transformed images.
# It will loop indefinitely, so we need to `break` the loop at some point!
i = 0
for batch in datagen.flow(x, batch_size=1):
    plt.figure(i)
    imgplot = plt.imshow(image.array_to_img(batch[0]))
    i += 1
    if i % 4 == 0:
        break

plt.show()

If we train a new network with this data augmentation configuration, our network will never see the same input twice. However, the inputs it sees are still highly correlated because they come from a small number of original images - we cannot generate new information, we can only mix existing information. So this may not be enough to completely eliminate overfitting.

To combat overfitting further, we will also add a Dropout layer to our model, just before the densely connected classifier:

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])

Train the network with data augmentation and dropout

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,)

# Note that the validation data should not be augmented!
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=32,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')

history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=100,
      validation_data=validation_generator,
      validation_steps=50)


save model

model.save('smile_and_nosmile_1.h5')

Look at the loss and accuracy of the model again

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()

3. Camera real-time smile recognition

import cv2
from keras.preprocessing import image
from keras.models import load_model
import numpy as np
import dlib
from PIL import Image
model = load_model('smile_and_nosmile_1.h5')
detector = dlib.get_frontal_face_detector()
video=cv2.VideoCapture(0)
font = cv2.FONT_HERSHEY_SIMPLEX
def rec(img):
    gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    dets=detector(gray,1)
    if dets is not None:
        for face in dets:
            left=face.left()
            top=face.top()
            right=face.right()
            bottom=face.bottom()
            cv2.rectangle(img,(left,top),(right,bottom),(0,255,0),2)
            img1=cv2.resize(img[top:bottom,left:right],dsize=(150,150))
            img1=cv2.cvtColor(img1,cv2.COLOR_BGR2RGB)
            img1 = np.array(img1)/255.
            img_tensor = img1.reshape(-1,150,150,3)
            prediction =model.predict(img_tensor)    
            print(prediction)
            if prediction[0][0]<0.5:
                result='no_smiles'
            else:
                result='smiles'
            cv2.putText(img, result, (left,top), font, 2, (0, 255, 0), 2, cv2.LINE_AA)
        cv2.imshow('smile detector', img)
while video.isOpened():
    res, img_rd = video.read()
    if not res:
        break
    rec(img_rd)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
video.release()
cv2.destroyAllWindows()


2. Change the data set to the mask data set, and conduct training to realize face recognition with masks

1. Download the mask dataset

Face mask dataset download link: https://pan.baidu.com/s/11PBCmDDx7Dtx_ckjwZR2uw
Extraction code: n2um

The training face mask dataset is the same as the training smiley face dataset, just change the corresponding variable name and dataset.

2. Face recognition for mask wearing

import cv2
from keras.preprocessing import image
from keras.models import load_model
import numpy as np
import dlib
from PIL import Image
model = load_model('mask_and_nomask.h5')
detector = dlib.get_frontal_face_detector()
video=cv2.VideoCapture(0)
font = cv2.FONT_HERSHEY_SIMPLEX
def rec(img):
    gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    dets=detector(gray,1)
    if dets is not None:
        for face in dets:
            left=face.left()
            top=face.top()
            right=face.right()
            bottom=face.bottom()
            cv2.rectangle(img,(left,top),(right,bottom),(0,255,0),2)
            img1=cv2.resize(img[top:bottom,left:right],dsize=(150,150))
            img1=cv2.cvtColor(img1,cv2.COLOR_BGR2RGB)
            img1 = np.array(img1)/255.
            img_tensor = img1.reshape(-1,150,150,3)
            prediction =model.predict(img_tensor)    
            print(prediction)
            if prediction[0][0]>0.5:
                result='nomask'
            else:
                result='mask'
            cv2.putText(img, result, (left,top), font, 2, (0, 255, 0), 2, cv2.LINE_AA)
        cv2.imshow('mask detector', img)
while video.isOpened():
    res, img_rd = video.read()
    if not res:
        break
    rec(img_rd)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
video.release()
cv2.destroyAllWindows()


references

https://blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/104743982
https://blog.csdn.net/cj151525/article/details/104984897
https://blog.csdn.net/weixin_45137708/article/details/107142706

Posted by kid85 on Wed, 01 Jun 2022 00:58:26 +0530