1, A method of feature extraction from face images
1.HOG features
Histogram of oriented gradient (HOG) feature is a feature descriptor used for object detection in computer vision and image processing. It computes and counts the gradient direction histogram of the local region of the image to form the feature. Hog feature combined with SVM classifier has been widely used in image recognition, especially in pedestrian detection. It should be noted that the method of HOG+SVM for pedestrian detection was proposed by the French researcher Dalal on the CVPR in 2005. Although many pedestrian detection algorithms have been proposed, they are basically based on the idea of HOG+SVM.
(1) Main idea: in an image, the appearance and shape of local targets can be well described by the gradient or edge direction density distribution. (essence: the statistical information of the gradient, and the gradient mainly exists at the edge).
(2) The specific implementation method is as follows: firstly, the image is divided into small connected regions, which we call cell units. Then collect the direction histogram of the gradient or edge of each pixel in the cell unit. Finally, these histograms can be combined to form a feature descriptor.
(3) Improve performance: contrast normalized these local histograms in a larger range of the image (we call it an interval or block). The method adopted is: first calculate the density of each histogram in this interval (block), and then normalize each cell unit in the interval according to this density. Through this normalization, we can get better effect on illumination change and shadow.
(4) Advantages: compared with other feature description methods, HOG has many advantages. First of all, because HOG operates on the local square cells of the image, it can maintain good invariance to the geometric and optical deformations of the image. These two deformations only appear in the larger space. Secondly, under the conditions of coarse spatial sampling, fine directional sampling and strong local optical normalization, as long as pedestrians can generally maintain an upright posture, they can be allowed to have some subtle body movements, which can be ignored without affecting the detection effect. Therefore, the HOG feature is particularly suitable for human detection in images.
The HOG feature extraction method is to extract an image (the target or scanning window you want to detect):
1) Grayscale (the image is regarded as a three-dimensional image of x,y,z (grayscale);
2) Gamma correction method is used to normalize the color space of the input image; The purpose is to adjust the contrast of the image, reduce the influence caused by the local shadow and illumination change of the image, and suppress the noise interference at the same time;
3) Calculate the gradient (including size and direction) of each pixel of the image; It is mainly to capture the contour information and further weaken the interference of illumination.
4) Dividing the image into small cells (E. G., 66 pixels /cell);
5) Count the gradient histogram (the number of different gradients) of each cell to form the descriptor of each cell;
6) Every few cells form a block (for example, 33 cells/block). The HOG feature descriptor of the block is obtained by connecting the feature descriptors of all cells in a block in series. 7) Connect the HOG feature descriptors of all blocks in the image to get the HOG feature descriptor of the image (the target you want to detect). This is the final feature vector for classification.
2.Dlib
Dlib is a C++ Open Source Toolkit containing machine learning algorithms. Dlib can help you create many complex machine learning software to help solve practical problems. At present, Dlib has been widely used in industry and academic fields, including robots, embedded devices, mobile phones and large-scale high-performance computing environments. Main features of Dlib:
- Unlike many other open source libraries, Dlib provides complete documentation for each class and function. At the same time, it also provides the debug mode; After the debug mode is turned on, users can debug the code, view the values of variables and objects, and quickly locate the error points. In addition, Dlib provides a large number of instances.
- Dlib, a high-quality portable code, does not rely on third-party libraries and does not need to be installed and configured. For this part, please refer to the introduction to how to compile in the tree directory on the left of the official website. Dlib can be used on window s, Mac OS and Linux systems.
- Provide a large number of machine learning / image processing algorithms
- Deep learning
SVM based classification and recursion algorithm for large-scale classification and recursion dimensionality reduction method correlation vector machine; It is a sparse probability model with the same function form as support vector machine to predict or classify unknown functions. Its training is carried out under the Bayesian framework. Compared with SVM, it does not need to estimate the regularization parameters, its kernel function does not need to meet the Mercer condition, requires fewer correlation vectors, and has long training time and short testing time.
Clustering: linear or kernel k-means, Chinese whispers, and Newman clustering Radial basis function networks multilayer perceptron
3. convolutional neural network
Convolutional Neural Networks is a deep learning model or a multilayer perceptron similar to artificial neural networks, which is often used to analyze visual images. The founder of convolutional neural network is Yann LeCun, a well-known computer scientist who works at Facebook. He is the first person to solve the problem of handwritten digits on MNIST data sets through convolutional neural network. Convolutional neural network CNN is mainly used to identify displacement, scaling and other forms of distortion invariance of two-dimensional graphics. Because the feature detection layer of CNN learns from the training data, it avoids explicit feature extraction and learns implicitly from the training data; Moreover, because the weights of neurons on the same feature mapping surface are the same, the network can learn in parallel, which is also a major advantage of convolution network over the network with neurons connected to each other. Convolutional neural network has unique advantages in speech recognition and image processing with its special structure of local weight sharing. Its layout is closer to the actual biological neural network. Weight sharing reduces the complexity of the network, especially the multi-dimensional input vector image can be directly input into the network, which avoids the complexity of data reconstruction in the process of feature extraction and classification. In convolution neural networks, the first step is to extract features with convolution kernels. These initialized convolution kernels will be updated again and again in the process of back propagation, and infinitely approximate our real solution. In fact, instead of solving the image matrix, it initializes a feature vector set that conforms to a certain distribution, and then infinitely updates this feature set in the back-propagation so that it can infinitely approximate the conceptual feature vector in mathematics, so that we can use the mathematical method of feature vector to extract the features of the matrix.
Dlib smiley face detection
Using dilb to extract mouth features and write data CSV
Filter data
# Filter data to extract pictures that cannot extract feature points def select(): for i in imgs_smiles: img_rd = path_images_with_smiles + i # read img file img = cv2.imread(img_rd) # Take grayscale img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) faces = detector(img_gray, 0) try: np.matrix([[p.x, p.y] for p in predictor(img, faces[0]).parts()]) except: shutil.move(img_rd,select_smiles) for i in imgs_no_smiles: img_rd = path_images_no_smiles + i # read img file img = cv2.imread(img_rd) # Take grayscale img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) faces = detector(img_gray, 0) try: np.matrix([[p.x, p.y] for p in predictor(img, faces[0]).parts()]) except: shutil.move(img_rd,select_no_smiles)
Extracting feature points
# Input the path where the image file is located, and return a 41 dimensional array (including the extracted 40 dimensional features and 1-dimensional output tags) def get_features(img_rd): # Input: img_rd: image file # Output: positions_ Lip_ Arr: feature point 49 to feature point 68, 20 feature points / 40d in all # read img file img = cv2.imread(img_rd) # Take grayscale img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Calculate 68 point coordinates positions_68_arr = [] faces = detector(img_gray, 0) landmarks = np.matrix([[p.x, p.y] for p in predictor(img, faces[0]).parts()]) # except: for point in enumerate(landmarks): # Coordinates of 68 points pos = (point[0, 0], point[0, 1]) positions_68_arr.append(pos) positions_lip_arr = [] # Write points 49-68 to CSV # positions_68_arr[48]-positions_68_arr[67] for i in range(48, 68): positions_lip_arr.append(positions_68_arr[i][0]) positions_lip_arr.append(positions_68_arr[i][1]) return positions_lip_arr
Full code
import dlib # Face processing library Dlib import numpy as np # Data processing library numpy from cv2 import cv2 # Image processing library OpenCv import os # read file import csv # CSV operation import shutil detector = dlib.get_frontal_face_detector() predictor = dlib.shape_predictor('D:\\Desktop\\Face_recognition\\shape_predictor_68_face_landmarks.dat') # Input the path where the image file is located, and return a 41 dimensional array (including the extracted 40 dimensional features and 1-dimensional output tags) def get_features(img_rd): # Input: img_rd: image file # Output: positions_ Lip_ Arr: feature point 49 to feature point 68, 20 feature points / 40d in all # read img file img = cv2.imread(img_rd) # Take grayscale img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Calculate 68 point coordinates positions_68_arr = [] faces = detector(img_gray, 0) landmarks = np.matrix([[p.x, p.y] for p in predictor(img, faces[0]).parts()]) # except: for idx, point in enumerate(landmarks): # Coordinates of 68 points pos = (point[0, 0], point[0, 1]) positions_68_arr.append(pos) positions_lip_arr = [] # Write points 49-68 to CSV # positions_68_arr[48]-positions_68_arr[67] for i in range(48, 68): positions_lip_arr.append(positions_68_arr[i][0]) positions_lip_arr.append(positions_68_arr[i][1]) return positions_lip_arr # Read the path where the image is located path_images_with_smiles = "D:\\Desktop\\Face_recognition\\data\\data_imgs\\database\\smiles\\" path_images_no_smiles = "D:\\Desktop\\Face_recognition\\data\\data_imgs\\database\\no_smiles\\" # Get the image file under the path imgs_smiles = os.listdir(path_images_with_smiles) imgs_no_smiles = os.listdir(path_images_no_smiles) select_smiles = "D:\\Desktop\\Face_recognition\\data\\data_imgs\\database\\select\\1\\" select_no_smiles = "D:\\Desktop\\Face_recognition\\data\\data_imgs\\database\\select\\2\\" # Path to the CSV where the extracted feature data is stored path_csv = "data\\data_csvs\\" # Filter data to extract pictures that cannot extract feature points def select(): for i in imgs_smiles: img_rd = path_images_with_smiles + i # read img file img = cv2.imread(img_rd) # Take grayscale img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) faces = detector(img_gray, 0) try: np.matrix([[p.x, p.y] for p in predictor(img, faces[0]).parts()]) except: shutil.move(img_rd,select_smiles) for i in imgs_no_smiles: img_rd = path_images_no_smiles + i # read img file img = cv2.imread(img_rd) # Take grayscale img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) faces = detector(img_gray, 0) try: np.matrix([[p.x, p.y] for p in predictor(img, faces[0]).parts()]) except: shutil.move(img_rd,select_no_smiles) # write the features into CSV def write_into_CSV(): with open(path_csv+"data.csv", "w", newline="") as csvfile: writer = csv.writer(csvfile) # Working with images with smiling faces print("######## with smiles #########") for i in imgs_smiles: print(path_images_with_smiles + i) features_csv_smiles = get_features(path_images_with_smiles+ i) features_csv_smiles.append(1) print("positions of lips:", features_csv_smiles, "\n") # Write CSV writer.writerow(features_csv_smiles) # Working with images without smiling faces print("######## no smiles #########") for i in imgs_no_smiles: print(path_images_no_smiles + i) features_csv_no_smiles = get_features(path_images_no_smiles + i) features_csv_no_smiles.append(0) print("positions of lips:", features_csv_no_smiles, "\n") # Write CSV writer.writerow(features_csv_no_smiles) # Data filtering # select() # Write CSV # write_into_CSV()
Extracting training set X with sklearn_ Train and test set X_test
Full code
# pandas read CSV import pandas as pd # Split data from sklearn.model_selection import train_test_split # For data pre-processing standardization from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression # Logistic regression model in linear model from sklearn.neural_network import MLPClassifier # Multilayer network model in neural network model from sklearn.svm import LinearSVC # Linear SVC model in SVM model from sklearn.linear_model import SGDClassifier # Stochastic gradient descent model in linear model import joblib # Read data from csv def pre_data(): # 41 dimension header column_names = [] for i in range(0, 40): column_names.append("feature_" + str(i + 1)) column_names.append("output") # read csv rd_csv = pd.read_csv("data/data_csvs/data.csv", names=column_names) # Dimension of output csv file # print("shape:", rd_csv.shape) X_train, X_test, y_train, y_test = train_test_split( # input 0-40 # output 41 rd_csv[column_names[0:40]], rd_csv[column_names[40]], # 25% for testing, 75% for training test_size=0.25, random_state=33) return X_train, X_test, y_train, y_test path_models = "data/data_models/" # LR, logistic regression classification (linear model) def model_LR(): # get data X_train_LR, X_test_LR, y_train_LR, y_test_LR = pre_data() # Data preprocessing # Standardize data to ensure that the variance of characteristic data of each dimension is 1 and the mean value is 0. So that the prediction results will not be dominated by the characteristic values of some dimensions that are too large ss_LR = StandardScaler() X_train_LR = ss_LR.fit_transform(X_train_LR) X_test_LR = ss_LR.transform(X_test_LR) # Initialize LogisticRegression LR = LogisticRegression() # Call fit() in LogisticRegression to train model parameters LR.fit(X_train_LR, y_train_LR) # save LR model joblib.dump(LR, path_models + "model_LR.m") # Scoring function score_LR = LR.score(X_test_LR, y_test_LR) print("The accurary of LR:", score_LR) # print(type(ss_LR)) return (ss_LR) # model_LR() # MLPC, multi layer perceptron classifier (neural network) def model_MLPC(): # get data X_train_MLPC, X_test_MLPC, y_train_MLPC, y_test_MLPC = pre_data() # Data preprocessing ss_MLPC = StandardScaler() X_train_MLPC = ss_MLPC.fit_transform(X_train_MLPC) X_test_MLPC = ss_MLPC.transform(X_test_MLPC) # Initialize MLPC MLPC = MLPClassifier(hidden_layer_sizes=(13, 13, 13), max_iter=500) # Call fit() in MLPC to train model parameters MLPC.fit(X_train_MLPC, y_train_MLPC) # save MLPC model joblib.dump(MLPC, path_models + "model_MLPC.m") # Scoring function score_MLPC = MLPC.score(X_test_MLPC, y_test_MLPC) print("The accurary of MLPC:", score_MLPC) return (ss_MLPC) # model_MLPC() # Linear SVC, Linear Supported Vector Classifier def model_LSVC(): # get data X_train_LSVC, X_test_LSVC, y_train_LSVC, y_test_LSVC = pre_data() # Data preprocessing ss_LSVC = StandardScaler() X_train_LSVC = ss_LSVC.fit_transform(X_train_LSVC) X_test_LSVC = ss_LSVC.transform(X_test_LSVC) # Initialize LSVC LSVC = LinearSVC() # Call fit() in LSVC to train model parameters LSVC.fit(X_train_LSVC, y_train_LSVC) # save LSVC model joblib.dump(LSVC, path_models + "model_LSVC.m") # Scoring function score_LSVC = LSVC.score(X_test_LSVC, y_test_LSVC) print("The accurary of LSVC:", score_LSVC) return ss_LSVC # model_LSVC() # SGDC, stochastic gradient descent classifier (linear model) def model_SGDC(): # get data X_train_SGDC, X_test_SGDC, y_train_SGDC, y_test_SGDC = pre_data() # Data preprocessing ss_SGDC = StandardScaler() X_train_SGDC = ss_SGDC.fit_transform(X_train_SGDC) X_test_SGDC = ss_SGDC.transform(X_test_SGDC) # Initialize SGDC SGDC = SGDClassifier(max_iter=5) # Call fit() in SGDC to train model parameters SGDC.fit(X_train_SGDC, y_train_SGDC) # save SGDC model joblib.dump(SGDC, path_models + "model_SGDC.m") # Scoring function score_SGDC = SGDC.score(X_test_SGDC, y_test_SGDC) print("The accurary of SGDC:", score_SGDC) return ss_SGDC # model_SGDC()
Model test
Full code
# use the saved model from sklearn.externals import joblib from get_features import get_features import ML_ways_sklearn from cv2 import cv2 # path of test img path_test_img = "data/data_imgs/test_imgs/sunxiaoc.jpg" # Extract single 40 dimension feature positions_lip_test = get_features(path_test_img) # path of models path_models = "data/data_models/" print("The result of"+path_test_img+":") print('\n') # ######### LR ########### LR = joblib.load(path_models+"model_LR.m") ss_LR = ML_ways_sklearn.model_LR() X_test_LR = ss_LR.transform([positions_lip_test]) y_predict_LR = str(LR.predict(X_test_LR)[0]).replace('0', "no smile").replace('1', "with smile") print("LR:", y_predict_LR) # ######### LSVC ########### LSVC = joblib.load(path_models+"model_LSVC.m") ss_LSVC = ML_ways_sklearn.model_LSVC() X_test_LSVC = ss_LSVC.transform([positions_lip_test]) y_predict_LSVC = str(LSVC.predict(X_test_LSVC)[0]).replace('0', "no smile").replace('1', "with smile") print("LSVC:", y_predict_LSVC) # ######### MLPC ########### MLPC = joblib.load(path_models+"model_MLPC.m") ss_MLPC = ML_ways_sklearn.model_MLPC() X_test_MLPC = ss_MLPC.transform([positions_lip_test]) y_predict_MLPC = str(MLPC.predict(X_test_MLPC)[0]).replace('0', "no smile").replace('1', "with smile") print("MLPC:", y_predict_MLPC) # ######### SGDC ########### SGDC = joblib.load(path_models+"model_SGDC.m") ss_SGDC = ML_ways_sklearn.model_SGDC() X_test_SGDC = ss_SGDC.transform([positions_lip_test]) y_predict_SGDC = str(SGDC.predict(X_test_SGDC)[0]).replace('0', "no smile").replace('1', "with smile") print("SGDC:", y_predict_SGDC) img_test = cv2.imread(path_test_img) img_height = int(img_test.shape[0]) img_width = int(img_test.shape[1]) # show the results on the image font = cv2.FONT_HERSHEY_SIMPLEX cv2.putText(img_test, "LR: "+y_predict_LR, (int(img_height/10), int(img_width/10)), font, 0.5, (84, 255, 159), 1, cv2.LINE_AA) cv2.putText(img_test, "LSVC: "+y_predict_LSVC, (int(img_height/10), int(img_width/10*2)), font, 0.5, (84, 255, 159), 1, cv2.LINE_AA) cv2.putText(img_test, "MLPC: "+y_predict_MLPC, (int(img_height/10), int(img_width/10)*3), font, 0.5, (84, 255, 159), 1, cv2.LINE_AA) cv2.putText(img_test, "SGDC: "+y_predict_SGDC, (int(img_height/10), int(img_width/10)*4), font, 0.5, (84, 255, 159), 1, cv2.LINE_AA) cv2.namedWindow("img", 2) cv2.imshow("img", img_test) cv2.waitKey(0)
Show mouth feature points
Full code
# Show mouth feature points # Draw the positions of someone's lip import dlib # Face recognition library Dlib from cv2 import cv2 # Image processing library OpenCv from get_features import get_features # return the positions of feature points path_test_img = "data/data_imgs/test_imgs/test1.jpg" detector = dlib.get_frontal_face_detector() predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat') # Get lip's positions of features points positions_lip = get_features(path_test_img) img_rd = cv2.imread(path_test_img) # Draw on the lip points for i in range(0, len(positions_lip), 2): print(positions_lip[i], positions_lip[i+1]) cv2.circle(img_rd, tuple([positions_lip[i], positions_lip[i+1]]), radius=1, color=(0, 255, 0)) cv2.namedWindow("img_read", 2) cv2.imshow("img_read", img_rd) cv2.waitKey(0)
Video detection
Full code
# use the saved model from sklearn.externals import joblib import ML_ways_sklearn import dlib # Face processing library Dlib import numpy as np # Data processing library numpy from cv2 import cv2 # Image processing library OpenCv detector = dlib.get_frontal_face_detector() predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat') # OpenCv call camera cap = cv2.VideoCapture(0) # Set video parameters cap.set(3, 480) def get_features(img_rd): # Input: img_rd: image file # Output: positions_ Lip_ Arr: feature point 49 to feature point 68, 20 feature points / 40d in all # Take grayscale img_gray = cv2.cvtColor(img_rd, cv2.COLOR_RGB2GRAY) # Calculate 68 point coordinates positions_68_arr = [] faces = detector(img_gray, 0) landmarks = np.matrix([[p.x, p.y] for p in predictor(img_rd, faces[0]).parts()]) for idx, point in enumerate(landmarks): # Coordinates of 68 points pos = (point[0, 0], point[0, 1]) positions_68_arr.append(pos) positions_lip_arr = [] # Write points 49-68 to CSV # positions_68_arr[48]-positions_68_arr[67] for i in range(48, 68): positions_lip_arr.append(positions_68_arr[i][0]) positions_lip_arr.append(positions_68_arr[i][1]) return positions_lip_arr while cap.isOpened(): # 480 height * 640 width flag, img_rd = cap.read() kk = cv2.waitKey(1) img_gray = cv2.cvtColor(img_rd, cv2.COLOR_RGB2GRAY) # faces faces = detector(img_gray, 0) # Face detected if len(faces) != 0: # Extract single 40 dimension feature positions_lip_test = get_features(img_rd) # path of models path_models = "data/data_models/" # ######### LR ########### LR = joblib.load(path_models+"model_LR.m") ss_LR = ML_ways_sklearn.model_LR() X_test_LR = ss_LR.transform([positions_lip_test]) y_predict_LR = str(LR.predict(X_test_LR)[0]).replace('0', "no smile").replace('1', "with smile") print("LR:", y_predict_LR) # ######### LSVC ########### LSVC = joblib.load(path_models+"model_LSVC.m") ss_LSVC = ML_ways_sklearn.model_LSVC() X_test_LSVC = ss_LSVC.transform([positions_lip_test]) y_predict_LSVC = str(LSVC.predict(X_test_LSVC)[0]).replace('0', "no smile").replace('1', "with smile") print("LSVC:", y_predict_LSVC) # ######### MLPC ########### MLPC = joblib.load(path_models+"model_MLPC.m") ss_MLPC = ML_ways_sklearn.model_MLPC() X_test_MLPC = ss_MLPC.transform([positions_lip_test]) y_predict_MLPC = str(MLPC.predict(X_test_MLPC)[0]).replace('0', "no smile").replace('1', "with smile") print("MLPC:", y_predict_MLPC) # ######### SGDC ########### SGDC = joblib.load(path_models+"model_SGDC.m") ss_SGDC = ML_ways_sklearn.model_SGDC() X_test_SGDC = ss_SGDC.transform([positions_lip_test]) y_predict_SGDC = str(SGDC.predict(X_test_SGDC)[0]).replace('0', "no smile").replace('1', "with smile") print("SGDC:", y_predict_SGDC) print('\n') # Press the'q'key to exit if kk == ord('q'): break # Window display # cv2.namedWindow("camera", 0) # Adjustable camera window size if required cv2.imshow("camera", img_rd) # Release the camera cap.release() # Delete established window cv2.destroyAllWindows()
Install TensorFlow (python3.7)
##Creating Anaconda virtual environment Python3.6
conda create -n `Your name` python=3.6
Enter virtual environment
python3.7 installing TensorFlow
Install Keras
pip install keras
Partition dataset
mkdir.py
import tensorflow as tf import os,shutil # The path to the directory where the original # dataset was uncompressed original_dataset_dir = 'D:\\Desktop\\Face_recognition\\tensorflow\\genki4k ' # The directory where we will # store our smaller dataset base_dir = 'D:\\Desktop\\Face_recognition\\tensorflow\\smile_data' os.mkdir(base_dir) # Directories for our training, # validation and test splits train_dir = os.path.join(base_dir, 'train') os.mkdir(train_dir) validation_dir = os.path.join(base_dir, 'validation') os.mkdir(validation_dir) test_dir = os.path.join(base_dir, 'test') os.mkdir(test_dir) # Directory with our training smile pictures train_smile_dir = os.path.join(train_dir, 'smile') os.mkdir(train_smile_dir) # Directory with our training unsmile pictures train_unsmile_dir = os.path.join(train_dir, 'unsmile') os.mkdir(train_unsmile_dir) # Directory with our validation smile pictures validation_smile_dir = os.path.join(validation_dir, 'smile') os.mkdir(validation_smile_dir) # Directory with our validation unsmile pictures validation_unsmile_dir = os.path.join(validation_dir, 'unsmile') os.mkdir(validation_unsmile_dir) # Directory with our validation smile pictures test_smile_dir = os.path.join(test_dir, 'smile') os.mkdir(test_smile_dir) # Directory with our validation unsmile pictures test_unsmile_dir = os.path.join(test_dir, 'unsmile') os.mkdir(test_unsmile_dir)
Operation results:
Put smiling faces and non smiling faces in the corresponding folders
Check the number of pictures:
lenpic.py
import keras import os, shutil train_smile_dir="D:\\Desktop\\Face_recognition\\tensorflow\\smile_data\\train\\smile\\" train_umsmile_dir="D:\\Desktop\\Face_recognition\\tensorflow\\smile_data\\train\\unsmile\\" test_smile_dir="D:\\Desktop\\Face_recognition\\tensorflow\\smile_data\\test\\smile\\" test_umsmile_dir="D:\\Desktop\\Face_recognition\\tensorflow\\smile_data\\test/unsmile\\" validation_smile_dir="D:\\Desktop\\Face_recognition\\tensorflow\\smile_data\\validation\\smile\\" validation_unsmile_dir="D:\\Desktop\\Face_recognition\\tensorflow\\smile_data\\validation\\unsmile\\" train_dir="D:\\Desktop\\Face_recognition\\tensorflow\\smile_data\\train\\" test_dir="D:\\Desktop\\Face_recognition\\tensorflow\\smile_data\\test\\" validation_dir="D:\\Desktop\\Face_recognition\\tensorflow\\smile_data\\validation\\" print('total training smile images:', len(os.listdir(train_smile_dir))) print('total training unsmile images:', len(os.listdir(train_umsmile_dir))) print('total testing smile images:', len(os.listdir(test_smile_dir))) print('total testing unsmile images:', len(os.listdir(test_umsmile_dir))) print('total validation smile images:', len(os.listdir(validation_smile_dir))) print('total validation unsmile images:', len(os.listdir(validation_unsmile_dir)))
Create model
Cmodle.py
''' Create model ''' from keras import layers from keras import models model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Flatten()) model.add(layers.Dense(512, activation='relu')) model.add(layers.Dense(1, activation='sigmoid')) print(model.summary())
Normalize the picture:
normalize.py
''' Normalize the picture ''' from keras import optimizers from Cmodel import model from lenpic import train_dir,validation_dir,test_dir model.compile(loss='binary_crossentropy', optimizer=optimizers.RMSprop(lr=1e-4), metrics=['acc']) from keras.preprocessing.image import ImageDataGenerator # All images will be rescaled by 1./255 train_datagen = ImageDataGenerator(rescale=1./255) validation_datagen=ImageDataGenerator(rescale=1./255) test_datagen = ImageDataGenerator(rescale=1./255) train_generator = train_datagen.flow_from_directory( # Target file directory train_dir, #The size of all pictures must be 150x150 target_size=(150, 150), batch_size=20, # Since we use binary_crossentropy loss, we need binary labels class_mode='binary') validation_generator = test_datagen.flow_from_directory( validation_dir, target_size=(150, 150), batch_size=20, class_mode='binary') test_generator = test_datagen.flow_from_directory( test_dir, target_size=(150, 150), batch_size=20, class_mode='binary') for data_batch, labels_batch in train_generator: print('data batch shape:', data_batch.shape) print('labels batch shape:', labels_batch) break print(train_generator.class_indices)
model training
Tmodle.py
''' model training ''' from Cmodel import model from normalize import train_generator,validation_generator history = model.fit_generator( train_generator, steps_per_epoch=100, epochs=10, validation_data=validation_generator, validation_steps=50) model.save('model.save('D:\\Desktop\\Face_recognition\\tensorflow\\smileAndUnsmile_1.h5')') ''' Draw the graph of accuracy and loss degree of training set and verification set ''' import matplotlib.pyplot as plt acc = history.history['acc'] val_acc = history.history['val_acc'] loss = history.history['loss'] val_loss = history.history['val_loss'] epochs = range(len(acc)) plt.plot(epochs, acc, 'bo', label='Training acc') plt.plot(epochs, val_acc, 'b', label='Validation acc') plt.title('Training and validation accuracy') plt.legend() plt.figure() plt.plot(epochs, loss, 'bo', label='Training loss') plt.plot(epochs, val_loss, 'b', label='Validation loss') plt.title('Training and validation loss') plt.legend() plt.show()
Data enhancement
endata.py
''' Data enhancement ''' from keras.preprocessing.image import ImageDataGenerator datagen = ImageDataGenerator( rotation_range=40, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, fill_mode='nearest') ''' View picture changes after data enhancement ''' import matplotlib.pyplot as plt from lenpic import train_smile_dir import os # This is module with image preprocessing utilities from keras.preprocessing import image fnames = [os.path.join(train_smile_dir, fname) for fname in os.listdir(train_smile_dir)] img_path = fnames[8] img = image.load_img(img_path, target_size=(150, 150)) x = image.img_to_array(img) x = x.reshape((1,) + x.shape) i = 0 for batch in datagen.flow(x, batch_size=1): plt.figure(i) imgplot = plt.imshow(image.array_to_img(batch[0])) i += 1 if i % 4 == 0: break plt.show()
Create network
Cnet.py
''' Create network ''' from keras import models from keras import layers from keras import optimizers model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Flatten()) model.add(layers.Dropout(0.5)) model.add(layers.Dense(512, activation='relu')) model.add(layers.Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer=optimizers.RMSprop(lr=1e-4), metrics=['acc'])
Train the model again
T2modle.py
''' Retraining model ''' #Normalization treatment from keras.preprocessing.image import ImageDataGenerator from lenpic import train_dir, validation_dir from Cnet import model train_datagen = ImageDataGenerator( rescale=1./255, rotation_range=40, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True,) # Note that the validation data should not be augmented! test_datagen = ImageDataGenerator(rescale=1./255) train_generator = train_datagen.flow_from_directory( # This is the target directory train_dir, # All images will be resized to 150x150 target_size=(150, 150), batch_size=32, # Since we use binary_crossentropy loss, we need binary labels class_mode='binary') validation_generator = test_datagen.flow_from_directory( validation_dir, target_size=(150, 150), batch_size=32, class_mode='binary') history = model.fit_generator( train_generator, steps_per_epoch=100, epochs=100, # Training times validation_data=validation_generator, validation_steps=50) # Save training model model.save('D:\\Desktop\\Face_recognition\\tensorflow\\smileAndUnsmile_2.h5')
Training 50 times:
100 Workouts:
Real time video detection using models
''' Detect face in video or camera ''' from cv2 import cv2 from keras.preprocessing import image from keras.models import load_model import numpy as np import dlib from PIL import Image model = load_model('D:\\Desktop\\Face_recognition\\tensorflow\\smileAndUnsmile_2.h5') detector = dlib.get_frontal_face_detector() video=cv2.VideoCapture(0) font = cv2.FONT_HERSHEY_SIMPLEX def rec(img): gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) dets=detector(gray,1) if dets is not None: for face in dets: left=face.left() top=face.top() right=face.right() bottom=face.bottom() cv2.rectangle(img,(left,top),(right,bottom),(0,255,0),2) img1=cv2.resize(img[top:bottom,left:right],dsize=(150,150)) img1=cv2.cvtColor(img1,cv2.COLOR_BGR2RGB) img1 = np.array(img1)/255. img_tensor = img1.reshape(-1,150,150,3) prediction =model.predict(img_tensor) if prediction[0][0]>0.5: result='unsmile' else: result='smile' cv2.putText(img, result, (left,top), font, 2, (0, 255, 0), 2, cv2.LINE_AA) cv2.imshow('Video', img) while video.isOpened(): res, img_rd = video.read() if not res: break rec(img_rd) if cv2.waitKey(1) & 0xFF == ord('q'): break video.release() cv2.destroyAllWindows()
Mask identification
Partition dataset
mkdir.py
# The path to the directory where the original # dataset was uncompressed # The directory where we will # store our smaller dataset base_dir = 'D:\\Desktop\\Face_recognition\\Mask\\mask_data' os.mkdir(base_dir) # Directories for our training, # validation and test splits train_dir = os.path.join(base_dir, 'train') os.mkdir(train_dir) validation_dir = os.path.join(base_dir, 'validation') os.mkdir(validation_dir) test_dir = os.path.join(base_dir, 'test') os.mkdir(test_dir) # Directory with our training mask pictures train_mask_dir = os.path.join(train_dir, 'have_mask') os.mkdir(train_mask_dir) # Directory with our training no_mask pictures train_no_mask_dir = os.path.join(train_dir, 'no_mask') os.mkdir(train_no_mask_dir) # Directory with our validation mask pictures validation_mask_dir = os.path.join(validation_dir, 'have_mask') os.mkdir(validation_mask_dir) # Directory with our validation no_mask pictures validation_no_mask_dir = os.path.join(validation_dir, 'no_mask') os.mkdir(validation_no_mask_dir) # Directory with our validation mask pictures test_mask_dir = os.path.join(test_dir, 'have_mask') os.mkdir(test_mask_dir) # Directory with our validation no_mask pictures test_no_mask_dir = os.path.join(test_dir, 'no_mask') os.mkdir(test_no_mask_dir)
Add mask pictures to the corresponding folder
Download mask dataset
Download address
Since it is stored in multiple files, the pictures should be merged into one folder
move.py
import os outer_path = 'D:\\Desktop\\Face_recognition\\Mask\\have_mask\\RWMFD_part_1' folderlist = os.listdir(outer_path) print(folderlist) #Enumerate folders i = 0 for folder in folderlist: inner_path = os.path.join(outer_path, folder) total_num_folder = len(folderlist) #Total number of folders print('total have %d folders' % (total_num_folder)) #Total number of print folders filelist = os.listdir(inner_path) #List pictures for item in filelist: total_num_file = len(filelist) #Total number of pictures in a single folder if item.endswith('.jpg'): src = os.path.join(os.path.abspath(inner_path), item) #Address of original drawing dst = os.path.join(os.path.abspath(inner_path), str('D:\\Desktop\\Face_recognition\\Mask\\have_mask\\RWMFD_part_1\\MASK') + '_' + str(i) + '.jpg') #Address of the new graph (STR (folder) +'\u'+ Str (I) + ' Jpg'change to the name you want to change) try: os.rename(src, dst) print('converting %s to %s ...' % (src, dst)) i += 1 except: continue print('total %d to rename & converted %d jpgs' % (total_num_file, i))
View number of pictures
lenpic.py
import keras import os, shutil train_mask_dir="D:\\Desktop\\Face_recognition\\Mask\\mask_data\\train\\have_mask\\" train_ummask_dir="D:\\Desktop\\Face_recognition\\Mask\\mask_data\\train\\no_mask\\" test_mask_dir="D:\\Desktop\\Face_recognition\\Mask\\mask_data\\test\\have_mask\\" test_ummask_dir="D:\\Desktop\\Face_recognition\\Mask\\mask_data\\test\\no_mask\\" validation_mask_dir="D:\\Desktop\\Face_recognition\\Mask\\mask_data\\validation\\have_mask\\" validation_unmask_dir="D:\\Desktop\\Face_recognition\\Mask\\mask_data\\validation\\no_mask\\" train_dir="D:\\Desktop\\Face_recognition\\Mask\\mask_data\\train\\" test_dir="D:\\Desktop\\Face_recognition\\Mask\\mask_data\\test\\" validation_dir="D:\\Desktop\\Face_recognition\\Mask\\mask_data\\validation\\" print('total training mask images:', len(os.listdir(train_mask_dir))) print('total training unmask images:', len(os.listdir(train_ummask_dir))) print('total testing mask images:', len(os.listdir(test_mask_dir))) print('total testing unmask images:', len(os.listdir(test_ummask_dir))) print('total validation mask images:', len(os.listdir(validation_mask_dir))) print('total validation unmask images:', len(os.listdir(validation_unmask_dir)))
Data enhancement
endata.py
from keras.preprocessing.image import ImageDataGenerator datagen = ImageDataGenerator( rotation_range=40, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, fill_mode='nearest') ''' View picture changes after data enhancement ''' import matplotlib.pyplot as plt from lenpic import train_mask_dir import os # This is module with image preprocessing utilities from keras.preprocessing import image fnames = [os.path.join(train_mask_dir, fname) for fname in os.listdir(train_mask_dir)] img_path = fnames[8] img = image.load_img(img_path, target_size=(150, 150)) x = image.img_to_array(img) x = x.reshape((1,) + x.shape) i = 0 for batch in datagen.flow(x, batch_size=1): plt.figure(i) imgplot = plt.imshow(image.array_to_img(batch[0])) i += 1 if i % 4 == 0: break plt.show()
Create network
Cnet.py
''' Create network ''' from keras import models from keras import layers from keras import optimizers model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(128, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Flatten()) model.add(layers.Dropout(0.5)) model.add(layers.Dense(512, activation='relu')) model.add(layers.Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer=optimizers.RMSprop(lr=1e-4), metrics=['acc'])
Model training and draw the graph of accuracy and loss of training set and verification set
Tmodle.py
#Normalization treatment from keras.preprocessing.image import ImageDataGenerator from lenpic import train_dir, validation_dir from Cnet import model train_datagen = ImageDataGenerator( rescale=1./255, rotation_range=40, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True,) # Note that the validation data should not be augmented! test_datagen = ImageDataGenerator(rescale=1./255) train_generator = train_datagen.flow_from_directory( # This is the target directory train_dir, # All images will be resized to 150x150 target_size=(150, 150), batch_size=32, # Since we use binary_crossentropy loss, we need binary labels class_mode='binary') validation_generator = test_datagen.flow_from_directory( validation_dir, target_size=(150, 150), batch_size=32, class_mode='binary') history = model.fit_generator( train_generator, steps_per_epoch=100, epochs=100, # Training times validation_data=validation_generator, validation_steps=50) # Save training model model.save('D:\\Desktop\\Face_recognition\\Mask\\maskAndUnmask_1.h5') ''' Draw the graph of accuracy and loss degree of training set and verification set ''' import matplotlib.pyplot as plt acc = history.history['acc'] val_acc = history.history['val_acc'] loss = history.history['loss'] val_loss = history.history['val_loss'] epochs = range(len(acc)) plt.plot(epochs, acc, 'bo', label='Training acc') plt.plot(epochs, val_acc, 'b', label='Validation acc') plt.title('Training and validation accuracy') plt.legend() plt.figure() plt.plot(epochs, loss, 'bo', label='Training loss') plt.plot(epochs, val_loss, 'b', label='Validation loss') plt.title('Training and validation loss') plt.legend() plt.show()
Camera detection
camera.py
''' Detect face in video or camera ''' from cv2 import cv2 from keras.preprocessing import image from keras.models import load_model import numpy as np import dlib from PIL import Image model = load_model('D:\\Desktop\\Face_recognition\\Mask\\maskAndUnmask_1.h5') detector = dlib.get_frontal_face_detector() video=cv2.VideoCapture(0) font = cv2.FONT_HERSHEY_SIMPLEX def rec(img): gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) dets=detector(gray,1) if dets is not None: for face in dets: left=face.left() top=face.top() right=face.right() bottom=face.bottom() cv2.rectangle(img,(left,top),(right,bottom),(0,255,0),2) img1=cv2.resize(img[top:bottom,left:right],dsize=(150,150)) img1=cv2.cvtColor(img1,cv2.COLOR_BGR2RGB) img1 = np.array(img1)/255. img_tensor = img1.reshape(-1,150,150,3) prediction =model.predict(img_tensor) if prediction[0][0]>0.5: result='unmask' else: result='mask' cv2.putText(img, result, (left,top), font, 2, (0, 255, 0), 2, cv2.LINE_AA) cv2.imshow('mask', img) while video.isOpened(): res, img_rd = video.read() if not res: break rec(img_rd) if cv2.waitKey(5) & 0xFF == ord('q'): break video.release() cv2.destroyAllWindows()
References
Artificial intelligence and machine learning
Dlib model face feature detection principle and demo
cungudafa's blog