Facial Beauty Evaluation with Computer Vision: Chinese Golden Ratio Face

auroraAA Apr 09.2024

0 2007 Easy

Introduction

In today's era of social media, golden ratio face has become one of the focal points of people's attention. To satisfy people's curiosity and demand for their own facial attractiveness, we have embarked on a facial beauty evaluation project based on computer vision technology. By analyzing facial features and proportions, we aim to assess the aesthetic appeal of faces and provide users with an intuitive beauty score.

source： https://www.sciencedirect.com/science/article/pii/S1808869418303161

HARDWARE LIST

1 UNIHIKER - IoT Python Single Board Computer with Touchscreen

Link

1 USB Camera

Three-Proportions Five-Eyes

Our project is based on the principle of Three-Proportions Five-Eyes, a traditional Chinese standard of facial aesthetics.

- Three-Proportions: Refers to the proportion of facial length, dividing the face's length into three equal parts, from the hairline to the brow, from the brow to the base of the nose, and from the base of the nose to the chin, each accounting for 1/3 of the face's length.

- Five-Eyes: Refers to the proportion of facial width, dividing the face's width into five equal parts using the length of an eye as a unit, from the left hairline to the right hairline, forming five eye-shaped sections. The space between two eyes counts as one eye's width, and each eye's width extends from the outer corner of the eye to the side hairline, each accounting for 1/5 of the total width.

The "Three-Proportions Five-Eyes" serve as a general standard proportion of facial length and width. Deviation from this proportion may distance one from the ideal face shape.

We calculate these proportions and combine them with the concept of norms to evaluate the overall aesthetic appeal of the face.

Application of Norms

Using Mediapipe to obtain the coordinates of facial keypoints, we calculate these five distances and use "norms" as a comparison value. Norms are primarily a description of matrices and vectors, providing a standard measure for comparison. With norms, we can compare the sizes of different entities, facilitating uniform comparisons.

- For example, while it's easy to see that 1 is smaller than 2, comparing (3,5,3) and (6,1,2) isn't as straightforward. In terms of the 2-norm comparison: the square root of 43 is greater than the square root of 41, thus in the 2-norm comparison, (3,5,3) is larger. Regarding the infinity-norm comparison: 5 is smaller than 6, so in the infinity-norm comparison, (6,1,2) is larger.

- Matrix norms describe the magnitude of changes caused by matrices. Given the equation AX=B, if the matrix X changes by A orders of magnitude, it becomes B.

- Vector norms describe the magnitude of vectors in space. More generally, norms can describe the distance relationship between two quantities.

The general formula for vector norms is the L-p norm.

Remember, all other formulas are derived from this one

L-0 norm: Used to count the number of non-zero elements in a vector.

L-1 norm: The sum of the absolute values of all elements in the vector. It can be used in optimization to remove information with no value, also known as the sparse rule operator.

L-2 norm: Typically used for Euclidean distance. It can be used in optimization for regularization to avoid overfitting.

L-∞ norm: Computes the maximum value in a vector.

Image Beauty Test

We conduct beauty tests using static images, utilizing the Mediapipe library's facial keypoint detection model to obtain keypoint coordinates and calculate beauty metrics for visualization.

CODE


import cv2 as cv
import  mediapipe as mp
import numpy as np

import time
import  matplotlib.pyplot as plt

def look_img(img):
    img_RGB=cv.cvtColor(img,cv.COLOR_BGR2RGB)
    plt.imshow(img_RGB)
    plt.show()

mp_face_mesh=mp.solutions.face_mesh
# help(mp_face_mesh.FaceMesh)

model=mp_face_mesh.FaceMesh(
    static_image_mode=True,

    max_num_faces=40,
    min_detection_confidence=0.5, 
    min_tracking_confidence=0.5,
)


mp_drawing=mp.solutions.drawing_utils
# mp_drawing_styles=mp.solutions.drawing_styles
draw_spec=mp_drawing.DrawingSpec(thickness=2,circle_radius=1,color=[66,77,229])

img=cv.imread("face.jpg")



img_RGB=cv.cvtColor(img,cv.COLOR_BGR2RGB)
scaler=1
h,w=img.shape[0],img.shape[1]
r=10
results=model.process(img_RGB)


FL=results.multi_face_landmarks[0].landmark[234];
FL_X,FL_Y=int(FL.x*w),int(FL.y*h);FL_Color=(234,0,255)
img=cv.circle(img,(FL_X,FL_Y),r,FL_Color,-1)



FT=results.multi_face_landmarks[0].landmark[10];
img=cv.circle(img,(FT_X,FT_Y),r,FT_Color,-1)



FB=results.multi_face_landmarks[0].landmark[152];
FB_X,FB_Y=int(FB.x*w),int(FB.y*h);FB_Color=(231,141,181)
img=cv.circle(img,(FB_X,FB_Y),r,FB_Color,-1)


FR=results.multi_face_landmarks[0].landmark[454];
FR_X,FR_Y=int(FR.x*w),int(FR.y*h);FR_Color=(0,255,0)
img=cv.circle(img,(FR_X,FR_Y),r,FR_Color,-1)


ELL=results.multi_face_landmarks[0].landmark[33];
ELL_X,ELL_Y=int(ELL.x*w),int(ELL.y*h);ELL_Color=(0,255,0)
img=cv.circle(img,(ELL_X,ELL_Y),r,ELL_Color,-1)


ELR=results.multi_face_landmarks[0].landmark[133];
ELR_X,ELR_Y=int(ELR.x*w),int(ELR.y*h);ELR_Color=(0,255,0)
img=cv.circle(img,(ELR_X,ELR_Y),r,ELR_Color,-1)


ERL=results.multi_face_landmarks[0].landmark[362];
ERL_X,ERL_Y=int(ERL.x*w),int(ERL.y*h);ERL_Color=(233,255,128)
img=cv.circle(img,(ERL_X,ERL_Y),r,ERL_Color,-1)


ERR=results.multi_face_landmarks[0].landmark[263];
ERR_X,ERR_Y=int(ERR.x*w),int(ERR.y*h);ERR_Color=(23,255,128)
img=cv.circle(img,(ERR_X,ERR_Y),r,ERR_Color,-1)



Six_X=np.array([FL_X,ELL_X,ELR_X,ERL_X,ERR_X,FR_X])

Left_Right=FR_X-FL_X
Five_Distance=100*np.diff(Six_X)/Left_Right

Eye_Width_Mean=np.mean((Five_Distance[1],Five_Distance[3]))

Five_Eye_Diff=Five_Distance-Eye_Width_Mean

Five_Eye_Metrics=np.linalg.norm(Five_Eye_Diff)

cv.line(img,(FL_X,FT_Y),(FL_X,FB_Y),FL_Color,3)
cv.line(img,(ELL_X,FT_Y),(ELL_X,FB_Y),ELL_Color,3)
cv.line(img,(ELR_X,FT_Y),(ELR_X,FB_Y),ELR_Color,3)
cv.line(img,(ERL_X,FT_Y),(ERL_X,FB_Y),ERL_Color,3)
cv.line(img,(ERR_X,FT_Y),(ERR_X,FB_Y),ERR_Color,3)
cv.line(img,(FR_X,FT_Y),(FR_X,FB_Y),FR_Color,3)
cv.line(img,(FL_X,FT_Y),(FR_X,FT_Y),FT_Color,3)
cv.line(img,(FL_X,FB_Y),(FR_X,FB_Y),FB_Color,3)

scaler=1
img = cv.putText(img, 'Five Eye Metrics{:.2f}'.format(Five_Eye_Metrics), (25, 50), cv.FONT_HERSHEY_SIMPLEX, 1,(218, 112, 214), 4, 6)
#img = cv.putText(img, 'Distance 1{:.2f}'.format(Five_Eye_Diff[0]), (25, 100), cv.FONT_HERSHEY_SIMPLEX, 1,(218, 112, 214), 5, 5)
#img = cv.putText(img, 'Distance 1{:.2f}'.format(Five_Eye_Diff[2]), (25, 150), cv.FONT_HERSHEY_SIMPLEX, 1,(218, 112, 214), 4, 4)
#img = cv.putText(img, 'Distance 1{:.2f}'.format(Five_Eye_Diff[4]), (25, 200), cv.FONT_HERSHEY_SIMPLEX,1,(218, 112, 214), 3, 4)
look_img(img)
cv.imwrite("yanzhi.jpg",img)

Display on Unihiker

Real-time Camera Test

We capture real-time video streams using a camera and apply the same beauty evaluation algorithm to calculate beauty metrics instantly during real-time testing.

CODE


import cv2 as cv
import  mediapipe as mp
import numpy as np

import time
import  matplotlib.pyplot as plt

def look_img(img):
    img_RGB=cv.cvtColor(img,cv.COLOR_BGR2RGB)
    plt.imshow(img_RGB)
    plt.show()

mp_face_mesh=mp.solutions.face_mesh
# help(mp_face_mesh.FaceMesh)

model=mp_face_mesh.FaceMesh(
    static_image_mode=False,
    
    max_num_faces=5,
    min_detection_confidence=0.5, 
    min_tracking_confidence=0.5,
)



mp_drawing=mp.solutions.drawing_utils
# mp_drawing_styles=mp.solutions.drawing_styles
draw_spec=mp_drawing.DrawingSpec(thickness=2,circle_radius=1,color=[66,77,229])
landmark_drawing_spec=mp_drawing.DrawingSpec(thickness=1,circle_radius=2,color=[66,77,229])
connection_drawing_spec=mp_drawing.DrawingSpec(thickness=2,circle_radius=1,color=[233,155,6])



def process_frame(img):
    start_time = time.time()
    scaler = 1
    h, w = img.shape[0], img.shape[1]
    img_RGB = cv.cvtColor(img, cv.COLOR_BGR2RGB)
    results = model.process(img_RGB)
    if results.multi_face_landmarks:
        # for face_landmarks in results.multi_face_landmarks: 
            FL = results.multi_face_landmarks[0].landmark[234];
            FL_X, FL_Y = int(FL.x * w), int(FL.y * h);
            FL_Color = (234, 0, 255)
            img = cv.circle(img, (FL_X, FL_Y), 5, FL_Color, -1)
    
            ELL = results.multi_face_landmarks[0].landmark[33];  
            ELL_X, ELL_Y = int(ELL.x * w), int(ELL.y * h);
            ELL_Color = (0, 255, 0)
            img = cv.circle(img, (ELL_X, ELL_Y), 5, ELL_Color, -1)
           
          
            ELR = results.multi_face_landmarks[0].landmark[133];  
            ELR_X, ELR_Y = int(ELR.x * w), int(ELR.y * h);
            ELR_Color = (0, 255, 0)
            img = cv.circle(img, (ELR_X, ELR_Y), 5, ELR_Color, -1)
          
        
            ERL = results.multi_face_landmarks[0].landmark[362];  
            ERL_X, ERL_Y = int(ERL.x * w), int(ERL.y * h);
            ERL_Color = (233, 255, 128)
            img = cv.circle(img, (ERL_X, ERL_Y), 5, ERL_Color, -1)
           
            
            ERR = results.multi_face_landmarks[0].landmark[263];  
            ERR_X, ERR_Y = int(ERR.x * w), int(ERR.y * h);
            ERR_Color = (23, 255, 128)
            img = cv.circle(img, (ERR_X, ERR_Y), 5, ERR_Color, -1)
                      
            FR = results.multi_face_landmarks[0].landmark[454];  
            FR_X, FR_Y = int(FR.x * w), int(FR.y * h);
            FR_Color = (0, 255, 0)
            img = cv.circle(img, (FR_X, FR_Y), 5, FR_Color, -1)
                                
            FT = results.multi_face_landmarks[0].landmark[10]; 
            FT_X, FT_Y = int(FT.x * w), int(FT.y * h);
            FT_Color = (231, 141, 181)
            img = cv.circle(img, (FT_X, FT_Y), 5, FT_Color, -1)
             
            FB = results.multi_face_landmarks[0].landmark[152];  
            FB_X, FB_Y = int(FB.x * w), int(FB.y * h);
            FB_Color = (231, 141, 181)
            img = cv.circle(img, (FB_X, FB_Y), 5, FB_Color, -1)
                        
            Six_X = np.array([FL_X, ELL_X, ELR_X, ERL_X, ERR_X, FR_X])
          
            Left_Right = FR_X - FL_X
            Five_Distance = 100 * np.diff(Six_X) / Left_Right
            
            Eye_Width_Mean = np.mean((Five_Distance[1], Five_Distance[3]))
            
            Five_Eye_Diff = Five_Distance - Eye_Width_Mean
            
            Five_Eye_Metrics = np.linalg.norm(Five_Eye_Diff)

            cv.line(img, (FL_X, FT_Y), (FL_X, FB_Y), FL_Color, 3)
            cv.line(img, (ELL_X, FT_Y), (ELL_X, FB_Y), ELL_Color, 3)
            cv.line(img, (ELR_X, FT_Y), (ELR_X, FB_Y), ELR_Color, 3)
            cv.line(img, (ERL_X, FT_Y), (ERL_X, FB_Y), ERL_Color, 3)
            cv.line(img, (ERR_X, FT_Y), (ERR_X, FB_Y), ERR_Color, 3)
            cv.line(img, (FR_X, FT_Y), (FR_X, FB_Y), FR_Color, 3)
            cv.line(img, (FL_X, FT_Y), (FR_X, FT_Y), FT_Color, 3)
            cv.line(img, (FL_X, FB_Y), (FR_X, FB_Y), FB_Color, 3)

            scaler = 1
            
            img = cv.putText(img, 'Five Eye Metrics{:.2f}'.format(Five_Eye_Metrics), (25, 50), cv.FONT_HERSHEY_SIMPLEX,
                             1,
                             (218, 112, 214), 2, 6)
            img = cv.putText(img, 'Distance 1{:.2f}'.format(Five_Eye_Diff[0]), (25, 100), cv.FONT_HERSHEY_SIMPLEX, 1,
                             (218, 112, 214), 2, 5)
            img = cv.putText(img, 'Distance 1{:.2f}'.format(Five_Eye_Diff[2]), (25, 150), cv.FONT_HERSHEY_SIMPLEX, 1,
                             (218, 112, 214), 2, 4)
            img = cv.putText(img, 'Distance 1{:.2f}'.format(Five_Eye_Diff[4]), (25, 200), cv.FONT_HERSHEY_SIMPLEX, 1,
                             (218, 112, 214), 2, 4)


    else:
        img = cv.putText(img, 'NO FACE DELECTED', (25, 50), cv.FONT_HERSHEY_SIMPLEX, 1.25,
                         (218, 112, 214), 1, 8)
    
    end_time = time.time()   
    FPS = 1 / (end_time - start_time)
    scaler = 1
    img = cv.putText(img, 'FPS' + str(int(FPS)), (25 * scaler, 300 * scaler), cv.FONT_HERSHEY_SIMPLEX,
                         1.25 * scaler, (0, 0, 255), 1, 8)
    return img

cap=cv.VideoCapture(0)


cap.set(cv.CAP_PROP_FRAME_WIDTH, 320)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 240)
cap.set(cv.CAP_PROP_BUFFERSIZE, 1)
cv.namedWindow('my_window',cv.WND_PROP_FULLSCREEN)    #Set the windows to be full screen.
cv.setWindowProperty('my_window', cv.WND_PROP_FULLSCREEN, cv.WINDOW_FULLSCREEN)    #Set the windows to be full screen.

cap.open(0)
while cap.isOpened():
    success,frame=cap.read()
    # if not success:
    #     print('ERROR')
    #     break
    frame=process_frame(frame)    
    cv.imshow('my_window',frame)
    if cv.waitKey(1) &0xff==ord('q'):
        break

cap.release()
cv.destroyAllWindows()

Conclusion

This project utilizes computer vision technology, based on the "Three-Proportions Five-Eyes" principle, to calculate beauty metrics using norms, providing a novel method for facial beauty evaluation. Whether through static images or real-time videos, facial attractiveness can be assessed quickly and accurately.