Facial Beauty Evaluation with Computer Vision: Chinese Golden Ratio Face
Introduction
In today's era of social media, golden ratio face has become one of the focal points of people's attention. To satisfy people's curiosity and demand for their own facial attractiveness, we have embarked on a facial beauty evaluation project based on computer vision technology. By analyzing facial features and proportions, we aim to assess the aesthetic appeal of faces and provide users with an intuitive beauty score.
Three-Proportions Five-Eyes
Our project is based on the principle of Three-Proportions Five-Eyes, a traditional Chinese standard of facial aesthetics.
- Three-Proportions: Refers to the proportion of facial length, dividing the face's length into three equal parts, from the hairline to the brow, from the brow to the base of the nose, and from the base of the nose to the chin, each accounting for 1/3 of the face's length.
- Five-Eyes: Refers to the proportion of facial width, dividing the face's width into five equal parts using the length of an eye as a unit, from the left hairline to the right hairline, forming five eye-shaped sections. The space between two eyes counts as one eye's width, and each eye's width extends from the outer corner of the eye to the side hairline, each accounting for 1/5 of the total width.
The "Three-Proportions Five-Eyes" serve as a general standard proportion of facial length and width. Deviation from this proportion may distance one from the ideal face shape.
We calculate these proportions and combine them with the concept of norms to evaluate the overall aesthetic appeal of the face.
Application of Norms
Using Mediapipe to obtain the coordinates of facial keypoints, we calculate these five distances and use "norms" as a comparison value. Norms are primarily a description of matrices and vectors, providing a standard measure for comparison. With norms, we can compare the sizes of different entities, facilitating uniform comparisons.
- For example, while it's easy to see that 1 is smaller than 2, comparing (3,5,3) and (6,1,2) isn't as straightforward. In terms of the 2-norm comparison: the square root of 43 is greater than the square root of 41, thus in the 2-norm comparison, (3,5,3) is larger. Regarding the infinity-norm comparison: 5 is smaller than 6, so in the infinity-norm comparison, (6,1,2) is larger.
- Matrix norms describe the magnitude of changes caused by matrices. Given the equation AX=B, if the matrix X changes by A orders of magnitude, it becomes B.
- Vector norms describe the magnitude of vectors in space. More generally, norms can describe the distance relationship between two quantities.
The general formula for vector norms is the L-p norm.
Remember, all other formulas are derived from this one
L-0 norm: Used to count the number of non-zero elements in a vector.
L-1 norm: The sum of the absolute values of all elements in the vector. It can be used in optimization to remove information with no value, also known as the sparse rule operator.
L-2 norm: Typically used for Euclidean distance. It can be used in optimization for regularization to avoid overfitting.
L-∞ norm: Computes the maximum value in a vector.
Image Beauty Test
We conduct beauty tests using static images, utilizing the Mediapipe library's facial keypoint detection model to obtain keypoint coordinates and calculate beauty metrics for visualization.
import cv2 as cv
import mediapipe as mp
import numpy as np
import time
import matplotlib.pyplot as plt
def look_img(img):
img_RGB=cv.cvtColor(img,cv.COLOR_BGR2RGB)
plt.imshow(img_RGB)
plt.show()
mp_face_mesh=mp.solutions.face_mesh
# help(mp_face_mesh.FaceMesh)
model=mp_face_mesh.FaceMesh(
static_image_mode=True,
max_num_faces=40,
min_detection_confidence=0.5,
min_tracking_confidence=0.5,
)
mp_drawing=mp.solutions.drawing_utils
# mp_drawing_styles=mp.solutions.drawing_styles
draw_spec=mp_drawing.DrawingSpec(thickness=2,circle_radius=1,color=[66,77,229])
img=cv.imread("face.jpg")
img_RGB=cv.cvtColor(img,cv.COLOR_BGR2RGB)
scaler=1
h,w=img.shape[0],img.shape[1]
r=10
results=model.process(img_RGB)
FL=results.multi_face_landmarks[0].landmark[234];
FL_X,FL_Y=int(FL.x*w),int(FL.y*h);FL_Color=(234,0,255)
img=cv.circle(img,(FL_X,FL_Y),r,FL_Color,-1)
FT=results.multi_face_landmarks[0].landmark[10];
img=cv.circle(img,(FT_X,FT_Y),r,FT_Color,-1)
FB=results.multi_face_landmarks[0].landmark[152];
FB_X,FB_Y=int(FB.x*w),int(FB.y*h);FB_Color=(231,141,181)
img=cv.circle(img,(FB_X,FB_Y),r,FB_Color,-1)
FR=results.multi_face_landmarks[0].landmark[454];
FR_X,FR_Y=int(FR.x*w),int(FR.y*h);FR_Color=(0,255,0)
img=cv.circle(img,(FR_X,FR_Y),r,FR_Color,-1)
ELL=results.multi_face_landmarks[0].landmark[33];
ELL_X,ELL_Y=int(ELL.x*w),int(ELL.y*h);ELL_Color=(0,255,0)
img=cv.circle(img,(ELL_X,ELL_Y),r,ELL_Color,-1)
ELR=results.multi_face_landmarks[0].landmark[133];
ELR_X,ELR_Y=int(ELR.x*w),int(ELR.y*h);ELR_Color=(0,255,0)
img=cv.circle(img,(ELR_X,ELR_Y),r,ELR_Color,-1)
ERL=results.multi_face_landmarks[0].landmark[362];
ERL_X,ERL_Y=int(ERL.x*w),int(ERL.y*h);ERL_Color=(233,255,128)
img=cv.circle(img,(ERL_X,ERL_Y),r,ERL_Color,-1)
ERR=results.multi_face_landmarks[0].landmark[263];
ERR_X,ERR_Y=int(ERR.x*w),int(ERR.y*h);ERR_Color=(23,255,128)
img=cv.circle(img,(ERR_X,ERR_Y),r,ERR_Color,-1)
Six_X=np.array([FL_X,ELL_X,ELR_X,ERL_X,ERR_X,FR_X])
Left_Right=FR_X-FL_X
Five_Distance=100*np.diff(Six_X)/Left_Right
Eye_Width_Mean=np.mean((Five_Distance[1],Five_Distance[3]))
Five_Eye_Diff=Five_Distance-Eye_Width_Mean
Five_Eye_Metrics=np.linalg.norm(Five_Eye_Diff)
cv.line(img,(FL_X,FT_Y),(FL_X,FB_Y),FL_Color,3)
cv.line(img,(ELL_X,FT_Y),(ELL_X,FB_Y),ELL_Color,3)
cv.line(img,(ELR_X,FT_Y),(ELR_X,FB_Y),ELR_Color,3)
cv.line(img,(ERL_X,FT_Y),(ERL_X,FB_Y),ERL_Color,3)
cv.line(img,(ERR_X,FT_Y),(ERR_X,FB_Y),ERR_Color,3)
cv.line(img,(FR_X,FT_Y),(FR_X,FB_Y),FR_Color,3)
cv.line(img,(FL_X,FT_Y),(FR_X,FT_Y),FT_Color,3)
cv.line(img,(FL_X,FB_Y),(FR_X,FB_Y),FB_Color,3)
scaler=1
img = cv.putText(img, 'Five Eye Metrics{:.2f}'.format(Five_Eye_Metrics), (25, 50), cv.FONT_HERSHEY_SIMPLEX, 1,(218, 112, 214), 4, 6)
#img = cv.putText(img, 'Distance 1{:.2f}'.format(Five_Eye_Diff[0]), (25, 100), cv.FONT_HERSHEY_SIMPLEX, 1,(218, 112, 214), 5, 5)
#img = cv.putText(img, 'Distance 1{:.2f}'.format(Five_Eye_Diff[2]), (25, 150), cv.FONT_HERSHEY_SIMPLEX, 1,(218, 112, 214), 4, 4)
#img = cv.putText(img, 'Distance 1{:.2f}'.format(Five_Eye_Diff[4]), (25, 200), cv.FONT_HERSHEY_SIMPLEX,1,(218, 112, 214), 3, 4)
look_img(img)
cv.imwrite("yanzhi.jpg",img)
Display on Unihiker
Real-time Camera Test
We capture real-time video streams using a camera and apply the same beauty evaluation algorithm to calculate beauty metrics instantly during real-time testing.
import cv2 as cv
import mediapipe as mp
import numpy as np
import time
import matplotlib.pyplot as plt
def look_img(img):
img_RGB=cv.cvtColor(img,cv.COLOR_BGR2RGB)
plt.imshow(img_RGB)
plt.show()
mp_face_mesh=mp.solutions.face_mesh
# help(mp_face_mesh.FaceMesh)
model=mp_face_mesh.FaceMesh(
static_image_mode=False,
max_num_faces=5,
min_detection_confidence=0.5,
min_tracking_confidence=0.5,
)
mp_drawing=mp.solutions.drawing_utils
# mp_drawing_styles=mp.solutions.drawing_styles
draw_spec=mp_drawing.DrawingSpec(thickness=2,circle_radius=1,color=[66,77,229])
landmark_drawing_spec=mp_drawing.DrawingSpec(thickness=1,circle_radius=2,color=[66,77,229])
connection_drawing_spec=mp_drawing.DrawingSpec(thickness=2,circle_radius=1,color=[233,155,6])
def process_frame(img):
start_time = time.time()
scaler = 1
h, w = img.shape[0], img.shape[1]
img_RGB = cv.cvtColor(img, cv.COLOR_BGR2RGB)
results = model.process(img_RGB)
if results.multi_face_landmarks:
# for face_landmarks in results.multi_face_landmarks:
FL = results.multi_face_landmarks[0].landmark[234];
FL_X, FL_Y = int(FL.x * w), int(FL.y * h);
FL_Color = (234, 0, 255)
img = cv.circle(img, (FL_X, FL_Y), 5, FL_Color, -1)
ELL = results.multi_face_landmarks[0].landmark[33];
ELL_X, ELL_Y = int(ELL.x * w), int(ELL.y * h);
ELL_Color = (0, 255, 0)
img = cv.circle(img, (ELL_X, ELL_Y), 5, ELL_Color, -1)
ELR = results.multi_face_landmarks[0].landmark[133];
ELR_X, ELR_Y = int(ELR.x * w), int(ELR.y * h);
ELR_Color = (0, 255, 0)
img = cv.circle(img, (ELR_X, ELR_Y), 5, ELR_Color, -1)
ERL = results.multi_face_landmarks[0].landmark[362];
ERL_X, ERL_Y = int(ERL.x * w), int(ERL.y * h);
ERL_Color = (233, 255, 128)
img = cv.circle(img, (ERL_X, ERL_Y), 5, ERL_Color, -1)
ERR = results.multi_face_landmarks[0].landmark[263];
ERR_X, ERR_Y = int(ERR.x * w), int(ERR.y * h);
ERR_Color = (23, 255, 128)
img = cv.circle(img, (ERR_X, ERR_Y), 5, ERR_Color, -1)
FR = results.multi_face_landmarks[0].landmark[454];
FR_X, FR_Y = int(FR.x * w), int(FR.y * h);
FR_Color = (0, 255, 0)
img = cv.circle(img, (FR_X, FR_Y), 5, FR_Color, -1)
FT = results.multi_face_landmarks[0].landmark[10];
FT_X, FT_Y = int(FT.x * w), int(FT.y * h);
FT_Color = (231, 141, 181)
img = cv.circle(img, (FT_X, FT_Y), 5, FT_Color, -1)
FB = results.multi_face_landmarks[0].landmark[152];
FB_X, FB_Y = int(FB.x * w), int(FB.y * h);
FB_Color = (231, 141, 181)
img = cv.circle(img, (FB_X, FB_Y), 5, FB_Color, -1)
Six_X = np.array([FL_X, ELL_X, ELR_X, ERL_X, ERR_X, FR_X])
Left_Right = FR_X - FL_X
Five_Distance = 100 * np.diff(Six_X) / Left_Right
Eye_Width_Mean = np.mean((Five_Distance[1], Five_Distance[3]))
Five_Eye_Diff = Five_Distance - Eye_Width_Mean
Five_Eye_Metrics = np.linalg.norm(Five_Eye_Diff)
cv.line(img, (FL_X, FT_Y), (FL_X, FB_Y), FL_Color, 3)
cv.line(img, (ELL_X, FT_Y), (ELL_X, FB_Y), ELL_Color, 3)
cv.line(img, (ELR_X, FT_Y), (ELR_X, FB_Y), ELR_Color, 3)
cv.line(img, (ERL_X, FT_Y), (ERL_X, FB_Y), ERL_Color, 3)
cv.line(img, (ERR_X, FT_Y), (ERR_X, FB_Y), ERR_Color, 3)
cv.line(img, (FR_X, FT_Y), (FR_X, FB_Y), FR_Color, 3)
cv.line(img, (FL_X, FT_Y), (FR_X, FT_Y), FT_Color, 3)
cv.line(img, (FL_X, FB_Y), (FR_X, FB_Y), FB_Color, 3)
scaler = 1
img = cv.putText(img, 'Five Eye Metrics{:.2f}'.format(Five_Eye_Metrics), (25, 50), cv.FONT_HERSHEY_SIMPLEX,
1,
(218, 112, 214), 2, 6)
img = cv.putText(img, 'Distance 1{:.2f}'.format(Five_Eye_Diff[0]), (25, 100), cv.FONT_HERSHEY_SIMPLEX, 1,
(218, 112, 214), 2, 5)
img = cv.putText(img, 'Distance 1{:.2f}'.format(Five_Eye_Diff[2]), (25, 150), cv.FONT_HERSHEY_SIMPLEX, 1,
(218, 112, 214), 2, 4)
img = cv.putText(img, 'Distance 1{:.2f}'.format(Five_Eye_Diff[4]), (25, 200), cv.FONT_HERSHEY_SIMPLEX, 1,
(218, 112, 214), 2, 4)
else:
img = cv.putText(img, 'NO FACE DELECTED', (25, 50), cv.FONT_HERSHEY_SIMPLEX, 1.25,
(218, 112, 214), 1, 8)
end_time = time.time()
FPS = 1 / (end_time - start_time)
scaler = 1
img = cv.putText(img, 'FPS' + str(int(FPS)), (25 * scaler, 300 * scaler), cv.FONT_HERSHEY_SIMPLEX,
1.25 * scaler, (0, 0, 255), 1, 8)
return img
cap=cv.VideoCapture(0)
cap.set(cv.CAP_PROP_FRAME_WIDTH, 320)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 240)
cap.set(cv.CAP_PROP_BUFFERSIZE, 1)
cv.namedWindow('my_window',cv.WND_PROP_FULLSCREEN) #Set the windows to be full screen.
cv.setWindowProperty('my_window', cv.WND_PROP_FULLSCREEN, cv.WINDOW_FULLSCREEN) #Set the windows to be full screen.
cap.open(0)
while cap.isOpened():
success,frame=cap.read()
# if not success:
# print('ERROR')
# break
frame=process_frame(frame)
cv.imshow('my_window',frame)
if cv.waitKey(1) &0xff==ord('q'):
break
cap.release()
cv.destroyAllWindows()
Conclusion
This project utilizes computer vision technology, based on the "Three-Proportions Five-Eyes" principle, to calculate beauty metrics using norms, providing a novel method for facial beauty evaluation. Whether through static images or real-time videos, facial attractiveness can be assessed quickly and accurately.
Demo
This article was first published on https://mc.dfrobot.com.cn/thread-313268-1-1.html on May 21, 2022
Author:云天