Models

Facial Recognition

The facial recognition is based on the Deepface library.

class src.models.face_recognition.FaceRecognition(thumbnail_list: Optional[list] = None, thumbnails_path: str = 'data/thumbnails/thumbnails', img_width: int = 500, encoder_name: str = 'Dlib', labels_path: str = 'data/embeddings/labels.pickle', embeddings_path: str = 'data/embeddings/embeddings.pickle')[source]

Bases: object

Allows to recognize faces in videos

batch_recognize_images(unknown_imgs: list, recognizer_model=None, distance_threshold=0.6)[source]

Recognize entities in batches of embeddings

Parameters

unknown_imgs (list) – List of embeddings.
recognizer_model (any model) – Model trained with embeddings to predict entities.
distance_threshold (float) – The threshold below which recognitions are marked as unknown.

Returns

List of detected entities.

Return type

detected_faces (list)

batch_represent(imgs: list)[source]

create embeddings from images in batches

Parameters: imgs (list) – List of frames.
Returns: List of face embeddings.
Return type: embeddings

create_embeddings()[source]

create and save face embeddings and entity labels

Returns: List of face embeddings. labels (list): List of entity names.
Return type: embeddings (list)

load_embeddings()[source]

Loads already existing embeddings

Returns: List of entity names. embeddings (list): List of face embeddings.
Return type: labels (list)

recognize_image(unknown_img, recognizer_model=None, distance_threshold=0.6)[source]

Recognize entities in an image

Parameters

unknown_img (image_path or image object) – The image to detect entities in.
recognizer_model (any model) – Model trained with embeddings to predict entities.
distance_threshold (float) – The threshold below which recognitions are marked as unknown.

Returns

List of detected entities.

Return type

detected_faces (list)

recognize_video(video_path: str, recognizer_model=None, distance_threshold=0.6, by='second')[source]

recognize faces on a frame or second level

Parameters

video_path (str) – Path to the video.
recognizer_model (any model) – Model trained with embeddings to predict entities.
distance_threshold (float) – The threshold below which recognitions are marked as unknown.
by (str) – Recognize by ‘second’ or ‘frame’.

Returns

List of recognized entities per frame/second. detected_faces (list): List of identical entities. timestamps (float): The corresponding timestamps to the detections.

Return type

frame_faces_list (list)

represent(img, one_face=False, return_face_number=False)[source]

create an embedding from an image

Parameters

img (img object | img_path) – The image to create the embedding for.
one_face (bool) – If only the largest face should be considered.
return_face_number (bool) – If the number of faces should be returned for distance tuning.

Returns

List of face embeddings. OR face_number (int): Returns number of faces if return_face_number is True and number of faces > 1.

Return type

embeddings (list)

Distance Tuning

src.models.distance_tuning.tune_distance_threshold(video_path='data/datasets/youtube_faces', sample_per_person=5, model='Dlib') → float[source]

Finds the optimal distance threshold

Parameters

video_path (str) – Path to the dataset on which the threshold should be investigated
sample_per_person (int) – Number of frames to compare in the data set
model (str) – The face recognition model to tune

Returns

The optimal threshold

Return type

distance_threshold (float)

Evaluation

src.models.evaluation.evaluate_on_dataset(path: str = 'data/datasets/ytcelebrity', path_thumbnails: str = 'data/thumbnails', ratio: float = 1.0, seed: int = 42, single_true: bool = False, scene_extraction: int = 0)[source]

Detects entities in a dataset and calculates evaluation metrics

Parameters

path (str) – The Location of the dataset.
path_thumbnails (str) – The Location of the thumbnails.
ratio (float) – Ratio between thumbnails contained and not contained in the dataset.
seed (int) – Parameter to control randomness for repeatable experiments.
single_true (bool) – If the evaluation dataset only gives a single label for images with multiple entities.
scene_extraction – (int): Whether to postprocess detections using the scene extraction algorithm. Disabled with 0.

Returns

The evaluation scores. [accuracy, precision, recall, f1] files (list): The files that were involved in the evaluation. per_file_results (list): The evaluation metrics per file.

Return type

scores (list)

src.models.evaluation.get_evaluation_metrics(y_pred: Optional[list] = None, y_true: Optional[list] = None, missing_entities: Optional[set] = None, single_true: bool = False)[source]

Calculates the accuracy, recall, precision and f1-score for predictions. Details: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.364.5612&rep=rep1&type=pdf

Parameters

y_pred (list) – The list of lists with predicted entities (Entities per frame).
y_true (list) – The list of lists with true entities (Entities per frame).
missing_entities (set) – List of entities to be handled as unknown.
single_true (bool) – Whether the evaluation dataset only gives single labels for images with multiple entities.

Returns

[accuracy, precision, recall, f1]

Return type

scores (np.array)

Approximate k-Nearest Neighbors

This module uses an implementation of NMSLIB.

class src.models.approximate_k_nearest_neighbors.ApproximateKNearestNeighbors(method='hnsw', space='cosinesimil', distance_threshold=0.4, index_path='data/embeddings/index.bin', k=1)[source]

Bases: object

Provides a fast way to perform k-Nearest-Neighbor-search to predict entities in videos or images

Parameters

method (str) – The method that NMSLIB uses for the k-Nearest Neighbor search. Details can be found here: https://github.com/nmslib/nmslib/blob/master/manual/methods.md.
space (str) – The vector space NMSLIB uses for comparing data points. Details can be found here: https://github.com/nmslib/nmslib/blob/master/manual/spaces.md.
distance_threshold (float) – Defines the maximum distance face embeddings can have to be detected as similar.
index_path (str) – Optional path to an existing model to load.
k (int) – The number of nearest neighbors to consider.

fit(embeddings, labels)[source]

Uses embeddings to train the algorithm.

Parameters

embeddings (list) – The ordered embeddings of all face images.
labels (list) – Ordered List of entities in our datasets

Returns

self

predict(embedding) → str[source]

Predict the entity of an embedding

Parameters: embedding – The embedding to analyze
Returns: The entity with maximum probability to match the embedding
Return type: entity (str)