Python Interface

You can import the Hunter class into your own projects using the following lines of code:

>>> from src.hunter import Hunter
>>> hunter = Hunter("https://www.youtube.com/watch?v=DbpdIEs2Xig").fit()

Afterwards you can either use the class to get a list of entities in the video:

>>> hunter.recognize()

to link the entities to a knowledge graph:

>>> hunter.link()

or to search for scenes of entities in an existing database:

>>> hunter.search("Adam Sandler")
class src.hunter.Hunter(url: Optional[str] = None)[source]

Class to use the entity linking in other projects and on the website.

fit(thumbnail_list=None, thumbnails_path='data/thumbnails/thumbnails', img_width=500, encoder_name: str = 'Dlib', labels_path='data/embeddings/labels.pickle', embeddings_path='data/embeddings/embeddings.pickle')[source]

Creates the embeddings for a dictionary of thumbnails.

Parameters
  • thumbnail_list (list) – list of thumbnails to load.

  • thumbnails_path (str) – Path to the directory containing the thumbnails.

  • img_width (int) – Size to which the thumbnails should be resized.

  • encoder_name (str) – Specifies the method to create embeddings of faces in an image.

  • labels_path (str) – Path where the label-information should be saved.

  • embeddings_path (str) – Path where the embeddings should be saved.

Returns

self

Recognize entities in a video and add corresponding links to the knowledge graph.

Parameters
  • algorithm (str) – Algorithm to use for the similarity-calculation. Should be ‘1nn’ for 1-Nearest Neighbors with euclidean distance, ‘appr’ for approximate k-Nearest Neighbors.

  • distance_threshold (float) – The threshold above which faces are recognized as being similar.

  • method (str) – Type of graph to use for the k-nearest neighbor approximation. See https://github.com/nmslib/nmslib/blob/master/manual/methods.md for available options. Only necessary if algorithm = ‘appr’.

  • space (str) – Similarity measure to use in the space. Only necessary if algorithm = ‘appr’.

  • index_path (str) – Path to an existing nmslib-index. Only necessary if algorithm = ‘appr’.

  • k (int) – The number of k-nearest neighbors to consider for the detection. Only necessary if algorithm = ‘appr’.

  • storage_type (str) – Whether to save links to a local rdf-file or a Virtuoso database. Should be ‘memory’ for a local file, ‘virtuoso’ for Virtuoso.

  • memory_path (str) – Path to which the links should be written. Only necessary if storage_type = memory.

  • virtuoso_url (str) – URL of the Virtuoso-SPARQL-instance. Only necessary if storage_type = virtuoso.

  • virtuoso_graph (str) – URL of the Virtuoso-Graph in which the links should be saved. Only necessary if storage_type = virtuoso.

  • virtuoso_username (str) – Username to access the Virtuoso instance. Only necessary if storage_type = virtuoso.

  • virtuoso_password (str) – Password to access the Virtuoso instance. Only necessary if storage_type = virtuoso.

  • dbpedia_csv (str) – Path of the normalized DBpedia-thumbnail-information.

  • wikidata_csv (str) – Path of the normalized Wikidata-thumbnail-information.

recognize(algorithm='appr', method='hnsw', space='cosinesimil', distance_threshold=0.4, index_path='data/embeddings/index.bin', k=1) list[source]

Get a list of entities that could be recognized in the video.

Parameters
  • algorithm (str) – Algorithm to use for the similarity-calculation. Should be ‘1nn’ for 1-Nearest Neighbors with euclidean distance, ‘appr’ for approximate k-Nearest Neighbors.

  • distance_threshold (float) – The threshold above which faces are recognized as being similar.

  • method (str) – Type of graph to use for the k-nearest neighbor approximation. See https://github.com/nmslib/nmslib/blob/master/manual/methods.md for available options. Only necessary if algorithm = ‘appr’.

  • space (str) – Similarity measure to use in the space. Only necessary if algorithm = ‘appr’.

  • index_path (str) – Path to an existing nmslib-index. Only necessary if algorithm = ‘appr’.

  • k (int) – The number of k-nearest neighbors to consider for the detection. Only necessary if algorithm = ‘appr’.

Returns

Entities found in the video.

Return type

entities (list)

static search(entity: Optional[str] = None, storage_type: str = 'memory', memory_path: str = 'models/store', virtuoso_url: Optional[str] = None, virtuoso_graph: Optional[str] = None, virtuoso_username: Optional[str] = None, virtuoso_password: Optional[str] = None, dbpedia_csv: Optional[str] = None, wikidata_csv: Optional[str] = None)[source]

Allows to search for scenes in a knowledge graph using the entity name.

Parameters
  • entity (str) – Name of the entity to search for. Should be the label, DBpedia- or Wikidata-URI.

  • storage_type (str) – Whether to save links to a local rdf-file or a Virtuoso database. Should be ‘memory’ for a local file, ‘virtuoso’ for Virtuoso.

  • memory_path (str) – Path to which the links should be written. Only necessary if storage_type = memory.

  • virtuoso_url (str) – URL of the Virtuoso-SPARQL-instance. Only necessary if storage_type = virtuoso.

  • virtuoso_graph (str) – URL of the Virtuoso-Graph in which the links should be saved. Only necessary if storage_type = virtuoso.

  • virtuoso_username (str) – Username to access the Virtuoso instance. Only necessary if storage_type = virtuoso.

  • virtuoso_password (str) – Password to access the Virtuoso instance. Only necessary if storage_type = virtuoso.

  • dbpedia_csv (str) – Path of the normalized DBpedia-thumbnail-information.

  • wikidata_csv (str) – Path of the normalized Wikidata-thumbnail-information

Returns

The scenes in which the entity occurs.

Return type

scenes (list)