Python Interface
You can import the Hunter class into your own projects using the following lines of code:
>>> from src.hunter import Hunter >>> hunter = Hunter("https://www.youtube.com/watch?v=DbpdIEs2Xig").fit()Afterwards you can either use the class to get a list of entities in the video:
>>> hunter.recognize()to link the entities to a knowledge graph:
>>> hunter.link()or to search for scenes of entities in an existing database:
>>> hunter.search("Adam Sandler")
- class src.hunter.Hunter(url: Optional[str] = None)[source]
Class to use the entity linking in other projects and on the website.
- fit(thumbnail_list=None, thumbnails_path='data/thumbnails/thumbnails', img_width=500, encoder_name: str = 'Dlib', labels_path='data/embeddings/labels.pickle', embeddings_path='data/embeddings/embeddings.pickle')[source]
Creates the embeddings for a dictionary of thumbnails.
- Parameters
thumbnail_list (list) – list of thumbnails to load.
thumbnails_path (str) – Path to the directory containing the thumbnails.
img_width (int) – Size to which the thumbnails should be resized.
encoder_name (str) – Specifies the method to create embeddings of faces in an image.
labels_path (str) – Path where the label-information should be saved.
embeddings_path (str) – Path where the embeddings should be saved.
- Returns
self
- link(algorithm='appr', method='hnsw', space='cosinesimil', distance_threshold=0.4, index_path='data/embeddings/index.bin', k=1, storage_type: str = 'memory', memory_path: str = 'models/store', virtuoso_url: Optional[str] = None, virtuoso_graph: Optional[str] = None, virtuoso_username: Optional[str] = None, virtuoso_password: Optional[str] = None, dbpedia_csv: str = 'data/thumbnails/dbpedia_thumbnails/Thumbnails_links.csv', wikidata_csv: str = 'data/thumbnails/wikidata_thumbnails/Thumbnails_links.csv')[source]
Recognize entities in a video and add corresponding links to the knowledge graph.
- Parameters
algorithm (str) – Algorithm to use for the similarity-calculation. Should be ‘1nn’ for 1-Nearest Neighbors with euclidean distance, ‘appr’ for approximate k-Nearest Neighbors.
distance_threshold (float) – The threshold above which faces are recognized as being similar.
method (str) – Type of graph to use for the k-nearest neighbor approximation. See https://github.com/nmslib/nmslib/blob/master/manual/methods.md for available options. Only necessary if algorithm = ‘appr’.
space (str) – Similarity measure to use in the space. Only necessary if algorithm = ‘appr’.
index_path (str) – Path to an existing nmslib-index. Only necessary if algorithm = ‘appr’.
k (int) – The number of k-nearest neighbors to consider for the detection. Only necessary if algorithm = ‘appr’.
storage_type (str) – Whether to save links to a local rdf-file or a Virtuoso database. Should be ‘memory’ for a local file, ‘virtuoso’ for Virtuoso.
memory_path (str) – Path to which the links should be written. Only necessary if storage_type = memory.
virtuoso_url (str) – URL of the Virtuoso-SPARQL-instance. Only necessary if storage_type = virtuoso.
virtuoso_graph (str) – URL of the Virtuoso-Graph in which the links should be saved. Only necessary if storage_type = virtuoso.
virtuoso_username (str) – Username to access the Virtuoso instance. Only necessary if storage_type = virtuoso.
virtuoso_password (str) – Password to access the Virtuoso instance. Only necessary if storage_type = virtuoso.
dbpedia_csv (str) – Path of the normalized DBpedia-thumbnail-information.
wikidata_csv (str) – Path of the normalized Wikidata-thumbnail-information.
- recognize(algorithm='appr', method='hnsw', space='cosinesimil', distance_threshold=0.4, index_path='data/embeddings/index.bin', k=1) list[source]
Get a list of entities that could be recognized in the video.
- Parameters
algorithm (str) – Algorithm to use for the similarity-calculation. Should be ‘1nn’ for 1-Nearest Neighbors with euclidean distance, ‘appr’ for approximate k-Nearest Neighbors.
distance_threshold (float) – The threshold above which faces are recognized as being similar.
method (str) – Type of graph to use for the k-nearest neighbor approximation. See https://github.com/nmslib/nmslib/blob/master/manual/methods.md for available options. Only necessary if algorithm = ‘appr’.
space (str) – Similarity measure to use in the space. Only necessary if algorithm = ‘appr’.
index_path (str) – Path to an existing nmslib-index. Only necessary if algorithm = ‘appr’.
k (int) – The number of k-nearest neighbors to consider for the detection. Only necessary if algorithm = ‘appr’.
- Returns
Entities found in the video.
- Return type
entities (list)
- static search(entity: Optional[str] = None, storage_type: str = 'memory', memory_path: str = 'models/store', virtuoso_url: Optional[str] = None, virtuoso_graph: Optional[str] = None, virtuoso_username: Optional[str] = None, virtuoso_password: Optional[str] = None, dbpedia_csv: Optional[str] = None, wikidata_csv: Optional[str] = None)[source]
Allows to search for scenes in a knowledge graph using the entity name.
- Parameters
entity (str) – Name of the entity to search for. Should be the label, DBpedia- or Wikidata-URI.
storage_type (str) – Whether to save links to a local rdf-file or a Virtuoso database. Should be ‘memory’ for a local file, ‘virtuoso’ for Virtuoso.
memory_path (str) – Path to which the links should be written. Only necessary if storage_type = memory.
virtuoso_url (str) – URL of the Virtuoso-SPARQL-instance. Only necessary if storage_type = virtuoso.
virtuoso_graph (str) – URL of the Virtuoso-Graph in which the links should be saved. Only necessary if storage_type = virtuoso.
virtuoso_username (str) – Username to access the Virtuoso instance. Only necessary if storage_type = virtuoso.
virtuoso_password (str) – Password to access the Virtuoso instance. Only necessary if storage_type = virtuoso.
dbpedia_csv (str) – Path of the normalized DBpedia-thumbnail-information.
wikidata_csv (str) – Path of the normalized Wikidata-thumbnail-information
- Returns
The scenes in which the entity occurs.
- Return type
scenes (list)