Comparisons

Statement comparison

ChatterBot uses Statement objects to hold information about things that can be said. An important part of how a chat bot selects a response is based on its ability to compare two statements to each other. There are a number of ways to do this, and ChatterBot comes with a handful of methods built in for you to use.

This module contains various text-comparison algorithms designed to compare one statement to another.

class chatterbot.comparisons.Comparator(language)[source]

Base class establishing the interface that all comparators should implement.

compare(statement_a, statement_b)[source]

Implemented in subclasses: compare statement_a to statement_b.

Returns:

The percent of similarity between the statements based on the implemented algorithm.

Return type:

float

class chatterbot.comparisons.JaccardSimilarity(language)[source]

Calculates the similarity of two statements based on the Jaccard index.

The Jaccard index is composed of a numerator and denominator. In the numerator, we count the number of items that are shared between the sets. In the denominator, we count the total number of items across both sets. Let’s say we define sentences to be equivalent if 50% or more of their tokens are equivalent. Here are two sample sentences:

The young cat is hungry. The cat is very hungry.

When we parse these sentences to remove stopwords, we end up with the following two sets:

{young, cat, hungry} {cat, very, hungry}

In our example above, our intersection is {cat, hungry}, which has count of two. The union of the sets is {young, cat, very, hungry}, which has a count of four. Therefore, our Jaccard similarity index is two divided by four, or 50%. Given our similarity threshold above, we would consider this to be a match.

compare(statement_a, statement_b)[source]

Return the calculated similarity of two statements based on the Jaccard index.

class chatterbot.comparisons.LevenshteinDistance(language)[source]

Compare two statements based on the Levenshtein distance of each statement’s text.

For example, there is a 65% similarity between the statements “where is the post office?” and “looking for the post office” based on the Levenshtein distance algorithm.

compare(statement_a, statement_b)[source]

Compare the two input statements.

Returns:

The percent of similarity between the text of the statements.

Return type:

float

class chatterbot.comparisons.SpacySimilarity(language)[source]

Calculate the similarity of two statements using Spacy models.

NOTE:

You will also need to download a spacy model to use for tagging. Internally these are used to determine parts of speech for words.

The easiest way to do this is to use the spacy download command directly:

python -m spacy download en_core_web_sm
python -m spacy download de_core_news_sm

Alternatively, the spacy models can be installed as Python packages. The following lines could be included in a requirements.txt or pyproject.yml file if you needed to pin specific versions:

https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.0/en_core_web_sm-2.3.0.tar.gz#egg=en_core_web_sm
https://github.com/explosion/spacy-models/releases/download/de_core_news_sm-2.3.0/de_core_news_sm-2.3.0.tar.gz#egg=de_core_news_sm
compare(statement_a, statement_b)[source]

Compare the two input statements.

Returns:

The percent of similarity between the closest synset distance.

Return type:

float

Use your own comparison function

You can create your own comparison function and use it as long as the function takes two statements as parameters and returns a numeric value between 0 and 1. A 0 should represent the lowest possible similarity and a 1 should represent the highest possible similarity.

def comparison_function(statement, other_statement):

    # Your comparison logic

    # Return your calculated value here
    return 0.0

Setting the comparison method

To set the statement comparison method for your chat bot, you will need to pass the statement_comparison_function parameter to your chat bot when you initialize it. An example of this is shown below.

from chatterbot import ChatBot
from chatterbot.comparisons import LevenshteinDistance

chatbot = ChatBot(
    # ...
    statement_comparison_function=LevenshteinDistance
)