Making a simple and fast chatbot in 10 minutes

In real-world response time for a chatbot matters a lot. Be it the travel industry, banks, or doctors, if you want to really help your customers, response time should be less, and similar to what it is while talking to a customer care representative.

Besides the time it is also important to understand the main motive of the chatbot, every industry cannot use a similar chatbot as they have different purposes and have a different set of corpus to reply from.

While transformers are good to get a suitable reply, it may take time to respond back. On the other hand where time is concerned various other methodologies can be applied and even find some rule-based systems to get an appropriate reply which is apt for the question asked.

How many times you may have contacted a travel agency for the refund of your tickets booked last year during the lock-down, I am sure getting an apt reply to it was far from reality.

Now let’s make a simple chatbot and install these packages:

Install nltk 
Install newspaper3k

Package newspaper3k has few advantages as below:

  1. · Multi-threaded article download framework

  2. · News URL can be identified

  3. · Text extraction can be done from HTML

  4. · Top image extraction from HTML

  5. · All image extraction can be done from HTML

  6. · Keyword extraction can be done from the text

  7. · Summary extraction can be done from the text

  8. · Author extraction can be done from the text

  9. · Google trending terms extraction

  10. · Works in 10+ languages (English, German, Arabic, Chinese, …)

Import libraries as below:

#import libraries
from newspaper import Article
import random
import nltk
import string
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity

I have already talked about CountVectorizer in my old blogs.

Cosine similarity, or the cosine kernel, computes similarity as the normalized dot product of X and Y:

sklearn.metrics.pairwise.cosine_similarity(X, Y=None, dense_output=True)


X{ndarray, sparse matrix} of shape (n_samples_X, n_features) Input data. Y{ndarray, sparse matrix} of shape (n_samples_Y, n_features), default=None Input data.

If None, the output will be the pairwise similarities between all samples in X. dense_outputbool, default=True Whether to return dense output even when the input is sparse. If False, the output is sparse if both input arrays are sparse.


kernel matrix: ndarray of shape (n_samples_X, n_samples_Y)

import numpy as np
import warnings

Tokenization is already explained in my blog. Here we are taking data from a healthcare website


sentence_list=nltk.sent_tokenize(text) #A list of sentences

#Print the list of sentences

Once you have the corpus ready, you may have to think about questions that a user or customer may ask or say, which doesn’t have any relation to the content we have.

It can be a greeting message, gratitude message, or a message like a bye. The team needs to brainstorm on such messages and their responses.

I tried to cover a few here.

Greeting bot response

#Random response to greeting
def greeting_response(text):

 #Bots greeting

  #User Greetings
 for word in text.split():
 if word in user_greetings:
 return random.choice(bot_greetings)
#Random response to greeting
def gratitude_response(text):

Gratitude Bot Response:

#Bots gratitude
 bot_gratitude=["Glad to help","You are most welcome", "Pleasure to be of help"]

 #User Gratitude
 user_gratitude=["Thankyou so much","grateful","Thankyou","thankyou","thank you"]

 for word in text.split():
 if word in user_gratitude:
 return random.choice(bot_gratitude)

Sorting list

# Default title text
def index_sort(list_var):
 for i in range(length):
 for j in range(length):
 if x[list_index[i]]>x[list_index[j]]:

 return list_index

Chatbot response function, which uses cosine similarities from predefined texts to respond from.

#Creat Bots Response
def bot_response(user_input):
 for i in range(len(index)):
 if similarity_scores_list[index[i]]>