ML-powered Business Assistant Chatbot

ML-powered business assistant implemented as a chatbot in a Streamlit app.

At its core, the system uses NLP techniques with a fine-tuned DistilBERT classifier trained on a custom dataset to recognize six business-related intents, such as generating LinkedIn notes, researching B2B accounts, and extracting company value propositions.

spaCy was integrated for preprocessing and keyword extraction, while pre-trained transformer models were used for sentiment analysis and question answering.
A T5-small model was fine-tuned to generate personalized LinkedIn connection notes. Real-time company research was enabled via APIs and web scraping (LinkedIn, News APIs, Google Search, BeautifulSoup, Selenium).

For a deeper explanation, see the project documentation.

Python

Streamlit

spaCy

Transformers

BeautifulSoup

Selenium

Natural Language Processing

Web Scraping

API Integration

Visit Repo

README.md

ML chatbot for business operations

A streamlit ML application.

For a comprehensive and in-depth explanation of the tool, please refer to the ML_Tool_Explanation.pdf document.

Preparing the environment

Create venv

I suggest to create first a virtual environment to avoid version conflicts

python -m venv venv

Then activate it

venv/Scripts/activate # for Windows

source venv/bin/activate # for Linux

Setup of the project

To run this project you have to:

Install the right version of PyTorch for your machine https://pytorch.org/get-started/locally/
Install the model for spacy
```
python -m spacy download en_core_web_lg
```
If you want faster execution but less accuracy you can install en_core_web_sm or en_core_web_md.

If you change that you also have to change it in the load function of spacy in utility/nlp.py.
Setup Chrome driver (ensure chromedriver is installed and accessible)
Then you can install the requirements with
```
pip install -r requirements.txt
```

Prepare the models

You can train the models running the jupyter notebook located at train_models.ipynb

It can take a lot of time to train the models. If you want you can download them from this link.

Extract the folder and put it in the root of the project.

AI-agent-for-Business/
- models/
  - chatbot_model/
    - fine_tuned_model/
  - note_model/
    - fine_tuned_model/
- custom_dataset/
- tasks/
- utility/
- ...

Environment variables

For this project I'm using several API with a free plan:

RapidAPI Real-Time Linkedin Scraper API (https://rapidapi.com/rockapis-rockapis-default/api/linkedin-api8) (https://rapidapi.com/rockapis-rockapis-default/api/linkedin-data-api)

It's possible to use just one of them but, due to API Free Plan limitations, one is used for profile data and the other one for company searches.
NewsAPI (https://newsapi.org/docs)
Google Cloud (https://console.cloud.google.com/) with Google Search API (https://programmablesearchengine.google.com/)

You need to create a .env file with your api keys in the main directory with this structure

NEWSAPI_API_KEY=API_KEY
LINKEDIN_RAPIDAPI_API_KEY=API_KEY
LINKEDIN_RAPIDAPI_FOR_COMPANIES_API_KEY=API_KEY
GOOGLE_CLOUD_API_KEY=API_KEY
GOOGLE_SEARCH_CX=CX

Starting the Tool

Once you have all set up, there are two options to run the Tool: command line, streamlit app.

To run the chatbot in the command line run
```
python chatbot.py
```
To run the chatbot in a streamlit app run
```
streamlit run streamlit.py
```
This will automatically open a browser tab with the application or you can open it manually going to http://localhost:8501/

It's preferred the Streamlit App because it has an intuitive UI and in the command line version some logs could be printed also.

ML-powered Business Assistant Chatbot

ML chatbot for business operations

Preparing the environment

Create venv

Setup of the project

Prepare the models

Environment variables

Starting the Tool

Tool schema

General Chatbot functioning schema

Chatbot Internal Operating

ML-Powered Suggestions