avatar

KONNECTWAY

Posted on Oct 30, 2023

Enhancing Chatbots with OpenAI and Laravel

#PHP #AI

 15 mins of reading

Discover how to boost chatbots with OpenAI’s GPT-3 and Laravel. Learn about vector embeddings and how giving a URL to your chatbot lets you ask questions about the page, making interactions smarter and easier to handle.

OpenAI’s GPT model,  is super smart, but its last update? September 2021. That might not sound like a long time ago, but in the tech world, it’s an eternity! Just think about it: if you were to chat with ChatGPT and ask about the latest OpenAI package for Laravel, it’d give you a puzzled digital look. It’s just not in the know about that yet. It’s a good reminder that even the most advanced tools have their limits and need a bit of help catching up sometimes.

In this post you will:

  1. Learn about Embeddings and Vector Similarity: We’ll be explore embedding and vector similarity, which help improve our chatbot’s understanding.
  2. Implement a real chatbot use case: You’ll be implementing a feature in a Laravel application where users can submit URLs. The chatbot will then process the content of these URLs using NLP to understand context and content, and respond appropriately.

Embeddings and Vector Similarity

Embedding, in the context of machine learning and natural language processing, refers to the conversion of data (usually words or phrases) into vectors of real numbers. These vectors represent the data in a way that a machine can understand, process, and utilize.

  1. Tokenization: The first step in the embedding process often involves breaking down a piece of text into smaller units, called tokens. These tokens can be as short as a single character or as long as a word.
  2. Vector Representation: Each token is then mapped to a vector in a predefined vector space. This mapping is done using algorithms that ensure semantically similar words are placed closer together in the vector space.
  3. Dimensionality: The vector space can be of any dimension, but for practical purposes and computational efficiency, we often reduce the number of dimensions using techniques like Principal Component Analysis (PCA) or t-SNE. Despite the reduction in dimensions, the relative distances (or similarities) between vectors are preserved.

What are Vector Embeddings | Pinecone

 

Vector Similarity

Vector similarity is a measure of the closeness or similarity between two vectors. It helps in determining how alike two pieces of data are. The more similar the vectors, the more similar the data they represent.

  1. Cosine Similarity: One of the most common ways to measure vector similarity. It calculates the cosine of the angle between two vectors. A value of 1 means the vectors are identical, while a value of 0 means they’re orthogonal (or not similar at all).
  2. Euclidean Distance: Another method where the similarity is determined based on the “distance” between two vectors. The closer they are, the more similar they’re considered to be.
  3. Dot Product: If the vectors are normalized, taking the dot product of two vectors will give a value between -1 and 1, which can also be a measure of similarity.

 

Imagine we’re using a 3-dimensional space to represent our words. Here’s a hypothetical representation:

Word Embeddings:

  • Cat: [0.9, 0.8, 0.1]
  • Dog: [0.8, 0.9, 0.05]
  • Computer: [0.2, 0.1, 0.95]

In this representation:

  • The vectors for “cat” and “dog” are close to each other, indicating they are semantically similar. This is because they are both animals and share certain characteristics.
  • The vector for “computer”, on the other hand, is farther from the vectors of “cat” and “dog”, indicating it is semantically different from them.

If we were to visualize this in a 3D space:

  • “Cat” and “dog” might be near each other in one corner, while “computer” would be on the opposite side or in a different corner of the space.

Understanding Similarity:

Using cosine similarity:

  • The similarity between “cat” and “dog” would be high (close to 1) because their vectors are close.
  • The similarity between “cat” and “computer” or “dog” and “computer” would be much lower (closer to 0) because their vectors are farther apart.

Remember, this is a very simplified representation. In real-world applications, the dimensions are much higher (often in the hundreds or thousands), and the vectors are derived from vast amounts of data to capture intricate semantic relationships.

Implement a real use case

Our primary tool for storing vectors will be Pine code. However, you can also use pg vector and the underlying mechanics of vector similarity will be essential to grasp the full potential of our chatbot. And to further elevate its capabilities, we’ll introduce web scraping. This ensures our bot is aware with information about a webpage, making it capable at answering queries related to that page.

Here you can see what we will acomplish with our implementation:

The first step is:

  1. User submits a web link.
  2. Backend Service receives the link.
  3. Crawler visits the link.
  4. Data Processing occurs:
    • Converts content using a Markdown Converter.
    • Tokenizes content.
  5. Store the processed vector to the database.

Then once we have crawled the web page we will see a chat, where you can ask question about that page.

  1. Question from user is vectorized.
  2. Search for similarities in vector database (Pinecode).
  3. Results sent to OpenAI for context.
  4. OpenAI’s Embedding API processes data.
  5. The AI responds to the user

Here is a video of the end result.

 

Let’s create new laravel project, we will name it aichat.

Select laravel breeze with livewire and alpine so we have livewire to make a our chat and tailwind css installed for making it easy to create our chat UI.

Install openai package for laravel.

 

Publish openai config.

 

Add this env variable on your .env file.

OPENAI_API_KEY=sk-...

 

Setting up your pinecone account

Install pinecode for php

Creating a section for setting up a Pinecone account and obtaining the necessary variables in a Laravel PHP project can be structured as follows:

1. Create a Free Pinecone Account:
– Visit the Pinecone Website
– Click on “Get Started” or “Sign Up” to create a free account.
– Follow the on-screen instructions to complete the registration.

– Once you create your index create the index using a vector dimension of 1536 and the rest can be standard.

2. Obtain Your Pinecone API Key and Environment Variable:
– Once logged in, navigate to your account settings or dashboard.
– Look for a section titled “API Keys” or “Credentials”.
– Generate a new API key and note down the environment variable associated with your account (usually it’s a string like production or development).

3. Setup Pinecone Variables in Your Laravel Project:
– In your Laravel project, open or create a .env file in the root directory.
– Add the following lines to your .env file, replacing YOUR_API_KEY and YOUR_ENVIRONMENT with the values obtained from your Pinecone account:

 

4. Add a pinecone.php in the config directory:
– Now in your Laravel PHP code, you can access these variables using the env() function as shown below:

 

5. Initialize Pinecone:
– You can now initialize Pinecone using the obtained credentials. While Pinecone’s documentation primarily shows initialization in Python or JavaScript, you would need to look for a PHP library or create a wrapper around Pinecone’s API to interact with it in PHP.

 

Install readability package for php this will help us generate sanitized html.

 

 

To begin developing the UI, simply run npm run dev. Once you have completed the development process, be sure to execute npm run build in order to generate all the necessary CSS and JS files.

 

Collecting Data

Now that our project is ready we can start creating helper class to collecting data that we will embed. Before we start lto create an account in browserless to get info from the webpage and get the html. You can do this also with the laravel HTTP client, but some pages are not SSR loaded. You can replace it just with laravel HTTP client if you want.

Setting Up Your Browserless Account

Browserless is a service that allows for browser automation and web scraping in a serverless environment. To use Browserless, you’ll need to set up an account and obtain a unique BROWSERLESS_KEY. Here’s how to do it:

1. Create a Free Browserless Account:
– Visit the Browserless Website
– Click on “Start for Free” or “Sign Up” to create a free account.
– Follow the on-screen instructions to complete the registration.

2. Obtain Your Browserless API Key:
– Once logged in, navigate to your account settings or dashboard.
– Look for a section titled “API Keys” or “Credentials”.
– Generate a new API key, which will be your BROWSERLESS_KEY.

 

 

Create a Class EmbedWeb.php

Here’s a simplified breakdown of what this class does:

  1.  Fetching Web Content:
    – The handle  method is triggered with a URL as its argument.
    – It sends a HTTP POST request to a web browsing automation service (browserless) to load the specified web page. Alternatively, it can send a plain HTTP GET request if web browsing automation is not needed.
  2. Processing Web Content:
    – Utilizes the Readability library to parse the fetched web page, isolating the main content and stripping away html elements.
  3. Preparing Content:
    – The script cleans up the text by removing HTML tags and splits it into chunks of up to 1000 characters each, ensuring the last chunk is at least 500 characters long by merging it with the previous chunk if necessary.
  4. Text Embedding:
    – Sends the processed text chunks to OpenAI’s service to generate text embeddings, which are compact numerical representations of the text in vectors. Just like we saw earlier.
  5. Indexing Embeddings:
    – Clears any previous embeddings indexed under the ‘web’ namespace for the ‘chatbot’ index in Pinecone, a vector database.
    – Then, it indexes the new embeddings in Pinecone, associating each embedding with a unique ID based on the URL and chunk index, and storing the original text and URL as metadata.

This way, the script facilitates the automated retrieval, processing, and indexing of web content, making it searchable and usable for a chatbot.

 

Creating the chatbot

Now let’s create the livewire component.

 

This will create 2 files

 

 

Create a blade file components/layouts/app.blade.php

 

And in your app.css file ad this

This way if the chatbot returns code, it show a code block in black.

 

Creating a class for managing a conversion with Open AI

We are going to create a class that will manage chat messages and go to the Open AI API, we will be using the stream response because we want the same behaviour as we have today with chatgpt, we don’t want to wait until the whole message is finished.

The class ChatMessages will handle chat interactions with OpenAI. The handle  method is the heart of this class, taking in chat messages and two handlers for processing the chat as it streams from OpenAI’s GPT-3.5 Turbo model.

Upon calling handle , a streamed chat with OpenAI is initiated using the provided messages. As responses come in from OpenAI, they are looped through and the new content is appended to a content string. If a streamHandler  is provided, it’s called with the updated content rendered to HTML, allowing for real-time updates.

Once all messages have been processed, the finishedHandler  is called with the full content also rendered to HTML, signaling the end of the chat processing. This setup allows for both real-time processing of chat messages as they come in and a final handling step once all messages have been processed.

This will be used in our ChatBot.php class

 

Not let’s go trough the ChatBot.php class in livewire.

Let’s add some properties.

The prompt will what user types for asking questions.

The answer property will be the current answer the chatbot is streaming once we go to open ai. We will be using wire:stream to stream the response to the front end.

Pending is a boolean so we now the AI is streaming a response and we are waiting it to finish.

Conversions array will be used to save our chat messages.

And the step property will help us giving a step to add the URL and after entering the URL we will show the chat UI.

 

Let’s add a submitUrl method

 

The url will be the url we want the scrap. This will be validated to be required and have a valid url.

The submitUrl method validates the url property, processes it using the EmbedWeb class explained in a previous snippet, and transitions to a chat step by updating the step property to 'chat'.

In our chat-bot.blade.php file

We make the url wire:modelable and once we click on the button we submit the url and call the function we just made.

 

Now let’s create a submitPromt method so we can make a question about the url we have given.

 

The submitPrompt method is designed to process a user’s prompt, find relevant information from previously indexed web content, and prepare for a chat interaction based on the information retrieved.

  1. An instance of the Pinecone vector database client is created.
  2. The user’s prompt is sent to OpenAI to obtain a text embedding.
  3. A query is made to Pinecone to find the top 4 most relevant snippets of web content based on the text embedding.
  4. A system message is prepared with these snippets, instructing to base the answer on the given info.
  5. The conversation array is updated with the system message and the user’s prompt.
  6. The user prompt input field is cleared ($this->prompt = '').
  7. A flag ($this->pending) is set to true, indicating a pending action so we can show the user some indication that the chatbot is responding.
  8. A JavaScript function ask is triggered via the Livewire $wire object, so what this will do is that is it will refresh the UI the new messages and the current state, and on the front end it will go back to the server to start sending everything to Open AI.

Let’s create the ask method in livewire:

The ask method orchestrates a chat interaction by handling incoming messages, generating responses, and managing real-time updates to the UI.

  1. Creating Chat Handler:
    • An instance of ChatMessages is created and its handle method is called with the current conversation as an argument.
  2. Handling Finished Responses:
    • When the handle method finishes processing, it triggers the finishedHandler callback function.
    • A new message from the ‘assistant’ is appended to the conversation array, containing the generated content.
    • The answer property is cleared, and pending is set to false, indicating that processing is complete.
  3. Handling Streamed Responses:
    • If there are streamed updates during processing (like real-time typing indicators or partial responses), the streamHandler callback is triggered.
    • These updates are streamed to the ‘answer’ component on the frontend, replacing any previous content, providing a dynamic, real-time interaction experience.
    • We use the wire:stream functionality livewire gives us. This makes our so much easier, before livewire 3.0 what we did was actually using websockets to have the message update in realtime. Having the stream functionality makes our life so much easier without having the need of installing websockets.

The method facilitates a structured chat interaction, handling real-time updates, and appending final responses to the conversation, readying the system for the next user input.

Now let go over the UI to make this work.

The interface is split into two main sections: the chat display area and the message input area.

  • In the chat display area:
    • Messages from the ‘assistant’ and ‘user’ are iterated over and displayed with different stylings for easy differentiation.
    • If there’s a pending response (indicated by the $pending variable), a placeholder is displayed until the actual response ($answer) is received and displayed with the wire:stream functionality.
    • Livewire’s wire:stream directive is used to update the answer area in real-time as new content is streamed from the server.
  • In the message input area:
    • Users can type their message into a text input field.
    • Pressing the Enter key or clicking the “Send” button triggers the submitPrompt method, sending the user’s message for processing.
    • Any validation errors for the prompt input are displayed just below the input field.

Now you can ask questions about the webpage you gave, like we showed in the video earlier on.

 

 

Conclusion

In this journey, we’ve creatively integated OpenAI, Laravel, and Pinecone to give our chatbot a significant boost and extra knowledge. It all started with our EmbedWeb class, a tool that will do some scrapping and will get the web for content, embeds it, and saves it in Pinecone, our chosen vector database. This step automated the work of data gathering and set the stage for the chatbot to work its magic.

Then came the ChatMessages class, that is in charge of handling the  conversation flow. It will stream the response so we can

And then, we rolled up our sleeves for the heart of our project – the chatbot code. With a blend of structured logic and innovative coding, we crafted a setup that takes user prompts, sifts through the indexed web content, and engages in a meaningful back-and-forth with the user. The cherry on top? Real-time updates and a sleek UI, courtesy of Laravel’s Livewire and Tailwind CSS, which made the chat interface not only functional but a delight to interact with.

What we have now is a testament to the magic that happens when OpenAI’s text embeddings, Laravel’s robust framework, and Pinecone’s vector database come together. This fusion not only amped up our chatbot’s understanding but also its ability to dish out relevant, timely responses. As we wrap up, we’re not just left with a solid piece of work, but a stepping stone towards more intuitive and engaging chatbot interactions. The road ahead is exciting, and the possibilities, endless.