Private LLMs & Function Calling: Get LLMs to Answer Customer Queries with Your Organization Data

AI and LLMs are the rave of the moment. The buzz has made a number of LLM models popular, among which are ChatGPT, Claude, Gemini, and others.

These platforms provide cloud services which require some form of payment, usually in the form of tokens, to use the services. However, in addition to cost, there have been concerns about data protection with respect to regulations in different countries as well as data privacy. Companies are looking to leverage the power of AI for services like chatbots without flouting the data protection regulations in their business domain.

In light of the foregoing, there are many ways companies can use their organization's data with AI models, which includes:

Retrieval Augmented Generation (RAG)
Fine-tuning
Prompt engineering with context injection
Hybrid approach
Private or On-premises LLMs with Function or Tool Calling

This article focuses on item 5.

How is this achieved? What do you need?

i. Ollama

This is a free and open-source tool that allows you to run Large Language Models on your local computer. It provides a simple command-line interface for downloading and managing various open-source LLMs, which makes it easy to use AI models offline and with greater control over your data.

Ollama is available for macOS, Windows, and Linux operating systems. Once you have downloaded and installed Ollama, you can download and run open-source models of different sizes depending on the storage and RAM capacity of your computer.

ii. Your app backend

Your backend integrates with the AI model of your choice, and depending on your choice of programming language, there are different ways to implement this interaction.

iii. App Frontend

This guides the end-users on how to interact with your backend and AI model.

Spring AI and Spring Boot

If you are a Java Engineer, the Spring AI framework bundles so many of the features that get you started on your journey towards leveraging AI in your software application. This works seamlessly with Spring Boot for creating backend APIs that your frontends interact with.

On my GitHub profile https://github.com/sylnit, I have a sample full-stack application to demonstrate this possibility. Also, if you are bothered about the security of your data, the demo application shows you how you can host a model of reasonable size within your domain using Ollama and leverage tool calling in your backend application to provide basic chatbot and chat-like-query functionalities to customers.

Below are the links:

Backend built in Spring Boot - The backend built in Spring Boot
Frontend built in ReactJS - The frontend built in ReactJS

Feel free to explore and provide your feedback, suggestions, and inputs.

If you need help with providing this kind of functionality for your business, get in touch.