A series of C# functions wrapping the ollama APIs, mainly for UnityEngine
The user's system needs to have a working ollama setup already:
- Download and Install ollama
- Pull a model of choice from the Library
- Recommend
llama3.1for general conversationollama pull llama3.1
- Recommend
gemma2:2bfor device with limited memoryollama pull gemma2:2b
- Recommend
llavafor image captioningollama pull llava
- Recommend
In Unity, you need the Newtonsoft.Json package:
- Unity Editor
- Window
- Package Manager
- Add package by name
- Name:
com.unity.nuget.newtonsoft-json - Add
The following functions are avaliable under the Ollama class
All functions are asynchronous
- List()
- Return an array of
Model, representing all locally available models - The
Modelclass follows the official specs
- Return an array of
Tip
You can use the families attribute to determine if a model is multimodal (see #2608)
- Generate()
- The most basic function that returns a response when given a model and a prompt
- GenerateStream()
- The streaming variant that returns each word as soon as it's ready
- Requires a
callbackto handle the chunks
- GenerateJson()
- Return the response in the specified
class/structformat
- Return the response in the specified
Important
You need to manually tell the model to use a JSON format in the prompt
- Chat()
- Same as
Generate(), but now with the memory of prior chat history, thus allowing you to further ask about previous conversations - Requires either
InitChat()orLoadChatHistory()to be called first - Example:
>> Tell me a joke "..." >> Explain the joke "..."
- Same as
- ChatStream()
- Same as above
- InitChat()
- Initialize / Reset the chat history
historyLimit: The number of messages to keep in memory
- SaveChatHistory()
- Save the current chat history to the specified path
- LoadChatHistory()
- Load the chat history from the specified path
- Calls
InitChat()automatically instead if the file does not exist
Retrieval Augmented Generation
- Ask()
- Ask a question based on given context
- Requires both
InitRAG()andAppendData()to be called first
- InitRAG()
- Initialize the database
- Requires a model to generate embeddings
- Can use a different model from the one used in
Ask() - Can use a regular LLM or a dedicated embedding model, such as
nomic-embed-text
- Can use a different model from the one used in
- AppendData()
- Add a context (eg. a document) to retrieve from
Note
How well the RAG performs is dependent on several factors...
A demo scene containing 3 demo scripts showcasing various features is included:
-
Generate Demo
List()Generate()GenerateJson()KeepAlive.unload_immediately
-
Chat Demo
InitChat()ChatStream()
-
RAG Demo
InitRAG()AppendData()Ask()
Note
Recommended to not enable multiple demos at the same time...