The Ollama server provides a REST API for interacting with LLMs.
Base URL: http://localhost:11434
List available models.
Response:
{
"models": [
{
"name": "qwen2.5-coder:7b-instruct",
"size": "4.7GB",
"modified_at": "2024-01-01T00:00:00Z"
}
]
}
Generate a response from a prompt.
Request:
{
"model": "qwen2.5-coder:7b-instruct",
"prompt": "Write a hello world program in Python",
"stream": true
}
Response (streaming):
{"response": "Here", "done": false}
{"response": " is", "done": false}
{"response": " a", "done": false}
...
{"response": "!", "done": true}
Chat with a model (conversational).
Request:
{
"model": "qwen2.5-coder:7b-instruct",
"messages": [
{"role": "user", "content": "Hello"}
],
"stream": true
}
The included src/ollama_client.py provides a Python wrapper for the API in the Lorapok Dynamic Ollama LLM Chat Interface.
from src.ollama_client import OllamaClient
client = OllamaClient("192.168.1.100") # Server IP
# List models
models = client.list_models()
print(models)
# Generate response
response = client.generate("Hello, how are you?")
print(response)
# Interactive chat
client.chat()
__init__(server_ip): Initialize client with server IPlist_models(): Return list of available modelsgenerate(prompt, model, stream): Generate response from promptchat(): Start interactive chat sessionThe VS Code Ollama extension provides:
Configure the extension with your server URL for remote access.