Ollama Chat Interface

A feature-rich web interface for interacting with AI models via local Ollama server. Built with professional UI/UX design, comprehensive error handling, and advanced conversation management capabilities.

Overview

This application provides a robust, user-friendly chat interface for communicating with Ollama models (DeepSeek-R1, Llama, Mistral, etc.). Built with production-grade code patterns and modern UI design, it demonstrates best practices for Python AI application development.

Key Highlights

Professional 2-Column UI: Optimized layout with conversation display and organized control panel
Interactive Parameter Controls: Real-time adjustment of temperature, top-p, top-k, and max tokens
Conversation Export: Export chat history to JSON (with metadata) or Markdown (human-readable)
System Prompt Integration: Configurable AI personality and behavior
Response Status Indicators: Visual progress feedback during generation
Example Questions: Click-to-populate interactive prompts for quick testing
Conversation Memory: Context-aware responses with full history management
Production-Ready Architecture: Comprehensive logging, error handling, and retry logic

Features

🎨 User Interface

Professional 2-Column Layout:
- Left panel (70%): Spacious conversation window with 630px height
- Right panel (30%): Organized controls (input, buttons, export)
Advanced Settings Accordion:
- 2-column parameter layout for better visibility
- Real-time parameter adjustments (temperature, top-p, top-k, max tokens)
- Parameters apply per-session without modifying config
Example Questions:
- 3 topics with 2 questions each (AI Product Management, Environment, Technology Trends)
- Click-to-populate functionality for instant testing
- Interactive chips with hover effects
Enhanced Typography:
- Larger font sizes for better readability (16px base, 14px for sliders)
- Properly sized value input boxes (48px height with optimized padding)
- Dark borders and visual polish throughout

🚀 Conversation Management

System Prompt Support: Configurable AI personality prepended to all prompts
Conversation Memory: Maintains up to 20 messages (configurable) with full context
Parameter Tracking: Each message stores its generation parameters
Timestamp Tracking: ISO 8601 timestamps for all exchanges
Clear Function: Reset conversation and start fresh

📤 Export Capabilities

JSON Export (with metadata):

{
  "export_metadata": {
    "timestamp": "2025-12-03T11:07:06",
    "model": "deepseek-r1:8b",
    "system_prompt": "You are a helpful AI assistant...",
    "total_messages": 1,
    "export_version": "1.0"
  },
  "conversation": [...]
}

Markdown Export (human-readable): See example export for complete format.

🔧 Technical Features

Configuration Management: YAML-based configuration for easy customization
Structured Logging: File and console logging with configurable levels
Retry Logic: Automatic retry with exponential backoff for network resilience
Error Handling: Comprehensive exception handling with user-friendly messages
Type Safety: Full type hints for better IDE support and code quality
Modular Architecture: Reusable common utilities (logging, config, retry, export)
Response Status: Visual progress indicator with descriptive messages

Requirements

Python 3.8+
Ollama running locally (default: http://localhost:11434)
Any Ollama model installed (default config: deepseek-r1:8b)

Installation

1. Clone the Repository

git clone https://github.com/shrimpy8/ollama-chat-interface.git
cd ollama-chat-interface

2. Create a Virtual Environment (Recommended)

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Set Up Ollama

Install Ollama

Download: Get Ollama from https://ollama.com/download

macOS: Download and install the .dmg file
Linux: curl -fsSL https://ollama.com/install.sh | sh
Windows: Download and run the installer

Official Documentation: https://docs.ollama.com/

Start Ollama and Pull a Model

# Start Ollama server (in terminal 1)
ollama serve

# Pull the DeepSeek-R1 model (in terminal 2)
ollama pull deepseek-r1:8b

# Or try other popular models:
# ollama pull llama3.2          # Meta's Llama 3.2
# ollama pull mistral           # Mistral AI
# ollama pull qwen2.5           # Alibaba's Qwen

# Verify the model is available
ollama list

Verify Ollama is Running

# Test the API
curl http://localhost:11434/api/version

# Expected output: {"version":"0.x.x"}

5. (Optional) Customize Configuration

The application works out-of-the-box with sensible defaults. To customize settings, edit config.yaml:

ollama:
  model_name: "deepseek-r1:8b"  # Change to your preferred model
  parameters:
    temperature: 0.7              # Adjust creativity (0.0-1.0)
    top_p: 0.9                    # Nucleus sampling
    top_k: 40                     # Vocabulary limit
    num_predict: 2048             # Max response length

Usage

Starting the Application

python app.py

The application will start on http://127.0.0.1:7860 (configurable in config.yaml).

Using the Chat Interface

Open your browser: Navigate to http://127.0.0.1:7860
Try example questions: Click any of the pre-populated questions
Or type your message: Enter custom prompts in the input box
Adjust parameters (optional): Open "Advanced Settings" to fine-tune:
- Temperature: 0.0 (deterministic) to 1.0 (creative)
- Top-P: Nucleus sampling threshold
- Top-K: Vocabulary size limit
- Max Tokens: Response length limit
Send message: Click "Send" or press Enter
View response: Watch the status indicator during generation
Export conversation: Choose JSON or Markdown format and download

Example Workflows

Quick Test with Example Questions:

Open the interface
Click "How do I prioritize AI features in a product roadmap?"
Observe response generation with status indicator
Continue conversation with follow-up questions

Custom Research Session:

Type your research question
Adjust temperature to 0.8 for more creative responses
Increase max tokens to 4096 for longer answers
Export the conversation as Markdown for documentation

Parameter Experimentation:

Ask the same question with different temperatures
Compare response creativity and determinism
Export both conversations and analyze differences

Project Structure

ollama-chat-interface/
├── app.py                      # Main application with UI and chat logic
├── config.yaml                 # Centralized configuration
├── common/                     # Shared utilities module
│   ├── __init__.py            # Module exports
│   ├── config_loader.py       # Configuration management
│   ├── logging_config.py      # Structured logging setup
│   ├── retry_utils.py         # Retry decorators and utilities
│   └── export_utils.py        # Conversation export functionality
├── tests/                      # Test suite (63 tests, 76% coverage)
│   ├── test_config_loader.py  # Configuration tests
│   ├── test_retry_utils.py    # Retry logic tests
│   ├── test_logging_config.py # Logging tests
│   └── test_export_utils.py   # Export functionality tests
├── screenshots/                # Documentation screenshots
│   ├── Complete_Chat_Interface.png
│   └── ollama_conversation_*.md
├── requirements.txt            # Python dependencies
├── requirements-dev.txt        # Development dependencies
├── .env.example               # Environment variable template
├── .gitignore                 # Git ignore patterns
├── pytest.ini                 # Pytest configuration
└── README.md                  # This file

Configuration

All settings are managed via config.yaml for easy customization without code changes:

# Ollama Server Configuration
ollama:
  base_url: "http://localhost:11434"
  api_endpoint: "/api/generate"
  model_name: "deepseek-r1:8b"

  # Model Parameters (overridable via UI)
  parameters:
    temperature: 0.7        # Creativity level (0.0 = deterministic, 1.0 = creative)
    top_p: 0.9             # Nucleus sampling
    top_k: 40              # Top-k sampling
    num_predict: 2048      # Maximum tokens to generate

# Request Configuration
request:
  timeout: 120             # Timeout in seconds for API requests
  retry:
    max_attempts: 3        # Maximum retry attempts for failed requests
    min_wait: 2            # Minimum wait time between retries (seconds)
    max_wait: 10           # Maximum wait time between retries (seconds)
    multiplier: 2          # Exponential backoff multiplier

# Logging Configuration
logging:
  level: "INFO"            # Logging level: DEBUG, INFO, WARNING, ERROR, CRITICAL
  format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
  file: "ollama_chat.log"  # Log file name
  console: true            # Enable console logging
  file_logging: true       # Enable file logging

# Gradio UI Configuration
ui:
  title: "DeepSeek-R1 AI Chat Interface"
  description: "Chat with DeepSeek-R1 model via local Ollama server"
  theme: "default"         # Gradio theme: default, soft, monochrome
  share: false             # Enable public sharing via gradio.live link
  server:
    port: 7860             # Port to run the server on
    host: "127.0.0.1"      # Host address (127.0.0.1 for local only)

# Conversation Settings
conversation:
  system_prompt: "You are a helpful AI assistant powered by DeepSeek-R1."
  memory_enabled: true     # Enable conversation memory
  context_window: 4096     # Maximum context window in tokens
  history:
    max_messages: 20       # Maximum number of messages to keep in history
    show_timestamps: true  # Show timestamps in chat history

Customize settings by editing config.yaml - no code changes needed!

Testing

The project includes a comprehensive test suite with 63 tests covering core functionality, achieving 76% code coverage.

Running Tests

# Install development dependencies
pip install -r requirements-dev.txt

# Run all tests
pytest

# Run with verbose output
pytest -v

# Run with coverage report
pytest --cov=. --cov-report=html --cov-report=term

# Run specific test file
pytest tests/test_export_utils.py

# Run tests by marker
pytest -m unit              # Unit tests only
pytest -m integration       # Integration tests only

Test Coverage

Configuration Management (26 tests):

YAML configuration loading and validation
Default value handling
Error handling for missing/invalid files
All configuration getter methods
Singleton pattern implementation

Retry Logic (15 tests):

Retry decorator creation and configuration
Exponential backoff timing
Connection error handling
HTTP error handling (404, 500)
Timeout and request exception handling
Safe API call wrapper with fallback values

Logging (14 tests):

Logger setup and configuration
Console and file handlers
Log level configuration
Log message formatting
File creation and writing
Error handling for invalid paths

Export Utilities (8 tests):

JSON export with metadata
Markdown export formatting
Empty conversation handling
Unicode support (Chinese, Arabic, emoji)
Parameter preservation
Structure validation

Coverage reports: htmlcov/index.html (detailed interactive report)

Advanced Features

System Prompt Integration

The system prompt is automatically prepended to all prompts, defining the AI's personality and behavior:

# Configured in config.yaml
conversation:
  system_prompt: "You are a helpful AI assistant powered by DeepSeek-R1."

# Automatically included in every API call
# User sees normal conversation, but model receives context

Interactive Parameter Controls

Adjust model behavior in real-time via the Advanced Settings accordion:

Parameter	Range	Description	Impact
Temperature	0.0-1.0	Creativity level	0.0 = deterministic, 1.0 = creative
Top-P	0.0-1.0	Nucleus sampling	Higher = more diverse vocabulary
Top-K	1-100	Vocabulary limit	Higher = more variety in responses
Max Tokens	256-4096	Response length	Higher = longer responses

Note: UI parameters override config defaults for the current session only. Changes are not persisted.

Response Status Indicators

Visual feedback during response generation:

🤔 Thinking... - Processing user input
⏳ Generating response... - Model is generating
✅ Complete! - Response delivered

Conversation Export

Export your conversations with full metadata:

JSON Format Features:

Complete metadata (timestamp, model, system prompt, version)
All conversation exchanges with parameters
Structured for programmatic access
Unicode support (Chinese, Arabic, emoji)

Markdown Format Features:

Human-readable format for documentation
Includes metadata header
System prompt display
Timestamped exchanges
Parameter annotations for each message

File Naming: ollama_conversation_YYYY-MM-DD_HHMMSS.{json|md}

Retry Logic

Network requests automatically retry on failure with exponential backoff:

Attempt 1: Immediate
Attempt 2: Wait 2 seconds
Attempt 3: Wait 4 seconds
Maximum wait: 10 seconds
Handles: Connection errors, timeouts, HTTP errors, request exceptions

Error Handling

Comprehensive error handling for all scenarios:

Error Type	User Message	Recovery Action
Connection Error	"Cannot connect to Ollama server"	Check if Ollama is running
Timeout	"Request timed out after 120 seconds"	Try shorter prompt or increase timeout
Model Not Found	"404 Client Error: model not found"	Check model name in config.yaml
Empty Response	"Received empty response"	Retry request
Unexpected Error	"An unexpected error occurred"	Check logs for details

Troubleshooting

Common Issues

Application Won't Start

Verify all dependencies installed: pip install -r requirements.txt
Check Python version: python --version (3.8+ required)
Ensure config.yaml exists and is valid YAML

Cannot Connect to Ollama

# Check if Ollama is running
ps aux | grep ollama

# Start Ollama if not running
ollama serve

# Verify API is accessible
curl http://localhost:11434/api/version

Model Not Found (404 Error)

# List available models
ollama list

# Pull the required model (match config.yaml model_name)
ollama pull deepseek-r1:8b

Slow Responses

DeepSeek-R1 is a reasoning model and may take longer
Check your system resources (CPU/RAM usage)
Try a smaller model: ollama pull llama3.2 (faster)
Reduce num_predict in config.yaml or via UI

Values Cut Off in Slider Inputs

This issue has been resolved with optimized padding (1px top, 22px-23px bottom)
If still experiencing issues, check your browser zoom level (should be 100%)

Port Already in Use

# Change port in config.yaml
ui:
  server:
    port: 8080  # Use different port

Debugging

All operations are logged to ollama_chat.log. For detailed debugging:

Enable debug logging:
```
# config.yaml
logging:
  level: "DEBUG"
```
Check the log file:
```
tail -f ollama_chat.log
```

Test API directly:

curl -X POST http://localhost:11434/api/generate \
  -H "Content-Type: application/json" \
  -d '{"model":"deepseek-r1:8b","prompt":"Hello","stream":false}'

Performance Considerations

Conversation Memory: Automatically trims to max_messages to stay within token limits
Request Timeout: Default 120 seconds (configurable)
Retry Logic: Maximum 3 attempts with exponential backoff
Token Usage: Monitor num_predict to control response length and cost
Model Selection: Smaller models (7B-8B) respond faster than larger models (70B+)
UI Responsiveness: Status indicators provide real-time feedback during generation

Security Notes

Local Deployment: Runs on 127.0.0.1 by default (localhost only)
No API Keys: Uses local Ollama - no external API credentials needed
Public Sharing: Disabled by default (share: false in config.yaml)
Input Validation: All user inputs validated before processing
Error Sanitization: Error messages sanitized to avoid information leakage
No Data Collection: All conversations stay local

Customization

Changing the Model

Edit config.yaml:

ollama:
  model_name: "llama3.2"  # Change to your preferred model

Ensure the model is installed:

ollama pull llama3.2

Popular Model Options

Model	Size	Best For	Speed
deepseek-r1:8b	8B	Reasoning, analysis	Medium
llama3.2	3B-70B	General chat, versatile	Fast-Slow
mistral	7B	Efficient, balanced	Fast
qwen2.5	7B-72B	Multilingual, coding	Fast-Medium
gemma2	2B-27B	Lightweight, efficient	Very Fast

Customizing the UI

ui:
  title: "My Custom AI Chat"
  description: "Powered by Ollama and Python"
  theme: "soft"              # Options: default, soft, monochrome
  share: true                # Enable public sharing (use with caution)
  server:
    port: 8080               # Custom port

Adjusting System Prompt

conversation:
  system_prompt: "You are an expert Python developer and educator. Provide clear, detailed explanations with code examples."

Disabling Conversation Memory

For stateless interactions:

conversation:
  memory_enabled: false

Development

Project Philosophy

This project demonstrates production-ready Python development:

Configuration over code: Settings externalized to YAML
Observability: Comprehensive logging for debugging
Resilience: Retry logic for network failures
Modularity: Reusable utilities for common operations
Type safety: Full type hints for better IDE support
User Experience: Professional UI with real-time feedback
Documentation: Clear docstrings and comprehensive README

Code Quality Standards

Type hints on all functions and methods
Google-style docstrings
Specific exception handling (no bare except)
Structured logging throughout
Configuration-driven behavior
No hardcoded values
76% test coverage

Extending the Application

The modular architecture makes it easy to extend:

Adding a new feature:

Add configuration to config.yaml
Update ConfigLoader with getter methods
Implement feature in app.py
Add logging and error handling
Write tests in tests/
Update README documentation

Creating custom utilities:

Add new module to common/
Export in common/__init__.py
Import in app.py as needed
Write comprehensive tests

License

MIT License

Acknowledgments

This application utilizes the following technologies:

Ollama - Local LLM runtime and API
Gradio - Web UI framework
DeepSeek-R1 - Reasoning-focused language model
Tenacity - Retry library with exponential backoff
PyYAML - YAML configuration parser
Pytest - Testing framework

Additional Resources

Made with ❤️ using Ollama, Gradio, and Python

For issues, feature requests, or contributions, visit the GitHub repository

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
common		common
screenshots		screenshots
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Readme.md		Readme.md
app.py		app.py
config.yaml		config.yaml
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

shrimpy8/ollama-chat-interface

Folders and files

Latest commit

History

Repository files navigation