Ollama – Run Large Language Models Locally on Linux

Ollama is a powerful open-source tool that enables you to run large language models (LLMs)...

AI/ML Tools Linux Open Source

Ollama is a powerful open-source tool that enables you to run large language models (LLMs) locally on your Linux machine. Unlike cloud-based AI services, Ollama gives you complete control over your AI interactions, ensuring privacy and eliminating API costs. Whether you’re a developer building AI applications, a researcher experimenting with different models, or simply curious about local AI, Ollama provides an accessible entry point into the world of self-hosted language models.

Key Features

  • Local Execution – Run LLMs entirely on your hardware without internet dependency
  • Model Library – Access popular models like Llama 2, Mistral, CodeLlama, and Vicuna
  • Simple CLI – Intuitive command-line interface for model management
  • REST API – Built-in API server for application integration
  • GPU Acceleration – Supports NVIDIA CUDA and AMD ROCm for faster inference
  • Model Customization – Create custom models with Modelfiles
  • Lightweight – Minimal resource overhead compared to alternatives

System Requirements

Ollama runs on most modern Linux systems. For optimal performance, consider the following specifications:

  • CPU – Modern multi-core processor (AMD64 or ARM64)
  • RAM – Minimum 8GB, recommended 16GB+ for larger models
  • Storage – 10GB+ free space for models
  • GPU – Optional but recommended: NVIDIA GPU with CUDA support or AMD GPU with ROCm

Installation on Linux

Installing Ollama on Linux is straightforward with the official installation script:

curl -fsSL https://ollama.com/install.sh | sh

For manual installation or package manager options:

# Download the binary directly
curl -L https://ollama.com/download/ollama-linux-amd64 -o ollama
chmod +x ollama
sudo mv ollama /usr/local/bin/

# Start the Ollama service
ollama serve

Basic Usage

Once installed, you can start using Ollama immediately:

# Pull and run a model
ollama run llama2

# List available models
ollama list

# Pull a specific model
ollama pull mistral

# Remove a model
ollama rm llama2

# Show model information
ollama show llama2

Ollama supports a wide range of open-source models:

  • Llama 2 – Meta’s powerful open-source LLM (7B, 13B, 70B parameters)
  • Mistral – Efficient 7B model with excellent performance
  • CodeLlama – Specialized for code generation and understanding
  • Vicuna – Fine-tuned for conversations
  • Orca Mini – Smaller model optimized for limited hardware
  • Neural Chat – Intel’s optimized conversational model

API Integration

Ollama provides a REST API compatible with OpenAI’s format, making it easy to integrate with existing applications:

# Generate a completion
curl http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt": "Explain Linux file permissions"
}'

# Chat completion
curl http://localhost:11434/api/chat -d '{
  "model": "llama2",
  "messages": [
    {"role": "user", "content": "Hello!"}
  ]
}'

Use Cases

  • Private AI Assistant – Chat with AI without sending data to external servers
  • Code Generation – Use CodeLlama for programming assistance
  • Document Analysis – Process sensitive documents locally
  • Application Development – Build AI-powered applications with the REST API
  • Learning & Research – Experiment with different LLM architectures

Download Ollama

Was this article helpful?