Press ESC to close Press / to search

LocalAI – Self-Hosted OpenAI Alternative

LocalAI is a free, open-source alternative to OpenAI that allows you to run AI models...

AI/ML Tools Linux Open Source

LocalAI is a free, open-source alternative to OpenAI that allows you to run AI models locally without requiring a GPU. It provides a drop-in replacement REST API compatible with OpenAI’s specifications, enabling you to use existing applications and libraries that integrate with OpenAI while maintaining complete data privacy and eliminating API costs.

Key Features

  • OpenAI API Compatible – Drop-in replacement for OpenAI API endpoints
  • No GPU Required – Runs efficiently on CPU-only systems
  • Multiple Model Support – LLMs, image generation, audio transcription, embeddings
  • Docker Ready – Easy deployment with official Docker images
  • Function Calling – Supports OpenAI-style function calling
  • Grammar Constraints – JSON mode and structured output support
  • Text-to-Speech – Built-in TTS capabilities

Supported Model Types

  • Text Generation – LLaMA, GPT-J, Falcon, MPT, and more
  • Image Generation – Stable Diffusion models
  • Audio – Whisper for transcription, various TTS models
  • Embeddings – BERT, sentence-transformers
  • Vision – LLaVA, bakllava for image understanding

Installation

The easiest way to run LocalAI is with Docker:

# Basic installation with Docker
docker run -p 8080:8080 --name local-ai \
  -v $PWD/models:/models \
  localai/localai:latest

# With GPU support (NVIDIA)
docker run -p 8080:8080 --gpus all --name local-ai \
  -v $PWD/models:/models \
  localai/localai:latest-gpu-nvidia-cuda-12

For native installation:

# Download the latest release
curl -Lo local-ai https://github.com/mudler/LocalAI/releases/latest/download/local-ai-Linux-x86_64
chmod +x local-ai

# Run LocalAI
./local-ai --models-path ./models

Basic Usage

LocalAI exposes OpenAI-compatible endpoints:

# Text completion
curl http://localhost:8080/v1/completions -d '{
  "model": "gpt-3.5-turbo",
  "prompt": "What is Linux?",
  "temperature": 0.7
}'

# Chat completion
curl http://localhost:8080/v1/chat/completions -d '{
  "model": "gpt-3.5-turbo",
  "messages": [{"role": "user", "content": "Hello!"}]
}'

# Generate embeddings
curl http://localhost:8080/v1/embeddings -d '{
  "model": "text-embedding-ada-002",
  "input": "Linux is an open-source operating system"
}'

Integration with Python

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="not-needed"
)

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Explain containers"}]
)
print(response.choices[0].message.content)

Use Cases

  • Private ChatGPT – Run a private AI assistant without cloud dependencies
  • Development Testing – Test OpenAI integrations without API costs
  • Enterprise Deployment – Self-hosted AI for sensitive data environments
  • Offline AI – AI capabilities without internet connectivity
  • Multi-Modal Applications – Combine text, image, and audio AI in one service

Download LocalAI

Was this article helpful?