Files
pi-skills/ccc/references/settings.md
alex wiesner 5d5d0e2d26 updates
2026-04-12 06:47:14 +01:00

3.9 KiB

ccc Settings

Configuration lives in two YAML files, both created automatically by ccc init.

User-Level Settings (~/.cocoindex_code/global_settings.yml)

Shared across all projects. Controls the embedding model and extra environment variables for the daemon.

embedding:
  provider: sentence-transformers   # or "litellm" (default when provider is omitted)
  model: sentence-transformers/all-MiniLM-L6-v2
  device: mps                       # optional: cpu, cuda, mps (auto-detected if omitted)
  min_interval_ms: 300              # optional: pace LiteLLM embedding requests to reduce 429s; defaults to 5 for LiteLLM

envs:                               # extra environment variables for the daemon
  OPENAI_API_KEY: your-key          # only needed if not already in the shell environment

Fields

Field Description
embedding.provider sentence-transformers for local models, litellm (or omit) for cloud/remote models
embedding.model Model identifier — format depends on provider (see examples below)
embedding.device Optional. cpu, cuda, or mps. Auto-detected if omitted. Only relevant for sentence-transformers.
embedding.min_interval_ms Optional. Minimum delay between LiteLLM embedding requests in milliseconds. Defaults to 5 for LiteLLM and is ignored by sentence-transformers. Set explicitly to override the default.
envs Key-value map of environment variables injected into the daemon. Use for API keys not already in the shell environment.

Embedding Model Examples

Local (sentence-transformers, no API key needed):

embedding:
  provider: sentence-transformers
  model: sentence-transformers/all-MiniLM-L6-v2    # default, lightweight
embedding:
  provider: sentence-transformers
  model: nomic-ai/CodeRankEmbed                     # better code retrieval, needs GPU (~1 GB VRAM)

Ollama (local):

embedding:
  model: ollama/nomic-embed-text

OpenAI:

embedding:
  model: text-embedding-3-small
  min_interval_ms: 300
envs:
  OPENAI_API_KEY: your-api-key

Gemini:

embedding:
  model: gemini/gemini-embedding-001
envs:
  GEMINI_API_KEY: your-api-key

Voyage (code-optimized):

embedding:
  model: voyage/voyage-code-3
envs:
  VOYAGE_API_KEY: your-api-key

For the full list of supported cloud providers and model identifiers, see LiteLLM Embedding Models.

Important

Switching embedding models changes vector dimensions — you must re-index after changing the model:

ccc reset && ccc index

Project-Level Settings (<project>/.cocoindex_code/settings.yml)

Per-project. Controls which files to index. Created by ccc init and automatically added to .gitignore.

include_patterns:
  - "**/*.py"
  - "**/*.js"
  - "**/*.ts"
  # ... (sensible defaults for 28+ file types)

exclude_patterns:
  - "**/.*"              # hidden directories
  - "**/__pycache__"
  - "**/node_modules"
  - "**/dist"
  # ...

language_overrides:
  - ext: inc             # treat .inc files as PHP
    lang: php

Fields

Field Description
include_patterns Glob patterns for files to index. Defaults cover common languages (Python, JS/TS, Rust, Go, Java, C/C++, C#, SQL, Shell, Markdown, PHP, Lua, etc.).
exclude_patterns Glob patterns for files/directories to skip. Defaults exclude hidden dirs, node_modules, dist, __pycache__, vendor, etc.
language_overrides List of {ext, lang} pairs to override language detection for specific file extensions.

Editing Tips

  • To index additional file types, append glob patterns to include_patterns (e.g. "**/*.proto").
  • To exclude a directory, append to exclude_patterns (e.g. "**/generated").
  • After editing, run ccc index to re-index with the new settings.