APFEL(1)

NAME

apfelApple Intelligence from the command line. On-device LLM via FoundationModels framework. No API keys, no cloud, no…

SYNOPSIS

$brew install apfel

INFO

3.4k stars
124 forks
0 views

DESCRIPTION

Apple Intelligence from the command line. On-device LLM via FoundationModels framework. No API keys, no cloud, no dependencies.

README

apfel

Version 0.9.0 Swift 6.3+ macOS 26+ No Xcode Required License: MIT 100% On-Device Website

Use the FREE local Apple Intelligence LLM on your Mac - your model, your machine, your way.

No API keys. No cloud. No subscriptions. No per-token billing. The AI is already on your computer - apfel lets you use it.

What is this

Every Mac with Apple Silicon has a built-in LLM - Apple's on-device foundation model, shipped as part of Apple Intelligence. Apple provides the FoundationModels framework (macOS 26+) to access it, but only exposes it through Siri and system features. apfel wraps it in a CLI and an HTTP server - so you can actually use it. All inference runs on-device, no network calls.

  • UNIX tool - echo "summarize this" | apfel - pipe-friendly, file attachments, JSON output, exit codes
  • OpenAI-compatible server - apfel --serve - drop-in replacement at localhost:11434, works with any OpenAI SDK
  • Tool calling - function calling with schema conversion, full round-trip support
  • Zero cost - no API keys, no cloud, no subscriptions, 4096-token context window

apfel CLI

Requirements & Install

  • Apple Silicon Mac, macOS 26 Tahoe or newer, Apple Intelligence enabled
  • Building from source requires Command Line Tools with macOS 26.4 SDK (ships Swift 6.3). No Xcode required.

Homebrew (recommended):

brew tap Arthur-Ficial/tap
brew install apfel
brew upgrade apfel

Update:

brew upgrade apfel

Build from source:

git clone https://github.com/Arthur-Ficial/apfel.git
cd apfel
make install

Troubleshooting: docs/install.md

Quick Start

UNIX tool

Shell note: if your prompt contains !, prefer single quotes in zsh/bash so history expansion does not break copy-paste. Example: apfel 'Hello, Mac!'

# Single prompt
apfel "What is the capital of Austria?"

Stream output

apfel --stream "Write a haiku about code"

Pipe input

echo "Summarize: $(cat README.md)" | apfel

Attach file content to prompt

apfel -f README.md "Summarize this project"

Attach multiple files

apfel -f old.swift -f new.swift "What changed between these two files?"

Combine files with piped input

git diff HEAD~1 | apfel -f CONVENTIONS.md "Review this diff against our conventions"

JSON output for scripting

apfel -o json "Translate to German: hello" | jq .content

System prompt

apfel -s "You are a pirate" "What is recursion?"

System prompt from file

apfel --system-file persona.txt "Explain TCP/IP"

Quiet mode for shell scripts

result=$(apfel -q "Capital of France? One word.")

OpenAI-compatible server

Start the server:

apfel --serve

Then in another terminal:

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"apple-foundationmodel","messages":[{"role":"user","content":"Hello"}]}'

Works with the official Python client:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11434/v1", api_key="unused") resp = client.chat.completions.create( model="apple-foundationmodel", messages=[{"role": "user", "content": "What is 1+1?"}], ) print(resp.choices[0].message.content)

Interactive chat

apfel --chat
apfel --chat -s "You are a helpful coding assistant"
apfel --chat --mcp ./mcp/calculator/server.py      # chat with MCP tools
apfel --chat --debug                                # debug output to stderr

Ctrl-C exits cleanly. Context window is managed automatically with configurable strategies:

apfel --chat --context-strategy newest-first     # default: keep recent turns
apfel --chat --context-strategy oldest-first     # keep earliest turns
apfel --chat --context-strategy sliding-window --context-max-turns 6
apfel --chat --context-strategy summarize        # compress old turns via on-device model
apfel --chat --context-strategy strict           # error on overflow, no trimming
apfel --chat --context-output-reserve 256        # custom output token reserve

Updating

apfel --update

Or directly via Homebrew: brew upgrade apfel

Benchmarking

Measure internal overhead on the installed release binary:

apfel --benchmark -o json
make benchmark

File attachments (-f/--file)

Attach one or more files to any prompt with -f. File contents are prepended to your prompt text. The flag is repeatable - use it as many times as you need.

# Explain a file
apfel -f main.swift "Explain what this code does"

Compare two files

apfel -f before.txt -f after.txt "What are the differences?"

Code review a git diff

jj diff | apfel -f CONVENTIONS.md "Review this diff against our coding conventions" git diff HEAD~1 | apfel -f style-guide.md "Any style violations in this diff?"

Summarize a commit

git show HEAD | apfel -f CHANGELOG.md "Write a changelog entry for this commit"

Combine with other flags

apfel -f data.csv -o json "Extract the top 5 rows" | jq .content apfel -f code.py -s "You are a senior code reviewer" "Find bugs" apfel -f spec.md --stream "Generate test cases for this spec"

Files, stdin, and prompt arguments all compose naturally:

# File only (file content becomes the prompt)
apfel -f poem.txt

File + prompt argument

apfel -f poem.txt "Translate this to German"

Stdin + prompt argument

echo "some text" | apfel "Summarize this"

File + stdin + prompt argument (all three combined)

echo "extra context" | apfel -f code.swift "Explain this code with the context above"

MCP Tool Support

Attach MCP tool servers with --mcp. apfel discovers tools, executes them automatically, and returns the final answer. No glue code needed.

apfel --mcp ./mcp/calculator/server.py "What is 15 times 27?"
mcp: ./mcp/calculator/server.py - add, subtract, multiply, divide, sqrt, power    ← stderr
tool: multiply({"a": 15, "b": 27}) = 405                                          ← stderr
15 times 27 is 405.                                                                ← stdout

Tool discovery and call info goes to stderr (visible in terminal, invisible when piped). Only the answer goes to stdout. Use -q to suppress tool info entirely.

# Multiple MCP servers (bring your own)
apfel --mcp ./server_a.py --mcp ./server_b.py "Use both tools"

Server mode - tools auto-available to all clients

apfel --serve --mcp ./mcp/calculator/server.py

No --mcp = exactly as before. Zero overhead.

apfel "What is 2+2?"

Ships with a calculator MCP server at mcp/calculator/. See MCP docs for details and how to build your own.

Demos

See demo/ for real-world shell scripts powered by apfel.

cmd - natural language to shell command:

demo/cmd "find all .log files modified today"
# $ find . -name "*.log" -type f -mtime -1

demo/cmd -x "show disk usage sorted by size" # -x = execute after confirm demo/cmd -c "list open ports" # -c = copy to clipboard

Shell function version - add to your .zshrc and use cmd from anywhere:

# cmd - natural language to shell command (apfel). Add to .zshrc:
cmd(){ local x c r a; while [[ $1 == -* ]]; do case $1 in -x)x=1;shift;; -c)c=1;shift;; *)break;; esac; done; r=$(apfel -q -s 'Output only a shell command.' "$*" | sed '/^```/d;/^#/d;s/\x1b\[[0-9;]*[a-zA-Z]//g;s/^[[:space:]]*//;/^$/d' | head -1); [[ $r ]] || { echo "no command generated"; return 1; }; printf '\e[32m$\e[0m %s\n' "$r"; [[ $c ]] && printf %s "$r" | pbcopy && echo "(copied)"; [[ $x ]] && { printf 'Run? [y/N] '; read -r a; [[ $a == y ]] && eval "$r"; }; return 0; }
cmd find all swift files larger than 1MB     # shows: $ find . -name "*.swift" -size +1M
cmd -c show disk usage sorted by size        # shows command + copies to clipboard
cmd -x what process is using port 3000       # shows command + asks to run it
cmd list all git branches merged into main
cmd count lines of code by language

oneliner - complex pipe chains from plain English:

demo/oneliner "sum the third column of a CSV"
# $ awk -F',' '{sum += $3} END {print sum}' file.csv

demo/oneliner "count unique IPs in access.log"

$ awk '{print $1}' access.log | sort | uniq -c | sort -rn

mac-narrator - your Mac's inner monologue:

demo/mac-narrator              # one-shot: what's happening right now?
demo/mac-narrator --watch      # continuous narration every 60s

Also in demo/:

  • wtd - "what's this directory?" - instant project orientation
  • explain - explain a command, error, or code snippet
  • naming - naming suggestions for functions, variables, files
  • port - what's using this port?
  • gitsum - summarize recent git activity

Debug GUI

brew install Arthur-Ficial/tap/apfel-gui
apfel-gui

apfel GUI

Native macOS SwiftUI app for debugging and chatting with Apple Intelligence. Debug inspector with full request/response timeline, MCP JSON-RPC protocol viewer, live request logs, model settings, speech-to-text, text-to-speech - all on-device.

Full documentation and features in the apfel-gui repo ->

OpenAI API Compatibility

Base URL: http://localhost:11434/v1

FeatureStatusNotes
POST /v1/chat/completionsSupportedStreaming + non-streaming
GET /v1/modelsSupportedReturns apple-foundationmodel
GET /healthSupportedModel availability, context window, languages
GET /v1/logsDebug onlyAvailable only when server starts with --debug
GET /v1/logs/statsDebug onlyAvailable only when server starts with --debug
Tool callingSupportedNative Transcript.ToolDefinition + JSON detection. See Tool Calling Guide
response_format: json_objectSupportedVia system prompt injection
temperature, max_tokens, seedSupportedMapped to GenerationOptions
stream: trueSupportedSSE with usage stats in final chunk
finish_reasonSupportedstop, tool_calls, length
Context strategiesSupportedx_context_strategy, x_context_max_turns, x_context_output_reserve extension fields
CORSSupportedEnable with --cors
POST /v1/completions501Legacy text completions not supported
POST /v1/embeddings501Embeddings not available on-device
logprobs=true, n>1, stop, presence_penalty, frequency_penalty400Rejected explicitly. n=1 and logprobs=false are accepted as no-ops
Multi-modal (images)400Rejected with clear error
Authorization headerSupportedRequired when --token is set. See Server Security

Full API spec: openai/openai-openapi

Limitations

ConstraintDetail
Context window4096 tokens (input + output combined). ~3000 English words.
PlatformmacOS 26+, Apple Silicon only
ModelOne model (apple-foundationmodel), not configurable
GuardrailsApple's safety system may block benign prompts (false positives exist)
SpeedOn-device inference, not cloud-scale - expect a few seconds per response
No embeddingsApple's model doesn't support vector embeddings
No visionImage/multi-modal input not supported

CLI Reference

MODES
  apfel <prompt>                          Single prompt (default)
  apfel --stream <prompt>                 Stream response tokens
  apfel --chat                            Interactive conversation
  apfel --serve                           Start OpenAI-compatible server
  apfel --benchmark                       Run internal performance benchmarks

INPUT apfel -f, --file <path> <prompt> Attach file content (repeatable) apfel -s, --system <text> <prompt> Set system prompt apfel --system-file <path> <prompt> Read system prompt from file apfel --mcp <server.py> <prompt> Attach MCP tool server (repeatable)

OUTPUT -o, --output <fmt> Output format: plain, json -q, --quiet Suppress non-essential output --no-color Disable ANSI colors

MODEL --temperature <n> Sampling temperature (e.g., 0.7) --seed <n> Random seed for reproducibility --max-tokens <n> Maximum response tokens --permissive Use permissive content guardrails --retry [n] Retry transient errors with backoff (default: 3) --debug Enable debug logging to stderr (all modes)

CONTEXT (--chat) --context-strategy <s> newest-first, oldest-first, sliding-window, summarize, strict --context-max-turns <n> Max history turns (sliding-window only) --context-output-reserve <n> Tokens reserved for output (default: 512)

SERVER (--serve) --port <n> Server port (default: 11434) --host <addr> Bind address (default: 127.0.0.1) --cors Enable CORS headers --allowed-origins <origins> Comma-separated allowed origins --no-origin-check Disable origin checking --token <secret> Require Bearer token auth --token-auto Generate random Bearer token --public-health Keep /health unauthenticated --footgun Disable all protections --max-concurrent <n> Max concurrent requests (default: 5) --debug Verbose logging + /v1/logs endpoints

META -v, --version Print version -h, --help Show help --release Detailed build info --model-info Print model capabilities --update Check for updates via Homebrew

Examples by flag:

# -f, --file — attach file content to prompt (repeatable)
apfel -f main.swift "Explain this code"
apfel -f before.txt -f after.txt "What changed?"

-s, --system — set a system prompt

apfel -s "You are a pirate" "What is recursion?" apfel -s "Reply in JSON only" "List 3 colors"

--system-file — read system prompt from a file

apfel --system-file persona.txt "Introduce yourself"

--mcp — attach MCP tool servers (repeatable)

apfel --mcp ./mcp/calculator/server.py "What is 15 times 27?" apfel --mcp ./calc.py --mcp ./weather.py "Use both tools"

-o, --output — output format: plain (default) or json

apfel -o json "Translate to German: hello" | jq .content apfel -o json "List 3 facts" | jq -r .content

-q, --quiet — suppress headers and chrome, output only

apfel -q "Give me a UUID" apfel -q -o json "Summarize" | jq .content

--no-color — disable ANSI color codes

apfel --no-color --help NO_COLOR=1 apfel "Hello"

--temperature — sampling temperature (higher = more creative)

apfel --temperature 0.0 "What is 2+2?" apfel --temperature 1.5 "Write a wild poem"

--seed — random seed for reproducible output

apfel --seed 42 "Tell me a joke" apfel --seed 42 "Tell me a joke" # same output

--max-tokens — limit response length

apfel --max-tokens 50 "Explain quantum computing" apfel --max-tokens 10 "Say something short"

--permissive — relax content guardrails

apfel --permissive "Write a villain monologue"

--retry — retry transient errors with exponential backoff

apfel --retry "What is 2+2?"

--debug — verbose debug logging to stderr

apfel --debug "Hello world"

--stream — stream response tokens as they arrive

apfel --stream "Write a haiku about code" apfel --stream -s "Be verbose" "Explain TCP"

--chat — interactive multi-turn conversation

apfel --chat apfel --chat -s "You are a helpful coding assistant"

--context-strategy — how to handle context overflow in chat

apfel --chat --context-strategy newest-first # default: keep recent turns apfel --chat --context-strategy oldest-first # keep earliest turns apfel --chat --context-strategy sliding-window --context-max-turns 6 apfel --chat --context-strategy summarize # compress old turns apfel --chat --context-strategy strict # error on overflow

--context-max-turns — max turns to keep (sliding-window only)

apfel --chat --context-strategy sliding-window --context-max-turns 4

--context-output-reserve — tokens reserved for model response

apfel --chat --context-output-reserve 256

--serve — start OpenAI-compatible HTTP server

apfel --serve apfel --serve --port 3000 --host 0.0.0.0

--port, --host — server bind address

apfel --serve --port 8080 apfel --serve --host 0.0.0.0 --port 443

--cors — enable CORS for browser clients

apfel --serve --cors

--allowed-origins — restrict to specific origins

apfel --serve --allowed-origins "https://myapp.com,https://staging.myapp.com"

--no-origin-check — allow requests from any origin

apfel --serve --no-origin-check

--token — require Bearer token authentication

apfel --serve --token "my-secret-token" curl localhost:11434/v1/models -H "Authorization: Bearer my-secret-token"

--token-auto — generate a random token and print it

apfel --serve --token-auto

--public-health — keep /health open even with token auth

apfel --serve --token "secret" --host 0.0.0.0 --public-health

--footgun — disable all protections (CORS + no origin check)

apfel --serve --footgun # only for local development!

--max-concurrent — limit parallel model requests

apfel --serve --max-concurrent 2

--debug — verbose logging with /v1/logs endpoints

apfel --serve --debug

--benchmark — run internal performance benchmarks

apfel --benchmark apfel --benchmark -o json | jq '.benchmarks[] | {name, speedup_ratio}'

--model-info — print model capabilities

apfel --model-info

--update — check for updates and upgrade via Homebrew

apfel --update

--release — detailed version and build info

apfel --release

--version

apfel --version

--help

apfel --help

See Server Security for detailed documentation on security options.

Exit Codes

CodeMeaning
0Success
1Runtime error
2Usage error (bad flags)
3Guardrail blocked
4Context overflow
5Model unavailable
6Rate limited

Environment Variables

VariableDescription
APFEL_SYSTEM_PROMPTDefault system prompt
APFEL_HOSTServer bind address
APFEL_PORTServer port
APFEL_TOKENBearer token for server authentication
APFEL_TEMPERATUREDefault temperature
APFEL_MAX_TOKENSDefault max tokens
APFEL_CONTEXT_STRATEGYDefault context strategy
APFEL_CONTEXT_MAX_TURNSMax turns for sliding-window
APFEL_CONTEXT_OUTPUT_RESERVETokens reserved for output
NO_COLORDisable colors (no-color.org)

Architecture

CLI (single/stream/chat) ──┐
                           ├─→ FoundationModels.SystemLanguageModel
HTTP Server (/v1/*) ───────┘   (100% on-device, zero network)
                                ContextManager → Transcript API
                                SchemaConverter → native ToolDefinitions
                                TokenCounter → real token counts (SDK 26.4)

Built with Swift 6.3 strict concurrency. Single Package.swift, three targets:

  • ApfelCore - pure logic library (no FoundationModels dependency, unit-testable)
  • apfel - executable (CLI + server)
  • apfel-tests - unit tests (pure Swift runner, no XCTest)

No Xcode required. Builds and tests with Command Line Tools only.

Build & Test

# Build + install (auto-bumps patch version each time)
make install                             # build release + install to /usr/local/bin
make build                               # build release only (no install)

Version management (zero manual editing)

make version # print current version make release-minor # bump minor: 0.6.x -> 0.7.0 make release-major # bump major: 0.x.y -> 1.0.0

Debug build (no version bump, uses swift directly)

swift build # quick debug build

Tests

swift run apfel-tests # pure Swift unit tests (no XCTest needed) python3 -m pytest Tests/integration/ -v # integration tests (auto-starts servers) apfel --benchmark -o json # performance report from the installed binary make benchmark # same benchmark via Makefile

Every make build/make install automatically:

  • Bumps the patch version (.version file is the single source of truth)
  • Updates the README version badge
  • Generates build metadata (commit, date, Swift version) viewable via apfel --release

Related Projects

  • apfel-clip - AI-powered clipboard actions from the menu bar (fix grammar, translate, explain code, and more)
  • apfel-gui - Native macOS SwiftUI debug GUI for apfel (debug inspector, MCP protocol viewer, request logs, chat, TTS/STT)

Examples

See docs/EXAMPLES.md for 50 real prompts and unedited model outputs.

License

MIT

SEE ALSO

clihub4/7/2026APFEL(1)