Skip to main content
What you can do with Local (on-device) through the harness, by feature family: the capability rows you can rely on (linked to vendor docs) and the parameters you can set for each. Configure parameters through the workflow surfaces described in the Workflow Schema — model/turn/budget fields on agent, provider knobs under harness_config.sdk_settings.local, and tool/sandbox policy at top level.

Core generation

Capabilities (4/4 rows usable):
  • local.llamacpp.openai_compat (llm.complete) — vendor docs
  • local.llamacpp.server (llm.complete) — vendor docs
  • local.ollama.generate (llm.complete) — vendor docs
  • local.ollama.openai_compat (llm.complete) — vendor docs
Parameters (7):
ParameterTypeDefaultAllowedRiskNotes
base_urlstring"http://127.0.0.1:8080/v1"mediumdocs
modelstring"gemma-4-e2b-it"lowdocs
base_urlstring"http://127.0.0.1:8080"lowdocs
base_urlstring"http://127.0.0.1:11434"mediumdocs
modelstringlowDiscovered via local.ollama.tags.
options.num_ctxnumber2048lowContext length window — model-dependent maximum.
base_urlstring"http://127.0.0.1:11434/v1"mediumdocs

llama-cpp-python

Capabilities (13/13 rows usable):
  • local.llamacpp.python.create_chat_completion (llm.chat) — vendor docs
  • local.llamacpp.python.create_completion (llm.complete) — vendor docs
  • local.llamacpp.python.create_embedding (llm.embed) — vendor docs
  • local.llamacpp.python.eval_sample_generate (llm.streaming) — vendor docs
  • local.llamacpp.python.from_pretrained (provider.models) — vendor docs
  • local.llamacpp.python.llama_cache_state (provider.slots) — vendor docs
  • local.llamacpp.python.llama_class (provider.models) — vendor docs
  • local.llamacpp.python.llama_grammar (llm.structured_output) — vendor docs
  • local.llamacpp.python.logits_processor (llm.sampling) — vendor docs
  • local.llamacpp.python.save_load_state (provider.slots) — vendor docs
  • local.llamacpp.python.server_module (provider.lifecycle) — vendor docs
  • local.llamacpp.python.stopping_criteria (llm.sampling) — vendor docs
  • local.llamacpp.python.tokenize_detokenize (llm.tokenize) — vendor docs
Parameters (13):
ParameterTypeDefaultAllowedRiskNotes
create_chat_completion.messagesarraylowdocs
create_completion.promptstringlowdocs
create_embedding.inputstringlowRequires an embedding GGUF (embedding=True).
eval.tokensarraylowdocs
from_pretrained.repo_idstringmediumLive-probed with ggml-org/models tinyllamas/stories260K.gguf.
set_cacheobjectlowdocs
Llama.model_pathstringlowdocs
create_completion.grammarobjectlowGBNF; live-probed constraining output to yes|no.
create_completion.logits_processorarraymediumdocs
save_stateobjectlowdocs
args.portnumbermediumpython -m llama_cpp.server; OpenAI-compatible completions live-probed.
create_completion.stopping_criteriaarraylowdocs
tokenize.textstringlowdocs

llama.cpp CLI

Capabilities (4/4 rows usable):
  • local.llamacpp.bench (eval.benchmark) — vendor docs
  • local.llamacpp.cli (provider.cli) — vendor docs
  • local.llamacpp.model_acquisition_cache (provider.models) — vendor docs
  • local.llamacpp.perplexity (eval.perplexity) — vendor docs
Parameters (4):
ParameterTypeDefaultAllowedRiskNotes
args.output_formatstring"json"lowdocs
args.promptstringlowdocs
args.hf_repostringmedium-hf fetch + —cache-list; live-probed with ggml-org/SmolVLM-256M-Instruct-GGUF.
args.filestringlowdocs

llama.cpp CLI Tools

Capabilities (15/20 rows usable):
  • local.llamacpp.cli.batched (eval.benchmark) — vendor docs
  • local.llamacpp.cli.batched_bench (eval.benchmark) — vendor docs
  • local.llamacpp.cli.embedding (llm.embed) — vendor docs
  • local.llamacpp.cli.gguf_inspect (provider.diagnostics) — vendor docs
  • local.llamacpp.cli.gguf_split (file.split_merge) — vendor docs
  • local.llamacpp.cli.imatrix (provider.quantization) — vendor docs
  • local.llamacpp.cli.lookahead (llm.speculative_decoding) — vendor docs
  • local.llamacpp.cli.mtmd_cli (media.multimodal) — vendor docs
  • local.llamacpp.cli.parallel (eval.benchmark) — vendor docs
  • local.llamacpp.cli.passkey (eval.benchmark) — vendor docs
  • local.llamacpp.cli.quantize (provider.quantization) — vendor docs
  • local.llamacpp.cli.retrieval (llm.rag) — vendor docs
  • local.llamacpp.cli.simple (llm.complete) — vendor docs
  • local.llamacpp.cli.simple_chat (llm.chat) — vendor docs
  • local.llamacpp.cli.tokenize (llm.tokenize) — vendor docs
Parameters (15):
ParameterTypeDefaultAllowedRiskNotes
args.nplstringlowdocs
args.n_parallelnumberlowdocs
args.embd_output_formatstring"json"lowdocs
args.modestring"r"lowdocs
args.split_max_sizestring"50M"lowSplit + merge round trip live-probed.
args.train_filestringlowdocs
args.n_predictnumberlowdocs
args.imagestringlowVision CLI with -hf SmolVLM; live image description.
args.n_sequencesnumberlowdocs
args.junknumberlowdocs
args.ftypestringmediumdocs
args.top_knumberlowdocs
args.ctx_sizenumberlowdocs
args.n_predictnumberlowdocs
args.promptstringlowdocs

llama.cpp Evaluation

Capabilities (1/1 rows usable):
  • local.llamacpp.eval.choice_logit_tasks (eval.benchmark) · model-dependentvendor docs

llama.cpp Quant Types

Capabilities (7/7 rows usable):
  • local.llamacpp.quant.bf16_f16_f32 (provider.quantization) — vendor docs
  • local.llamacpp.quant.iq_extreme (provider.quantization) — vendor docs
  • local.llamacpp.quant.iq2 (provider.quantization) — vendor docs
  • local.llamacpp.quant.iq3 (provider.quantization) — vendor docs
  • local.llamacpp.quant.iq4 (provider.quantization) — vendor docs
  • local.llamacpp.quant.k_quants (provider.quantization) — vendor docs
  • local.llamacpp.quant.legacy_q (provider.quantization) — vendor docs
Parameters (7):
ParameterTypeDefaultAllowedRiskNotes
args.ftypestring"f32"lowdocs
args.ftypestring"iq1_s"mediumRequires —imatrix; live-probed with a probe-built imatrix.
args.ftypestring"iq2_xs"mediumRequires —imatrix; live-probed with a probe-built imatrix.
args.ftypestring"iq3_xxs"mediumRequires —imatrix; live-probed with a probe-built imatrix.
args.ftypestring"iq4_xs"lowdocs
args.ftypestring"q4_K_M"lowdocs
args.ftypestring"q4_0"lowdocs

llama.cpp Runtime

Capabilities (4/7 rows usable):
  • local.llamacpp.runtime.cpu_memory (provider.runtime) — vendor docs
  • local.llamacpp.runtime.kv_cache_context (provider.context) — vendor docs
  • local.llamacpp.runtime.threads_batch (provider.runtime) — vendor docs
  • local.llamacpp.sampling_controls (llm.sampling) — vendor docs
Parameters (4):
ParameterTypeDefaultAllowedRiskNotes
args.threadsnumberlow-t/—no-mmap/—mlock live boot + completion.
args.ctx_sizenumber2048low-c with -ctk/-ctv q8_0; props echoes n_ctx.
args.threads_batchnumberlowsystem_info echoes n_threads_batch.
samplingobjectlowtop_k/top_p/min_p/temperature/penalties echoed in generation_settings.

llama.cpp Server Anthropic Format

Capabilities (2/2 rows usable):
  • local.llamacpp.server.anthropic_count_tokens (anthropic.messages.count_tokens) — vendor docs
  • local.llamacpp.server.anthropic_messages (anthropic.messages) — vendor docs
Parameters (2):
ParameterTypeDefaultAllowedRiskNotes
messagesarraylowdocs
max_tokensnumber1024lowdocs

llama.cpp Server Native

Capabilities (21/26 rows usable):
  • local.llamacpp.server.apply_template (llm.chat_template) — vendor docs
  • local.llamacpp.server.auth_tls (provider.auth) — vendor docs
  • local.llamacpp.server.completion_native (llm.completion) — vendor docs
  • local.llamacpp.server.detokenize (llm.tokenize) — vendor docs
  • local.llamacpp.server.embeddings_native (llm.embedding) — vendor docs
  • local.llamacpp.server.gpu_backend (provider.gpu_offload) — vendor docs
  • local.llamacpp.server.grammar (llm.structured_output) — vendor docs
  • local.llamacpp.server.health (provider.health) — vendor docs
  • local.llamacpp.server.infill (llm.fim) · model-dependentvendor docs
  • local.llamacpp.server.lora_adapters (tuning.lora_runtime) · model-dependentvendor docs
  • local.llamacpp.server.metrics (provider.metrics) — vendor docs
  • local.llamacpp.server.parallel_batching (provider.parallel_decoding) — vendor docs
  • local.llamacpp.server.props (provider.runtime_config) — vendor docs
  • local.llamacpp.server.props_post (provider.runtime_config) — vendor docs
  • local.llamacpp.server.reasoning (llm.reasoning) · model-dependentvendor docs
  • local.llamacpp.server.reranking (llm.rerank) — vendor docs
  • local.llamacpp.server.slots (provider.slots) — vendor docs
  • local.llamacpp.server.slots_save_restore (provider.slots) — vendor docs
  • local.llamacpp.server.speculative (llm.speculative_decoding) — vendor docs
  • local.llamacpp.server.tokenize (llm.tokenize) — vendor docs
  • local.llamacpp.server.webui (docs.webui) — vendor docs
Parameters (20):
ParameterTypeDefaultAllowedRiskNotes
messagesarraylowdocs
args.api_keystringhigh401 keyless / 200 bearer; HTTPS via —ssl-key-file/—ssl-cert-file live-probed.
base_urlstring"http://127.0.0.1:8080"mediumdocs
n_predictnumber128lowdocs
tokensarraylowdocs
contentstringlowRequires llama-server —embeddings.
args.n_gpu_layersnumberlowMetal backend on Apple Silicon; -ngl 99 live boot + completion.
json_schemaobjectlowGBNF —grammar twin; schema-constrained decoding live-probed.
base_urlstring"http://127.0.0.1:8080"lowdocs
base_urlstring"http://127.0.0.1:8080"lowRequires llama-server —metrics.
messages.image_urlobjectlowVision via mmproj; needs explicit —mmproj wiring on this build.
args.n_parallelnumber1lowprops total_slots reflects -np; concurrent completions live-probed.
bodyobjectmediumMutates global server properties; requires llama-server —props.
base_urlstring"http://127.0.0.1:8080"lowdocs
querystringlowRequires llama-server —reranking with an embedding/rerank model.
filenamestringmediumRequires llama-server —slot-save-path; writes slot KV cache to disk.
base_urlstring"http://127.0.0.1:8080"lowRequires llama-server —slots.
args.model_draftstringmediumtimings.draft_n present in completions.
contentstringlowdocs
base_urlstring"http://127.0.0.1:8080"lowdocs

llama.cpp Server OpenAI Format

Capabilities (2/2 rows usable):
  • local.llamacpp.server.responses (llm.responses) — vendor docs
  • local.llamacpp.server.tools (tool.function_calling) · model-dependentvendor docs
Parameters (1):
ParameterTypeDefaultAllowedRiskNotes
inputstringlowdocs

LocalAI Anthropic Format

Capabilities (1/1 rows usable):
  • local.localai.anthropic_messages (anthropic.messages) — vendor docs

LocalAI Backends

Capabilities (1/23 rows usable):
  • local.localai.backend.llamacpp (provider.backend) — vendor docs

LocalAI Galleries

Capabilities (2/6 rows usable):
  • local.localai.gallery_available (provider.gallery) — vendor docs
  • local.localai.gallery_jobs (provider.gallery) — vendor docs

LocalAI Ollama Format

Capabilities (1/1 rows usable):
  • local.localai.ollama_compat (llm.complete) — vendor docs

LocalAI OpenAI Format

Capabilities (3/16 rows usable):
  • local.localai.openai_chat (llm.chat) — vendor docs
  • local.localai.openai_completions (llm.completion) — vendor docs
  • local.localai.openai_models (provider.models) — vendor docs

MLX Apple Platform

Capabilities (2/2 rows usable):
  • local.mlx.platform.lazy_eval (provider.runtime) — vendor docs
  • local.mlx.platform.macos_unified_memory (provider.gpu_offload) — vendor docs

MLX Distributed

Capabilities (2/2 rows usable):
  • local.mlx.distributed.launch (provider.distributed_inference) — vendor docs
  • local.mlx.distributed.primitives (provider.distributed_inference) — vendor docs

MLX Examples

Capabilities (2/12 rows usable):
  • local.mlx.ex.bert (llm.embed) · model-dependentvendor docs
  • local.mlx.ex.t5 (llm.complete) · model-dependentvendor docs

MLX Fast Kernels

Capabilities (5/5 rows usable):
  • local.mlx.fast.metal_kernel (provider.runtime) — vendor docs
  • local.mlx.fast.quantized_matmul (provider.quantization) — vendor docs
  • local.mlx.fast.rms_norm (provider.runtime) — vendor docs
  • local.mlx.fast.rope (provider.runtime) — vendor docs
  • local.mlx.fast.scaled_dot_product_attention (provider.attention_backend) — vendor docs

MLX Optimizers

Capabilities (1/1 rows usable):
  • local.mlx.optimizers (tuning.training_reference) — vendor docs

MLX Profiling

Capabilities (3/3 rows usable):
  • local.mlx.metal.cache_limit (provider.gpu_offload) — vendor docs
  • local.mlx.metal.capture (provider.observability) — vendor docs
  • local.mlx.utils.tree (provider.runtime) — vendor docs

MLX-LM CLI

Capabilities (10/17 rows usable):
  • local.mlx.lm.cli_benchmark (eval.benchmark) — vendor docs
  • local.mlx.lm.cli_cache_prompt (provider.slots) — vendor docs
  • local.mlx.lm.cli_chat (llm.chat) — vendor docs
  • local.mlx.lm.cli_convert (provider.models) — vendor docs
  • local.mlx.lm.cli_fuse (tuning.lora_runtime) — vendor docs
  • local.mlx.lm.cli_generate (llm.complete) — vendor docs
  • local.mlx.lm.cli_lora (tuning.lora_runtime) — vendor docs
  • local.mlx.lm.cli_manage (provider.models) — vendor docs
  • local.mlx.lm.cli_perplexity (eval.perplexity) · model-dependentvendor docs
  • local.mlx.lm.cli_server (llm.chat) — vendor docs

MLX-LM Python SDK

Capabilities (7/9 rows usable):
  • local.mlx.lm.kv_cache_quantized (provider.kv_cache) — vendor docs
  • local.mlx.lm.kv_cache_rotating (provider.kv_cache) — vendor docs
  • local.mlx.lm.py_generate (llm.complete) — vendor docs
  • local.mlx.lm.py_load (provider.models) — vendor docs
  • local.mlx.lm.py_prompt_cache (provider.slots) — vendor docs
  • local.mlx.lm.py_sample_utils (llm.sampling) — vendor docs
  • local.mlx.lm.py_stream_generate (llm.streaming) — vendor docs

MLX-LM Server

Capabilities (1/1 rows usable):
  • local.mlx.lm.speculative (llm.speculative_decoding) — vendor docs

Ollama Anthropic Format

Capabilities (1/3 rows usable):
  • local.ollama.anthropic_messages (anthropic.messages) — vendor docs
Parameters (1):
ParameterTypeDefaultAllowedRiskNotes
max_tokensnumber1024lowdocs

Ollama Blobs

Capabilities (2/2 rows usable):
  • local.ollama.api_blobs_head (file.exists) — vendor docs
  • local.ollama.api_blobs_post (file.upload) — vendor docs
Parameters (2):
ParameterTypeDefaultAllowedRiskNotes
digeststringlowdocs
digeststringmediumUploads content-addressed blobs (GGUF/adapter) to the local daemon.

Ollama CLI

Capabilities (6/7 rows usable):
  • local.ollama.cli_cp (provider.models) — vendor docs
  • local.ollama.cli_pull_rm_ls (provider.models) — vendor docs
  • local.ollama.cli_run (llm.chat) — vendor docs
  • local.ollama.cli_serve (provider.lifecycle) — vendor docs
  • local.ollama.cli_show (provider.admin.read) — vendor docs
  • local.ollama.cli_stop (provider.lifecycle) — vendor docs
Parameters (6):
ParameterTypeDefaultAllowedRiskNotes
args.source_deststringlowdocs
args.modelstringmediumpull/rm/ls/create/cp management set.
args.promptstringlowdocs
env.OLLAMA_HOSTstring"127.0.0.1:11434"mediumdocs
args.inspect_flagstringlow—modelfile/—parameters/—template/—system/—license.
args.modelstringlowdocs

Ollama Environment

Capabilities (13/15 rows usable):
  • local.ollama.env.context_length (provider.context) — vendor docs
  • local.ollama.env.debug (provider.observability) — vendor docs
  • local.ollama.env.flash_attention (provider.attention_backend) — vendor docs
  • local.ollama.env.gpu_overhead (provider.gpu_offload) — vendor docs
  • local.ollama.env.host (provider.connectivity) — vendor docs
  • local.ollama.env.keep_alive (provider.lifecycle) — vendor docs
  • local.ollama.env.kv_cache_type (provider.kv_cache) — vendor docs
  • local.ollama.env.max_loaded (provider.lifecycle) — vendor docs
  • local.ollama.env.max_queue (provider.parallel_decoding) — vendor docs
  • local.ollama.env.models_dir (file.manage) — vendor docs
  • local.ollama.env.num_parallel (provider.parallel_decoding) — vendor docs
  • local.ollama.env.origins (provider.connectivity) — vendor docs
  • local.ollama.env.sched_spread (provider.gpu_offload) — vendor docs
Parameters (13):
ParameterTypeDefaultAllowedRiskNotes
env.OLLAMA_CONTEXT_LENGTHnumberlowLive-probed: /api/ps reports the env-set context_length.
env.OLLAMA_DEBUGbooleanlowLive-probed: DEBUG-level log lines under OLLAMA_DEBUG=1.
env.OLLAMA_FLASH_ATTENTIONbooleanlowConfig echo + live generation under the flag.
env.OLLAMA_GPU_OVERHEADnumbermediumApplied into scheduler config (startup config echo).
env.OLLAMA_HOSTstring"127.0.0.1:11434"mediumdocs
env.OLLAMA_KEEP_ALIVEstring"5m"lowLive-probed: model evicted from /api/ps after expiry.
env.OLLAMA_KV_CACHE_TYPEstringmediumConfig echo + live generation under q8_0.
env.OLLAMA_MAX_LOADED_MODELSnumberlowApplied into scheduler config (startup config echo).
env.OLLAMA_MAX_QUEUEnumberlowApplied into scheduler config (startup config echo).
env.OLLAMA_MODELSstringmediumdocs
env.OLLAMA_NUM_PARALLELnumberlowApplied into scheduler config (startup config echo).
env.OLLAMA_ORIGINSstringmediumCORS allowlist; live-probed 403 deny / 200 allow.
env.OLLAMA_SCHED_SPREADbooleanlowApplied into scheduler config (startup config echo).

Ollama Generate

Capabilities (6/6 rows usable):
  • local.ollama.chat (llm.chat) — vendor docs
  • local.ollama.generate_context (llm.state.continue) — vendor docs
  • local.ollama.generate_keep_alive (provider.lifecycle) — vendor docs
  • local.ollama.generate_options_full (llm.sampling) — vendor docs
  • local.ollama.generate_raw (llm.complete) — vendor docs
  • local.ollama.generate_suffix (llm.fim) · model-dependentvendor docs
Parameters (5):
ParameterTypeDefaultAllowedRiskNotes
messagesarraylowdocs
contextarraylowdocs
keep_alivestring"5m"lowdocs
optionsobjectlowFull sampler dict: temperature/top_p/top_k/repeat_penalty/seed/num_ctx/num_predict (live-probed).
rawbooleanfalsemediumBypasses the model prompt template.

Ollama Model Management

Capabilities (11/11 rows usable):
  • local.ollama.api_create_adapters (tuning.lora_runtime) · model-dependentvendor docs
  • local.ollama.api_create_quantize (provider.quantization) — vendor docs
  • local.ollama.api_create_safetensors (provider.models) · model-dependentvendor docs
  • local.ollama.api_show (provider.admin.read) — vendor docs
  • local.ollama.copy (provider.models) — vendor docs
  • local.ollama.create (provider.models) — vendor docs
  • local.ollama.delete (provider.models) — vendor docs
  • local.ollama.ps (provider.lifecycle) — vendor docs
  • local.ollama.pull (provider.models) — vendor docs
  • local.ollama.push (provider.models) · model-dependentvendor docs
  • local.ollama.tags (provider.models) — vendor docs
Parameters (8):
ParameterTypeDefaultAllowedRiskNotes
quantizestring"q4_K_M"mediumRequires an F16/F32 source tag; live-probed from smollm:135m-instruct-v0.2-fp16.
modelstringlowdocs
destinationstringmediumdocs
fromstringmediumMedium risk: writes a new model manifest to the local store.
modelstringhighHigh risk: destructive — removes a model from the local store.
base_urlstring"http://127.0.0.1:11434"lowdocs
modelstringmediumMedium risk: downloads model layers to local disk.
base_urlstring"http://127.0.0.1:11434"lowdocs

Ollama Modelfile

Capabilities (2/2 rows usable):
  • local.ollama.context_length (provider.context) — vendor docs
  • local.ollama.modelfile (provider.modelfile) — vendor docs
Parameters (2):
ParameterTypeDefaultAllowedRiskNotes
options.num_ctxnumber4096lowdocs
modelfileobjectmediumfrom/system/parameters/template create fields; live-probed via /api/create + /api/show.

Ollama OpenAI Format

Capabilities (6/6 rows usable):
  • local.ollama.openai_chat (llm.chat) — vendor docs
  • local.ollama.openai_completions (llm.completions) — vendor docs
  • local.ollama.openai_embeddings (llm.embeddings) — vendor docs
  • local.ollama.openai_images (media.image_generation) · model-dependentvendor docs
  • local.ollama.openai_models (provider.models) — vendor docs
  • local.ollama.openai_responses (llm.responses) — vendor docs
Parameters (5):
ParameterTypeDefaultAllowedRiskNotes
messagesarraylowdocs
promptstringlowdocs
inputstringlowRequires an embedding model (e.g. nomic-embed-text).
base_urlstring"http://127.0.0.1:11434/v1"lowdocs
inputstringlowdocs

Ollama Server

Capabilities (1/1 rows usable):
  • local.ollama.api_version (provider.health) — vendor docs
Parameters (1):
ParameterTypeDefaultAllowedRiskNotes
base_urlstring"http://127.0.0.1:11434"lowdocs

Ollama Streaming

Capabilities (1/1 rows usable):
  • local.ollama.streaming (llm.streaming) — vendor docs
Parameters (1):
ParameterTypeDefaultAllowedRiskNotes
streambooleantruelowdocs

Ollama Structured Output

Capabilities (1/1 rows usable):
  • local.ollama.structured_outputs (llm.structured_output) — vendor docs
Parameters (1):
ParameterTypeDefaultAllowedRiskNotes
formatobjectlowJSON-schema constrained decoding; live-probed with a required-name object schema.

Ollama Thinking

Capabilities (1/1 rows usable):
  • local.ollama.thinking (llm.thinking) · model-dependentvendor docs

Ollama Vision

Capabilities (2/2 rows usable):
  • local.ollama.generate_image_input (media.image_input) · model-dependentvendor docs
  • local.ollama.vision (media.image_input) · model-dependentvendor docs

ollama-js SDK

Capabilities (3/5 rows usable):
  • local.ollama.js_abort_method (llm.cancel) — vendor docs
  • local.ollama.js_async_iterator (llm.streaming) — vendor docs
  • local.ollama.js_client_class (provider.admin.read) — vendor docs
Parameters (3):
ParameterTypeDefaultAllowedRiskNotes
abortobjectlowAbortError interrupts in-flight streamed generation (live-probed).
chat.streambooleantruelowdocs
Ollama.hoststringlowdocs

ollama-python SDK

Capabilities (10/11 rows usable):
  • local.ollama.python_async_client (provider.admin.read) — vendor docs
  • local.ollama.python_chat_method (llm.chat) — vendor docs
  • local.ollama.python_client_class (provider.admin.read) — vendor docs
  • local.ollama.python_copy_delete (provider.models) — vendor docs
  • local.ollama.python_create_modelfile (provider.models) — vendor docs
  • local.ollama.python_embed_method (llm.embed) — vendor docs
  • local.ollama.python_generate_method (llm.complete) — vendor docs
  • local.ollama.python_list_method (provider.models) — vendor docs
  • local.ollama.python_ps_method (provider.lifecycle) — vendor docs
  • local.ollama.python_show_method (provider.admin.read) — vendor docs
Parameters (10):
ParameterTypeDefaultAllowedRiskNotes
AsyncClient.hoststringlowdocs
chat.messagesarraylowdocs
Client.hoststringlowdocs
copy.source_deststringmediumdocs
create.from_stringmediumdocs
embed.inputstringlowdocs
generate.promptstringlowdocs
listobjectlowdocs
psobjectlowdocs
show.modelstringlowdocs

Retrieval/files/embeddings

Capabilities (4/4 rows usable):
  • local.ollama.embed_dimensions (llm.embed) — vendor docs
  • local.ollama.embed_truncate (llm.embed) — vendor docs
  • local.ollama.embeddings (llm.embed) — vendor docs
  • local.ollama.embeddings_legacy (llm.embed) — vendor docs
Parameters (4):
ParameterTypeDefaultAllowedRiskNotes
dimensionsnumberlowMatryoshka truncation; honored by nomic-embed-text (live-probed at 64).
truncatebooleantruelowdocs
promptstringlowdocs
inputarraylowRequires an embedding model (e.g. nomic-embed-text); generation runners refuse embedding requests.

Tools

Capabilities (1/1 rows usable):
  • local.ollama.tools (tool.call) · model-dependentvendor docs