Ollama

Overview

Ollama is a tool for running local LLM models. Supports multiple model formats and allows easy management via CLI or built-in interactive interface.

Command Name Description
ollama Interactive Menu Runs the Ollama interactive interface in terminal. Allows quickly launching models, selecting integrations and opening additional tools without manual command input. Supports navigation with ↑ ↓ Enter keys. Example: ollama
ollama serve Start Server Runs Ollama as an API service. After start the API becomes available at localhost:11434. Used for integration with applications and libraries. Example: ollama serve Additionally: ollama serve --debug — start in debug mode, ollama serve --port <port> — start on specified port

Running Models

Command Name Description
Loading and Running Models
ollama run <model> Run Model Loads and runs the model in interactive mode. If model not local, Ollama automatically downloads it. Used for dialogs, model testing and local LLM work. Example: ollama run llama3.2
ollama pull <model> Download Model Downloads model from Ollama registry to local storage without starting it. Convenient for pre-loading. Check downloading: ollama list. During download Ollama automatically updates to latest version. Example: ollama pull gemma3
ollama pull <model> --cloud Download from Cloud Downloads model from Ollama Cloud. Available after authorization or for free models. During download progress is shown and proxy server is used for acceleration. Example: ollama pull mistral:7b --cloud
ollama list Models List Shows all models installed locally. Convenient for viewing available models and their versions. Example: ollama list
ollama ps Active Processes Shows models currently loaded in memory, as well as CPU/GPU usage and resource volume. Useful for performance diagnostics. Example: ollama ps
ollama show <model> View Model Information Outputs model information: parameters, template, system prompt and other data. Useful for configuration analysis. Example: ollama show llama3.2
Running Integrations
ollama launch Run Integrations Opens menu for configuring and launching external applications working via Ollama. Used for connecting IDE and AI tools. Example: ollama launch
ollama launch <integration> Run Specific Integration Runs specific integration directly without selection menu. Suitable for quick start of development environment or AI tool. Example: ollama launch codex
ollama launch <integration> --model <model> Run Integration with Model Allows specifying specific model for external application. Useful if multiple models with different tasks are used. Example: ollama launch codex --model llama3.2
ollama cp <source> <destination> Copy Model Creates a copy of existing model under new name. Convenient before changing configuration or experimentation. Example: ollama cp llama3.2 llama3.2-custom
ollama rm <model> Remove Model Removes model from local storage and frees disk space. Example: ollama rm llama3.2
ollama create <name> -f <Modelfile> Create Model Creates new model based on Modelfile. Allows changing system instructions, parameters, templates and connecting adapters. Example: ollama create mario-ai -f ./Modelfile
ollama pull <model> --latest Update Model Updates local model to latest version in registry. Useful for getting security patches and improvements. During update Ollama checks availability of new version and shows size difference. Example: ollama pull llama3.2 --latest

Integrations

Ollama supports various IDEs and tools for working with models.

Command Name Description
ollama launch Integrations Menu Opens interactive menu for selecting and launching external applications. Allows easily connecting IDEs (VS Code, Vim), editors and AI tools without complex configuration. Used first time after Ollama installation. Example: ollama launch
ollama launch codex Run in VS Code Runs integration with VS Code, opening Ollama plugin. After start you can use built-in chat windows, code auto-completion and code analysis with models. Example: ollama launch codex
ollama launch <integration> --model <model> Run with Specific Model Runs integration with specifying specific model. Useful when using multiple models with different tasks (coding, chat, analysis). Example: ollama launch codex --model qwen2.5:7b

Cloud Work

Command Name Description
ollama signin Cloud Authorization Performs login to Ollama Cloud account. Required for using cloud models and accessing premium content in registry. After authorization you can upload proprietary models and use API. Example: ollama signin
ollama push <model> Publish Model Sends local model or custom build to remote registry. Usually applied when distributing own models. Requires authorization for publishing. Example: ollama push my-assistant

Built-in Commands

Note: Some commands may depend on Ollama version and active integration. /set think commands work only with models supporting reasoning/think mode. Built-in commands are available only inside ollama run, not in regular CLI.

Command Name Description
Help and Documentation
/? Commands List Shows list of available built-in commands for current interactive session. Convenient to use as built-in help. Example: /?
/help Help Equivalent of /?. Displays detailed information about built-in commands and parameters. Example: /help
Session Management
/bye Exit Session Completes current chat with model and exits interactive mode. Example: /bye
/clear Clear Context Completely resets current dialog and context history. After execution model starts new conversation without previous messages. Example: /clear
Model Information
/show info Model Information Displays model data: name, size, architecture, parameters and used settings. Example: /show info
/show modelfile Show Modelfile Outputs full Modelfile of active model. Useful for analyzing system instructions, templates and model parameters. Example: /show modelfile
/show parameters Model Parameters Shows current generation parameters and execution settings. Example: /show parameters
/show system System Prompt Shows system instruction (system prompt) used by current model. Example: /show system
/show template Request Template Displays used template for forming messages between user and model. Example: /show template
Response Formatting
/set format json JSON Mode Forces model to generate responses in JSON format. Useful for API and automatic data processing. Example: /set format json
/set noformat Disable Formatting Returns normal text response mode. Example: /set noformat
Stats and Modes
/set verbose Detailed Mode Enables displaying technical information: generation speed, token count, processing time and model work statistics. Example: /set verbose
/set quiet Quiet Mode Disables additional statistics and leaves only model responses. Example: /set quiet
/set history Enable History Activates saving history of entered commands and messages. Example: /set history
/set nohistory Disable History Disables saving history of current session commands. Example: /set nohistory
/set wordwrap Line Wrapping Enables automatic line wrapping of long lines in terminal. Example: /set wordwrap
/set nowordwrap Disable Wrapping Disables automatic text wrapping. Useful for logs and JSON. Example: /set nowordwrap
Reasoning Mode
/set think Reasoning Mode Enables additional internal analysis mode of model (if model supports reasoning). Levels available: low, medium, high. Example: /set think high
/set nothink Disable Reasoning Disables advanced analysis mode. Example: /set nothink
Generation Parameters
/set parameter temperature <value> Generation Temperature Changes generation temperature in real time without restarting model. Smaller values make responses more predictable, larger — more creative. Example: /set parameter temperature 0.7
/set parameter num_ctx <value> Context Size Changes model context window size. Increasing allows using more text but increases memory consumption. Example: /set parameter num_ctx 8192
Model Switching
/load <model> Switch Model Loads another model directly during current session without exiting interactive mode. Example: /load qwen3:8b
Ctrl+C Stop Generation Immediately interrupts current model response generation. Example: Ctrl+C
Ctrl+D Finish Input/Exit Completes multiline message input or completes current interactive session. Example: Ctrl+D

Interactive Mode

Command Name Description
Interactive Menu
ollama Interactive Interface Runs interactive mode with graphical menu and models list. Allows selecting models, running them and using built-in commands without knowing CLI. Navigation: ↑ ↓ Enter. Example: ollama
ollama <model> Run Model via Menu Runs specific model via Ollama interactive menu. Menu automatically shows all locally installed models. Example: ollama llama3.2
Tools
ollama launch Integrations Menu Opens menu for launching external applications and IDEs. Example: ollama launch

CLI and Help

Command Name Description
ollama help Help Shows list of available commands and their parameters. Can be used as general help or for separate command. Example: ollama help run
ollama --help General Help Shows general reference information and commands list. Example: ollama --help
ollama --version Version Outputs installed Ollama version. Used for compatibility check and diagnostics. Example: ollama --version

Server

Command Name Description
ollama serve Start Server Runs Ollama as an API service. After start the API becomes available at localhost:11434. Used for integration with applications and libraries. Example: ollama serve
ollama serve --debug Start in Debug Mode Runs server with debug messages in console. Useful for diagnosing model and integration issues. Example: ollama serve --debug
ollama serve --port <port> Start on Specified Port Runs server on specified port instead of default 11434. Useful when working with multiple Ollama instances. Example: ollama serve --port 11435