Overview
Ollama is a tool for running local LLM models. Supports multiple model formats and allows easy management via CLI or built-in interactive interface.
| Command | Name | Description |
|---|---|---|
ollama |
Interactive Menu | Runs the Ollama interactive interface in terminal. Allows quickly launching models, selecting integrations and opening additional tools without manual command input. Supports navigation with ↑ ↓ Enter keys. Example: ollama |
ollama serve |
Start Server | Runs Ollama as an API service. After start the API becomes available at localhost:11434. Used for integration with applications and libraries. Example: ollama serve
Additionally: ollama serve --debug — start in debug mode, ollama serve --port <port> — start on specified port |
Running Models
| Command | Name | Description |
|---|---|---|
| Loading and Running Models | ||
ollama run <model> |
Run Model | Loads and runs the model in interactive mode. If model not local, Ollama automatically downloads it. Used for dialogs, model testing and local LLM work. Example: ollama run llama3.2 |
ollama pull <model> |
Download Model | Downloads model from Ollama registry to local storage without starting it. Convenient for pre-loading. Check downloading: ollama list. During download Ollama automatically updates to latest version. Example: ollama pull gemma3 |
ollama pull <model> --cloud |
Download from Cloud | Downloads model from Ollama Cloud. Available after authorization or for free models. During download progress is shown and proxy server is used for acceleration. Example: ollama pull mistral:7b --cloud |
ollama list |
Models List | Shows all models installed locally. Convenient for viewing available models and their versions. Example: ollama list |
ollama ps |
Active Processes | Shows models currently loaded in memory, as well as CPU/GPU usage and resource volume. Useful for performance diagnostics. Example: ollama ps |
ollama show <model> |
View Model Information | Outputs model information: parameters, template, system prompt and other data. Useful for configuration analysis. Example: ollama show llama3.2 |
| Running Integrations | ||
ollama launch |
Run Integrations | Opens menu for configuring and launching external applications working via Ollama. Used for connecting IDE and AI tools. Example: ollama launch |
ollama launch <integration> |
Run Specific Integration | Runs specific integration directly without selection menu. Suitable for quick start of development environment or AI tool. Example: ollama launch codex |
ollama launch <integration> --model <model> |
Run Integration with Model | Allows specifying specific model for external application. Useful if multiple models with different tasks are used. Example: ollama launch codex --model llama3.2 |
ollama cp <source> <destination> |
Copy Model | Creates a copy of existing model under new name. Convenient before changing configuration or experimentation. Example: ollama cp llama3.2 llama3.2-custom |
ollama rm <model> |
Remove Model | Removes model from local storage and frees disk space. Example: ollama rm llama3.2 |
ollama create <name> -f <Modelfile> |
Create Model | Creates new model based on Modelfile. Allows changing system instructions, parameters, templates and connecting adapters. Example: ollama create mario-ai -f ./Modelfile |
ollama pull <model> --latest |
Update Model | Updates local model to latest version in registry. Useful for getting security patches and improvements. During update Ollama checks availability of new version and shows size difference. Example: ollama pull llama3.2 --latest |
Integrations
Ollama supports various IDEs and tools for working with models.
| Command | Name | Description |
|---|---|---|
ollama launch |
Integrations Menu | Opens interactive menu for selecting and launching external applications. Allows easily connecting IDEs (VS Code, Vim), editors and AI tools without complex configuration. Used first time after Ollama installation. Example: ollama launch |
ollama launch codex |
Run in VS Code | Runs integration with VS Code, opening Ollama plugin. After start you can use built-in chat windows, code auto-completion and code analysis with models. Example: ollama launch codex |
ollama launch <integration> --model <model> |
Run with Specific Model | Runs integration with specifying specific model. Useful when using multiple models with different tasks (coding, chat, analysis). Example: ollama launch codex --model qwen2.5:7b |
Cloud Work
| Command | Name | Description |
|---|---|---|
ollama signin |
Cloud Authorization | Performs login to Ollama Cloud account. Required for using cloud models and accessing premium content in registry. After authorization you can upload proprietary models and use API. Example: ollama signin |
ollama push <model> |
Publish Model | Sends local model or custom build to remote registry. Usually applied when distributing own models. Requires authorization for publishing. Example: ollama push my-assistant |
Built-in Commands
Note: Some commands may depend on Ollama version and active integration. /set think commands work only with models supporting reasoning/think mode. Built-in commands are available only inside ollama run, not in regular CLI.
| Command | Name | Description |
|---|---|---|
| Help and Documentation | ||
/? |
Commands List | Shows list of available built-in commands for current interactive session. Convenient to use as built-in help. Example: /? |
/help |
Help | Equivalent of /?. Displays detailed information about built-in commands and parameters. Example: /help |
| Session Management | ||
/bye |
Exit Session | Completes current chat with model and exits interactive mode. Example: /bye |
/clear |
Clear Context | Completely resets current dialog and context history. After execution model starts new conversation without previous messages. Example: /clear |
| Model Information | ||
/show info |
Model Information | Displays model data: name, size, architecture, parameters and used settings. Example: /show info |
/show modelfile |
Show Modelfile | Outputs full Modelfile of active model. Useful for analyzing system instructions, templates and model parameters. Example: /show modelfile |
/show parameters |
Model Parameters | Shows current generation parameters and execution settings. Example: /show parameters |
/show system |
System Prompt | Shows system instruction (system prompt) used by current model. Example: /show system |
/show template |
Request Template | Displays used template for forming messages between user and model. Example: /show template |
| Response Formatting | ||
/set format json |
JSON Mode | Forces model to generate responses in JSON format. Useful for API and automatic data processing. Example: /set format json |
/set noformat |
Disable Formatting | Returns normal text response mode. Example: /set noformat |
| Stats and Modes | ||
/set verbose |
Detailed Mode | Enables displaying technical information: generation speed, token count, processing time and model work statistics. Example: /set verbose |
/set quiet |
Quiet Mode | Disables additional statistics and leaves only model responses. Example: /set quiet |
/set history |
Enable History | Activates saving history of entered commands and messages. Example: /set history |
/set nohistory |
Disable History | Disables saving history of current session commands. Example: /set nohistory |
/set wordwrap |
Line Wrapping | Enables automatic line wrapping of long lines in terminal. Example: /set wordwrap |
/set nowordwrap |
Disable Wrapping | Disables automatic text wrapping. Useful for logs and JSON. Example: /set nowordwrap |
| Reasoning Mode | ||
/set think |
Reasoning Mode | Enables additional internal analysis mode of model (if model supports reasoning). Levels available: low, medium, high. Example: /set think high |
/set nothink |
Disable Reasoning | Disables advanced analysis mode. Example: /set nothink |
| Generation Parameters | ||
/set parameter temperature <value> |
Generation Temperature | Changes generation temperature in real time without restarting model. Smaller values make responses more predictable, larger — more creative. Example: /set parameter temperature 0.7 |
/set parameter num_ctx <value> |
Context Size | Changes model context window size. Increasing allows using more text but increases memory consumption. Example: /set parameter num_ctx 8192 |
| Model Switching | ||
/load <model> |
Switch Model | Loads another model directly during current session without exiting interactive mode. Example: /load qwen3:8b |
Ctrl+C |
Stop Generation | Immediately interrupts current model response generation. Example: Ctrl+C |
Ctrl+D |
Finish Input/Exit | Completes multiline message input or completes current interactive session. Example: Ctrl+D |
Interactive Mode
| Command | Name | Description |
|---|---|---|
| Interactive Menu | ||
ollama |
Interactive Interface | Runs interactive mode with graphical menu and models list. Allows selecting models, running them and using built-in commands without knowing CLI. Navigation: ↑ ↓ Enter. Example: ollama |
ollama <model> |
Run Model via Menu | Runs specific model via Ollama interactive menu. Menu automatically shows all locally installed models. Example: ollama llama3.2 |
| Tools | ||
ollama launch |
Integrations Menu | Opens menu for launching external applications and IDEs. Example: ollama launch |
CLI and Help
| Command | Name | Description |
|---|---|---|
ollama help |
Help | Shows list of available commands and their parameters. Can be used as general help or for separate command. Example: ollama help run |
ollama --help |
General Help | Shows general reference information and commands list. Example: ollama --help |
ollama --version |
Version | Outputs installed Ollama version. Used for compatibility check and diagnostics. Example: ollama --version |
Server
| Command | Name | Description |
|---|---|---|
ollama serve |
Start Server | Runs Ollama as an API service. After start the API becomes available at localhost:11434. Used for integration with applications and libraries. Example: ollama serve |
ollama serve --debug |
Start in Debug Mode | Runs server with debug messages in console. Useful for diagnosing model and integration issues. Example: ollama serve --debug |
ollama serve --port <port> |
Start on Specified Port | Runs server on specified port instead of default 11434. Useful when working with multiple Ollama instances. Example: ollama serve --port 11435 |