Core Environment Variables #
- OLLAMA_MODELS - custom model directory, I generally move it away from the OS drive.
- OLLAMA_KEEP_ALIVE - model memory retention.
- OLLAMA_NUM_PARALLEL - Number of the same model loaded at the same time. This is quite nice to run say 6 llama 2B models at the same time.
- OLLAMA_MAX_LOADED_MODELS - different concurrent models, combine say a llama 2B + llama 7B.
- OLLAMA_FLASH_ATTENTION - enable