main script #
The main way most folks install and run Ollama is to start with the installer that you can find on ollama.com. You can be up and running super quickly with that. But some like to use Docker. Docker is nice because it is fully self-contained. For programs that are messy with a lot of dependencies, it can be nice to use Docker because when you delete the container and image, there is nothing left. Ollama is a very simple program with a single executable, but still its convenient for some to have everything running in Docker. You will take a little performance hit, though. On Linux, it’s pretty minimal, but on Mac and Windows it will be a little more significant due to the virtual machine that needs to run underneath Docker. Also, on Mac, there is no GPU pass-through possible, due to the Docker platform not supporting it. Just on Mac.
Once you know you want to run Ollama in Docker, you still need to put your models in a directory on your base system. Typically, docker containers and images are designed to be really small. But the models you run with Ollama are enormous. So that’s why they are separate. Though, with Docker, it may be touch easier to put your models anywhere you like.
I am going to assume that Docker is already set up on your machine and is ready to go. The command to run is docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
. Run it, and then we can take a look at the command in a bit more detail. docker run
is the command to start a new container from an image. If you don’t have the image, it will be pulled from Docker Hub. -d
says to run the container as a daemon, so it’s running in the background. --gpus=all
allows for GPU pass-through. BUT, this doesn’t work on a Mac. And on Linux and Windows, you need to make sure the nvidia toolkit is installed, assuming you have an nvidia card. I am running this on a brev.dev instance which already has that installed, but you can find the instructions to install it in the README on the DockerHub page for the Ollama image. I’ll talk more about Brev in another video coming very soon.
The next part of the command is the volume mount. This is saying create a volume named ollama and mount it to /root/.ollama inside the container. This is a little confusing as there are different ways to use volume mounts. But basically, there are two sides of any command that has a colon. The left side is usually what’s on the host system and the right side is what is in the container. If the left side is a word and not a path, then the Docker process will create a new directory where its default is. On this system that is /var/lib/docker/volumes and there you can see an ollama directory. But if you want the models to go somewhere else, you need to specify the fully qualified path before the colon. I am currently in /home/ubuntu and if I want models to go into the myollamamodels directory which I have already created, I would have to use -v /home/ubuntu/myollamamodels:/root/.ollama.
Next comes ports. Any time there is a port that you need to access inside the container and want to use it outside, you have to declare it. So we want port 11434 inside the container to be available outside the container as well. After that, we set up a name. If you don’t include a name, one will be generated. So its usually best to assign one that you like. Finally we see the Docker image being used: ollama/ollama which corresponds to the url on Dockerhub which is https://hub.docker.com/r/ollama/ollama.
Let’s say you actually want to call the container something else, or you want to move the container to a new location, or there is a new version and you need to update the ollama container. Well in all of those cases, if you just run the command again it will fail, because there is a running container already. So we need to stop it and remove it. Luckily we set a good name so this is pretty easy. Just run docker stop ollama
and then docker rm ollama
. Now if you need a new ollama image, you can run docker pull ollama/ollama
to grab the latest. Then you can run that docker run
command again with the correct parameters. One thing that confuses new users is that they want to update the software inside the container, instead of pulling a new image. Images and containers are designed to be immutable. You don’t want to update the contents of the container. If there is updated software, you pull the new version of the image and start that up. Updating the software inside a container will probably make it break much more easily. Don’t do it.
Of course you didn’t run ollama just to run the server, you probably want to use the client to ask a question. But ollama probably isn’t installed on your host system. To run the ollama client that is inside the container, we need to use the docker exec
command. docker exec -it ollama ollama run llama2
. So that’s docker exec
which will run a command inside a container. -it
says to run it in an interactive terminal. If you just needed to run the command and exit, you could leave that off. The first ollama
tells docker exec which container to use. If you hadn’t specified a name when running the image, then you would have to specify a container id which is a unique string each time it’s run. Everything after the container name is the command to run on the host. So ollama run llama2
.
Now I have used the terms container and image a lot, and if you aren’t familiar, you might think I am using them interchangeably. But they are different. All the things on DockerHub are called images. ollama/ollama is an image. It defines how ollama should run. A container is a lightweight method to run an image. You can think of the container as the instantiation of the image. And we can see this on our system. We stopped and removed the container just now, but the image still existed and we can quickly run it. If you run docker images
you will see a list of all the images you have downloaded to run as containers. If you want to remove an image, then the command is docker rmi
and the image name but it will only delete if there are no containers currently using the image.
OK, back to ollama run. We were looking at docker exec -it ollama ollama run llama2
. If you run that you will be dropped into the interactive repl to work with ollama. If you are going to be doing this a lot, then it probably makes sense to make an alias. On linux and mac, assuming you are using the bash or zsh shell, the command is alias ollama="docker exec -it ollama ollama
. If using fish, then just replace that equals sign with a space. If you are a Windows user, you probably know the right command to create an alias like that.
Now you can run ollama run llama2
just like you do on systems with ollama installed. If you want to use that alias after this terminal session is over, you will want to add the command to one of your rc files for your shell. That’s going to be a file like .bashrc or .profile or .zshrc or config/fish/config.fish.
There is another command you need to know about. Sometimes you have to look at the logs to see what ollama is doing. So docker logs ollama
is the command to run to see those.
ok, so now you are setup to use ollama in docker on a host. But let’s say you have setup ollama on a host, like I have on brev, and you want to access it from a remote machine, something other than the host running docker. Well if the host is on the same network as your client machine, this is easy. You will have to stop and remove your docker container for ollama. Remember, that was docker stop
and docker rm
. Then run the same run command but add OLLAMA_HOST=0.0.0.0:11434
before docker run on the same line. And then anything that connects to the host on that port will see your docker container version of ollama.
But what happens if that host isn’t on your home network. Instead its somewhere else on the Internet. Well then things get a bit tougher. Somehow you need to give access to the host to your client. Some will solve this by opening ports, but that’s an incredibly stupid thing to do without blocking access with some sort of authentication. Probably the easiest thing to do is to use tailscale, which is free for most non-companies. For you at home, this is brain dead easy. And if you are doing this for work, then your ops team probably has a solution they like or tailscale is so ridiculously cheap that you can probably expense it and they wont bat an eyelash over it.
Side note, I once bought some stuff for the office when I was at datadog. It was the only time finance reached out and they said I needed to describe what I bought. ‘You really don’t want to know’ I said. But I had to put it in. new people , new system. Don’t worry they said. OK, so I put in what I bought. About $200 worth of NERF guns and those little yellow balls for ammo. I thought, oh well, guess I’ll eat that expense. No problem. expense paid. Unfortunately the cleaning staff thought the balls were trash and within a week or so, all the ammo was gone.
Anyway, that’s what you need to know to understand how to work with Docker and Ollama. In most cases, you will want to run Ollama as a native install, but if you insist on using docker, now you are set. If you have any questions, let me know in the chat below. I love the questions that I see, especially when you give me suggestions for what to make a video about. So many of these videos, including this one, come from those suggestions. Keep them coming.
Thanks so much for being here, goodbye.
docker containers vs images #
A docker container is not the same as a docker image, but it’s common to think they are the same. All the things on DockerHub are called images. Ollama has an image called Ollama that you can find at hub.docker.com/r/ollama/ollama. ollama/ollama is the image. It defines how Ollama should run. A container is a lightweight method to run an image. You can think of the container as the instantiation of the image. And we can see this on our system. We can stop a container with docker stop
and remove a container with docker rm
, but the image still exists and we can quickly run it to bring it back up. If you run docker images
you will see a list of all the images you have downloaded to run as containers. If you want to remove an image, then the command is docker rmi
and the image name, but it will only delete if there are no containers currently using the image. And now you know that a docker container is not the same as a docker image.
crazy expenses #
Don’t be afraid of your expense reports, sometimes they go really well. I once bought some stuff for the office when I was at Datadog. I was the first Evangelist there and started the community team as well as Docs and Training. But this story is from when Datadog was a bit bigger. It was the only time finance EVER reached out and they said I needed to describe what I bought. ‘You really don’t want to know’ I said. But I had to put it in. new people , new system. Don’t worry they said. OK, so I put in what I bought. About $200 worth of NERF guns and those little yellow balls for ammo. I thought, oh well, guess I’ll eat that expense. No problem. expense paid. Unfortunately the cleaning staff thought the balls were trash and within a week or so, all the ammo was gone.
how to update ollama when running docker #
Always update docker containers the right way, whether its for ollama or anything else. The upgrade process is always the same. Docker containers should be immutable, so you don’t want to update Ollama in the container. Instead, you need to update the image from dockerhub. The best way to do this is to stop the container with docker stop
and the container name or id. Then remove the container with docker rm
and the container name or id. Then pull the latest version of the container with docker pull
and the image name. Then finally run the original docker run
command to start up the container again. If you try to update the software running inside a container, you risk making it more brittle. So always update docker containers the right way.
how to make ollama in a docker container accessible to the outside world #
often you run a product like Ollama in a container to act as a server. And sometimes you want that server to be accessible from another location. With Ollama we have the standard way of running it as a container, using the command docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
. This will start the Ollama service, publishing port 11434 to the host system. But to get Ollama to respond when you are making a connection from another machine, you need to add the OLLAMA_HOST environment variable. Just add that to the beginning of our command. OLLAMA_HOST=http://0.0.0.0:11434 docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
. Now as long as the host is accessible you will be able to access port 11434 on that host and it will respond from the container.
how to run the ollama client when ollama is installed with docker #
You installed ollama with docker and the client is up and running in Docker. How do you access it? The easiest way is to run docker exec -it ollama ollama run llama2
. What this does is execute a command in an interactive terminal. The command is run from the container named ollama, that’s the first ollama in our command. Then the command to run is ollama run llama2
, or whatever model you want. But this isn’t super convenient. So create an alias instead. On bash or zsh, try alias ollama=“docker exec -it ollama ollama”, and then you will be able to run ollama run llama2
. Now it feels like ollama in installed on the host even though…