Docker’s documentation states that volumes can be “migrated” – which I’m assuming means that I should be able to move a volume from one host to another host. (More than happy to be corrected on this point.) However, the same documentation page doesn’t provide information on how to do this.
Digging around on SO, I have found an older question (circa 2015-ish) that states that this is not possible, but given that it’s 2 years on, I thought I’d ask again.
In case it helps, I’m developing a Flask app that uses [TinyDB] + local disk as its data storage – I have determined that I didn’t need anything more fancy than that; this is a project done for learning at the moment, so I’ve decided to go extremely lightweight. The project is structured as such:
/project_directory |- /app |- __init__.py |- ... |- run.py # assumes `data/databases/ and data/files/` are present |- Dockerfile |- data/ |- databases/ |- db1.json |- db2.json |- files/ |- file1.pdf |- file2.pdf
I have the folder
data/* inside my
.gitignore, so that they are not placed under version control and are ignored by Docker when building the images.
While developing the app, I am also trying to work with database entries and PDFs that are as close to real-world as possible, so I seeded the app with a very small subset of real data, that are stored on a volume that is mounted directly into
data/ when the Docker container is instantiated.
What I want to do is deploy the container on a remote host, but have the remote host seeded with the starter data (ideally, this would be the volume that I’ve been using locally, for maximal convenience); later on as more data are added on the remote host, I’d like to be able to pull that back down so that during development I’m working with up-to-date data that my end users have entered.
Looking around, the “hacky” way I’m thinking of doing is simply using
rsync, which might work out just fine. However, if there’s a solution I’m missing, I’d greatly appreciate guidance!
The way I would approach this is to generate a Docker container that stores a copy of the data you want to seed your development environment with. You can then expose the data in that container as a volume, and finally mount that volume into your development containers. I’ll demonstrate with an example:
Creating the Data Container
Firstly we’re just going to create a Docker container that contains your seed data and nothing else. I’d create a
~/data/Dockerfile and give it the following content:
FROM alpine:3.4 ADD . /data VOLUME /data CMD /bin/true
You could then build this with:
docker build -t myproject/my-seed-data .
This will create you a Docker image tagged as
myproject/my-seed-data:latest. The image simply contains all of the data you want to seed the environment with, stored at
/data within the image. Whenever we create an instance of the image as a container, it will expose all of the files within
/data as a volume.
Mounting the volume into another Docker container
I imagine you’re running your Docker container something like this:
docker run -d -v $(pwd)/data:/data your-container-image <start_up_command>
You could now extend that to do the following:
docker run -d --name seed-data myproject/my-seed-data docker run -d --volumes-from seed-data your-container-image <start_up_command>
What we’re doing here is first creating an instance of your seed data container. We’re then creating an instance of the development container and mounting the volumes from the data container into it. This means that you’ll get the seed data at
/data within your development container.
This gets a little bit of a pain that you know need to run two commands, so we could go ahead and orchestrate it a bit better with something like Docker Compose
Simple Orchestration with Docker Compose
Docker Compose is a way of running more than one container at the same time. You can declare what your environment needs to look like and do things like define:
“My development container depends on an instance of my seed data container”
You create a
docker-compose.yml file to layout what you need. It would look something like this:
version: 2 services: seed-data: image: myproject/my-seed-data:latest my_app: build: . volumes_from: - seed-data depends_on: - seed-data
You can then start all containers at once using
docker-compose up -d my_app. Docker Compose is smart enough to firstly start an instance of your data container, and then finally your app container.
Sharing the Data Container between hosts
The easiest way to do this is to push your data container as an image to Docker Hub. Once you have built the image, it can be pushed to Docker Hub as follows:
docker push myproject/my-seed-data:latest
It’s very similar in concept to pushing a Git commit to a remote repository, instead in this case you’re pushing a Docker image. What this does mean however is that any environment can now pull this image and use the data contained within it. That means you can re-generate the data image when you have new seed data, push it to Docker Hub under the
:latest tag and when you re-start your dev environment will have the latest data.
To me this is the “Docker” way of sharing data and it keeps things portable between Docker environments. You can also do things like have your data container generated on a regular basis by a job within a CI environment like Jenkins.
Answered By – Rob Blake
This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0