6

I am new to Docker but have had success in Dokcerizing some existing python code using the docker toolbox for windows 10.

Currently i have this setup:

picture of working python code in Docker container

This is done with the Dockerfile:

FROM python:2.7.13
WORKDIR /root
COPY ./requirements.txt /root/requirements.txt
RUN pip install -r requirements.txt
COPY . /root
CMD ["python", "main.py"]

and all my code sits in the container with a bunch of CSV and .pkl files. The thing is that the CSV and .pkl files change daily so after some reading I think i can split these files out into a volume or maybe even a separate container that i can modify and upload everyday without changing the main python script as its 1.4G in size and my upload speed is 40kbps (at best).

Picture of container setup that i would like

So im wondering how would i reference the other container/volume so i could access the CSV and /pkl files in my main body Python code? At the moment everything sits in the same directory so there is no problem i just call the .csv/.pkl name and it works

#open the local .csv file
data = pd.read_csv(csv_select)
#open the local .pkl file
pickled_list = pickle.load(open(can_cat+".pkl","rb"))

How would i reference the above code to open a csv/pkl file from a separate container??

I have read heaps of stackoverflow posts and the docker documentation but can't seem to understand how to make it work, any help would be appreciated.

1
  • So this pd.read_csv will run inside the container and the volume would be mounted from host ? Commented Aug 24, 2017 at 13:22

1 Answer 1

11

Yeah you're on the right track in terms of thinking of using volumes. I would split it up into three bits:

  1. Your python code running in one container
  2. A volume that is shared between your python containers and one or more other containers
  3. A "data copying" container, that on a daily basis copies the latest data to the shared volume.

1. A shared volume

Creating volumes with Docker is easy. What is particularly good is that you can create a volume with a particular name:

docker volume create data-volume

So here we have created the data-volume named volume. You can then mount this onto any container using a command like this:

docker run --rm -v data-volume:/data my-container-image

So here we're running a container from the my-container-image Docker image and mounting the data-volume volume at /data within that container.

Your python code could easy read the files it needs from that directory .e.g /data or you could change the mount-point as required.

2. Copying changed data into the volume

The next step would be to create a simple app that can copy the latest changes into that directory. Again lets say this app copies the latest data into /data on it's own file system. Essentially we want an app that does:

cp $TODAYS_DATA.csv $TODAYS_DATA.pkl /data

We could run this app within a container and also ensure that container has the data-volume mounted at data e.g.:

docker run --rm data-volume:/data my-data-copying-app

This container could be really simple, something like:

FROM alpine:latest
COPY ./todaysdata /todaysdata

You could then run it using the following:

docker run --rm data-volume:/data my-data-copy-image "/bin/sh -c cp -r /todaysdata/* /data/"

So essentially you just run the container with a command to copy the data from today into /data. Because /data is actually a volume, the latest data is then immediately shared with your python app which is exactly what you wanted.

Hope that helps.

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you Rob! I'm going to try this out as soon as I get some time today and I'll get back to you with how it all went. Thanks for the detailed reply, I didn't think of copying it over into the container volume I was just focusing on how to access the data at another container location.
Hey Rob, i found a similar way of achieving the same thing you describe when talking about the copy container: docker run -v my-volume:/data --name helper busybox true, docker cp . helper:/data, docker rm helper stackoverflow.com/questions/37468788/…
@MichaelDalton Yep an alternative to my solution is to use docker cp. You could for example create a script that copies your files for today onto the host running your container and then docker cp them into the container. Either way would work absolutely fine.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.