2

I am quite new to the whole world of Docker. Actually I am trying to establish an environment for different python machine learning applications which should act independently from each other in an own docker container. Since I dont really understood the way of using base images and trying to extend those base images I using a dockerfile for each new application which defines the different packages that I use. They all have in one thing in common - they use the command FROM python:3.6-slim as the base.

I am looking for a starting point or way that I can easily extend this base image to form a new image which contains the individual packages that each of the application needs in order to save disk space. Right now, each of the images has a file size of approx. 1gb and hopefully this could be a solution to reduce the amount.

File Size for each Docker image

1
  • I'm not sure about that but dockers are container when you can run different application on it so when you need give to another person the same program you just create an image of your docker container that has all the needed application so you introduce the docker container in the new environment without the need to get crazy to install all the library etc... it doesn't save space, it's just a good way to go if you need multiple library, application, etc running Commented Jul 24, 2018 at 13:26

1 Answer 1

3

Without going into details about the different storage backend solutions for Docker (check Docker - About Storage Drivers for reference), docker reuses all the shared intermediate points of an image.

Having said that, even though you see in the docker images output [1.17 GB, 1.17 GB, 1.17 GB, 138MB, 918MB] it does not mean that is using the sum in your storage. We can say the following:

sum(`docker images`) <= space-in-disk

Each of the steps in the Dockerfile creates a layer.

Let's take the following project structure:

├── common-requirements.txt
├── Dockerfile.1
├── Dockerfile.2
├── project1
│   ├── requirements.txt
│   └── setup.py
└── project2
    ├── requirements.txt
    └── setup.py

With Dockerfile.1:

FROM python:3.6-slim
# - here we have a layer from python:3.6-slim -

# 1. Copy requirements and install dependencies
# we do this first because we assume that requirements.txt changes 
# less than the code
COPY ./common-requirements.txt /requirements.txt
RUN pip install -r requirements
# - here we have a layer from python:3.6-slim + your own requirements-

# 2. Install your python package in project1
COPY ./project1 /code
RUN pip install -e /code
# - here we have a layer from python:3.6-slim + your own requirements
# + the code install

CMD ["my-app-exec-1"]

With Dockerfile.2:

FROM python:3.6-slim
# - here we have a layer from python:3.6-slim -

# 1. Copy requirements and install dependencies
# we do this first because we assume that requirements.txt changes 
# less than the code
COPY ./common-requirements.txt /requirements.txt
RUN pip install -r requirements
# == here we have a layer from python:3.6-slim + your own requirements ==
# == both containers are going to share the layers until here ==
# 2. Install your python package in project1
COPY ./project2 /code
RUN pip install -e /code
# == here we have a layer from python:3.6-slim + your own requirements
# + the code install ==

CMD ["my-app-exec-2"]

The two docker images are going to share the layers with python and the common-requirements.txt. It is extremely useful when building application with a lot heavy libraries.

To compile, I will do:

docker build -t app-1 -f Dockerfile.1 .
docker build -t app-2 -f Dockerfile.2 .

So, think that the order how you write the steps in the Dockerfile does matter.

Sign up to request clarification or add additional context in comments.

1 Comment

thanks for your detailed answer! that clarified a lot of my questions.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.