成人国产在线小视频_日韩寡妇人妻调教在线播放_色成人www永久在线观看_2018国产精品久久_亚洲欧美高清在线30p_亚洲少妇综合一区_黄色在线播放国产_亚洲另类技巧小说校园_国产主播xx日韩_a级毛片在线免费

資訊專欄INFORMATION COLUMN

docker系列--runC解讀

binaryTree / 1252人閱讀

摘要:而具體代碼首先調(diào)用創(chuàng)建容器之后填充結(jié)構(gòu)。該函數(shù)先處理,設(shè)置,調(diào)用重新掛載,在配置文件中為等等調(diào)用設(shè)置,處理調(diào)用告訴父進(jìn)程容器可以執(zhí)行了從父進(jìn)程來(lái)看,已經(jīng)完成了處理和,判斷和是否相等,找到,最后,完成。

前言

理解docker,主要從namesapce,cgroups,聯(lián)合文件,運(yùn)行時(shí)(runC),網(wǎng)絡(luò)幾個(gè)方面。接下來(lái)我們會(huì)花一些時(shí)間,分別介紹。

docker系列--namespace解讀

docker系列--cgroups解讀

docker系列--unionfs解讀

docker系列--runC解讀

docker系列--網(wǎng)絡(luò)模式解讀

namesapce主要是隔離作用,cgroups主要是資源限制,聯(lián)合文件主要用于鏡像分層存儲(chǔ)和管理,runC是運(yùn)行時(shí),遵循了oci接口,一般來(lái)說基于libcontainer。網(wǎng)絡(luò)主要是docker單機(jī)網(wǎng)絡(luò)和多主機(jī)通信模式。

runC

RunC 是一個(gè)輕量級(jí)的工具,它是用來(lái)運(yùn)行容器的,只用來(lái)做這一件事,并且這一件事要做好。我們可以認(rèn)為它就是個(gè)命令行小工具,可以不用通過 docker 引擎,直接運(yùn)行容器。事實(shí)上,runC 是標(biāo)準(zhǔn)化的產(chǎn)物,它根據(jù) OCI 標(biāo)準(zhǔn)來(lái)創(chuàng)建和運(yùn)行容器。而 OCI(Open Container Initiative)組織,旨在圍繞容器格式和運(yùn)行時(shí)制定一個(gè)開放的工業(yè)化標(biāo)準(zhǔn)。
OCI 由 docker、coreos 以及其他容器相關(guān)公司創(chuàng)建于 2015 年,目前主要有兩個(gè)標(biāo)準(zhǔn)文檔:容器運(yùn)行時(shí)標(biāo)準(zhǔn) (runtime spec)和 容器鏡像標(biāo)準(zhǔn)(image spec)。
runC 由golang語(yǔ)言實(shí)現(xiàn),基于libcontainer庫(kù)。從docker1.11以后,docker架構(gòu)圖:

編譯

runc目前支持各種架構(gòu)的Linux平臺(tái)。必須使用Go 1.6或更高版本構(gòu)建它才能使某些功能正常運(yùn)行。
要啟用seccomp支持,您需要在平臺(tái)上安裝libseccomp。

e.g. libseccomp-devel for CentOS, or libseccomp-dev for Ubuntu

否則,如果您不想使用seccomp支持構(gòu)建runc,則可以在運(yùn)行make時(shí)添加BUILDTAGS =“”。

# create a "github.com/opencontainers" in your GOPATH/src
cd github.com/opencontainers
git clone https://github.com/opencontainers/runc
cd runc

make
sudo make install
編譯選項(xiàng)

runc支持可選的構(gòu)建標(biāo)記,用于編譯各種功能的支持。要將構(gòu)建標(biāo)記添加到make選項(xiàng),必須設(shè)置BUILDTAGS變量。

make BUILDTAGS="seccomp apparmor"
Build Tag Feature Dependency
seccomp Syscall filtering libseccomp
selinux selinux process and mount labeling
apparmor apparmor profile support
ambient ambient capability support kernel 4.3
使用runC 創(chuàng)建一個(gè) OCI Bundle

要使用runc,您必須使用OCI包的格式容器。如果安裝了Docker,則可以使用其導(dǎo)出方法從現(xiàn)有Docker容器中獲取根文件系統(tǒng)。

# create the top most bundle directory
mkdir /mycontainer
cd /mycontainer

# create the rootfs directory
mkdir rootfs

# export busybox via Docker into the rootfs directory
docker export $(docker create busybox) | tar -C rootfs -xvf -

runc提供了一個(gè)spec命令來(lái)生成您可以編輯的基本模板規(guī)范。

runc spec
運(yùn)行容器

先來(lái)準(zhǔn)備一個(gè)工作目錄,下面所有的操作都是在這個(gè)目錄下執(zhí)行的,比如 mycontainer:

# mkdir mycontainer

接下來(lái),準(zhǔn)備容器鏡像的文件系統(tǒng),我們選擇從 docker 鏡像中提?。?/p>

# mkdir rootfs
# docker export $(docker create busybox) | tar -C rootfs -xvf -
# ls rootfs 
bin  dev  etc  home  proc  root  sys  tmp  usr  var

有了 rootfs 之后,我們還要按照 OCI 標(biāo)準(zhǔn)有一個(gè)配置文件 config.json 說明如何運(yùn)行容器,包括要運(yùn)行的命令、權(quán)限、環(huán)境變量等等內(nèi)容,runc 提供了一個(gè)命令可以自動(dòng)幫我們生成:

# runc spec
# ls
config.json  rootfs

這樣就構(gòu)成了一個(gè) OCI runtime bundle 的內(nèi)容,這個(gè) bundle 非常簡(jiǎn)單,就上面兩個(gè)內(nèi)容:config.json 文件和 rootfs 文件系統(tǒng)。config.json 里面的內(nèi)容很長(zhǎng),這里就不貼出來(lái)了,我們也不會(huì)對(duì)其進(jìn)行修改,直接使用這個(gè)默認(rèn)生成的文件。有了這些信息,runc 就能知道怎么怎么運(yùn)行容器了,我們先來(lái)看看簡(jiǎn)單的方法 runc run(這個(gè)命令需要 root 權(quán)限),這個(gè)命令類似于 docker run,它會(huì)創(chuàng)建并啟動(dòng)一個(gè)容器:

runc run simplebusybox
/ # ls
bin   dev   etc   home  proc  root  sys   tmp   usr   var
/ # hostname
runc
/ # whoami
root
/ # pwd
/
/ # ip addr
1: lo:  mtu 65536 qdisc noqueue qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
/ # ps aux
PID   USER     TIME   COMMAND
    1 root       0:00 sh
   11 root       0:00 ps aux

此時(shí),另開一個(gè)終端,可以查看運(yùn)行的容器信息:

runc list
ID              PID         STATUS      BUNDLE                                    CREATED                          OWNER
simplebusybox   18073       running     /home/cizixs/Workspace/runc/mycontainer   2017-11-02T06:54:52.023379345Z   root
runC代碼解讀

總體來(lái)說,runC代碼比較簡(jiǎn)單。主要是引用github.com/urfave/cli庫(kù),實(shí)現(xiàn)了一系列命令

    app.Commands = []cli.Command{
        checkpointCommand,
        createCommand,
        deleteCommand,
        eventsCommand,
        execCommand,
        initCommand,
        killCommand,
        listCommand,
        pauseCommand,
        psCommand,
        restoreCommand,
        resumeCommand,
        runCommand,
        specCommand,
        startCommand,
        stateCommand,
        updateCommand,
    }

熟悉docker命令的人,應(yīng)該對(duì)此很熟悉了。
這些命令底層是調(diào)用 libcontainer庫(kù)實(shí)現(xiàn)具體的操作。
例如create 命令:

var createCommand = cli.Command{
    Name:  "create",
    Usage: "create a container",
    ArgsUsage: `

Where "" is your name for the instance of the container that you
are starting. The name you provide for the container instance must be unique on
your host.`,
    Description: `The create command creates an instance of a container for a bundle. The bundle
is a directory with a specification file named "` + specConfig + `" and a root
filesystem.

The specification file includes an args parameter. The args parameter is used
to specify command(s) that get run when the container is started. To change the
command(s) that get executed on start, edit the args parameter of the spec. See
"runc spec --help" for more explanation.`,
    Flags: []cli.Flag{
        cli.StringFlag{
            Name:  "bundle, b",
            Value: "",
            Usage: `path to the root of the bundle directory, defaults to the current directory`,
        },
        cli.StringFlag{
            Name:  "console-socket",
            Value: "",
            Usage: "path to an AF_UNIX socket which will receive a file descriptor referencing the master end of the console"s pseudoterminal",
        },
        cli.StringFlag{
            Name:  "pid-file",
            Value: "",
            Usage: "specify the file to write the process id to",
        },
        cli.BoolFlag{
            Name:  "no-pivot",
            Usage: "do not use pivot root to jail process inside rootfs.  This should be used whenever the rootfs is on top of a ramdisk",
        },
        cli.BoolFlag{
            Name:  "no-new-keyring",
            Usage: "do not create a new session keyring for the container.  This will cause the container to inherit the calling processes session key",
        },
        cli.IntFlag{
            Name:  "preserve-fds",
            Usage: "Pass N additional file descriptors to the container (stdio + $LISTEN_FDS + N in total)",
        },
    },
    Action: func(context *cli.Context) error {
        if err := checkArgs(context, 1, exactArgs); err != nil {
            return err
        }
        if err := revisePidFile(context); err != nil {
            return err
        }
        spec, err := setupSpec(context)
        if err != nil {
            return err
        }
        status, err := startContainer(context, spec, CT_ACT_CREATE, nil)
        if err != nil {
            return err
        }
        // exit with the container"s exit status so any external supervisor is
        // notified of the exit with the correct exit status.
        os.Exit(status)
        return nil
    },
}

規(guī)定了每個(gè)命令所需的命令行參數(shù)。

具體執(zhí)行邏輯。

其實(shí)如果需要更深的理解,更多需要理解libcontainer了。
主要有以下幾個(gè)重要的文件需要理解

factory.go

container.go

process.go

init_linux.go

下面我們通過如何創(chuàng)建一個(gè)容器來(lái)剖析和理解上面的幾個(gè)文件。

先調(diào)用spec, err := setupSpec(context)加載配置文件config.json的內(nèi)容。此處是和咱們前面提到的OCI bundle 相關(guān)。

        spec, err := setupSpec(context)
        if err != nil {
            return err
        }

最終生成了Spec對(duì)象,spec定義如下:

// Spec is the base configuration for the container.
type Spec struct {
    // Version of the Open Container Runtime Specification with which the bundle complies.
    Version string `json:"ociVersion"`
    // Process configures the container process.
    Process *Process `json:"process,omitempty"`
    // Root configures the container"s root filesystem.
    Root *Root `json:"root,omitempty"`
    // Hostname configures the container"s hostname.
    Hostname string `json:"hostname,omitempty"`
    // Mounts configures additional mounts (on top of Root).
    Mounts []Mount `json:"mounts,omitempty"`
    // Hooks configures callbacks for container lifecycle events.
    Hooks *Hooks `json:"hooks,omitempty" platform:"linux,solaris"`
    // Annotations contains arbitrary metadata for the container.
    Annotations map[string]string `json:"annotations,omitempty"`

    // Linux is platform-specific configuration for Linux based containers.
    Linux *Linux `json:"linux,omitempty" platform:"linux"`
    // Solaris is platform-specific configuration for Solaris based containers.
    Solaris *Solaris `json:"solaris,omitempty" platform:"solaris"`
    // Windows is platform-specific configuration for Windows based containers.
    Windows *Windows `json:"windows,omitempty" platform:"windows"`
}

之后調(diào)用status, err := startcontainer(context, spec, CT_ACT_CREATE, nil)進(jìn)行容器的創(chuàng)建工作。其中CT_ACT_CREATE表示創(chuàng)建操作。CT_ACT_CREATE是一個(gè)枚舉類型。

type CtAct uint8

const (
    CT_ACT_CREATE CtAct = iota + 1
    CT_ACT_RUN
    CT_ACT_RESTORE
)
        status, err := startContainer(context, spec, CT_ACT_CREATE, nil)

而startcontainer具體代碼:

func startContainer(context *cli.Context, spec *specs.Spec, action CtAct, criuOpts *libcontainer.CriuOpts) (int, error) {
    id := context.Args().First()
    if id == "" {
        return -1, errEmptyID
    }

    notifySocket := newNotifySocket(context, os.Getenv("NOTIFY_SOCKET"), id)
    if notifySocket != nil {
        notifySocket.setupSpec(context, spec)
    }

    container, err := createContainer(context, id, spec)
    if err != nil {
        return -1, err
    }

    if notifySocket != nil {
        err := notifySocket.setupSocket()
        if err != nil {
            return -1, err
        }
    }

    // Support on-demand socket activation by passing file descriptors into the container init process.
    listenFDs := []*os.File{}
    if os.Getenv("LISTEN_FDS") != "" {
        listenFDs = activation.Files(false)
    }
    r := &runner{
        enableSubreaper: !context.Bool("no-subreaper"),
        shouldDestroy:   true,
        container:       container,
        listenFDs:       listenFDs,
        notifySocket:    notifySocket,
        consoleSocket:   context.String("console-socket"),
        detach:          context.Bool("detach"),
        pidFile:         context.String("pid-file"),
        preserveFDs:     context.Int("preserve-fds"),
        action:          action,
        criuOpts:        criuOpts,
        init:            true,
    }
    return r.run(spec.Process)
}

首先調(diào)用container, err := createContainer(context, id, spec)創(chuàng)建容器, 之后填充runner結(jié)構(gòu)r。

func createContainer(context *cli.Context, id string, spec *specs.Spec) (libcontainer.Container, error) {
    rootless, err := isRootless(context)
    if err != nil {
        return nil, err
    }
    config, err := specconv.CreateLibcontainerConfig(&specconv.CreateOpts{
        CgroupName:       id,
        UseSystemdCgroup: context.GlobalBool("systemd-cgroup"),
        NoPivotRoot:      context.Bool("no-pivot"),
        NoNewKeyring:     context.Bool("no-new-keyring"),
        Spec:             spec,
        Rootless:         rootless,
    })
    if err != nil {
        return nil, err
    }

    factory, err := loadFactory(context)
    if err != nil {
        return nil, err
    }
    return factory.Create(id, config)
}

注意factory, err := loadFactory(context)和factory.Create(id, config),這兩個(gè)就是我們上面提到的factory.go。由工廠來(lái)根據(jù)配置config創(chuàng)建具體容器。

最后調(diào)用了run方法。run方法傳遞了一個(gè)process對(duì)象,表示容器內(nèi)進(jìn)程的信息。即上面提到的process.go文件中的內(nèi)容。

// Process contains information to start a specific application inside the container.
type Process struct {
    // Terminal creates an interactive terminal for the container.
    Terminal bool `json:"terminal,omitempty"`
    // ConsoleSize specifies the size of the console.
    ConsoleSize *Box `json:"consoleSize,omitempty"`
    // User specifies user information for the process.
    User User `json:"user"`
    // Args specifies the binary and arguments for the application to execute.
    Args []string `json:"args"`
    // Env populates the process environment for the process.
    Env []string `json:"env,omitempty"`
    // Cwd is the current working directory for the process and must be
    // relative to the container"s root.
    Cwd string `json:"cwd"`
    // Capabilities are Linux capabilities that are kept for the process.
    Capabilities *LinuxCapabilities `json:"capabilities,omitempty" platform:"linux"`
    // Rlimits specifies rlimit options to apply to the process.
    Rlimits []POSIXRlimit `json:"rlimits,omitempty" platform:"linux,solaris"`
    // NoNewPrivileges controls whether additional privileges could be gained by processes in the container.
    NoNewPrivileges bool `json:"noNewPrivileges,omitempty" platform:"linux"`
    // ApparmorProfile specifies the apparmor profile for the container.
    ApparmorProfile string `json:"apparmorProfile,omitempty" platform:"linux"`
    // Specify an oom_score_adj for the container.
    OOMScoreAdj *int `json:"oomScoreAdj,omitempty" platform:"linux"`
    // SelinuxLabel specifies the selinux context that the container process is run as.
    SelinuxLabel string `json:"selinuxLabel,omitempty" platform:"linux"`
}

run方法主要是newProcess方法

process, err := newProcess(*config, r.init)

newProcess 主要是填充 libcontainer.Process 結(jié)構(gòu)體,包括參數(shù),環(huán)境變量,user 權(quán)限,工作目錄,cpabilities,資源限制等。
具體的操作是:

    switch r.action {
    case CT_ACT_CREATE:
        err = r.container.Start(process)
    case CT_ACT_RESTORE:
        err = r.container.Restore(process, r.criuOpts)
    case CT_ACT_RUN:
        err = r.container.Run(process)
    default:
        panic("Unknown action")
    }

啟動(dòng)容器代碼container.Start(process):

func (c *linuxContainer) start(process *Process) error {
    parent, err := c.newParentProcess(process)
    if err != nil {
        return newSystemErrorWithCause(err, "creating new parent process")
    }
    if err := parent.start(); err != nil {
        // terminate the process to ensure that it properly is reaped.
        if err := ignoreTerminateErrors(parent.terminate()); err != nil {
            logrus.Warn(err)
        }
        return newSystemErrorWithCause(err, "starting container process")
    }
    // generate a timestamp indicating when the container was started
    c.created = time.Now().UTC()
    if process.Init {
        c.state = &createdState{
            c: c,
        }
        state, err := c.updateState(parent)
        if err != nil {
            return err
        }
        c.initProcessStartTime = state.InitProcessStartTime

        if c.config.Hooks != nil {
            bundle, annotations := utils.Annotations(c.config.Labels)
            s := configs.HookState{
                Version:     c.config.Version,
                ID:          c.id,
                Pid:         parent.pid(),
                Bundle:      bundle,
                Annotations: annotations,
            }
            for i, hook := range c.config.Hooks.Poststart {
                if err := hook.Run(s); err != nil {
                    if err := ignoreTerminateErrors(parent.terminate()); err != nil {
                        logrus.Warn(err)
                    }
                    return newSystemErrorWithCausef(err, "running poststart hook %d", i)
                }
            }
        }
    }
    return nil
}

newParentProcess

1.創(chuàng)建一對(duì)pipe,parentPipe和childPipe,作為 runc start 進(jìn)程與容器內(nèi)部 init 進(jìn)程通信管道
2.創(chuàng)建一個(gè)命令模版作為 Parent 進(jìn)程啟動(dòng)的模板
3.newInitProcess 封裝 initProcess。主要工作為添加初始化類型環(huán)境變量,將namespace、uid/gid 映射等信息使用 bootstrapData 封裝為一個(gè) io.Reader

newInitProcess

添加初始化類型環(huán)境變量,將namespace、uid/gid 映射等信息使用 bootstrapData 函數(shù)封裝為一個(gè) io.Reader,使用的是 netlink 用于內(nèi)核間的通信,返回 initProcess 結(jié)構(gòu)體。

最后調(diào)用func (l *linuxStandardInit) Init() error方法,這里是上面提到的init_linux.go文件。

func (l *linuxStandardInit) Init() error {
    if !l.config.Config.NoNewKeyring {
        ringname, keepperms, newperms := l.getSessionRingParams()

        // Do not inherit the parent"s session keyring.
        sessKeyId, err := keys.JoinSessionKeyring(ringname)
        if err != nil {
            return errors.Wrap(err, "join session keyring")
        }
        // Make session keyring searcheable.
        if err := keys.ModKeyringPerm(sessKeyId, keepperms, newperms); err != nil {
            return errors.Wrap(err, "mod keyring permissions")
        }
    }

    if err := setupNetwork(l.config); err != nil {
        return err
    }
    if err := setupRoute(l.config.Config); err != nil {
        return err
    }

    label.Init()
    if err := prepareRootfs(l.pipe, l.config); err != nil {
        return err
    }
    // Set up the console. This has to be done *before* we finalize the rootfs,
    // but *after* we"ve given the user the chance to set up all of the mounts
    // they wanted.
    if l.config.CreateConsole {
        if err := setupConsole(l.consoleSocket, l.config, true); err != nil {
            return err
        }
        if err := system.Setctty(); err != nil {
            return errors.Wrap(err, "setctty")
        }
    }

    // Finish the rootfs setup.
    if l.config.Config.Namespaces.Contains(configs.NEWNS) {
        if err := finalizeRootfs(l.config.Config); err != nil {
            return err
        }
    }

    if hostname := l.config.Config.Hostname; hostname != "" {
        if err := unix.Sethostname([]byte(hostname)); err != nil {
            return errors.Wrap(err, "sethostname")
        }
    }
    if err := apparmor.ApplyProfile(l.config.AppArmorProfile); err != nil {
        return errors.Wrap(err, "apply apparmor profile")
    }
    if err := label.SetProcessLabel(l.config.ProcessLabel); err != nil {
        return errors.Wrap(err, "set process label")
    }

    for key, value := range l.config.Config.Sysctl {
        if err := writeSystemProperty(key, value); err != nil {
            return errors.Wrapf(err, "write sysctl key %s", key)
        }
    }
    for _, path := range l.config.Config.ReadonlyPaths {
        if err := readonlyPath(path); err != nil {
            return errors.Wrapf(err, "readonly path %s", path)
        }
    }
    for _, path := range l.config.Config.MaskPaths {
        if err := maskPath(path, l.config.Config.MountLabel); err != nil {
            return errors.Wrapf(err, "mask path %s", path)
        }
    }
    pdeath, err := system.GetParentDeathSignal()
    if err != nil {
        return errors.Wrap(err, "get pdeath signal")
    }
    if l.config.NoNewPrivileges {
        if err := unix.Prctl(unix.PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); err != nil {
            return errors.Wrap(err, "set nonewprivileges")
        }
    }
    // Tell our parent that we"re ready to Execv. This must be done before the
    // Seccomp rules have been applied, because we need to be able to read and
    // write to a socket.
    if err := syncParentReady(l.pipe); err != nil {
        return errors.Wrap(err, "sync ready")
    }
    // Without NoNewPrivileges seccomp is a privileged operation, so we need to
    // do this before dropping capabilities; otherwise do it as late as possible
    // just before execve so as few syscalls take place after it as possible.
    if l.config.Config.Seccomp != nil && !l.config.NoNewPrivileges {
        if err := seccomp.InitSeccomp(l.config.Config.Seccomp); err != nil {
            return err
        }
    }
    if err := finalizeNamespace(l.config); err != nil {
        return err
    }
    // finalizeNamespace can change user/group which clears the parent death
    // signal, so we restore it here.
    if err := pdeath.Restore(); err != nil {
        return errors.Wrap(err, "restore pdeath signal")
    }
    // Compare the parent from the initial start of the init process and make
    // sure that it did not change.  if the parent changes that means it died
    // and we were reparented to something else so we should just kill ourself
    // and not cause problems for someone else.
    if unix.Getppid() != l.parentPid {
        return unix.Kill(unix.Getpid(), unix.SIGKILL)
    }
    // Check for the arg before waiting to make sure it exists and it is
    // returned as a create time error.
    name, err := exec.LookPath(l.config.Args[0])
    if err != nil {
        return err
    }
    // Close the pipe to signal that we have completed our init.
    l.pipe.Close()
    // Wait for the FIFO to be opened on the other side before exec-ing the
    // user process. We open it through /proc/self/fd/$fd, because the fd that
    // was given to us was an O_PATH fd to the fifo itself. Linux allows us to
    // re-open an O_PATH fd through /proc.
    fd, err := unix.Open(fmt.Sprintf("/proc/self/fd/%d", l.fifoFd), unix.O_WRONLY|unix.O_CLOEXEC, 0)
    if err != nil {
        return newSystemErrorWithCause(err, "open exec fifo")
    }
    if _, err := unix.Write(fd, []byte("0")); err != nil {
        return newSystemErrorWithCause(err, "write 0 exec fifo")
    }
    // Close the O_PATH fifofd fd before exec because the kernel resets
    // dumpable in the wrong order. This has been fixed in newer kernels, but
    // we keep this to ensure CVE-2016-9962 doesn"t re-emerge on older kernels.
    // N.B. the core issue itself (passing dirfds to the host filesystem) has
    // since been resolved.
    // https://github.com/torvalds/linux/blob/v4.9/fs/exec.c#L1290-L1318
    unix.Close(l.fifoFd)
    // Set seccomp as close to execve as possible, so as few syscalls take
    // place afterward (reducing the amount of syscalls that users need to
    // enable in their seccomp profiles).
    if l.config.Config.Seccomp != nil && l.config.NoNewPrivileges {
        if err := seccomp.InitSeccomp(l.config.Config.Seccomp); err != nil {
            return newSystemErrorWithCause(err, "init seccomp")
        }
    }
    if err := syscall.Exec(name, l.config.Args[0:], os.Environ()); err != nil {
        return newSystemErrorWithCause(err, "exec user process")
    }
    return nil
}

(1)、該函數(shù)先處理l.config.Config.NoNewKeyring,l.config.Console, setupNetwork, setupRoute, label.Init()

(2)、if l.config.Config.Namespaces.Contains(configs.NEWNS) -> setupRootfs(l.config.Config, console, l.pipe)

(3)、設(shè)置hostname, apparmor.ApplyProfile(...), label.SetProcessLabel(...),l.config.Config.Sysctl

(4)、調(diào)用remountReadonly(path)重新掛載ReadonlyPaths,在配置文件中為/proc/asound,/proc/bus, /proc/fs等等

(5)、調(diào)用maskPath(path)設(shè)置maskedPaths,pdeath := system.GetParentDeathSignal(), 處理l.config.NoNewPrivileges

(6)、調(diào)用syncParentReady(l.pipe) // 告訴父進(jìn)程容器可以執(zhí)行Execv了, 從父進(jìn)程來(lái)看,create已經(jīng)完成了

(7)、處理l.config.Config.Seccomp 和 l.config.NoNewPrivileges, finalizeNamespace(l.config),pdeath.Restore(), 判斷syscall.Getppid()和l.parentPid是否相等,找到name, err := exec.Lookpath(l.config.Args[0]),最后l.pipe.Close(),init完成。此時(shí)create 在子進(jìn)程中也完成了。

(8)、fd, err := syscall.Openat(l.stateDirFD, execFifoFilename, os.O_WRONLY|syscall.O_CLOEXEC, 0) ---> wait for the fifo to be opened on the other side before exec"ing the user process,其實(shí)此處就是在等待start命令。之后,再往fd中寫一個(gè)字節(jié),用于同步:syscall.Write(fd, []byte("0"))

(9)、調(diào)用syscall.Exec(name, l.config.Args[0:], os.Environ())執(zhí)行容器命令

文章版權(quán)歸作者所有,未經(jīng)允許請(qǐng)勿轉(zhuǎn)載,若此文章存在違規(guī)行為,您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請(qǐng)注明本文地址:http://systransis.cn/yun/27446.html

相關(guān)文章

  • docker系列--runC解讀

    摘要:而具體代碼首先調(diào)用創(chuàng)建容器之后填充結(jié)構(gòu)。該函數(shù)先處理,設(shè)置,調(diào)用重新掛載,在配置文件中為等等調(diào)用設(shè)置,處理調(diào)用告訴父進(jìn)程容器可以執(zhí)行了從父進(jìn)程來(lái)看,已經(jīng)完成了處理和,判斷和是否相等,找到,最后,完成。 前言 理解docker,主要從namesapce,cgroups,聯(lián)合文件,運(yùn)行時(shí)(runC),網(wǎng)絡(luò)幾個(gè)方面。接下來(lái)我們會(huì)花一些時(shí)間,分別介紹。 docker系列--namespace...

    _Suqin 評(píng)論0 收藏0
  • docker系列--runC解讀

    摘要:而具體代碼首先調(diào)用創(chuàng)建容器之后填充結(jié)構(gòu)。該函數(shù)先處理,設(shè)置,調(diào)用重新掛載,在配置文件中為等等調(diào)用設(shè)置,處理調(diào)用告訴父進(jìn)程容器可以執(zhí)行了從父進(jìn)程來(lái)看,已經(jīng)完成了處理和,判斷和是否相等,找到,最后,完成。 前言 理解docker,主要從namesapce,cgroups,聯(lián)合文件,運(yùn)行時(shí)(runC),網(wǎng)絡(luò)幾個(gè)方面。接下來(lái)我們會(huì)花一些時(shí)間,分別介紹。 docker系列--namespace...

    shixinzhang 評(píng)論0 收藏0
  • docker系列--namespace解讀

    摘要:目前內(nèi)核總共實(shí)現(xiàn)了種隔離和消息隊(duì)列。參數(shù)表示我們要加入的的文件描述符。提供了很多種進(jìn)程間通信的機(jī)制,針對(duì)的是和消息隊(duì)列。所謂傳播事件,是指由一個(gè)掛載對(duì)象的狀態(tài)變化導(dǎo)致的其它掛載對(duì)象的掛載與解除掛載動(dòng)作的事件。 前言 理解docker,主要從namesapce,cgroups,聯(lián)合文件,運(yùn)行時(shí)(runC),網(wǎng)絡(luò)幾個(gè)方面。接下來(lái)我們會(huì)花一些時(shí)間,分別介紹。 docker系列--names...

    wupengyu 評(píng)論0 收藏0
  • docker系列--namespace解讀

    摘要:目前內(nèi)核總共實(shí)現(xiàn)了種隔離和消息隊(duì)列。參數(shù)表示我們要加入的的文件描述符。提供了很多種進(jìn)程間通信的機(jī)制,針對(duì)的是和消息隊(duì)列。所謂傳播事件,是指由一個(gè)掛載對(duì)象的狀態(tài)變化導(dǎo)致的其它掛載對(duì)象的掛載與解除掛載動(dòng)作的事件。 前言 理解docker,主要從namesapce,cgroups,聯(lián)合文件,運(yùn)行時(shí)(runC),網(wǎng)絡(luò)幾個(gè)方面。接下來(lái)我們會(huì)花一些時(shí)間,分別介紹。 docker系列--names...

    cikenerd 評(píng)論0 收藏0
  • docker系列--namespace解讀

    摘要:目前內(nèi)核總共實(shí)現(xiàn)了種隔離和消息隊(duì)列。參數(shù)表示我們要加入的的文件描述符。提供了很多種進(jìn)程間通信的機(jī)制,針對(duì)的是和消息隊(duì)列。所謂傳播事件,是指由一個(gè)掛載對(duì)象的狀態(tài)變化導(dǎo)致的其它掛載對(duì)象的掛載與解除掛載動(dòng)作的事件。 前言 理解docker,主要從namesapce,cgroups,聯(lián)合文件,運(yùn)行時(shí)(runC),網(wǎng)絡(luò)幾個(gè)方面。接下來(lái)我們會(huì)花一些時(shí)間,分別介紹。 docker系列--names...

    Acceml 評(píng)論0 收藏0

發(fā)表評(píng)論

0條評(píng)論

最新活動(dòng)
閱讀需要支付1元查看
<