一千萬個為什麽

搜索

為什麽建議在容器中只運行一個進程?



在許多博客文章和一般意見中,有一種說法是“每個容器一個進程”。

為什麽這個規則存在? 為什麽不在需要使所有進程工作的單個容器中運行ntp,nginx,uwsgi和更多進程?

提到這條規則的博客文章:

轉載註明原文: 為什麽建議在容器中只運行一個進程?

一共有 5 個回答:

讓我們暫時忘記高層次的建築和哲學論證。盡管可能有一些邊緣情況下單個容器中的多個函數可能有意義,但為什麽您可能想要考慮遵循“每個容器的一個函數”作為經驗法則,這是非常實際的原因:

  • Scaling containers horizontally is much easier if the container is isolated to a single function. Need another apache container? Spin one up somewhere else. However if my apache container also has my DB, cron and other pieces shoehorned in, this complicates things.
  • Having a single function per container allows the container to be easily re-used for other projects or purposes.
  • It also makes it more portable and predictable for devs to pull down a component from production to troubleshoot locally rather than an entire application environment.
  • Patching/upgrades (both the OS and the application) can be done in a more isolated and controlled manner. Juggling multiple bits-and-bobs in your container not only makes for larger images, but also ties these components together. Why have to shut down application X and Y just to upgrade Z?
    • Above also holds true for code deployments and rollbacks.
  • Splitting functions out to multiple containers allows more flexibility from a security and isolation perspective. You may want (or require) services to be isolated on the network level -- either physically or within overlay networks -- to maintain a strong security posture or comply with things like PCI.
  • Other more minor factors such as dealing with stdout/stderr and sending logs to the container log, keeping containers as ephemeral as possible etc.

請註意,我在說的是功能,而不是過程。那種語言已經過時了。官方碼頭文檔已經離開從每個容器的“一個過程”改為推薦“一個關註”。

幾天前殺了一個“兩個進程”的容器,對我來說有一些痛苦點讓我使用兩個容器,而不是啟動兩個進程的python腳本:

  1. Docker擅長識別崩潰的容器。當主流程看起來不錯時,它不能這樣做,但其他一些流程死了可怕的死亡。當然,您可以手動監控您的流程,但為什麽要重新實現?
  2. 當多個進程將日誌傳送到控制臺時,docker日誌會變得非常有用。再次,您可以將進程名稱寫入日誌,但docker也可以這樣做。
  3. 測試和推理容器會變得更加困難。

The recommendation comes from the goal and design of the Operating-system-level virtualization

容器的設計旨在通過為其他人提供自己的用戶空間和文件系統來為他人隔離流程。 > 這是提供隔離文件系統的 chroot 的邏輯演化,下一步是將進程與其他進程隔離,以避免內存覆蓋並允許使用相同的資源(例如,TCP端口8080)多個進程沒有沖突。

對容器的主要興趣在於為進程打包所需的庫,而不必擔心版本沖突。 如果你在同一個用戶空間和文件系統中運行需要兩個版本的同一個庫的多個進程,你必須為每個進程至少調整LDPATH,以便首先找到合適的庫,並且某些庫不能用這種方式調整,因為它們的路徑在編譯時在可執行文件中被硬編碼,請參閱這個SO問題了解更多詳情 在網絡級別,您必須配置每個進程以避免使用相同的端口。

在同一個容器中運行多個進程需要進行大量調整,並在一天結束時擊敗隔離目的,如果您可以在同一個用戶空間內運行多個進程,共享相同的文件系統和網絡資源,那麽為什麽不運行它們在主機本身?

以下是我可以想到的重大調整/陷阱的非詳盡列表:

  • Handling the logs

    Either being with a mounted volume or interleaved on stdout this bring some management. If using a mounted volume your container should have it's own "place" on host or two same containers will fight for the same resource. When interleaving on stdout to take advantage of docker logs it can become a nightmare for analysis if the sources can't be identified easily.

  • Beware of zombie processes

    If one of your process in a container crash, supervisord may not be able to clean up the childs in a zombie state, and the host init will never inherit them. Once you exhausted the number of available pids (2^22 so roughly 4 millions) a bunch of things will fail.

  • Separation of concerns

    If you run two separated things, like an apache server and logstash within the same container, that may ease the log handling, but you have to shutdown apache to update logstash. (In reality, you should use the logging driver of Docker) Will it be a graceful stop waiting the current sessions to end or not ? If it's a graceful stop, it may take sometime and become long to roll the new version. If you do a kill, you'll impact users for a log shipper and that should be avoided IMHO.

最後,當您有多個進程正在復制操作系統時,在這種情況下,使用硬件虛擬化聽起來更符合這種需求。

在大多數情況下,它不是全有或全無。 “每個容器一個進程”的指導源於容器應該有獨特用途的想法。例如,容器不應該是Web應用程序 Redis服務器。

有些情況下,在單個容器中運行多個進程是有意義的,只要這兩個進程支持單個模塊化功能即可。

我將在此處稱為服務的過程 1個容器〜1個服務,如果我的任何服務都失敗了,那麽我只會啟動相應的容器,並且幾秒鐘內所有內容都會重新啟動。所以,服務之間不會有任何依賴關系。最好的做法是保持容器大小小於200 MB和最大500 MB(Windows本機容器超過2 GB的情況除外),否則它將與虛擬機類似,但不完全,但性能足夠。另外,考慮少量參數作為縮放,我怎樣才能使我的服務具有彈性,自動部署等。

而且,它純粹是您的呼叫,您需要如何使用最適合您的環境的容器技術,在Polygot環境中制作像微服務這樣的架構模式,並為您實現自動化。