01、uwsgi、gunicorn如何实现优雅重启

2023-04-10 15:51 由画个一样的我发表于 #后端开发

1、为何需要优雅重启

在实际开发过程中，我们会不断迭代升级产品，每次迭代后，都需要在线上服务器更新代码。一般小公司的迭代升级，是没有做到像金丝雀发布或者使用到kubernetes这些东西的。那如何保证更新的时候，之前接收到的请求能够正常处理完成呢，这个时候就需要实现优雅重启了。

那如何实现优雅重启呢，其实，我们部署python web服务所用到的uwsgi和gunicorn已经实现了优雅重启了，下面就讲讲如何实现优雅重启

2、uwsgi 如何实现优雅重启

以下实验是基于以下版本进行的。

python3.6.8

flask==2.0.3
uwsgi==2.0.21

2.1 编写 web 服务

main.py

import time

from flask import Flask

app = Flask(__name__)


@app.route("/")
def index():
    time.sleep(10)
    return "hello eeee"


if __name__ == "__main__":
    app.run()

2.2 编写 uwsgi.ini 配置文件

[uwsgi]
#uwsgi启动时，所使用的地址和端口（这个是http协议的）
http=0.0.0.0:8000
#指向网站目录
chdir=./
#python 启动程序文件
wsgi-file=main.py
#python 程序内用以启动的application 变量名
callable=app
#处理器数
processes=4
#线程数
threads=2

#####实现优雅重启添加下面两行配置即刻#####
lazy-apps = true
#监听 test.txt 文件 当 test.txt 发生改变时优雅重启uwsgi。这个名字可以随便起
touch-chain-reload = /Users/xx/work/test/py_test/sample_test/flask_graceful_restart/test.txt

2.3 启动uwsgi 服务

uwsgi --ini uwsgi.ini

2.4 测试优雅重启

1、请求 http://127.0.0.1:8000/

2、更新 main.py 中返回的内容，改为： return "hello xxxxx"

2、在/Users/xx/work/test/py_test/sample_test/flask_graceful_restart 目录下，执行 touch test.txt。有这个文件时，更改这个文件的内容也可以优雅重启 uwsgi 服务

3、得到第一步的返回结果，返回结果为："hello eeee"

5、再次请求 http://127.0.0.1:8000/ ，返回结果为："hello xxxxx"

通过上述测试，可以发现实现了优雅重启。

优雅重启的日志过程:

整个时间还挺久的，差不多4分钟。

开始：12:14:45。

结束：12:18:57。

1、先查看 uwsgi 进程信息

501  7758  4633   0 12:13PM ttys005    0:00.04 uwsgi --ini uwsgi.ini
501  7759  7758   0 12:13PM ttys005    0:00.27 uwsgi --ini uwsgi.ini
501  7760  7758   0 12:13PM ttys005    0:00.27 uwsgi --ini uwsgi.ini
501  7761  7758   0 12:13PM ttys005    0:00.27 uwsgi --ini uwsgi.ini
501  7762  7758   0 12:13PM ttys005    0:00.26 uwsgi --ini uwsgi.ini
501  7763  7758   0 12:13PM ttys005    0:00.00 uwsgi --ini uwsgi.ini
501  7789  6013   0 12:13PM ttys006    0:00.00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox uwsgi

2、生成 test.txt 时的现象：当出现 chain reloading complete 时，代表了优雅完成。

Mon Apr 10 12:14:45 2023 - *** /Users/mashili/work/test/py_test/sample_test/flask_graceful_restart/test.txt has been touched... chain reload !!! ***
Mon Apr 10 12:14:45 2023 - chain next victim is worker 1
Gracefully killing worker 1 (pid: 7759)...
Mon Apr 10 12:15:46 2023 - worker 1 (pid: 7759) is taking too much time to die...NO MERCY !!!
worker 1 killed successfully (pid: 7759)
Respawned uWSGI worker 1 (new pid: 7847)
Mon Apr 10 12:15:47 2023 - chain is still waiting for worker 1...
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x7fc33e00da00 pid: 7847 (default app)
Mon Apr 10 12:15:48 2023 - chain next victim is worker 2
Gracefully killing worker 2 (pid: 7760)...
Mon Apr 10 12:16:49 2023 - worker 2 (pid: 7760) is taking too much time to die...NO MERCY !!!
worker 2 killed successfully (pid: 7760)
Respawned uWSGI worker 2 (new pid: 7885)
Mon Apr 10 12:16:50 2023 - chain is still waiting for worker 2...
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x7fc33e00da00 pid: 7885 (default app)
Mon Apr 10 12:16:51 2023 - chain next victim is worker 3
Gracefully killing worker 3 (pid: 7761)...
Mon Apr 10 12:17:52 2023 - worker 3 (pid: 7761) is taking too much time to die...NO MERCY !!!
worker 3 killed successfully (pid: 7761)
Respawned uWSGI worker 3 (new pid: 7905)
Mon Apr 10 12:17:53 2023 - chain is still waiting for worker 3...
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x7fc33e00da00 pid: 7905 (default app)
Mon Apr 10 12:17:54 2023 - chain next victim is worker 4
Gracefully killing worker 4 (pid: 7762)...
Mon Apr 10 12:18:55 2023 - worker 4 (pid: 7762) is taking too much time to die...NO MERCY !!!
worker 4 killed successfully (pid: 7762)
Respawned uWSGI worker 4 (new pid: 7910)
Mon Apr 10 12:18:56 2023 - chain is still waiting for worker 4...
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x7fc33e00da00 pid: 7910 (default app)
Mon Apr 10 12:18:57 2023 - chain reloading complete

3、优雅重启过程中，查看进程信息。ps -ef|grep uwsgi

发现即存在新的进程，也存在老的进程。测试的时候，发现，优雅重启过程中，并不一定会将重启过程中的请求转发到新的进程中去。

  501  7758  4633   0 12:13PM ttys005    0:00.08 uwsgi --ini uwsgi.ini
  501  7761  7758   0 12:13PM ttys005    0:00.27 uwsgi --ini uwsgi.ini
  501  7762  7758   0 12:13PM ttys005    0:00.26 uwsgi --ini uwsgi.ini
  501  7763  7758   0 12:13PM ttys005    0:00.00 uwsgi --ini uwsgi.ini
  501  7847  7758   0 12:15PM ttys005    0:00.26 uwsgi --ini uwsgi.ini
  501  7885  7758   0 12:16PM ttys005    0:00.26 uwsgi --ini uwsgi.ini
  501  7889  6013   0 12:17PM ttys006    0:00.00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox uwsgi

2.5 线上如何更新

1、首先将代码更新到服务器。

2、ps -ef|grep uwsgi 查看现在的进程号。

2、查看 test.txt是否存在，存在就更新文件内容，不存在就生成 test.txt。

3、观察uwsgi的日志或者进程，待所有的worker进程都重启生成后，即完成了优雅重启。

3、gunicorn 如何实现优雅重启

3.1 编写 web 服务

main.py

import time

from flask import Flask

app = Flask(__name__)


@app.route("/")
def index():
    # time.sleep(3)
    return "hello fdaf fdafd "


if __name__ == "__main__":
    app.run()

3.2 编写 conf.py 配置gunicorn 文件

conf.py

# 是否开启debug模式
debug = True
# 访问地址
bind = "0.0.0.0:8888"
# 工作进程数
workers = 2
# 工作线程数
threads = 2
# 超时时间
timeout = 600
# 输出日志级别
loglevel = 'info'
# 存放日志路径
pidfile = "log/gunicorn.pid"
# 存放日志路径
accesslog = "log/access.log"
# 存放日志路径
errorlog = "log/debug.log"

######注意，下面这个不能加，加了就不能达到优雅重启的效果，切记切记！！
# gunicorn + apscheduler场景下，解决多worker运行定时任务重复执行的问题
# preload_app = True

3.3 启动 gunicorn 服务

gunicorn -c conf.py main:app

3.4 测试优雅重启

1、pstree -ap|grep gunicorn 找到主进程

pstree -ap|grep gunicorn

2、执行 kill -HUP masterpid

kill -HUP 1540847

3、再次执行 pstree -ap|grep gunicorn，发现worker 进程id不一样后，即更新完成。或者查看日志

[2023-04-10 15:36:51 +0800] [11623] [INFO] Handling signal: hup
[2023-04-10 15:36:51 +0800] [11623] [INFO] Hang up: Master
[2023-04-10 15:36:51 +0800] [11681] [INFO] Booting worker with pid: 11681
[2023-04-10 15:36:51 +0800] [11682] [INFO] Booting worker with pid: 11682
[2023-04-10 15:36:51 +0800] [11644] [INFO] Worker exiting (pid: 11644)
[2023-04-10 15:36:51 +0800] [11645] [INFO] Worker exiting (pid: 11645)

3.5 线上如何更新

1、通过 pstree -ap|grep gunicorn 找到主进程ID

2、执行 kill -HUP masterpid 命令

3、等待gunicorn优雅重启完成

热门相关：最强狂兵马夫的孩子霸皇纪薄先生，情不由己霸皇纪