WRY

Where Are You?
You are on the brave land,
To experience, to remember...

0%

Flask 简介

本文出现的所有代码内容都是对Armin Ronacher的项目粗略介绍,不要对Flask有偏见,因为Flask的依赖也是同一个作者写的。Flask是对更难理解的Werkzeug和jinja的封装,便于用户的日常开发。

WSGI

Flask 可以说是一个WSGI标准的Web服务框架。WSGI全称Web Server Gateway Interface.它是一套Server和Application之间相互沟通的标准。那么什么是Server和Application呢?根据Python官方的文档,定义如下:

Application / Framework

一个Application就是一个callable object,可以是以下三者之一

  • function, method
  • class
  • instance with a __call__ method

这个callable接受两个参数,分别是

  • 字典类型的environ
  • 回调函数 start_response,这个回调函数会用于生成HTTP status,headers

返回的是一个可以迭代的值,官网例子如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
HELLO_WORLD = b"Hello world!\n"

def simple_app(environ, start_response):
"""Simplest possible application object"""
status = '200 OK'
response_headers = [('Content-type', 'text/plain')]
start_response(status, response_headers)
return [HELLO_WORLD]

class AppClass:
"""Produce the same output, but using a class

(Note: 'AppClass' is the "application" here, so calling it
returns an instance of 'AppClass', which is then the iterable
return value of the "application callable" as required by
the spec.

If we wanted to use *instances* of 'AppClass' as application
objects instead, we would have to implement a '__call__'
method, which would be invoked to execute the application,
and we would need to create an instance for use by the
server or gateway.
"""

def __init__(self, environ, start_response):
self.environ = environ
self.start = start_response

def __iter__(self):
status = '200 OK'
response_headers = [('Content-type', 'text/plain')]
self.start(status, response_headers)
yield HELLO_WORLD

Server / Gateway

Server的角色就是用来处理每次接收到的从客户端发送过来的请求,主要功能是:

  • 接收HTTP请求,但不关心请求的url和method等信息
  • 提供并发
  • 构建environ,收集其需要的信息,实现一个回调函数start_response
  • 调用Application的callable object

官网的例子如下,十分全面的展示了一个sever应有的功能。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
import os, sys

enc, esc = sys.getfilesystemencoding(), 'surrogateescape'

def unicode_to_wsgi(u):
# Convert an environment variable to a WSGI "bytes-as-unicode" string
return u.encode(enc, esc).decode('iso-8859-1')

def wsgi_to_bytes(s):
return s.encode('iso-8859-1')

# server 函数入口
def run_with_cgi(application):
# 构建environ
environ = {k: unicode_to_wsgi(v) for k,v in os.environ.items()}
environ['wsgi.input'] = sys.stdin.buffer
environ['wsgi.errors'] = sys.stderr
environ['wsgi.version'] = (1, 0)
environ['wsgi.multithread'] = False
environ['wsgi.multiprocess'] = True
environ['wsgi.run_once'] = True

if environ.get('HTTPS', 'off') in ('on', '1'):
environ['wsgi.url_scheme'] = 'https'
else:
environ['wsgi.url_scheme'] = 'http'

headers_set = []
headers_sent = []

def write(data):
out = sys.stdout.buffer

if not headers_set:
raise AssertionError("write() before start_response()")

elif not headers_sent:
# Before the first output, send the stored headers
status, response_headers = headers_sent[:] = headers_set
out.write(wsgi_to_bytes('Status: %s\r\n' % status))
for header in response_headers:
out.write(wsgi_to_bytes('%s: %s\r\n' % header))
out.write(wsgi_to_bytes('\r\n'))

out.write(data)
out.flush()
# 构建start_response
def start_response(status, response_headers, exc_info=None):
if exc_info:
try:
if headers_sent:
# Re-raise original exception if headers sent
raise exc_info[1].with_traceback(exc_info[2])
finally:
exc_info = None # avoid dangling circular ref
elif headers_set:
raise AssertionError("Headers already set!")

headers_set[:] = [status, response_headers]

# Note: error checking on the headers should happen here,
# *after* the headers are set. That way, if an error
# occurs, start_response can only be re-called with
# exc_info set.

return write
# 调用Application
result = application(environ, start_response)
try:
for data in result:
if data: # don't send headers until body appears
write(data)
if not headers_sent:
write('') # send headers now if body was empty
finally:
if hasattr(result, 'close'):
result.close()

Middleware

Middleware处于Application和Server之间,他兼顾Application和Server的功能特点,可以接在Server的后面和Application的前面,也可以自己堆叠在一起。

  • 根据路由请求到不同的Application上
  • 允许多个Application并行运行,并可以控制请求速率,设置白名单等
  • 通过转发实现负载均衡和远程处理
  • 执行一些处理操作,如XSL数据(不是很理解)

这里参考一篇博客中的内容,比较简短易懂。博客地址:Python WSGI框架详解

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class IPBlacklistMiddleware(object):
def __init__(self, app):
self.app = app

def __call__(self, environ, start_response):
ip_addr = environ.get('HTTP_HOST').split(':')[0]
if ip_addr not in ('127.0.0.1'):
return forbidden(start_response)

return self.app(environ, start_response)

def forbidden(start_response):
start_response('403 Forbidden', [('Content-Type', 'text/plain')])
return ['Forbidden']

def application(environ, start_response):
start_response('200 OK', [('Content-Type', 'text/plain')])
return ['This is a python application!']

if __name__ == '__main__':
from wsgiref.simple_server import make_server
# 下面的嵌套有更优的解决方案,可以在下面的wsgi_app函数中找到答案
application = IPBlacklistMiddleware(application)
server = make_server('0.0.0.0', 8080, application)
server.serve_forever()

官网例子如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
from piglatin import piglatin

class LatinIter:

"""Transform iterated output to piglatin, if it's okay to do so

Note that the "okayness" can change until the application yields
its first non-empty bytestring, so 'transform_ok' has to be a mutable
truth value.
"""

def __init__(self, result, transform_ok):
if hasattr(result, 'close'):
self.close = result.close
self._next = iter(result).__next__
self.transform_ok = transform_ok

def __iter__(self):
return self

def __next__(self):
if self.transform_ok:
return piglatin(self._next()) # call must be byte-safe on Py3
else:
return self._next()

class Latinator:

# by default, don't transform output
transform = False

def __init__(self, application):
self.application = application

def __call__(self, environ, start_response):

transform_ok = []

def start_latin(status, response_headers, exc_info=None):

# Reset ok flag, in case this is a repeat call
del transform_ok[:]

for name, value in response_headers:
if name.lower() == 'content-type' and value == 'text/plain':
transform_ok.append(True)
# Strip content-length if present, else it'll be wrong
response_headers = [(name, value)
for name, value in response_headers
if name.lower() != 'content-length'
]
break

write = start_response(status, response_headers, exc_info)

if transform_ok:
def write_latin(data):
write(piglatin(data)) # call must be byte-safe on Py3
return write_latin
else:
return write

return LatinIter(self.application(environ, start_latin), transform_ok)


# Run foo_app under a Latinator's control, using the example CGI gateway
from foo_app import foo_app
run_with_cgi(Latinator(foo_app))

Demo

代码来自参考链接Python WSGI框架详解

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# 中间件
class IPBlacklistMiddleware(object):
def __init__(self, app):
self.app = app

def __call__(self, environ, start_response):
ip_addr = environ.get('HTTP_HOST').split(':')[0]
if ip_addr not in ('127.0.0.1'):
return forbidden(start_response)

return self.app(environ, start_response)
# Application
def application(environ, start_response):
# 具体执行函数
def dog(start_response):
start_response('200 OK', [('Content-Type', 'text/plain')])
return ['This is dog!']

def cat(start_response):
start_response('200 OK', [('Content-Type', 'text/plain')])
return ['This is cat!']

def not_found(start_response):
start_response('404 NOT FOUND', [('Content-Type', 'text/plain')])
return ['Not Found']

def forbidden(start_response):
start_response('403 Forbidden', [('Content-Type', 'text/plain')])
return ['Forbidden']

path = environ.get('PATH_INFO', '').lstrip('/')
mapping = {'dog': dog, 'cat': cat}

call_back = mapping[path] if path in mapping else not_found
return call_back(start_response)

# server
if __name__ == '__main__':
from wsgiref.simple_server import make_server
application = IPBlacklistMiddleware(application)
server = make_server('0.0.0.0', 8080, application)
server.serve_forever()

Flask 之前

在理解了WSGI的接口定义之后,就可以比较容易的理解Flask的框架结构了。根据Flask官网的介绍,Flask是一个满足WSGI协议的轻量级Web服务框架,他不是一个从头开始编写的项目,而是一个对WerkzeugJinja 的简单封装,那么Werkzeug和Jinja是什么呢?

Werkzeug

这是一个全面的WSGI Web的应用程序库,具有如下功能:

  • 交互式调试器,基于交互风格的JavaScript脚本语言的浏览器调试器(本人并不了解)
  • 功能齐全的对象,可以方便的操作请求的Header,form,file,cookie等,最直观的体验就是易于使用的request和response对象
  • 可以和其他的WSGI应用程序相结合
  • 路由功能,并可以捕捉路径中的变量
  • 各种HTTP使用程序,例如HTTP Session和签名Cookie支持
  • 等等...

Jinja

Jinja 是一个模板引擎,这里就不是关心的重点了

Flask

Flask的设计理念中提到了,Flask是为了用户更方便地使用Werkzeug设计的一个桥梁,其自身不包含例如数据库等复杂的组件。Flask 框架中有两个十分重要的Context(确切而言这两个Context是在Werzeug中实现的):

  • App Context,在一个Flask服务启动的时候就会有一个App Context,存储着Flask的各种配置参数等信息。通过App Context的设置,允许Flask 同时启动多个APP,详见
  • Request Context,他伴随着一个请求的生命周期,存储关于一个请求的相关的数据,而不是让数据在函数之间传递,详见

Context的实现思想是Thread Local的线程级对象隔离,Werkzeug在Thread Local上作出了一些改进,实现了Local Stack和Local Proxy,作为Context的实现基础。以后补充更多细节。

一次Flask请求的源码追踪

一个简单的Flask应用

1
2
3
4
5
6
7
8
9
10
11
12
from flask import Flask

app = Flask(__name__)


@app.route("/")
def hello():
return "hello"


if __name__ == '__main__':
app.run(host="0.0.0.0", port=5005, debug=True)

其中run方法的核心代码如下,调用了werkzeug提供的server组件,将实例化出来的Flask对象app传入其中,Web服务开始运行。run_simple 之后再详细介绍一下,地位等同于Server/Gateway的地位。

1
2
3
4
5
6
7
8
9
10
11
def run(self, host=None, port=None, debug=None, load_dotenv=True, **options):
...
from werkzeug.serving import run_simple

try:
run_simple(host, port, self, **options)
finally:
# reset the first request information if the development server
# reset normally. This makes it possible to restart the server
# without reloader and that stuff from an interactive shell.
self._got_first_request = False

一次请求过程中,Flask组件的处理流程

一个请求进入之后,Flask对象的__call__函数被调用,该函数满足WSGI的Application标准,连参数名称都是那么的眼熟,其中的wsgi_app函数就是真正的WSGI 标准的Application函数。

1
2
3
4
5
6
7
class Flask(Scaffold):
...
def __call__(self, environ, start_response):
"""The WSGI server calls the Flask application object as the
WSGI application. This calls :meth:`wsgi_app` which can be
wrapped to applying middleware."""
return self.wsgi_app(environ, start_response)

environ包含的参数信息如下,可以看到里面有很多server处理好的信息。

image-20200814115019358

跟随__call__函数的调用流程,继续向下,来到wsgi_app函数,根据源码中的注释介绍,不直接把逻辑代码写在__call__中的好处是更适合Middleware的堆叠。

1
2
3
4
# 更简单的这种方式
app.wsgi_app = MyMiddleware(app.wsgi_app)
# 而不是只能通过比较冗余复杂的原始方式
app = MyMiddleware(app)

wsgi_app的源码如下,主要工作有:

  • 创建一个RequestContext,并且推入堆栈之中。
  • 调用full_dispatch_request,执行预处理程序,再找到request对应的处理函数,并进行执行(此处如果路径匹配到用户编写的函数,就会执行用户的程序代码)。
  • 根据业务逻辑函数的执行结果,生成Response对象,并执行后续处理程序。
  • 从堆栈中弹出RequestContext
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def wsgi_app(self, environ, start_response):
ctx = self.request_context(environ)
error = None
try:
try:
ctx.push()
response = self.full_dispatch_request()
except Exception as e:
error = e
response = self.handle_exception(e)
except: # noqa: B001
error = sys.exc_info()[1]
raise
return response(environ, start_response)
finally:
if self.should_ignore_error(error):
error = None
ctx.auto_pop(error)

下面为具体调用用户编写的处理函数的地方,对应最后一行的位置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def dispatch_request(self):
"""Does the request dispatching. Matches the URL and returns the
return value of the view or error handler. This does not have to
be a response object. In order to convert the return value to a
proper response object, call :func:`make_response`.

.. versionchanged:: 0.7
This no longer does the exception handling, this code was
moved to the new :meth:`full_dispatch_request`.
"""
req = _request_ctx_stack.top.request
if req.routing_exception is not None:
self.raise_routing_exception(req)
rule = req.url_rule
# if we provide automatic options for this URL and the
# request came with the OPTIONS method, reply automatically
if (
getattr(rule, "provide_automatic_options", False)
and req.method == "OPTIONS"
):
return self.make_default_options_response()
# otherwise dispatch to the handler for that endpoint
return self.view_functions[rule.endpoint](**req.view_args)

__call__函数执行结束之后,结果反馈给Werkzeug的server部分,一次请求处理完成。

Flask常见组件

在跟踪完成一次简单的Flask请求之后,对之前比较大的疑问,例如session机制在哪里管理的,已经有了一个初步的猜测,那就是在Application 函数构建RequestContext的过程中或者调用业务函数之前的预处理函数。进一步阅读RequestContext的内容,确认了这个猜测,RequestContext提供了对session的管理,我们可以通过中间的件的方式,在构建request的时候传入。同时RequestContext提供了一个Global 变量的挂载点,Fission 的Python Environment就是通过g,把函数需要的变量传递给函数。

Flask Web 框架

Flask的源码的实际工作位置就位于下图的红色虚线的右侧,中间部分的逻辑主要由Werkzeug库实现。

Flask的处理流程如下图所示。这两张图片均参考博客,阅读源码后,认可这两张框架图。

参考链接