1 Overview¶
The site api.lsst.codes
is intended to be a unified front-end for
API services designed for programmatic consumption to support LSST Data
Management. This technote will document how to use the tools that
SQuaRE has provided to more easily write and deploy these API
microservices.
In general, a microservice will be a way to extract data from an underlying back-end source. It may be used to extract from multiple sources and correlate and aggregate those sources. It also may be used to reformat or reduce that data in a way that is sensible to consume programatically.
You may ask, “why not just hit the backend? What value does the microservice provide?” There are two, somewhat complementary, answers. First is that if you put the service behind a unified framework and enforce certain standards for discoverability and output format, you can make writing tools to query your services a great deal easier. Second is that the first use case for this is in a chatops service, and we wanted to make the chatbot, in so far as possible, a presentation-layer front end and keep data manipulation and reduction logic out of it. This is particularly true since (if we stick with Hubot) the chat logic is written in CoffeeScript. Python is already part of the DM technical stack, which CoffeeScript is not, and therefore likely more familiar to DM developers.
This document assumes that you will write your microservice in Python 3 and Flask (assuming you’re working in Python, it’s nice if it works under Python 2 as well), and that you will use the apikit module (source at GitHub) to provide the required metadata routes.
However, there’s also a Quick Start Guide if you don’t really care how all this works, as long as it works.
2 Quick Start Guide (i.e. tl;dr)¶
Install
cookiecutter
:pip install cookiecutter
Ensure that your locale settings are correct and that you’re using UTF-8:
export LC_ALL=en_US.utf-8 export LANG=en_US.utf-8
Create a new microservice project:
cookiecutter https://github.com/lsst-sqre/uservice-bootstrap.git
Follow the directions on your screen.
3 How A Microservice Works¶
What’s actually going on in the project you just created with
cookiecutter
? The rest of this technote is an answer to that
question, and will show construction (from scratch) of a toy example of
a microservice that provides a more useful interface to an underlying
backend service.
We will begin with the metadata routes any api.lsst.codes application must provide.
3.1 Required Metadata¶
The metadata must be accessible at the routes /metadata
and
/{{api_version}}/metadata
from the root of the microservice. As of
the time of writing, the current (and only) API version is 1.0
.
The metadata is also documented at GitHub.
The metadata must be presented as a JSON object. All fields are
type str
.
{
name
version
repository
description
api_version
auth
}
The fields name
, description
, and version
are arbitrary.
Semantic versioning is strongly encouraged for version
.
api_version
should reflect the version of the API in use (at time of
writing, 1.0
). repository
is the URL for the source of your
project. If you want your microservice to be published on
api.lsst.codes
its source must be publicly available. We extremely
strongly recommend hosting it on GitHub.
The auth
field is constrained. It must be one of none
,
basic
, or bitly-proxy
. These represent the three choices
available to the microservice for authentication to GitHub (we have
standardized on GitHub as a canonical source of authentication data,
since LSST is fairly fundamentally coupled to GitHub and it functions as
a widely available OAuth2 provider):
none
: no authentication required.basic
: HTTP Basic Auth. Typically used with a GitHub username and token, although if you didn’t have two-factor authentication enabled at GitHub you could use a password here as well.bitly-proxy
: Authenticate through the Bitly OAuth2 proxy. Typically used with a GitHub username and password, and basically converts two-factor authentication back into username-and-password authentication.
The good news is, if you’re writing in Python and your application is a
Flask app, you don’t need to implement the
metadata route. Just use apikit
.
3.2 Using apikit¶
The apikit
module is documented at GitHub. apikit
has
two classes: apikit.APIFlask
and
apikit.BackendError
, and two functions:
set_flask_metadata()
and add_metadata_route()
.
The apikit.APIFlask
class is what you should generally use: it
is a subclass of a Flask application (flask.Flask
) that
already has metadata added and the route baked into it.
If you have an existing Flask application, you might want to use
apikit.set_flask_metadata()
on that application rather than the
apikit.APIFlask
class. You will find add_route_prefix()
useful to add additional routes to the metadata. That is helpful, for
instance, for Kubernetes Ingress resources, which provide routing but
not path rewriting, which makes it your responsibility to ensure the
metadata is available at /{{app_name}}/metadata
as well as
/metadata
.
The apikit.BackendError
class is useful with Flask decorators
to return diagnostic information when something goes wrong with your
application. You’ll see it in the example below.
3.2.1 Example apikit usage¶
The following describes how you would use apikit
and specifically
apikit.APIFlask
to create a service wrapper suitable for use
on api.lsst.codes
.
3.2.1.1 Microservice server overview¶
Let’s pretend that you have a service living at
https://myservice.lsst.codes, which you want to turn into a microservice
(that is, put an api.lsst.codes
-conformant API wrapper around) using
apikit. Your service uses the Bitly OAuth2 proxy to use GitHub as its
authentication source, so you need to leverage that.
We’ll say that this is going to go in a directory
uservice_mymicroservice
, and we will package it for installation via
setuptools
. The server itself will, imaginatively, be called
server.py
. (This mirrors the setup you would get if you used
cookiecutter
to create the service.)
3.2.1.2 Imports¶
We’ll cheat a little and start with all the imports we’re going to need; in real development, of course, you wouldn’t know this a priori but would build it up a bit at a time:
from flask import jsonify, request
from apikit import APIFlask, BackendError
from BitlyOAuth2ProxySession import Session
3.2.1.3 Flask application¶
Having done that, we need to create the microservice as an instance of
apikit.APIFlask
. This class takes the same arguments as the
object returned by metadata, with the following exception: auth
becomes an object with two fields, type
and data
, unless auth
is
one of None
, the empty string, or the string none
. The type
field must be one of the strings none
, basic
, or bitly-proxy
.
If auth
is an object whose type field is none
, auth.data
is
the empty object, or omitted completely. Otherwise auth.data
is an
object with two fields, username
and password
. If auth.type
is bitly-proxy
then auth.data
must have a third field,
endpoint
, which is the start
point of the OAuth2 proxy data flow
for the underlying service. Usually this is
https://service.host/oauth2/start
.
The api_version
field has a sane default (currently 1.0
) and can
normally be omitted.
Here’s what all that looks like:
backenduri = "https://myservice.lsst.codes"
app = APIFlask(name="uservice-mymicroservice",
version="0.0.1",
repository="https://github.com/sqre-lsst/" +
"uservice-mymicroservice",
description="My delightful microservice",
route=["/", "/mymicroservice"],
auth={"type": "bitly-proxy",
"data": { "username": "",
"password": "",
"endpoint": backenduri +
"/oauth2/start" } })
This creates a Flask application which presents the service metadata on
/metadata
, /v1.0/metadata
, /mymicroservice/metadata
, and
/mymicroservice/v1.0/metadata/
, as well as all of those with
.json
appended.
3.2.1.4 Session object¶
Now, in order to actually access your data, you’re going to need to make your requests within a session with the appropriate authentication. Let’s assume that your caller is going to send you HTTP Basic Authentication headers, and you’re going to use those as username and password to the proxy.
You’ll need a place to store the session. Fortunately, Flask provides a
mechanism for this: the app.config
dict.
So, after initialization, you probably want:
app.config["SESSION"] = None
3.2.1.5 Reauthorization¶
Next you need a _reauth()
function, so if an HTTP operation fails
with a 401 Unauthorized
or 403 Forbidden
, you can try to
regenerate a session with your authentication data:
def _reauth(app, username, password):
"""Get a session with authentication data"""
oaep = app.config["AUTH"]["data"]["endpoint"]
# Session here comes from BitlyOAuth2Proxy
session = Session.Session(oauth2_username=username,
oauth2_password=password,
authentication_session_url=None,
authentication_base_url=oaep)
session.authenticate()
app.config["SESSION"] = session
When we create the actual fetch of backend data, we’ll see how to pull the headers off the request we got and create an authorization object for the session.
3.2.1.6 Error handler¶
Next we’ll add a basic error handler:
@app.errorhandler(BackendError)
def handle_invalid_usage(error):
"""Custom error handler; bubble up status code, jsonify rest."""
response = jsonify(error.to_dict())
response.status_code = error.status_code
return response
Now, whenever you want to return an error based on something you got
from the service, create a new apikit.BackendError
.
3.2.1.7 Healthcheck¶
Since this application is eventually going to run under Google Container
Engine using an Ingress TLS terminator and router (well, this is our
current state, and it is our assumption that it will be that way
long-term, anyway), you want the actual application root to return a
200
very quickly, because the Ingress controller will be pinging it
often to determine service health (GCE’s Ingress defines a successful
healthcheck as getting 200
from an HTTP GET /
.
@app.route("/")
def healthcheck():
"""Default route to keep Ingress controller happy."""
return "OK"
3.2.1.8 Service logic¶
Finally, let’s add the actual service. In addition to the routing and fetching logic, you will need to peel the authentication headers out of the inbound request and create a session with them, if you don’t already have a session with the correct authentication information.
3.2.1.8.1 Interface¶
Let’s say you have decided that your microservice interface will respond
to GET /mymicroservice/jobname/metric
to retrieve the named metric about
jobname (for instance, GET /mymicroservice/buildmyapp/time
to get back data about how long a build took).
3.2.1.8.2 Backend¶
We’ll pretend that your backend service is ill-behaved, and does the following annoying things:
- It wants its arguments as parameters on the
HTTP GET
rather than as a request body or a path on theGET
URL. - It returns the requested metric as a plain text value, rather than wrapped in JSON or XML or anything sane.
Therefore, you call it with GET /api?metric=metric&job=jobname
and
what you get is what you get, which you hope is ASCII text, or maybe
UTF-8, but it’s not like the other side is going to guarantee that to you.
3.2.1.8.3 Return value¶
What you have decided to return to your caller is, of course, JSON, and you are going to return a structure that looks like:
{
jobname
metric
value
}
Where each of those fields are strings.
3.2.1.8.4 Service route and implementation¶
Flask provides a nice decorator service for pointing routes to
functions. You’ve seen it above with the healthcheck route: just put
@app.route
atop the function definition.
# Route it to the root too, in case we want to put it behind nginx
# or HAProxy or something that can do path rewriting.
@app.route("/<jobname>/<metric>")
@app.route("/mymicroservice/<jobname>/<metric>")
def get_metric_for_job(metric=None, jobname=None):
"""Retrieve the metric and format it with JSON for return."""
# Create a custom error if metric or jobname are not specified
if metric is None or not metric or jobname is None or not jobname:
raise BackendError(reason="Bad Request",
status_code=400,
content="Must specify metric and jobname.")
# If we have authorization on the request, try to use it
if request.authorization is not None:
inboundauth = request.authorization
currentuser = app.config["AUTH"]["data"]["username"]
currentpw = app.config["AUTH"]["data"]["password"]
# If we are already using this user/pw, don't bother.
if currentuser != inboundauth.username or \
currentpw != inboundauth.password:
_reauth(app, inboundauth.username, inboundauth.password)
else:
raise BackendError(reason="Unauthorized", status_code=401,
content="No authorization provided.")
session = app.config["SESSION"]
# This is going to end up in the same function where backenduri
# is defined. See below.
url = backenduri + "/api"
params = { "metric": metric,
"job": jobname }
resp = session.get(url, params=params)
if resp.status_code == 403 or resp.status_code == 401:
# Try to reauth
_reauth(app, inboundauth.username, inboundauth.password)
session = app.config["SESSION"]
resp = session.get(url, params=params)
if resp.status_code == 200:
# Success!
rdict = { "metric": metric,
"jobname": jobname,
"value": resp.text() }
return jsonify(rdict)
else:
raise BackendError(reason=resp.reason,
status_code=resp.status_code,
content=resp.text)
3.2.1.8.5 Implementation notes¶
jsonify()
not only returns the JSON representation of the dictionary passed to it, but wraps it in aResponse
object with a mimetype ofapplication/json
and allows you to set an HTTP status code.- We set a custom error if either metric or jobname are not specified.
A
400 Bad Request
seems appropriate. - Most of the rest of the function is concerned with making sure you
have a session object and attempting reauthorization if you get a
401 Unauthorized
or403 Forbidden
on the initial request.
And that’s pretty much it. You’d want to wrap all of the above in a
function; let’s call it server()
and give it a run_standalone
parameter.
3.2.1.9 Server function¶
def server(run_standalone=False):
# Refer to the earlier pieces of this document for the code
# fragments that need to be inserted in place of the comments.
#
# APIFlask instantiation to create the application goes here...
# ...then add SESSION to the config dict...
# ...next, add an error handler...
# ...then, your healthcheck...
# ...finally, your actual route.
#
# And now a bit of new code, to run the service if invoked standalone:
if run_standalone:
app.run(host='0.0.0.0', threaded=True)
The imports go at the top of server.py
, of course, and the
_reauth()
function stands on its own, not nested inside server()
.
3.2.1.10 Standalone invocation¶
The only other thing you really need is to add a Python shebang and
invoke server()
standalone if the script is run from the
command-line. Making standalone()
its own function makes
setup.py
a bit prettier.
#!/usr/bin/env python
"""My microservice wrapper."""
# Imports go here...
# ...server function goes here...
# ...reauth goes here.
def standalone():
"""Run standalone; makes setuptools invocation a little prettier."""
server(run_standalone=True)
if __name__ == "__main__":
standalone()
3.3 Using setuptools¶
You now want to make this server loadable as a module and then wrap it
all up with setuptools
. So, you’ll need an __init__.py
that
exports the server()
and standalone()
symbols:
#!/usr/bin/env python
"""My microservice wrapper's __init__."""
from .server import server, standalone
__all__ = [ "server", "standalone" ]
Then you need to go up a directory and create setup.py
. There’s good
boilerplate for this, e.g. in the metricdeviation microservice.
Make sure to set any package dependencies:
install_requires=[
'sqre-apikit==0.0.10'
],
and the entrypoint:
entry_points={
'console_scripts': [
'sqre-uservice-mymicroservice = uservice_mymicroservice:standalone'
]
}
3.4 Further Considerations¶
Your service will eventually be set up to run as a Docker container
under Google Container Engine. This will require population of a
Dockerfile
and deployment description files in kubernetes
.
However, those files are not in scope for this document, and, in
general, are expected to be added by the DM SQuaRE team. (If you use
cookiecutter
you will already have these files, and they will be
modified as needed by the SQuaRE team.)
If you, as a service author, want to stop after making the service pip-installable with setuptools, that’s perfectly fine. SQuaRE will take it from there.
That process will be detailed in a future tech note.