For the last few months my team has been working on an identity solution to support splitting our monolith apart into smaller services. Our new world is a medley of microservices and microsites (single page apps), among other things. Internally this means many identity clients and flows in play, with the primary objective being to support single sign on (SSO) for our customers.
OAuth is everywhere, but in my opinion, it is not easy to fully understand without a considerable amount of time and effort. For my own benefit, and hopefully that of others, I have summarised my notes from the last few months.
For a more gentle introduction to OAuth and OpenID, I recommend these resources:
Contents
- What is OAuth?
- What is the point of OAuth?
- What is OAuth 2.1?
- What is an OAuth flow, and why are there many?
- Is it a protocol or framework?
- What is OpenID?
- What is the point of OIDC - why can’t I just use OAuth?
- Why does it matter who they are and if they are present?
- Why can’t I just take the presence of an access token as proof the user has authenticated?
- Why couldn’t the access token be used to hit a user info endpoint, to figure out who the user is?
- Are there any additional reasons?
- How does OpenID Connect integrate with OAuth?
- Why use the /userinfo endpoint instead of the ID token to get claims?
- What is the difference between Authorization and Authentication?
- What are the key terms (and how do I pronounce them)?
- What are scopes, claims and audiences and how do they interact?
- What are the different claim types in OIDC?
- What is the userinfo endpoint?
- What are the various endpoints I should be aware of, and how do I discover them?
- If access tokens should be opaque to the client, why is mine a JWT?
- If OIDC returns an ID token that can be read by the client, does that mean the client can now read access tokens?
- How do refresh tokens work?
- How do identity tokens work?
- What is an authorization code?
- Why not return the access token straight away, instead of exchanging an authorization code?
- How does PKCE factor in?
- Where do state and nonce come in?
- Why must I validate URL redirects?
- Should TLS be used for all requests?
- Which RFCs and BCPs are in play here?
- What happened to OAuth 1.0?
What is OAuth?
Straight from the spec, we have this definition:
The OAuth 2.0 authorization framework enables a third-party
application to obtain limited access to an HTTP service, either on
behalf of a resource owner by orchestrating an approval interaction
between the resource owner and the HTTP service, or by allowing the
third-party application to obtain access on its own behalf.
Let’s break this down:
enables a third-party application to obtain limited access to an HTTP service
For example, the third party application might manage social media posts, and will have limited access to Facebook or Twitter APIs to read and write posts/tweets. OAuth 2.0 is only relevant over the HTTP protocol - other protocols are out of scope.
either on
behalf of a resource owner by orchestrating an approval interaction
between the resource owner and the HTTP service
The resource owner in this example would be you - the owner of the social media accounts. The approval interaction would involve you being redirected to the Facebook or Twitter login page, and ultimately an access token (to provide access to the HTTP service) being returned.
or by allowing the
third-party application to obtain access on its own behalf
The OAuth standard includes provision for machine-to-machine communication (see client credentials flow). There is also provision to allow a service to “refresh” their access without further user interaction (see refresh tokens).
What is the point of OAuth?
Understanding how things worked before OAuth 1.0 or 2.0 existing helped my understanding of this framework.
OAuth solves many problems, but the main benefit is people don’t have to give their password to third party sites (this is known as the password anti-pattern). For example, say you sign up for an email account and the provider offers to get your contacts from Facebook. With OAuth, the email provider doesn’t have to know your Facebook password. Without, you are giving your password, and therefore unlimited access, to someone else.
Aside from that, it is convenient to have a standard for authorization. Tooling, libraries and communities are built round the standard that ultimately makes it easier to implement.
What is OAuth 2.1?
OAuth 2.1 consolidates OAuth 2.0 and several RFCs that have extended or amended the original spec over time. This page summarises the differences between 2.0 and 2.1:
What is an OAuth flow, and why are there many?
An OAuth flow (also known as grant types) are different ways to get an access token.
Different authorization scenarios require different flows. For example, a user sat at their desktop computer can interactively provide credentials (that is to say - type their username and password into a website). In this case you might use authorization code
flow. However if you are writing a service that communicates with another service, with no user present to enter credentials, you might use client credentials
flow.
There are other factors that might influence your decision, too. For example, how much do you trust the client? An SPA or mobile app isn’t at trustworthy as a server-side app that you fully own and control.
These flows all have a few things in common:
- The aim is to get an access token
- They share standard endpoints of the OAuth protocol (https://tools.ietf.org/html/rfc6749#section-3)
- The terminology to describe the various actors in each flow is the same
| Flow
|———————————————————————————————————————————————————————————————————————————————————————————————————————-|——————————————————————————————————————————————————|
| Authorization Code with PKCE
A human needs to provide credentials and their device is not input constrained and supports a browser. As of OAuth 2.1, all clients using this flow should implement PKCE (this is particularly important for SPAs). |
| Client credentials
Machine-to-machine communication - no human present. In this case the client is the resource owner. |
| Device code
Where a human needs to enter credentials on a device that is input constrained, or can’t support a browser. For example, smart TV apps. |
| Refresh Token
Used when exchanging an existing refresh token for a new access token (and maybe a new refresh token). Only used with the /token endpoint. |
| Implicit
Legacy - not included in OAuth 2.1. Was originally used to work around javascript apps not being able to make requests to a different domain - an issue now solved by CORS. There are other limitations, for example the spec makes no allowance for returning refresh tokens using this flow.
Okta blog post |
| Resource Owner Password Credentails
Legacy - not included in OAuth 2.1. Your app would send the human’s username and password to the identity provider (so you must have a high level of trust with the app). This is impersonation - the identity provider has no way of knowing whether the human is really there.
Scott Brady blog post |
| Authorization Code (no PKCE)
Legacy - not included in OAuth 2.1. All clients using authorization code should implement PKCE. | |
This page also summaries the different flows that are available: https://oauth.net/2/grant-types/
Is it a protocol or framework?
The title of the OAuth 2.0 spec calls it a framework, however the term “protocol” is used in the spec, too.
OAuth 2.0 is very extensible, so there is more likely to be variations in implementations compared to other standards we might consider “protocols”. That is to say - some vendors are easier to integrate with than others, depending on their implementation of OAuth. Not all vendors will “fully” implement OAuth. For example, at the time of writing, (Twitter only support the client credentials flow)[https://developer.twitter.com/en/docs/authentication/oauth-2-0] for OAuth 2.0.
More information here: https://stackoverflow.com/questions/35070594/oauth-2-is-a-protocol-or-a-framework
What is OpenID?
OpenID Connect (OIDC) is a layer on top of OAuth. From the spec:
OpenID Connect 1.0 is a simple identity layer on top of the OAuth 2.0 protocol. It enables Clients to verify the identity of the End-User based on the authentication performed by an Authorization Server, as well as to obtain basic profile information about the End-User in an interoperable and REST-like manner.
Essentially - OAuth enables authorization, whereas OIDC additionally enables Authentication. OIDC brings a number of benefits. In particular:
- Security - you can verify that the user has authenticated (see below)
- Standardisation - before OIDC, vendors were creating custom authentication methods
What is the point of OIDC - why can’t I just use OAuth?
It’s a good question. If I want users to login with some provider that offers OAuth, why not just initiate a flow, get an access token, and assume the user must have authenticated with the provider?
First, we should clear up exactly what authentication is. It gives us two things:
- Who the user is (their identity)
- Whether the user is present
Authorization does not give us these properties - it simply provides delegated access to a resource. We don’t know who the user is, or whether they are present.
The OIDC spec also guarantees uniqueness of the sub
claim for each issuer (so technically the sub
and iss
claims together are guaranteed to be unique). See section 5.7 of the spec.
Why does it matter who they are and if they are present?
Because you probably want to manage the user’s session - your app will likely need some indication of who the user is (at least a unique identifier so you can tell the difference between users).
Presence matters because otherwise you don’t know it’s a legitimate user who has authenticated. The entity accessing your app might have an access token, but you don’t know whether they are the the legitimate entity, i.e. the user (or resource owner in OAuth terms)?
Why can’t I just take the presence of an access token as proof the user has authenticated?
Firstly, the client is not the intended audience of the access token. The client should consider the token to be opaque. To quote the OAuth 2.0 spec:
The string is usually opaque to the client.
https://tools.ietf.org/html/rfc6749#section-1.4
This means the client should not read it. However, as we said, your app probably need some user information, at least a unique identifier for the user (for example, to associate the identity from the remote OpenID provider with an identity in the client app).
OpenID Connect provides this token - an identity token - that can be read by the client app as it’s in a standardised format (JWT). This allows access tokens to remain opaque, as per the OAuth 2.0 spec, but provides the client app with the information it needs.
Why couldn’t the access token be used to hit a user info endpoint, to figure out who the user is?
You can do this, but it only provides half of our definition of authentication. What we are lacking is an indication of whether the user is present. This is because:
- The access token might well outlive the user’s session, or perhaps the access token has come from somewhere else (e.g. exchanged for a refresh token). If this is the case, there is no user present. If you’re trying to verify whether a user’s session is still valid, exchanging an access token for details won’t help.
- A user info endpoint couldn’t provide info on the authentication (e.g. when it expires), because it’s accessed with an access token. The access token could have come from anywhere. Access tokens are not scoped to clients, so even if the OpenID provider could return information on recent authentications, how would the OpenID provider know which authentication you are after? The user may have logged into this OpenID provider via different clients. You couldn’t pass along a client id, because then any client could get authentication info about another client.
You need identity information to verify the user is still there. This can be done by looking at the expiry in the identity token, or using prompt=none
(https://stackoverflow.com/a/49048976/2547543).
Are there any additional reasons?
Another strong reason for OpenID Connect is standardisation. Without it, each OAuth implementation has a slightly different way of conveying identity information (e.g. it might return a subject
claim, or user_id
claim etc.). If your app integrates with multiple OpenID providers (e.g. login with google and login with facebook), then you have to handle this differently.
For more information on why OIDC is needed, look at https://oauth.net/articles/authentication/
How does OpenID Connect integrate with OAuth?
To be clear - OpenID Connect is an extension of OAuth. You can’t have OIDC without OAuth. To quote the spec:
OpenID Connect implements authentication as an extension to the OAuth 2.0 authorization process.
https://openid.net/specs/openid-connect-core-1_0.html#Introduction
Essentially, you define the openid
scope that indicates an identity token should be returned. This token is a JWT and can be verified by the client. See If access tokens should be opaque to the client, why is mine a JWT? for more information.
OpenID Connect then provides some optional standard scopes that map to various claims about the user (profile
, email
, address
and phone
). The profile scope maps to many claims that are returned (name
, family_name
, picture
, gender
etc.). You could implement your own scopes, too.
The cool thing about doing authorization first is the user can grant access to parts of their identity. For example, it’s feasible that the user (resource owner) would not want some client to have access to their phone number. The user could choose not to grant access to this information.
OpenID Connect also defines a standard userinfo
endpoint, that can be accessed using the access token to retrieve further user information, if not included in the initial identity token.
Why use the /userinfo endpoint instead of the ID token to get claims?
There are a few things to consider here:
- Calling the
/userinfo
endpoint is an additional call, compared to simply inspecting the ID token - The id token could have out-of-date information, whereas a call to
/userinfo
will (in theory) return up-to-date information
And if you are implementing the OpenID provider:
- Putting lots of claims into the ID token will increase its size. If the token will be stored client-side this could become a problem.
What is the difference between Authorization and Authentication?
Authorization is all about providing access to a resource. This is what OAuth is concerned with. If I have an access token to access your Facebook pictures, it doesn’t mean I am you. I just have access to a resource that you own.
Authentication is about verifying that a user is who they say they are. This could happen without granting any access to a resource - i.e. without authorization. This verification happens by someone providing something to prove who they are, e.g. a password or a fingerprint.
Typically, authentication happens before authorization. A simple example is logging onto your computer at work. This is authentication, but you might not be authorized to access certain files on a shared drive.
What are the key terms (and how do I pronounce them)?
The OAuth 2.0 spec defines four roles:
- Resource owner (
RO
) - “An entity capable of granting access to a protected resource”. This is often a human, for example when someone grants some service access to parts of their Facebook profile. - Resource server (
RS
) - “The server hosting the protected resources, capable of accepting and responding to protected resource requests using access tokens”. For example, if you grant a service access to a file in your OneDrive account, the resource server would be OneDrive. - Client - “An application making protected resource requests on behalf of the resource owner and with its authorization”. This would be the service that uses an access token to access protected resources on the resource server.
- Authorization server (
AS
) - “The server issuing access tokens to the client after successfully authenticating the resource owner and obtaining authorization”. This might be something you implement yourself, or a third party service (perhaps Okta, Auth0, or Facebook).
The OpenID Connect spec defines many terms. Some key ones:
- Relying party (
RP
) - “OAuth 2.0 Client application requiring End-User Authentication and Claims from an OpenID Provider”. This is the same as the “Client” defined above. - OpenID Provider (
OP
) - “OAuth 2.0 Authorization Server that is capable of Authenticating the End-User and providing Claims to a Relying Party about the Authentication event and the End-User”. This is the same as “Authorization server” defined above.
Note: the OAuth 2.1 spec also mentions relying parties, but doesn’t define it up-front. It should probably say “client(s)”.
These things have non-obvious pronunciations:
JWT
- pronounced “jot”PKCE
- pronounced “pixie”
What are scopes, claims and audiences and how do they interact?
Some definitions from the specs:
- Claims - “Piece of information asserted about an Entity”. This could be anything - some claims are defined by the OAuth and OIDC specs, others can be completely custom. Examples are “email” and “birthdate”.
- Scopes - this refers to the “scope” of access of an access token. Essentially, what access does an access token have. Clients pass along a scope parameter when making a request for a token (either via the token or authorize endpoints). The resulting access token will be “scoped” to the provided scopes. In practice, if the access token is a JWT, this will involve setting the scope and audience claims in the token. Additional claims may also be set (see below)
- Audience - if requesting an API scope this could be returned as audience claim, which can be validated by protected resources. In practice, if the token is a JWT, the protected resource can inspect the access token and validate the audience claim. If it’s not a JWT, the access token can be introspected (by making a call to the
AS
) to check the scope of the token. This is defined as optional in the OAuth 2.0 spec. Access tokens to not have to be scoped to an audience, but it is recommended. https://tools.ietf.org/html/draft-ietf-oauth-v2-1-00#section-7.4.3
For OAuth 2.0, the client uses the scope parameter when sending requests to the token or authorization endpoints. This parameter tells the authorization server what scope of access the client is requesting. If the access token is ultimately issued, it will have this scope of access. For example, a client might request a scope of files:read
. The scope of the token is determined by the RS
via token introspection (either by an introspect call to the AS
or by inspecting the token directly, if it’s readable (e.g. a JWT)). The token will have a scopes claim defined and possibly an audience claim.
The OAuth 2.1 spec makes a few assertions about scopes:
- They are case sensitive
- They are optional to request - the
AS
may return a default list if they are omitted (this should be documented by the server), or fail the request indicating an invalid scope - If the requested scopes do not match the actual granted scopes (for example, if the
RO
refuses access or client doesn’t have permission), then theAS
“MUST include the “scope” response parameter to inform the client of the actual scope granted” (See OAuth 2.1 3.3).
To summarise so far: for OAuth 2.0, scopes are about limiting the scope of access for an access token. This is by providing scope
or audience
claims in the access token, which are validated by the RS
. Scopes tend to be things like files:read
, and audiences tend to be RS
s, for example picture_server
.
When we consider OpenID Connect, scopes are extended to do a bit more. OIDC is concerned with identity, so we are talking about requesting information about the resource owner. Scopes in OIDC can be used to request sets of claims about a user: https://openid.net/specs/openid-connect-core-1_0.html#ScopeClaims
Depending on the response_type
used (whether an access token is requested), the claims will either be made available via the userinfo endpoint (when an access token is requested) or the id_token (when an access token is not requested). This is because an access token is required for the userinfo endpoint. (See OIDC 5.5)
For example, I can request the profile
scope, which is a standard scope defined by the OIDC spec. This scope maps to the following claims: name
, family_name
, given_name
, middle_name
, nickname
, preferred_username
, profile
, picture
, website
, gender
, birthdate
, zoneinfo
, locale
, and updated_at
. (See https://openid.net/specs/openid-connect-core-1_0.html#ScopeClaims).
OIDC also allows individual claims to be requested, without having to request them via a scope. This is using the claims
request parameter, which accepts a JSON document that lists the claims. The document defines whether a claim should be returned via the userinfo endpoint or in the id token. See https://openid.net/specs/openid-connect-core-1_0.html#ClaimsParameter for more information.
OIDC is concerned primarily with id tokens. The audience claim (aud
) is required in the ID token, and must at least include the client_id of the RP
.
It’s worth noting that the OAuth 2.0 spec says very little about claims. This makes sense, as OAuth is about delegated authorization. Information about the resource owner isn’t particularly relevant. OIDC is all about claims, as claims represent information about the resource owner.
So, to summarise:
- OAuth 2.0 takes scopes and returns access tokens that are scoped to the provided scopes, via
audience
orscopes
claims in the access token (these might be returned via an introspection call if the token is not readable or verifiable byRS
s) - OAuth 2.0 says that
AS
s can make providing a list of scopes optional, but must advertise the default list - OAuth 2.0 says scopes are case sensitive
- OIDC takes a scope and maps it to a list of claims, that are returned either in the id_token or from a call to the userinfo endpoint (when an access token is also requested)
What are the different claim types in OIDC?
OIDC defines 3 types of claim:
- Normal
- Aggregated
- Distributed
Normal claims are simple. They are asserted directly by the OpenID provider and returned as JSON key/value pairs.
Aggregated claims are asserted by a claims provider other than the OpenID provider (that is to say, some other service is providing this claims information). In this case, the claims are represented as part of a JWT and passed back alongside the normal claims. The JWT is signed by this other service providing the claims
Distributed claims are similar to aggregated claims, but instead of returning the claim value, a reference (URL) to the value is returned, along with an access token.
There are really good examples in the OIDC spec.
What is the userinfo endpoint?
The userinfo endpoint is defined in the OIDC spec.
The endpoint is accessed using an access token (as defined in the OAuth 2.0 spec). If an access token is not requested from the AS
, claims are returned in the id token instead of via the userinfo endpoint.
When requesting information from the userinfo endpoint, clients must verify that the sub
claim returned matches that in the id token. This is to guard against token substitution attacks. As clients should not inspect access tokens (as they are intended to be opaque to clients), there’s no guarantee that the sub
claim returned from the userinfo endpoint will match the sub
claim in the id token.
What are the various endpoints I should be aware of, and how do I discover them?
OAuth 2.0 defines a couple of key endpoints:
- authorize - used to interact with the resource owner in order to obtain an authorization grant.
- token - used to obtain access tokens by presenting an authorization grant or a refresh token. This is also the endpoint used to get an access token when using the resource owner password grant (deprecated in OAuth 2.1).
There is an extension for defining a token introspection endpoint (RFC 7662), which returns the active state and metadata about an access token as a JSON document.
OIDC makes use of the authorize and token endpoints from OAuth 2.0, and also defines the userinfo endpoint (see “What is the userinfo endpoint?”).
In order to discover these endpoints, many implementations provide a JSON document at a well-known location, that lists these endpoints in addition to lots of other useful information about the authorization server. This information relates to both OAuth and OpenID, and is defined in RFC 8414.
The default endpoints are:
/.well-known/oauth-authorization-server
/.well-known/openid-configuration
Note that some implementations will only use the later, but the document will include both OIDC and OAuth metadata.
These endpoints can be changed for your own implementation - they are not guaranteed.
OAuth/OIDC tooling and libraries may contain components that can read these discovery documents, so endpoints don’t have to be hardcoded. This is certainly the case in the .net world with the IdentityModel package.
If access tokens should be opaque to the client, why is mine a JWT?
As the OAuth 2 spec says regarding access tokens:
The string is usually opaque to the client.
https://tools.ietf.org/html/rfc6749#section-1.4
In some implementations of authorization servers, the access token returned to the client will be a JWT, which could be read by the client. This is not part of the spec - clients should not rely on being able to read the access token if the authorization server with which you are integrating only guarantees OAuth 2.0 compliance.
There is a proposed standard for “issuing OAuth 2.0 access tokens in JSON web token (JWT) format”. See Dominick Baier’s blog post for more information.
The main advantage of using JWTs for access tokens is that resource servers can validate the token cryptographically by using public key from authorization server, without calling an introspection endpoint on the authorization server. This is useful if the server’s aren’t co-located, and also limits the amount of traffic going to the authorization server (and therefore resource required).
Cryptographic validation of JWTs can be achieved by discovering the authorization server’s JWK (JSON Web Key) set through server metadata (RFC8414) - see jwks_uri
, which will allow you to verify the signature in the JWT. See RFC7515 on JSON Web Signatures for more information.
The downside of using JWTs for access tokens is that the resource server might accept a token that should no longer be valid (say the token has been revoked, or the resource owner is no longer valid - say their account has been disabled). It’s a trade-off - say the access tokens are valid for an hour, this might be a reasonable amount of time for a token to become invalid.
If this isn’t the case - if it’s important to have an up-to-date view of whether the granted authorization is still valid - then you need to introspect the token by calling back to the authorization server. If this is the case, JWTs might not be the best choice for access tokens. Identity server (a dotnet implementation of an authorization server) has the concept of “reference” tokens - these are completely opaque and would force protected resources to call back to introspect tokens. It’s important to consider the extra traffic this will generate, compared to protected resources being able to validate JWT tokens using public key information.
None of this changes the fact that clients should not read access tokens, even it the JWT format is guaranteed.
If OIDC returns an ID token that can be read by the client, does that mean the client can now read access tokens?
No. The fact that id tokens can and should be read by clients, and the fact that ID tokens are always JWTs, does not have any impact on the structure of the access tokens, or the fact that access tokens should be opaque to the client (as defined in the OAuth 2.0 spec).
How do refresh tokens work?
Refresh tokens allow clients to get new access tokens from the authorization server. As access tokens are typically shorter lived, refresh tokens are required to allow clients to get a new access token when the old one expires.
The advantage of refresh tokens is that further interaction from the resource owner (the user) is not required. If a user is present, a refresh token isn’t really required, as a new access token could be issued by the user being re-authorized on the authorization server (they may already have cookies set for the AS
, so this would be non-interactive).
Refresh tokens are not sent to resource servers. They are handled by the client and send directly to the authorization server when a new access token is needed.
To refresh an access token, the refresh token is sent to the standard OAuth 2.0 token
endpoint, along with the scopes required and the refresh_token
grant type. It is possible to change the scopes each time an access token is refreshed (although the requested scopes must not include any scope not originally granted by the RO
). A new refresh token may be returned with the new access token, in which case the client should discard the old one and use the new one in future.
The offline_access
scope may need to be requested in order to receive a refresh token. This is actually part of the OIDC spec, but is used by many providers to cover the OAuth 2.0 use case, too. The OIDC spec states that this scope will request that a refresh token is returned that can be used to obtain an access token that grants access to the user’s userinfo endpoint, even when the user is not there. This is often extended so the refresh token returned can obtain access tokens that grants access to all requested scopes. See the OIDC spec section 11 and this Auth0 blog post for more information.
Confidential clients must authenticate with the token endpoint using their client ID and client secret. For public clients like SPAs, things are more difficult. This blog by Auth0 has a good overview.
Refresh tokens are opaque to the client and to resource servers. They are only intended for use by the AS
. I am not aware of any specification for their structure.
Of course, refresh tokens are highly sensitive as they can be used to request new access tokens, so should be well protected. Leastprivilege has a good blog post on hardening refresh tokens.
For more information, see the OAuth 2.1 spec
How do identity tokens work?
To quote the OIDC spec:
The ID Token is a security token that contains Claims about the Authentication of an End-User by an Authorization Server when using a Client, and potentially other requested Claims.
It is the primary extension that the OIDC spec makes on top of OAuth 2.0. It is always in a JWT format, and unlike OAuth 2.0 access tokens it is fine for clients to read this token.
Clients can verify the id token using cryptography - the AS
should publish public keys. See RFC 7517. There are several other steps that clients must perform to validate id tokens.
An ID token is returned when the openid
scope is requested.
In addition to claims relating to the authentication of the user, the id token may contain additional claims about the user’s identity, e.g. name
and gender
(or this information will be obtainable from the userinfo endpoint - see “What are scopes, claims and audiences and how do they interact?”).
What is an authorization code?
The OAuth 2 spec details this well.
Essentially it’s a code that is exchanged for an access token. The client receives the code after the RO
authenticates on the AS
. The code is exchanged for an access token. See the next question for why this is the case.
Authorization codes are only used once, and is short lived (the recommended lifetime is less than 10 minutes).
Authorization codes are bound to the client identifier and the redirect URI (this is enforced by the AS
- when exchanging the authorization code for an access token, the client id and redirect URI must be sent, too).
The standard token endpoint is used for this exchange. The client must be authorized and the authorization_code
grant type must be used. See spec for details.
If you’re using the implicit flow (which has been deprecated with OAuth 2.1, and is not recommended by OAuth 2.0 Security BCP), then the access token is returned directly to the client and the authorization code is skipped. This is a good post on [why to avoid implicit flow(https://developer.okta.com/blog/2019/05/01/is-the-oauth-implicit-flow-dead).
Why not return the access token straight away, instead of exchanging an authorization code?
This has a few big security benefits, according to the OAuth 2.0 spec:
- Ability to authenticate the client (because the token endpoint requires client authentication)
- Access token doesn’t have to be passed via
RO
‘s user agent, avoiding the token being saved in local proxies or browser history
How does PKCE factor in?
RFC 6736 is the specification for PKCE.
PKCE stands for Proof Key for Code Exchange. It is used when public client (i.e. those that can’t store a secret, e.g. SPAs or mobile apps), need to exchange an authorization code. Because of PKCE, this is a lot more secure.
Although the spec is aimed at public clients, OAuth 2.1 actually states that all clients (including confidential clients) should use PKCE for authorization code exchange.
This post on the oauth.com website explains the process well. In summary:
- The client creates a cryptographically random
code_verifier
string - This
code_verifier
string is used to create acode_challenge
string, which is typically a SHA256 hash of thecode_verifier
string (or if the client doesn’t support SHA256 hashing, it can skip the hashing and use the plain value) code_challenge
andcode_challenge_method
query string parameters are added to the authorize request- The
AS
will associate thecode_challenge
with the authorization code it issues - On exchange, the
code_verifier
string is sent from the client to theAS
, along with the authorization token. If thecode_verifier
matches the one submitted before (or its hash does), then an access token is returned
Where do state and nonce come in?
state
is an optional parameter defined in the OAuth 2.0 spec and can be sent to the authorization endpoint from the client. The AS
doesn’t do anything with the state parameter value, other than return it to the client in the response query string (along with the authorization code). The state is also returned for error responses. Clients must use the state
parameter to implement CSRF protection, unless they are using the code_challenge
parameter (see PKCE).
nonce
is an optional parameter 3.1.2.1 (although mandatory for implicit and hybrid flow 3.2.2.1) defined by the OIDC spec. Is is passed from the client to the AS
and put into the ID token. The AS
shouldn’t perform any additional processing on this value. It is used to mitigate replay attacks (the client will check that the value in the ID token matches the one originally sent in the authentication request).
Daniel Fett has an excellent post on the relative merits of PKCE, state and nonce for mitigating various attack vectors. The conclusion is that PKCE or state should be used, in addition to nonce if using OIDC.
Why must I validate URL redirects?
This is to prevent an “open redirector” attack, which can result in authorization code or access token exfiltration, where an attacker can have tokens sent to a domain that they control.
The OAuth 2.0 spec says that redirect URIs should be registered and compared for confidential clients, and must be for public clients. The security BCP, and therefore the OAuth 2.1 spec, says that all clients must implement exact string matching when validating redirect URIs.
Redirect URIs must be HTTPS (although there is an exception in the original OAuth 2.0 spec for some scenarios when returning authorization codes - see below). This exception has been removed in OAuth 2.1. The Financial-grade readonly API spec (FAPI - build on top of OAuth and OIDC) does specify that redirect URIs must be over HTTPS, perhaps to remove the same exception.
Should TLS be used for all requests?
Yes - TLS is vitally important to OAuth and OpenID. As far as the specs go:
OAuth 2.0 - 3.1 - TLS must be used when sending requests to the authorize endpoint
OAuth 2.0 - 3.1.2.1 - redirection endpoints should require the use of TLS. It’s not a must because (according to the OAuth 2.0 spec) “at the time of this writing, requiring clients to deploy TLS is a significant hurdle for many client developers”. It says the user should be warned about the insecure endpoint prior to redirection
OAuth 2.0 - 3.2 - TLS must be used when sending requests to the token endpoint
OAuth 2.0 - 10.3 - Access tokens must only be transmitted using TLS
OAuth 2.0 - 10.4 - Refresh tokens must only be transmitted using TLS
OAuth 2.0 - 10.5 - TLS should be used when returning authorization codes, and must be used if the client relies on the authorization code for its own resource owner authentication
OAuth 2.0 - 10.9 - All requests to the authorization and token endpoints must require TLS
OAuth 2.0 - 10.11 - TLS must be used on every endpoint used for end-user interaction
OIDC 1.0 - 3.1.2 - TLS must be used for the authorization endpoint
OIDC 1.0 - 3.1.3 - TLS must be used for the token endpoint
OIDC 1.0 - 5.3 - TLS must be used for the userinfo endpoint
OAuth 2.1 - 7.4.2 - “The authorization server MUST implement TLS”
OAuth 2.1 - 7.4.3.3 - “Clients MUST always use TLS (https) or equivalent transport security when making requests with bearer tokens”
OAuth 2.1 - 9.7.1 - Loopback interface redirect URIs (traffic to localhost) can use HTTP as the HTTP request never leaves the device
OAuth 2.1 - 9.8 - “The transmission of authorization codes MUST be made over a secure channel, and the client MUST require the use of TLS with its redirect URI if the URI identifies a network resource”
As you can see, TLS is everywhere. The exception for authorization codes in some scenarios (OAuth 2.0 section 10.5) has been removed from the OAuth 2.1 spec (see section 9.8).
Note that in FAPI, loopback interface redirection for native apps is not allowed for production systems (the one remaining exception where HTTP is acceptable in OAuth 2.1 - 9.7.1).
Which RFCs and BCPs are in play here?
There are many in play for OAuth 2.0. OAuth 2.1 “flattens” some of these into a single document.
Instead of listing them all here, I refer to Aaron Parecki’s excellent post that explains how they fit together.
What happened to OAuth 1.0?
In summary, aspects OAuth 1.0 are challenging to implement, and the spec doesn’t cover some use-cases, for example non-browser based apps.
The spec isn’t dead - it’s still used by some sites, including Twitter.
For more information, see: