Aurora Scheduler lets you use an Apache Mesos cluster as a private cloud. It supports running long-running services, cron jobs, and ad-hoc jobs. Aurora aims to make it extremely quick and easy to take a built application and run it on machines in a cluster, with an emphasis on reliability. It provides basic operations to manage services running in a cluster, such as rolling upgrades.
To very concisely describe Aurora, it is like a distributed monit or distributed supervisord that you can instruct to do things like run 100 of these, somewhere, forever.
Aurora Scheduler is a reboot of Apache Aurora that seeks to continue its development after the latter entered the Apache Attic. That having been said, the project is largely in maintenance mode. We will continue to try to provide quality of life updates to the codebase but we don't anticipate any new large features being landed.
Furthermore, as a result of the decreased amount of contributors available, focus will be turned to the scheduler. Anyone who depends on tooling outside of the scheduler should look at taking up maintenance of those tools.
Changes made to the scheduler will always strive to be compatible with existing tools but compatibility is not guaranteed. More importantly, in many cases we will not be testing against such tools so it is up to users to report incompatible changes. Tools in this case also include the original Python2 client.
Aurora is built for users and operators.
-
User-facing Features:
- Management of long-running services
- Cron jobs
- Resource quotas: provide guaranteed resources for specific applications
- Rolling job updates, with automatic rollback
- Multi-user support
- Sophisticated DSL: supports templating, allowing you to establish common patterns and avoid redundant configurations
- Dedicated machines: for things like stateful services that must always run on the same machines
- Service registration: announce services in ZooKeeper for discovery by various clients
- Scheduling constraints to run on specific machines, or to mitigate impact of issues like machine and rack failure
-
Under the hood, to help you rest easy:
- Preemption: important services can 'steal' resources when they need it
- High-availability: resists machine failures and disk failures
- Scalable: proven to work in data center-sized clusters, with hundreds of users and thousands of jobs
- Instrumented: a wealth of information makes it easy to monitor and debug
Aurora can take over for most uses of software like monit and chef. Aurora can manage applications, while these tools are still useful to manage Aurora and Mesos themselves.
However, if you have very specific scheduling requirements, or are building a system that looks like a scheduler itself, you may want to explore developing your own framework.
Aurora supports multiple HTTP authentication mechanisms controlled by the -http_authentication_mechanism flag.
The Web UI can be protected using OAuth2 Authorization Code Flow with any OIDC-compatible provider (e.g. Keycloak, Okta, Auth0).
How it works:
- Unauthenticated browser requests to the Web UI are redirected to the identity provider login page.
- After successful login the provider redirects back to
/oauth2/callback. - The scheduler exchanges the authorization code for tokens, fetches the user's
subandemailfrom the userinfo endpoint, and issues a signed HMAC-SHA256 session cookie (aurora_tokenby default). - Subsequent requests carry the session cookie and are admitted without another round-trip to the provider.
- Paths listed in
-oauth2_exclude_paths(default:/api,/vars,/health,/apiclient) bypass OAuth2 entirely, so Thrift API clients and monitoring probes continue to work without browser credentials.
Required flags:
| Flag | Description |
|---|---|
-http_authentication_mechanism=OAUTH2 |
Enable OAuth2 mode |
-oauth2_issuer_url |
OIDC issuer base URL, e.g. https://keycloak.example.com/realms/myrealm |
-oauth2_client_id |
Client ID registered in the identity provider |
-oauth2_client_secret |
Client secret |
-oauth2_redirect_uri |
Callback URL registered in the provider, e.g. https://aurora.example.com/oauth2/callback |
-oauth2_jwt_secret |
Random string (≥ 32 chars) used to sign session cookies with HMAC-SHA256 |
Optional flags:
| Flag | Default | Description |
|---|---|---|
-oauth2_exclude_paths |
/api,/vars,/health,/apiclient |
Comma-separated path prefixes that bypass OAuth2 |
-oauth2_cookie_name |
aurora_token |
Name of the session cookie |
-oauth2_session_timeout_secs |
28800 (8 hours) |
Session cookie validity in seconds |
Example startup flags:
-http_authentication_mechanism=OAUTH2
-oauth2_issuer_url=https://keycloak.example.com/realms/myrealm
-oauth2_client_id=aurora-scheduler
-oauth2_client_secret=<secret>
-oauth2_redirect_uri=https://aurora.example.com/oauth2/callback
-oauth2_jwt_secret=<random-string-at-least-32-chars>
Keycloak client configuration checklist:
- Client protocol:
openid-connect - Access type:
confidential - Valid redirect URIs: must include your
-oauth2_redirect_urivalue - Scopes:
openid,email,profile
Notes:
- No new external libraries are required. Token exchange, discovery, and userinfo calls use the Java 11 built-in
java.net.http.HttpClient. Session cookies usejavax.crypto.Mac(HmacSHA256). - OIDC endpoints are resolved dynamically via
/.well-known/openid-configuration(authorization_endpoint,token_endpoint,userinfo_endpoint), so non-Keycloak providers are supported without hardcoded paths. - Production/remote deployments should use HTTPS for
-oauth2_issuer_urland-oauth2_redirect_uri. Local development may use loopback HTTP (localhost,127.0.0.1,::1). - OAuth2 cookies are emitted with
HttpOnlyand are markedSecurewhen the request is TLS (request.isSecure) or forwarded as HTTPS (X-Forwarded-Proto=https). - When OAUTH2 is active, Shiro-based authentication (BASIC / NEGOTIATE) is not installed. The Thrift API paths are excluded from OAuth2 by default and rely on network-level security.
If you have questions that aren't answered in our documentation, you can reach out to the maintainers via Slack: #aurora on mesos.slack.com. Invites to our slack channel may be requested via mesos-slackin.herokuapp.com
You can also file bugs/issues in our Github repo.
Except as otherwise noted this software is licensed under the Apache License, Version 2.0
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
