Introduction
The goals of this page are to:
-
Remain brief and up-to-date; short enough to read in one sitting, updated and versioned with the code.
-
Provide an overview and a snapshot of status for team members and newcomers.
-
Provide links to navigate to other project Resources
This page does not replace or duplicate information in JIRA, Bugzilla, enhancement proposals, or the enhancment proposal process. |
Architecture Summary
Log categories
We define 3 logging categories
Application |
Container logs from non-infrastrure containers. |
Infrastructure |
Node logs and container logs from |
Audit |
Node logs from |
Components
The logging system breaks down into 4 logical components:
collector |
Read container log data from each node. |
forwarder |
Forward log data to configured outputs. |
store |
Store log data for analys This is the default output for the forwarder. |
exploration |
UI tools (GUI and command line) to search, query and view stored logs |
Operators and Custom Resources
The cluster logging operator (CLO) implements the following custom resources:
- ClusterLogging (CL)
-
Deploys the collector and forwarder which currently are both implemented by a daemonset running Fluentd on each node.
- ClusterLogForwarder (CLF)
-
Generate Fluentd configuration to forward logs per user configuration.
The elasticsearch logging operator (ELO) implements the following custom resources:
- ElasticSearch
-
Configure and deploy an Elasticsearch instance as the default log store.
- Kibana
-
Configure and deploy Kibana instance to search, query and view logs.
Runtime behavior
The container runtime interface (CRI-O) on each node writes container logs to files. The file names include the container’s UID, namespace, name and other data. We also collect per-node logs from the Linux journald.
The CLO deploys a Fluentd daemon on each node which acts both as a collector (reading log files) and as a forwarder (sending log data to configured outputs)
Log Entries
The term log is overloaded so we’ll use these terms to clarify:
- Log
-
A stream of text containing a sequence of log entries.
- Log Entry
-
Usually a single line of text in a log, but see Multi-line Entries
- Container Log
-
Produced by CRI-O, combining stdout/stderr output from process running in a single container.
- Node Log
-
Produced by journald or other per-node agent from non-containerized processes on the node.
- Structured Log
-
A log where each entry is a JSON object (map), written as a single line of text.
Kubernetes does not enforce a uniform format for logs.
Anything that a containerized process writes to stdout
or stderr
is considered a log.
This "lowest common denominator" approach allows pre-existing applications to run on the cluster.
Traditional log formats write entries as ordered fields, but the order, field separator, format and meaning of fields varies.
Structured logs write log entries as JSON objects on a single line. However names, types, and meaning of fields in the JSON object varies between applications.
The Kubernetes Structured Logging proposal will standardize the log format for some k8s components, but there will still be diverse log formats from non-k8s applications running on the cluster.
Metadata, Envelopes and Forwarding
Metadata is additional data about a log entry (original host, container-id, namespace etc.) that we add as part of forwarding the logs. We use these terms for clarity:
Message |
The original, unmodifed log entry. |
Envelope |
Include metadata fields and a |
We usually use JSON notation for the envelope since it’s the most widespread convention.
However, we do and will implement other output formats formats; for example a syslog message with its MSG
and STRUCTURED-DATA
sections is an different way to encode the equivalent envelope data.
Depending on the output type, we may forward entries as _message only, full envelope, or the users choice.
The current metadata model is documented here. Model documentation is generated from a formal model.
Not all of the documented model is in active use. Review is needed.
The labels field is "flattened" before forwarding to Elasticsearch.
|
Multi-line Entries
Log entries are usually a single line of text, but they can consist of more than one line for several reasons:
- CRI-O
-
CRI-O reads chunks of text from applications, not single lines. If a line gets split between chunks, CRI-O writes each part as a separate line in the log file with a "partial" flag so they can be correctly re-assembled.
- Stack traces
-
Programs in languages like Java, Ruby or Python often dump multi-line stack traces into the log. The entire stack trace needs to be kept together when forwarded to be useful.
- JSON Objects
-
A JSON object can be written on multiple lines, although structured logging libraries typically don’t do this.
Work in progress
Proposals under review
Flow Control/Back-pressure
Status: Needs write-up as enhancment proposal(s)
TODO: Updated diagram - CRIO to fluentd end-to-end.
Goal: 1) Sustain a predictable average load without log loss up to retention limits. 2) Handle temporary load spikes predictably: drop or back-pressure. 3) Handle long-term overload with alerts and predictable log loss at source.
Problems now: - Uncontrolled (file-at-a-time) log loss from slow collection + node log rotation. - Large back-up in file buffers under load: very high latencies, slow recovery.
Propose 2 qualities of service: - fast: priority is low latency, high throughput, no effect on apps. May drop data. n- reliable: Priority is to avoid data loss, but may allow loss outside of defined limits. May slow application progress.
In traditional terminology: - fast: at-most-once - reliable: at-least-once with limits. Even users who want reliable logging may have a breaking point where they’d rather let the application progress and lose logs. We may need configurable limits on how hard we try to be reliable.
Architecture: - conmon writes container log files on node, log rotation (retention) - fluentd on node: file vs. memory buffers - forwarder target: throughput - store target: retention - Future: separate normalizer/forwarder, fluentbit/fluentd
Must consider data loss by forwarding protocol also: - store (elasticsearch) review options. - fluent-forward need to enable at-least-once acks (we don’t) - others need to review case by case if its possible.
Throughput and latency: - evaluate throughput of each stage: node log to store/target. - end-to-end latency, expected/acceptable variation.
Buffer sizes - all components must maintain bounded buffers. - without end-to-end back-pressure we cannot guarantee no data loss. - we should be able to give better sizing/capacity guidelines.
Need well-designed (accurate, no floods, no noise) alerts for log loss and back-pressure situations |
Configuration: - Enable backpressure by pod label and/or namespace. - Can’t impose backpressure everywhere? - Enable rate limiting in low-latency mode (back-pressure always limits rate)
Error Reporting
The logging system itself can encounter errors that need to be diagnosed, examples: * Invalid JSON received where structured logs are reuqired. * Hard (no retry possible) errors from store or other target causing unavoidable log loss.
Alerts are a key component, but alerts must be actionable, they can’t be used to record ongoing activity that might or might not be reviewed later. For that we need logs.
The CLO and fluentd collector logs can be captured just like any other infrastructure log. However, if the logging system itself is in trouble, users need a simple, direct path to diagnose the issue. This path might have a simpler implementation that is more likely to survive if logging is in trouble.
Proposal: add a 4th logging category [application, infrastructure, audit, logging] This category collects logs related to errors in the logging system, including fluentd error messages and errors logged by the CLO.
Document Metadata
Decide on the supported set of envelope metadata fields and document them.
Some of our format decisions are specifically for elasticsearch (e.g. flattening maps to lists) We need to separate the ES-specifics, either:
-
Include suffficient output format configuration to cover everything we need for ES (map flattening) OR
-
Move the ES-specific formatting into the elasticsearch output type.
Multi-line support
-
Cover common stack trace types: Java, Ruby, Python, Go.
-
Review need for multi-line JSON.
Syslog metadata
Optionally copy metadata copied to syslog STRUCTURED-DATA
Loki as store
-
Benchmarking & stress testing in progress
-
Configuring loki at scale.
-
Test with back ends s3, boltd.
Observability/Telemetry
TODO
Updating this page
The asciidoc source for this document is on GitHub. Create a GitHub Pull Request to request changes.
Resources
-
The Enhancement Proposal Process is how we document & discuss designs.
-
Cluster Logging Enhancement Proposals for CLO and ELO.
-
JIRA project LOG tracks feature work.
-
Bugzilla tracks bugs.
-
Formal Model and documentation/code generators