「Loki」- Grafana Loki OSS | 日志收集

该部分笔记将围绕官方文档展开，该笔记将记录：

该笔记将记录：与 Grafana Loki 相关的内容，以及相关问题的解决方案。
我们对 Grafana Loki 的学习笔记，与之关的概念术语、部署、使用、维护，及常见问题解决办法；

认识

官网：https://grafana.com/oss/loki/
文档：https://grafana.com/docs/loki/latest/?pg=oss-loki&plcmt=quick-links
仓库：https://github.com/grafana/loki

Loki is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost effective and easy to operate. It does not index the contents of the logs, but rather a set of labels for each log stream. 简而言之，Loki，其是个日志采集（聚合）工具。

组成

设计文档 | Design documents | https://grafana.com/docs/loki/latest/community/design-documents/

Server | 架构 and 组件

https://grafana.com/docs/loki/latest/get-started/architecture/

Wirte Path -> Distributor -> Ingester -> Storage
Read Path <- Query Frontend <- Querier <- ( Storage + Ingester )
Ruler <- ( Storage + Ingester )

Distributor

Wirte Path 的第一站，进行初始化检查，比如 Tenant ID 等等，以减少 Ingester 压力，然后将流量发往多个 Ingester；Distributor 是无状态，前端负载均衡可以流量均发
Validation：时间戳检查、Label 检查、日志长度检查等等；
Preprocessing：预先处理，比如对 Label 排序以使得 Hash 一致；
Rate limiting：能够针对租户进行频率限制；增加 Distributor 节点，这能够限制某租户在每个 Distributor 的写入频率（加入租户和写入频率是固定的）；注意 distributor 使用自己的 ring 来与对等体注册，并获取节点数量，但这与 Ingester 的 ring 是两回事；
Forwarding：负责将数据进行转发到 Ingester（负责最后写入）；
Replication factor：Distributor 会将日志写入多个 Ingester 中，并要保证 floor(replication_factor / 2) + 1 个写入成功；并且还提供 WAL 机制以防止数据丢失；
Hashing：Distributor 与副本因数结合，以确定 Stream 应该写入哪个 Ingester 中；Stream 是通过 LabelSet 及 Tenant ID 进行 Hash 以找到对应的 Ingester 来存储日志；Ingester 将自己注册到 Hash Ring 中，并附带一堆 Token（Random Unsigned Int 32）；进行 Hash 查找时，先找到大于 Stream 的最小 Hash，然后是下一个 Ingrester 的 Token，以此类推（顺时针）；Ingester 拥有的 Token 仅负责部分范围 Hash；
Quorum consistency：Distributor 使用相同的 Hash ring，所以写入请求发往任意一个 Distributor 都是可以的；Loki 使用 Dynamo 风格的仲裁一致性，以保证读写一致，即 Distributor 将等待 (1/2) +1 个 Ingester 返回后再返回给客户端；

Ingester

负责将数据写入后端存储；负责将内存查询的日志返回；
lifecycler 来管理 Ingester 的生命周期（但是 WAL 取代此功能）：PENDING（当它正在等待另一个 ingester 的切换时）, JOINING（当它当前将其令牌插入环并初始化自身时。此时能够接收请求）, ACTIVE（当它完全初始化时。它可能会收到对其拥有的令牌的写入和读取请求）, LEAVING（当它关闭时。它可能会收到对它仍然在内存中的数据的读取请求。）, UNHEALTHY（当它未能向 Consul 发送心跳时。UNHEALTHY 由 Distributor 在定期检查环时设置）
Ingester 收到的日志是先写入内存，然后周期刷新到后端存储：达到容量限制、长时间未更新、触发 Flush 操作，都导致 Chunk 被压缩被标记为 ReadOnly 状态，一个可写的块取而代之；此时如果 Ingester 出现问题，将丢失这些数据（所以 Ingester 支持副本以缓解此问题）
数据写入后端时，将通过 LabelSet Tenant Cotent 进行 Hash，所以不会有重复数据；如果写入失败，将在后备存储中创建多个不同的块对象（请参阅 Querier 了解如何对数据进行重复数据删除。）
Timestamp Ordering：日志是由顺序要求的，Ingester 将检查日志顺序，破坏时间顺序的日志将被丢弃并产生错误；也可以配置无序日志；当要求日志由顺序时，完全相同的日志，后者将被丢弃，但如果时间相同但内容不同，则将保留；
交接（Handoff）：Ingester 在离开 Hash ring 或关闭时，将查看是否由新进入的 Ingester 实例，如果有则会尝试转交 Token 及 Chunk 数据；PENDING 状态即时在等待交接发生，如果超时，则将进行常规的加入过程，并插入新 TOKEN；这个交接过程是为了避免在关闭时 Flush 所有的 Chunk（刷新是个耗时的过程）；
文件系统支持：经过 Ingester 支持通过 BoltDB 写入数据库，但仅工作于单进程模式，Querier 需要同一个后端存储，并且 BoltDB 仅允许一个进程持有锁；

Query Frontend

可选组件，其用于提升查询速度，后面依旧由 Querier 执行查询；Frontend 进行查询调整并持有查询到队列中，Querier 从队列中获取任务执行查询，并将结果返回给 Frontend（所以 Querier 需要知道 Frondend 的地址）；无状态；运行少量（2 个足够）来容错；
Queueing：确保大型查询失败时能够重试，以允许管理员提供不足内存，或云新更小的查询；防止大型查询分布在同个 Querier 上（通过 FIFO 队列）；公平调度租户查询，防止 DDoS 攻击；
Splitting：将大型查询分解为多个小型查询，以减小压力；
Caching Metric Queries：能够对查询结果尽心更缓存，以用于后续的查询；如果查询结果不完整，将计算缺少的数据并查询；查询前端可以选择将查询与其步骤参数对齐，以提高查询结果的可缓存性；查询结果能够缓存在多种后端（Redis, Memecache, In-memory Cache）
Log Queries：还在开发阶段；

Querier

处理 LogQL 查询，以从 Ingester 或 Memeory 中获取日志；
先从 Ingester 中获取，失败后将从 Memory 中获取；

Index Gateway

在 BoltDB Shipper 中，Index Gateway 负载访问对象存储，并通过 gRCP 为 Querier 和 Ruler 提供索引查询，以减小对象存储的压力及成本。

Client | 负责将查询日志发送到 Grafana Loki 中

—- Promtail
—- Lambda Promtail
—- AWS
—- Docker driver
—- Fluent Bit
—- Fluentd
—- Logstash
—- k6 load testing

命令程序 and 配置文件

根据官网文档，Loki 的所有模块功能都在同个二进制程序中，通过命令行选项来指定要运行的微服务模块。

性质

Alerting and Recording Rules => Alerting and Recording
LogQL => LogQL
Operations（运维管理相关的内容，包括认证、多租户、存储、排障等等；）
—- Authentication => 3.1 Functions and Features
—- Loki in Grafana
—- Observability
—- Overrides Exporter
—- Scalability
—- Storage => Storage
—- Multi-tenancy
—- Loki Canary
—- Recording Rules
Storage => Storage
Tools
~~Community~~ => Concepts and Fundamentals/Community
~~Maintaining~~ => Concepts and Fundamentals/Maintaining

提供日志查询功能

日志查询 | Query | https://grafana.com/docs/loki/latest/query/
能够通过查询查询语句来查询数据；
能够针对日志数据进行统计指标；
…

https://grafana.com/docs/loki/latest/reference/loki-http-api/#query-logs-within-a-range-of-time
根据文档，start: Loki returns results with timestamp greater or equal to this value., end: Loki returns results with timestamp lower than this value.

提供 HTTP API 操作

Grafana Loki documentation / HTTP API / https://grafana.com/docs/loki/latest/api/

Loki HTTP API | https://grafana.com/docs/loki/latest/reference/loki-http-api/

Loki 提供的 HTTP 接口等等；

提供问题排查与调试 Troubleshooting

https://grafana.com/docs/loki/latest/operations/troubleshooting/

server.log_level:

构建

—— 服务配置

版本选择

现在（05/20/2022）最新为 2.5 版本，我们首次部署，所以我们并不关心发行说明；

https://grafana.com/docs/loki/latest/release-notes/

部署模式 | Deployment Mode

Installation | Grafana Loki documentation

Install using Tanka (recommended)

我们暂时 05/23/2022 没有采用该部署方式，原因是：我们的工作需要面对非常多、各种类型的系统，而 Tanka/Jsonnet 并非所有系统都在使用的工具，所以我们暂时不考虑引入该工具（等该工具大面积普及时我们再开始使用）；

Simple scalable deployment with Helm

我们暂时 05/23/2022 不采用该部署方式，所以相关文档不再阅读；

Microservices deployment with Helm

针对我们的环境，我们采用该方式进行部署
Loki cluster
Grafana
Promtail to Loki
Promtail with syslog
Promtail with systemd-journal

Install through Docker or Docker Compose

我们暂时 05/23/2022 不采用该部署方式，所以相关文档不再阅读；

Install from source

我们暂时 05/23/2022 不采用该部署方式，所以相关文档不再阅读；

测试二进制部署 | Install and run locally

我们 08/08/2024 通过该部署方式来测试使用，所以相关细节不再详述；
下载 Loki 和 Promtail 程序；
下载两个程序的配置文件；
启动 Loki 服务；
配置 Promtail 文件以抓取日志，并启动 Promtail 服务；

Monolithic Mode

在同个进程中，运行所有组件，-target=all
可用于测试或小规模使用（<100G）；性能取决于节点性能；流量是循环到各节点；
通过 memberlist_config 配置，以使多个成员共享同个存储；

Simple Scalable

微服务分为 Read Write Backend 类别，以类别为单位进行扩容缩容，-target=read -target=write -target=backend
分离读写而带来的高可用、性能提升；
需要前端负载均衡，以将请求转发到对应的组件中；
支持 TB 级别日志；若干 TB 级别；

Microservices Mode

所有组件独立运行：ingester distributor query-frontend query-scheduler querier index-gateway ruler compactor
比较复杂（但是配置 Kuberntes 比较容易）
具有更好的观测性，也是最高效的 Loki 安装；
适用于大规模场景以及高可控性；该模式最适合 Kubernetes 部署。有 Jsonnet 和分布式 Helm Chart 安装。

服务部署

WIP

on Kubernetes

该部分将记录：在 Kubernetes Cluster 中，部署 Grafana Loki 的方法，以及相关问题的解决办法；

服务升级

Upgrading | Grafana Loki documentation

我们现在并不涉及升级的问题，所以暂时并不关注升级相关的内容。
但是从官方文档来开，升级过程长篇大论，可能并非易事；

应用

—— 服务使用

在我们的技术栈中，Grafana Loki 出现的频率较高，能够覆盖我们多数业务场景，所以我们将尝试对其学习并使用。

我们采用 Promtail 来采集日志，Grafana Loki 来存储日志，以对日志进行集中管理（即单个 Grafana Loki 实例）。

场景 | 使用 TOS 存储

我们使用 VolcEngine 的对象存储 TOS 服务来存储日志。2025-11-17

    s3:
      s3: null
      endpoint: tos-s3-cn-beijing.ivolces.com
      region: tos-s3-cn-beijing
      secretAccessKey: XXXXXXX
      accessKeyId: XXXXXXX
      signatureVersion: null
      s3ForcePathStyle: false
      insecure: false
      http_config: {}
      backoff_config: {}
      disable_dualstack: false

level=error ts=2025-11-17T05:37:09.897264401Z caller=flush.go:261 component=ingester loop=24 org_id=cosarts-plat-200-vke msg=”failed to flush” retries=0 err=”failed to flush chunks: store put chunk: BadRequest: Bad Request\n\tstatus code: 400, request id: , host id: , num_chunks: 1, labels: …

查看 TOS 文档：

搜索 400 关键字，我们得知要使用 EC 码自助排查 https://www.volcengine.com/docs/6349/1188889
通过 TOS 日志管理功能，ECCode:0002-00000002，根据文档，https://www.volcengine.com/docs/6349/1188900
我们开始查找 TOS 与 AWS 兼容性相关内容，最后得知，需要使用独立的 S3 Endpoint 来访问。

场景 | 使用 COS 存储

腾讯云 COS 提供 S3 兼容，所以按照腾讯云官方文档使用正确配置即可。2025-11-20

    s3:
      s3: null
      endpoint: cos.ap-shanghai.myqcloud.com
      region: ap-shanghai
      secretAccessKey: XXXXXXX
      accessKeyId: XXXXXXX
      signatureVersion: null
      s3ForcePathStyle: false
      insecure: false
      http_config: {}
      backoff_config: {}
      disable_dualstack: false

场景 | 访问 Gateway 认证

根据 Authentication 文档，Grafana Loki 服务并不提供认证层，官方文件建议在前端的 Reverse Porxy 进行相关的认证服务。

补充说明，Reverse Proxy 同时需要负责多租户的 X-Scope-OrgID HTTP Header 的添加；

场景 | 数据可视化 Visualize

Loki 并未直接提供数据可视化功能，需要借助 Grafana 进行数据展示。

最佳实践 Best Practices

Static labels are good
1）类似 host, application, and environment 都是不错的标签，可以固定使用这些标签；

Use dynamic labels sparingly（有节制地）
1）不要过多的使用动态 Label
2）更推荐使用 filter expression 来过滤日志；

Label values must always be bounded
1）还是再说动态标签的问题，不要由过多不同类型的值；

Be aware of dynamic labels applied by clients
1）还是动态标签的问题，通过 logcli 或 Series API 确定 Label 的类型和其取值类型；
2）防止某些不该出现的 Label 被加入 Stream 中；

Configure caching
1）开启数据缓存，以提升 Loki 性能；

Time ordering of logs
1）日志要按时间顺序发送，否则会被决绝（除非允许非按序日志）；
2）或建议在日志上增加新 Label 来标识来自不同系统的日志（适用于多主机但日志时间存在偏差的场景）
3）或者 Promtail 创建新的日志时间；

Use chunk_target_size
1）chunk_target_size 只是 Loki 将 Chunk 填充到特定压缩大小（1.5M），这个大小更高效；
2）还有一些选项会影响 Chunk 的填充大小：max_chunk_age=1h，chunk_idle_period=30m
3）5-10x or 7.5-10MB 才能填充 1.5M 块大小（具体取决于压缩算法）；
4）记住，块是基于流；越多的流，内存将存在越多的块；如果 Stream 过多，那很可能在块填满之前就被 Flush 掉；
5）总之，保证块尽量被填满；如果程序日志过多，可以考虑增加 Label 来创建不同 Stream 以保存日志；

-print-config-stderr / -log-config-reverse-order
1）用于在启动的时候打印配置文件；

场景 | 日志查询

{type=”kubernetes-container”, cluster=”d3rm-infr-130-tke”, namespace=”ingress-nginx”, instance=”nginx-public”}
from: 2025-11-20 17:00:00
to: 2025-11-20 17:05:00

123 lines displayed 返回 123 条日志，通过 logcli query –from=”” –to=”” 查询，同样返回 123 条日志。

count_over_time({type=”kubernetes-container”, cluster=”d3rm-infr-130-tke”, namespace=”ingress-nginx”, instance=”nginx-public”} | keep type, cluster, namespace, instance [5m])
from: 2025-11-20 17:00:00
to: 2025-11-20 17:05:00
step: 5m

同样，涉及 123 条日志。

如果 step: 1m 则会变为 5 个点，显示这个时间段内日志数量的变化。
如果 step: 1m, from: 2025-11-20 17:00:00, to: 2025-11-20 17:01:00，则显示第 1min 内的日志数。

count_over_time({type=”kubernetes-container”, cluster=”d3rm-infr-130-tke”, namespace=”ingress-nginx”, instance=”nginx-public”} | keep type, cluster, namespace, instance [$__auto])
from: 2025-11-20 17:04:00
to: 2025-11-20 17:05:00
step: 1m

根据文档
https://grafana.com/docs/grafana/latest/datasources/loki/template-variables/#use-__auto-variable-for-loki-metric-queries 变量 $__auto 会自动追随 step 的值。

该查询的功能是：以当前时间开始，向前统计每分钟的日志数量

rate({type=”kubernetes-container”, cluster=”d3rm-infr-130-tke”, namespace=”ingress-nginx”, instance=”nginx-public”} | keep type, cluster, namespace, instance [$__auto])
step: 1m

改进

社区 | https://grafana.com/docs/loki/latest/community/

… received message larger than max …

ResourceExhausted desc = grpc: received message larger than max · Issue #2271 · grafana/loki

Jul 08 17:01:37 node-1 grafana-agent-flow[19436]: ts=2024-07-08T09:01:37.432203303Z level=warn msg=”error sending batch, will retry” component_path=/ component_id=loki.write.local component=client host=loki.devops.d3rm.com status=500 tenant=”” error=”server returned HTTP status 500 Internal Server Error (500): rpc error: code = ResourceExhausted desc = grpc: received message larger than max (4283158 vs. 4194304)”

grpc_server_max_recv_msg_size
grpc_server_max_send_msg_size

参考

Grafana Loki | Grafana Loki documentation

官方文档是我们学习的开始，以官方文档为中心进行学习与使用。

「Loki」- Grafana Loki OSS | 日志收集 | OBSERVABILITY

认识

组成