「less」

包含的可执行程序
less A file viewer or pager; it displays the contents of the given file, letting the user scroll, find strings, and jump to marks
lessecho, Needed to expand meta-characters, such as * and ?, in filenames on Unix systems
lesskey, Used to specify the key bindings for less

章节列表
「less」-[……]

2023-04-22 | k4nz

「Consensus Algorithm」- 共识算法

共识算法
多副本又引入数据一致性问题。需要引入一个共识算法，确保各节点数据一致性，并可容忍一定节点故障；
共识算法，它最早是基于复制状态机背景下提出来的；
复制状态机由共识模块、日志模块、状态机组成： 1）通过共识模块，来保证各个节点日志的一致性， 2）然后，各个节点基于同样的日志、顺序执行指令， 3）最终，各个复制状态机的结果实现一致；

Paxos
共识算法的祖师爷是 Paxos，但是由于它过于复杂，难于理解，工程实践上也较难落地，导致在工程界落地较慢；
Paxos服务解决的问题正是分布式一致性问题，即一个分布式系统中的各个进程如何就某个值（决议）达成一致。Paxos服务运行在允许有服务器宕机的系统中，不要求可靠的消息传递，可容忍消息丢失、延迟、乱序和重复。它利用大多数（Majority）机制保证了2N+1的容错能力，即2N+1个节点的系统最多允许N个节点同时出现故障。
参考文献
etcd 实战课（唐聪，腾讯云资深工程师，etcd 活跃贡献者）_极客时间[……]

| k4nz

「Docker」- 安装、部署（CentOS、Debian、Ubuntu）

关于版本号的疑问
docker version number confusion: 17.06 vs 1.13? Time-based release schedule
在一些旧的 Docker 文档中，经常会提到 Docker 1.x 版本，而另外一些文档是 Docker 17.x 版本，但是确没有 Docker 4.x……Docker 9.x …… Docker 16.x 等等版本。为什么会这个样子呢？
该笔记将记录：Docker 版本号的是如何变化的，以及相关问题的处理方法。
时间点：17.03.0-ce (2017-03-01)
从这个版本开始，Docker进入「月度发布」周期，并使用新的「YY.MM」版本控制方案来反映这一点。提供两种渠道：「月度」和「季度」。
任何特定「月度版本」只会在下一个月度发布可用之前收到安全性和错误修正。
而「季度版本」在首次发布后的4个月内会收到安全性和错误修正。
另外Docker 17.03.0-ce版本包含1.13.1的错误修正，但没有主要功能添加，API版本保持不变，所以从Docker 1.13.1升级到17.03.0预计将是简单且低风险的。
关于版本选择
由于「季度版本」的支持时间更久，所以我们选用季度版本。季度版本的发版月份为 3、6、9、12 月，所以 Docker 版本号为 Docker x.03、Docker x.06、Docker x.09、Docker x.12 的格式，其中 x 代表主版本号。
on Debian GNU/Linux 10 (buster)
Install Docker Engine on Debian | Docker Documentation

#### 卸载旧版
systemctl stop docker.service # 18.09.1
apt-get remove docker.io runc

#### 添加仓库
echo ‘Acquire::ForceIPv4 “true”;’ | tee /etc/apt/apt.conf.d/99-force-ipv4 # 禁用 IPv6 解析
apt-get update
apt-get install apt-transport-https ca-certificates curl gnupg-agent software-properties-common
curl -fsSL https://download.docker.com/linux/debian/gpg | apt-key add –
add-apt-repository “deb [arch=amd64] https://download.docker[……]

| k4nz

「DOCKER」- 快速入门

#01 检查Docker是否正常运行
使用docker info命令产看容器信息、检查容器是否正常运行：

docker info

# 该命令返回容器概述、镜像概述、执行驱动、存储驱动、基本配置

#02 运行第一个容器
使用docker run命令来运行容器。实际上，它执行了容器的创建和容器的启动：

#!/bin/sh

docker run -i -t ubuntu /bin/bash

#03 使用第一个容器
容器相当与一个操作系统，一个“完整”的操作系统。
如果退出，则容器会停止运行。
#04 容器命名
在运行容器时，使用–name选项指定容器名，否则会生成一个随机的名字。
合法字符：[A-z0-9_.-]
可以通「过容器名」替代「容器ID」，可以非常清楚的分辨容器。
「容器名」必须是唯一，如果容器名已经存在，则应该使用docker rm命令删除已经存在的容器。
#05 重新启动已停止的容器
使用docker start <container name/id>命令来启动容器。
或者可以使用docker restart <container name/id>命令来重新启动容器。
使用docker ps命令查看运行中的容器。
可以使用docker create命令来创建一个容器，但是并不运行，这个可以在工作流中进行细粒度的控制。
#06 附着到容器上
可以使用docker attach命令附着到该容器的会话上。
但是，如果退出（Ctrl+C, Ctrl+D），即进程退出，容器则会停止运行。在退出容器时，可以利用Ctrl+P Ctrl+Q进行退出，以守护式形式退出容器，但容器的守护进程依旧在运行，之后可以使用docker attach命令进入容器。
#07 创建守护进程式容器
如果进程退出，则容器也会退出。所以，如果要创建守护进程式的容器，则需要在容器内创建一个能一直运行的进程。

docker run –name container_name -d ubuntu /bin/sh -c “while true; do echo hello world; sleep 1; done”

选项-d是容器在后台运行，这使得主机控制台不会附着到新的shell会话上，而是仅仅返回一个容器ID而已。使用docker ps查看当前正在运行的容器。
#10 查看容器内的进程
使用docker top命令查看容器内的进程。
#11 查看统计信息
使用docker stats命令查看容器统计信息。
它显示了容器的CPU[……]

| k4nz

「Docker」- 常见错误汇总

#6 构建镜像是，执行 chown -R 非常慢
Docker images and files chown Recursive chown is really slow #388
问题描述：如果在Dockefile中包含chown -R /path/foo命令，则构建镜像时间将非常久。
问题原因：由与Docker使用写时复制策略，所以chown命令执行时，会将上层镜像文件全部复制到当前层，然后再修改权限，再写入文件系统。
在「使用chown -R命令」与「不使用chown -R命令」情况下，分别构建两个镜像，使用docker history命令可以看到镜像层。你会发现在包含chown -R命令时，镜像层将消耗很多空间。
解决办法：不应该使用chown之类大批量修改文件的命令。
#3 exec user process caused “exec format error”
「exec user process caused “exec format error” when run container with CMD on RHEL」启动容器时产生该错误。
该镜像的CMD是一个SHELL脚本，该将本没有添加「shebang」，导致运行时无法识别脚格式。
在启动脚本中添加shebang头，即「#!/bin/sh」
#2 x509: cannot validate certificate because of not containing any IP SANs
TODO 使用自签名的SSL证书情况下，如何登录Registry
#3 Error response from daemon when pulling a specific image tag from DTR
-「Error response from daemon when pulling a specific image tag from DTR」 TODO Error response from daemon: manifest for xxx/xxx/xxx:xxx not found
#1 想不到的错误
在我的Debain中运行：docker run –rm -t -i centos:6.10 /bin/bash
直接退出，退出码为139；但是执行其他的ls或者cat之类的命令是正常的。
该centos:6.10的镜像也可以在CentOS Linux release 7.4.1708 (Core)上正常运行；
使用dmesg查看系统输出：

[2023147.282206] docker0: port 2(veth2016d06) entered[……]

| k4nz

「OBSERVABILITY」- 概念术语

可观测性，主要概括三类：聚合度量（Metrics）；事件日志（logging）；链路追踪（tracing）；

补充说明
OBSERVING，Observing，观测
Observability，可观测性
observee (plural observees): One who is observed.
CNC/OBSERVABILITY
参考文献
聊聊可观测性 – 知乎 Peter Bourgon · Metrics, tracing, and logging[……]

| k4nz

「KUBERNETES-ADDONS」- 日志

问题描述
多种多样的日志： 1）应用程序具有日志，用于排查问题； 2）集群具有日志，用于排查问题； 3）容器环境具有日志机制，容器化应用应将日志写入标准输出与标准错误；
但这些日志机制存在问题： 1）容器引擎提供的日志功能远远不够：容器崩溃，驱逐 POD 实例，节点崩溃，在这些情况下我们依旧希望访问日志。所以日志需要独立存储，并且生命周期与节点、容器等等无关；
该概念被称为“集群日志”。“集群日志”需要单独存储，但是 Kubernetes 并没有提供日志后端存储，需要我们自己集成；
该笔记将记录：与 Kubernetes Cluster 日志有关内容，包括日志种类、收集方法等等，及相关问题的解决办法。该笔记结合官方 Logging Architecture 文档，整理与日志有关内容；
解决方案
在 Kubernetes Cluster 中，需要处理的日志分为以下几种： 1）容器外日志：kubelet、Docker 2）容器内日志：（已写标准输入、标准错误）容器应用程序，包括集群组件（比如 kube-porxy，etcd 等等） 3）容器内日志：（未写标准输入、标准错误）容器应用应用，写入容器本地
使用 DaemonSet 运行 POD 实例，在节点中收集日志；
写入标准输出的日志
使用 kubectl logs 查看日志，添加–previous 选项查看已崩溃容器的日志；
如果在 POD 实例中具有容器，可以指定容器名查看特定容器日志；
节点级日志类型
应用程序日志
由容器应用写入标准输出与标准错误的日志，会被容器引擎重定向。例如在 Docker 中，由日志驱动处理（在 Kubernetes 中，被配置写入 JSON 格式的文件中）；
注意，使用 Docker 日志驱动无法处理多行日志，需要在日志收集工具中进行处理；
如果容器重启，则 kubelet 会保存单个容器及它的日志。若驱逐 POD 实例，所有对应的容器将被驱逐，包括日志；
另外节点日志还要考虑轮转问题，防止日志消耗过多磁盘。但是 Kubernetes 当前不负责日志轮转，这个问题应该由容器应用处理。另外可以可以配置容器环境处理日志轮转，例如使用 Docker 的–log-opt选项；
当执行 kubectl logs 时，由对应节点的 kubelet 响应，直接读取日志文件。注意，如果外部系统执行轮转，日志截断为多个文件时，则 kubectl logs 只能读取组后一个文件；
系统组件日志
系统组件也有日志，但是分为两类：（1）运行在容器内的组件；（2）运行在容器外的组件；
运行在容器外的组件，例如 kubelet 及 Docker 等等：如果使用 syst[……]

| k4nz

「OBSERVABILITY」- 监控解决方案

监控的对象（对象）
排名不分先后：
第一部分、基础设施监控
计算、存储、网络
监控的内容（结果）
监控的目的：（1）发现端倪；（2）故障通知；（3）自动修复
综合所述，监控的深度有以下这四方面：可用性监控、性能监控、日志监控、自定义监控；
可用性监控
1）这是最简单的层面监控，比如：监控端口是否活。
性能监控
比如：虽然CPU正常运作，但CPU的占用率是否一直处在一个很高的水平？
日志监控
主要是应用日志监控，其次还有安全审计日志、系统日志等。这些日志监控可确保我们人员操作的合规性。应用日志也会为后续的全链路监控提供跟踪依据，为故障定位作为参考依据。
自定义监控
我们会有很多来自于业务方面的需求，自己定义的指标，比如过去十分钟的成交量有多少，虽然这些kpi可以通过其他方式查询到，但是如果能够直接集成在监控系统中进行查询，提供附加的自定义监控，那就能更好地满足业务监控需求，从业务的角度去了解企业业务的运行状况。
监控的方法（工具）
MRTG – Multi Router Traffic Grapher
https://oss.oetiker.ch/mrtg/ https://github.com/oetiker/mrtg https://en.wikipedia.org/wiki/Multi_Router_Traffic_Grapher
Perl 使用 SNMP 采集数据生成 PNG 等图片，以 Web 形式展示
Cacti
https://www.cacti.net/
PHP MySQL SNMP RRDtool LDAP
SmokePing
https://oss.oetiker.ch/smokeping PING WWW DNS SSH RRDtool
Graphite
https://github.com/graphite-project/graphite-web Wikipedia/Graphite
Python Django 存储数据按需绘图
Nagios
https://github.com/NagiosEnterprises/nagioscore https://en.wikipedia.org/wiki/Nagios
监控系统服务可用性网络信息 WebUI 服务可用性告警
Zenoss Core
https://www.zenoss.com/get-started
CMDB 发现管理各类资产监控和报告IT架构中的资源状态和性能与CMDB关联的事件和错误管理系统 SMTP收集数据
Ganglia
http://gangl[……]

| k4nz

「OBSERVING-SYSTEM」- 需要观测的内容

计算观测
WIP
磁盘观测（Storage）
进程（应用）
以进程为单位，观察进程当前的 IO 负载；
我们需要查看进程针对磁盘的 IOPS、延迟、带宽等等参数，类似 pidstat 的输出的指标： 1）
1）kubelet cAdvisor
设备
以设备为单位，观察磁盘 IO 负载；
我们要获取块设备的整体读写情况：Node Exporter
网络观测
借助 eBPF 对网络进行观测： isovalent/cilium-grafana-observability-demo
针对 Kubernetes Service 关系进行观测： https://github.com/groundcover-com/caretta
NetWorth，借助 eBPF 及 XDP 来观察入站流量（地理位置、协议类型、） ShubhamPalriwala/networth: eBPF based Network Monitoring using Prometheus and Grafana[……]

| k4nz

「OBSERVABILITY」- 观测系统模块交互

实践经验记录
Prometheus => Grafana
通过 Promtheus Remote Write 特性，将数据长久存储到 Thanos 服务，所以 Grafana 能够从 Thanos 直接查询数据；
但在实际实践中，我们发现开源的 Grafana Dashboard 是针对单集群的。如果针对多个集群中使用同个面板，需要调整面板；
为了减小 Grafana Dashboard 改造工作，我们将各个 Prometheus 作为数据源添加到 Grafana 中。而 Thanos 仅是作为长期存储，只有特殊场景下（例如统计历史趋势），我们才会针对 Thanos 来制定面板；[……]

| k4nz

「OBSERVABILITY」- Grafana-based Observing Stack（for Cloud Native Computing）

问题描述
我们需要观测我们的容器云原生环境，包含监控、日志、追踪方面，以协助我们进行问题排查；
该笔记将记录：我们所使用的观测系统架构，以及部署观测系统的过程，以及常见问题的解决办法；
解决方案
监控：Exporter + Prometheus + Grafana，已是广泛使用的监控解决方案，所以我们也不再进行技术调研；日志：我们直接使用 Grafana Loki 进行日志采集，目的是为了减少组件（直接使用 Grafana 进行日志展示）；追踪：根据研发要求，我们尝试使用 Jaeger 进行追踪，我们也将尝试引入其他性能追踪组件，来观察程序运行；
LGTM (Loki, Grafana, Tempo, Mimir) Stack
Loki for Logs Grafana for Visualization Tempo for Traces Mimir, Prometheus, and Graphite for Metrics[……]

| k4nz

「Linux」- 搭建 Prometheus 监控（单机，安装，部署，快速开始，实验环境）

问题描述
Prometheus 是开源的系统监控和告警工具集（官方解释）。但是我们提到 Prometheus 术语时，多半是指存储监控数据的服务。
在 Prometheus Monitoring System 中：Exporters 负责采集指标，并通过 HTTP 来暴露关键指标；Prometheus 定期请求 Exporters 以拉取指标并存储，然后评估是否达到告警阈值；Alertmanager 负责发送来自 Prometheus 的告警请求；Grafana 提供 Web 界面，读取 Prometheus 数据来展示各种指标。
该笔记将记录：在 Linux 中，如何部署 Prometheus 监控，以实验体验为目的，以及相关问题的解决思路。
解决方案
注意事项： 1）该笔记仅记录如何部署整套监控系统，但并未涵盖如何使用及如何配置，因为这涉及相当多的内容； 2）这是我们早期在刚接触云原生时记录的实验笔记。现在，我们已经在 Kubernetes 环境中部署监控；
官方提供多种部署方式（这些并非我们常用部署方式）： 1）Using pre-compiled binaries 2）From source 3）Using Docker 4）Using configuration management systems 5）通过 Minikube 提供的内置部署方式；
环境信息
操作系统：Debian GNU/Linux 10 (buster) 软件版本：Node Exporter 0.17.0；Alertmanager 0.15.3；Prometheus 2.7.1；Grafana 7.5.7；
部署 Node Exporter 服务

apt-get install -y prometheus-node-exporter

systemctl enable prometheus-node-exporter.service
systemctl start prometheus-node-exporter.service

# 请求 Node Exporter 服务，以检查服务正常运行
curl http://127.0.0.1:9100/metrics

补充说明：Node Exporter 是众多 Exporter 中的某一个，还有比如 MySQL server exporter、Memcached exporter、JIRA exporter 等等，参考 Exporters and integrations 页面。
部署 Alertmanager 服务

apt-get install -y prometheus-alertmanager

systemctl en[……]

| k4nz

「Prometheus」- 高可用监控集群，集群监控（Monitoring）

问题描述
该笔记将记录：部署 Prometheus Monitoring 的方法，以及相关问题的解决办法。
解决方案
补充说明
针对应用环境，我们这里讨论的 Prometheus Monitoring 是围绕 Kubernetes 展开的。针对非容器化环境，传统监控方案经过实践的检验，所以我们相信传统的监控方案会更好。
单集群监控
单集群监控，是指 Prometheus Monitoring 仅监控单个 Kubernetes 集群，即多个 Kubernetes Cluster 需要部署多个 Prometheus Monitoring 实例；
针对实际应用环境，通常具有多个集群需要监控，并且未来将会扩增，所以我们直接跳过单集群讨论，将重点放在 Multiple Kubernetes Cluster 的监控问题上；
多集群监控（调研、学习）
Multiple Kubernetes cluster monitoring with Prometheus | Sysrant Monitoring a Multi-Cluster Environment Using Prometheus Federation and Grafana
方案一、kubernetes_sd_config/api_server
通过 Prom 的 kubernetes_sd_config 的 api_server 选项，直接连接其他集群 1）优点：该方案简单易于部署，也无需在其他集群进行过多的配置； 2）缺点：跨集群认证需要单独配置；某些指标需要在被监集群部署组件，该方案无法满足该需求；
方案二、Prometheus Federation
通过 Prom 的联邦技术：分别在其他集群中部署 Prom 服务，在通过中心的 Prom 进行采集 1）优点：容易部署； 2）缺点：数据量较大，尤其是对于中心 Prom，其需要采集的指标更多，所需要的时间也更多（延迟）；
方案三、Expose the /metric Interface
通过暴露被监集群的 /metric 接口，以供 Prom 进行爬取； 1）优点：非常容易部署； 2）缺点：仅能小规模使用；没有服务发现；也无法实现自动化；
我们最初的想法是： 1）针对每个集群，部署 Prometheus Monitoring 监控； 2）通过 Grafana Datasoruce 特性，在 Dashboard 中选择不同数据源来显示；
方案四、Prometheus + Thanos Sidecar
通过在多个集群部署 Prometheus + Sidecar 的方式，并通过 Query 进行分别查询；

还有个改进方案，本质是类似的。[……]

| k4nz

「WEB-SCRAPING」- 网络爬虫

Web Scraper
chrome web store/Web Scraper webscraper中文网有关webscraper的问题，看这个就够了（建议收藏）
信息采集
# 万维易源 – 一个世界, 一个API http://www.showapi.com
QQBot 用 PYthon 实现的、基于腾讯 SmartQQ 协议的 QQ 机器人框架，可运行在 Linux 、 Windows 和 Mac OSX 平台下。[……]

| k4nz

「Selenium」- A browser automation framework and ecosystem

https://www.selenium.dev/
https://github.com/seleniumhq/selenium
Open URL in Chrome & save its source code using Command prompt[……]

| k4nz

「Selenium」- 基本概念及快速开始

Selenium 是用与支持浏览器自动化的系列工具和库的总体项目。即 Slenium 是个胶水，将用于浏览器自动化测试的项目及工具粘合在一起。
基本概念及常用术语
WebDriver，使用浏览器的接口来操作浏览器，并运行测试，就像真实的用户操作一样。工作流程：Python Lib => WebDriver => Web Browser
IDE (Integrated Development Environment)，是 Selenium 提供的 Firefox 或 Chrome 插件，用于测试用例开发，简化开发流程。
Selenium Grid，用于在不同平台上运行测试用例。控制端在本地，当出发测试用例后，将在远程主机执行测试用例。
参考文献
Quick tour :: Documentation for Selenium[……]

| k4nz

「Selenium」- 搭建本地环境（Linux）

安装 Selenium 工具（快速开始）
Selenium 的工作流程：Python Lib => WebDriver => Web Browser
因此我们需要进行以下设置： 1）安装 Python Selenium 库，用于操作 WebDriver 程序的工具类库； 2）我们需要 WebDriver 程序，该工具用于操作浏览器；
注意事项： 1）除了 Python Selenum 库，其实还有 Java C# Ruby 等等库，都可以操作 WebDriver 程序。 2）由于我们使用 Python 语言，所以这里只介绍 Python 类库；
第一步、安装 Python Selenum 类库

// 留意自己使用的 Python 版本

# pip install selenium
# pip3.7 install selenium

// 我们使用 Debian 10 发行版，推荐 APT 安装

apt-get install python3-selenium

第二步、安装 WebDriver 程序
需要做两件事情： 1）下载与浏览器对应的 WebDriver 程序，比如：Chrome 需要下载 chromedriver 二进制程序；Firefox 需要下载 geckodriver 二进制程序； 2）将 WebDriver 程序放入系统路径，以便被搜索到。因为 Python Selenum 库需要执行该二进制程序，以便操作浏览器；
Chromium
我们这里以 Debian 10 的 Chromium 为例：

# chromium –version
Chromium 83.0.4103.116 built on Debian 10.4, running on Debian 10.5

// 方法一、下载安装

# wget -O /tmp/chromedriver_linux64.zip \
https://chromedriver.storage.googleapis.com/83.0.4103.39/chromedriver_linux64.zip

# unzip -x /tmp/chromedriver_linux64.zip -d /usr/local/bin

# chromedriver
Starting ChromeDriver 83.0.4103.39 (ccbf011cb2d2b19b506d844400483861342c20cd-refs/branch-heads/4103@{#416}) on port 9515
Only local connections are allowed.
Please see https://c[……]

| k4nz

「Selenium Grid 3」- 在远程主机中执行测试任务

Selenium Grid
Selenium 用于自动化测试，但是只能运行在本地环境中，即启动本地浏览器进行测试任务。而通过 Selenium Grid 集群，可以在多台远程主机中同步执行自动化测试任务。
如图所示，Client 连接 Hub 并发送任务，然后 Hub 将测试任务下发到 Node 进行执行（摘自 Selenium Grid 3 文档）：
在该笔记中，我们将介绍如何搭建 Selenium Grid 3 环境，并编写最简单的演示程序。
注意事项，官方不再支持 Grid 3 版本，官方建议使用 Grid 4 版本。但是目前（08/30/2020）Grid 4 的文档并不完善，并且 Grid 3 的实践经验较多，因此我们使用 Selenium Grid 3 版本。
第一步、搭建 Seleium Grid 集群
Hub: 172.31.253.49, CentOS Linux release 7.5.1804 (Core) Node: 172.31.253.103, Ubuntu 20.04 LTS
运行 Hub 节点

yum install -y java-1.8.0-openjdk
mkdir -pv /srv/selenium-hub && cd /srv/selenium-hub
wget https://selenium-release.storage.googleapis.com/3.141/selenium-server-standalone-3.141.59.jar

# 配置 systemd 服务
cat > /etc/systemd/system/selenium-hub.service <<EOF
[Unit]
Description=Selenium Grid Hub

[Service]
Type=simple
User=root
ExecStart=/usr/bin/java -jar /srv/selenium-hub/selenium-server-standalone-3.141.59.jar -role hub -debug

[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl start selenium-hub.service
systemctl enable selenium-hub.service

# 查看服务运行信息
journalctl -f -u selenium-hub.service

当服务运行成功后，在日志中将看到类似如下信息：

Started Selenium Grid Hu[……]

| k4nz

「Selenium IDE」- 针对 Web 环境，记录和回放的自动化测试工具

问题描述
在编写传统的 Selenium 自动化测试脚本时，我们需要借助编程语言并使用 Selenium 类库；
虽然能够完成测试任务，但还是具有一定的工作量：环境搭建、编写测试用例、脚本调试；
解决方案
通过 Selenium IDE 工具，我们能够快速创建 Web 自动化测试任务，并减轻编码带来的工作量；

插件安装
安装 Selenium IDE 的过程大致如下： 1）在 Chrome / Firefox 中，安装 Selenium Plugin 即可； 2）在 Selenium Plugin 中，通过图形化操作来定义自动化任务；
使用方法
创建测试任务
参考 Selenium IDE/Getting Started 文档，以获取其详细使用方法；
首先，在界面中，填写 Playback Base URL 参数，元素选择也将在该页面上进行；
然后，在界面中，添加相关补充及任务。简单示例如下：

Command | Target | Value
—————————————————————————————-
open | https://www.example.com |
execute | script return document.getElementById(“inputBox”) !== null | inputBoxExists
if | ${inputBoxExists} > 0 |
send keys | id=inputBox | 123456
end | |
click | css=.table .button |
close | |

最后，在界面中，点击 Run current test（播放按钮）按钮，开始执[……]

| k4nz

「Selenium Grid 3」- 使用 Java / Groovy 语言

我们更多的是在 Jenkins Pipeline 中使用 Selenium 框架，因此需要使用 Groovy 类库。由于没有与之对应的 Groovy 类库，因此只能使用 Java 类库。
还有另外种做法：使用 Python 实现，然后在 Groovy 中命令行调用。但是我们无法使用该方法，因为 Selenime 的自动化测试过程中需要交互、判断，而这种方法无法获取状态，只能输入执行然后等待输出。
相关链接
Maven Repository: org.seleniumhq.selenium » selenium-java（我们使用 Selenium Grid 3 版本）下载页面：Downloads 接口文档：https://www.selenium.dev/selenium/docs/api/java/index.html
Selenium with Java: Best Practices
连接 Selenium Hub 节点

import org.openqa.selenium.remote.DesiredCapabilities;
import org.openqa.selenium.Platform;

import org.openqa.selenium.remote.RemoteWebDriver;
import org.openqa.selenium.WebDriver;
import java.net.URL;

DesiredCapabilities desiredCapabilities = DesiredCapabilities.chrome();
desiredCapabilities.setBrowserName(“chrome”);
desiredCapabilities.setPlatform(Platform.LINUX);

String seleniumHubUrl = “http://ip-address:port-number/wd/hub”;
WebDriver webDriver = new RemoteWebDriver(new URL(seleniumHubUrl), desiredCapabilities);

设置窗口大小及位置
python – How do I set browser width and height in Selenium WebDriver? – Stack Overflow Selenium Waits: Implicit, Explicit, Fluent And Sleep

import org.openqa.selenium.Dimension;
import org.openqa.selenium[……]

| k4nz

「Selenium」- 在页面中，点击按钮（或元素）

问题描述
该笔记将记录：在 Selenium 中，如何使用代码点击按钮，以及常见问题处理。
解决方案
使用 click() 点击
通常点击元素使用 click() 方法即可：

// 选择元素并进行点击
webDriver.findElement(By.id(“buttoncheck”)).click()

// 等待元素可以点击
new WebDriverWait(webDriver, 10).until(ExpectedConditions.elementToBeClickable(By.xpath(“xpath-query”))).click()

使用 JavaScript 点击
或者，当找到元素后，使用 JavaScript 点击：

((JavascriptExecutor) webDriver).executeScript(“arguments[0].click();”, button);

该方法可以我们遇到的解决如下问题：

org.openqa.selenium.ElementClickInterceptedException: element click intercepted:
Element xxxxxxxx is not clickable at point (1154, 91). Other element would receive the click: xxxxxxx

更多点击动作
复杂的点击操作（长按、右键、双击），参考 Test Automation With Selenium Click Button Method(Examples) 页面
关于点击需要注意的事项
现实世界是复杂的，我们将在该部分中记录我们遇到的问题（与点击相关）。
# 10/03/2020 某些元素，当在浏览器窗口（viewport）中可见时，才会被绑定点击事件。因此，如果没有滚动到该元素使其可见，点击动作是无效的。在 Firefox 80.0.1 (64-bit) 中，滚动至元素可见，会发现在 Inspector 中该元素的后面后显示 event 标签。使用如下代码进行滚动及滚动完成的检查：

// 下面是与滚动至元素可见的代码的参考文献
// https://www.guru99.com/scroll-up-down-selenium-webdriver.html
// https://stackoverflow.com/questions/42982950/how-to-scroll-down-the-page-till-bottomend-page-in-the-selenium-webdriver

/[……]

| k4nz

「Selenium」- 正确的休眠（等待）方法

问题描述
当我们加载页面后，可能需要等待页面渲染，等待某个 HTML 元素加载完成。我们经常使用 Thread.sleep() 进行等待，但是具有以下缺点： 1）等待时间过长，而页面已经加载完成；等待时间过短，而页面还未加载完成； 2）我们无法确定要等待的具体时间。如果使用 while 循环检查，程序会显得“不整洁”； 3）每个查找元素的地方都需要等待； 4）必须等待特定时间后，即 Thread.sleep() 设置的时间，才能继续执行后续程序；
解决方案
我们可以使用 Selenium 提供的等待方法： 1）Implicit wait – Provided by Selenium WebDriver 2）Explicit wait (WebDriverWait & FluentWait) Provided by Selenium WebDriver 3）Fluent Wait
Implicit Wait
如下是演示代码（只包含关键部分），我们通过演示代码进行讲解：

WebDriver driver=new ChromeDriver();

driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);

driver.get(“https://www.easemytrip.com/”);
driver.findElement(By.id(“FromSector_show”)).sendKeys(“Delhi”, Keys.ENTER);
driver.findElement(By.id(“Editbox13_show”)).sendKeys(“Mumbai”, Keys.ENTER);

如上示例，使用 implicitlyWait 最多 30s 等待，具有以下优势： 1）在 findElement 时，最多 30s 等待，只要找元素就立即向下执行； 2）如果在 30s 内没有找到，则返回 ElementNotVisibleException 异常； 3）全局设置（只需要设置一次，无需在每次查找元素时进行设置）；
但是我们会遇到另外场景，比如：虽然 HTML 元素已经找到，但是在页面元素是否可见、是否可以点击，这些会影响自动化测试的进行。针对这个问题，我们可以使用 Explicit wait 等待。
Explicit wait
如下是演示代码（只包含关键部分），我们通过演示代码进行讲解：

WebDriver driver = new ChromeDriver();
driver.get(“https://www.rentomojo.com/”);

// 等待页面元素可见
WebDriverWa[……]

| k4nz

「Selenium」- 文件上传

问题描述
我们需要使用 Selenium 进行文件上传，以完成功能测试任务。
但是，在尝试多种方法后，都会遇到如下错误：

org.openqa.selenium.InvalidArgumentException: invalid argument: File not found : xxxxxxx

该笔记将记录：在 Selenium 中，如何实现文件上传，以及相关问题处理。
解决方案
方法一、使用 FirefoxDriver 上传
我们没有使用 FirefoxDriver 上传的方法，这里只是记录这种做法：

import org.openqa.selenium.*;
import org.openqa.selenium.firefox.FirefoxDriver;

public class PG9 {
public static void main(String[] args) {
System.setProperty(“webdriver.gecko.driver”,”C:\\geckodriver.exe”);
String baseUrl = “http://demo.guru99.com/test/upload/”;

WebDriver driver = new FirefoxDriver();
driver.get(baseUrl);

WebElement uploadElement = driver.findElement(By.id(“uploadfile_0”));
uploadElement.sendKeys(“C:\\newhtml.html”);

driver.findElement(By.id(“terms”)).click();
driver.findElement(By.name(“send”)).click();
}
}

方法二、使用 ChromeDriver 上传
为了简单演示，这里之粘贴关键代码的两行代码：

public void uploadFile(){
…
webDriver.setFileDetector(new LocalFileDetector());
…
input.sendKeys(filePath);
…
}

Q：使用 Chrome 或 Chromium 浏览器遇到 File not found 错误， A：可能是因为 Chromium 是通过 snap 安装，其文件系统隔[……]

| k4nz

「Selenium」- 使用代理服务器

相关链接
Multiple Proxy Servers in Selenium Web-driver Python | by SiDdhartha | ML Book | Medium
Http proxies :: Documentation for Selenium[……]

| k4nz

「Selenium」- 常见错误汇总

[……]

| k4nz

「Selenium」- Can not connect to the Service /path/to/chromedriver

问题描述

# python3.7 /tmp/demo.py
Traceback (most recent call last):
File “/tmp/demo.py”, line 4, in <module>
driver = webdriver.Chrome(‘/srv/sharing/packages/chromedriver_linux64/chromedriver’) # Optional argument, if not specified will search path.
File “/usr/local/lib/python3.7/dist-packages/selenium/webdriver/chrome/webdriver.py”, line 73, in __init__
self.service.start()
File “/usr/local/lib/python3.7/dist-packages/selenium/webdriver/common/service.py”, line 104, in start
raise WebDriverException(“Can not connect to the Service %s” % self.path)
selenium.common.exceptions.WebDriverException: Message: Can not connect to the Service /srv/sharing/packages/chromedriver_linux64/chromedriver

问题原因
Selenium python: Can not connect to the Service %s” % self.path selenium.common.exceptions.WebDriverException: Message: Can not connect to the Service geckodriver
解决办法
添加 127.0.1.1 localhost 到 /etc/hosts 文件。
参考文献
Selenium python: Can not connect to the Service %s” % self.path selenium.common.exceptions.WebDriverException: Message: Can not connect to the Service geckodriver[……]

| k4nz

「Selemium」- ChromeDriver only supports characters in the BMP

问题描述
在 Selenium 中，执行自动化测试任务，产生如下错误：

Caught: org.openqa.selenium.WebDriverException: unknown error: ChromeDriver only supports characters in the BMP
(Session info: chrome=87.0.4280.66)
Build info: version: ‘3.141.59’, revision: ‘e82be7d358’, time: ‘2018-11-14T08:17:03’
System info: host: ‘laptop-k53sd’, ip: ‘127.0.1.1’, os.name: ‘Linux’, os.arch: ‘amd64’, os.version: ‘5.4.0-47-generic’, java.version: ‘1.8.0_265’
Driver info: org.openqa.selenium.remote.RemoteWebDriver
Capabilities {acceptInsecureCerts: false, browserName: chrome, browserVersion: 87.0.4280.66, chrome: {chromedriverVersion: 87.0.4280.66 (fd98a29dd59b3…, userDataDir: /tmp/.org.chromium.Chromium…}, goog:chromeOptions: {debuggerAddress: localhost:42727}, javascriptEnabled: true, networkConnectionEnabled: false, pageLoadStrategy: normal, platform: LINUX, platformName: LINUX, proxy: Proxy(), setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify, webauthn:virtualAuthenticators: true, webdriver.remote.sessionid: 7569ceb732eda9f22ba68994b0e…}
Session ID: 7569ceb732eda9f22ba68994b0ed1ee4
org.openqa.selenium.W[……]

| k4nz

「Selenium」- Element XXX is not clickable at point (672, 582)

问题描述
在使用 Selenium 进行自动化测试时，产生如下错误：

Caught: org.openqa.selenium.ElementClickInterceptedException: element click intercepted: Element <span style=”display: inline-block;”>…</span> is not clickable at point (672, 582). Other element would receive the click: <div class=”garr-footer-publish-content publish-footer-content”>…</div>
(Session info: chrome=87.0.4280.66)
Build info: version: ‘3.141.59’, revision: ‘e82be7d358’, time: ‘2018-11-14T08:17:03’
System info: host: ‘laptop-k53sd’, ip: ‘127.0.1.1’, os.name: ‘Linux’, os.arch: ‘amd64’, os.version: ‘5.4.0-47-generic’, java.version: ‘1.8.0_265’
Driver info: org.openqa.selenium.remote.RemoteWebDriver
Capabilities {acceptInsecureCerts: false, browserName: chrome, browserVersion: 87.0.4280.66, chrome: {chromedriverVersion: 87.0.4280.66 (fd98a29dd59b3…, userDataDir: /tmp/.org.chromium.Chromium…}, goog:chromeOptions: {debuggerAddress: localhost:45673}, javascriptEnabled: true, networkConnectionEnabled: false, pageLoadStrategy: normal, platform: LINUX, platformName: LINUX, proxy: Proxy(), setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000[……]

| k4nz

「Selenium Grid 3」- Node xxxxx has no free slots（close() vs quit()）

问题描述
在使用 Selenium Grid 3 进行自动化测试的过程中，出现“启动停滞”问题（在经过漫长等待后，Selenium Node 才能启动浏览器，开始自动化测试）。
查看 Selenium Hub 日志，发现如下信息：

Sep 28 04:25:41 selenium-hub java[4609]: 04:25:41.020 DEBUG [ProxySet.getNewSession] – Available nodes: [http://172.31.253.104:24663]
Sep 28 04:25:41 selenium-hub java[4609]: 04:25:41.020 DEBUG [BaseRemoteProxy.getNewSession] – Trying to create a new session on node http://172.31.253.104:24663
Sep 28 04:25:41 selenium-hub java[4609]: 04:25:41.021 DEBUG [BaseRemoteProxy.getNewSession] – Node http://172.31.253.104:24663 has no free slots

问题原因
在自动化测试代码中，我们没有正确的退出浏览器（WebDriver），占用 Selenium Node 资源，长此以往导致 Selenium Node 无法分配资源进行新的测试。
应该使用 webDriver.quit() 关闭浏览器，而不是使用 webDriver.close() 关闭浏览器。
webDriver.close()
close() is a webdriver command which closes the browser window which is currently in focus.During the automation process, if there are more than one browser window opened, then the close() command will close only the current browser window which is having focus at that time. The remaining browser windows will not be closed.
webDriver.quit()
quit() is a webdriver command which calls the driver.dispose method, which in turn closes a[……]

| k4nz

「STORAGE」- 对象存储

Object-based Storage
《Ceph企业级分布式存储: 原理与工程实践》
对象存储是一种解决和处理离散单元的方法。离散后的数据称为对象，因此数据会离散出很多对象。与传统的文件系统中的文件不同，对象存储不像文件系统那样通过目录树或者子目录树对文件进行组织。
原理简述
在一个平坦的命名空间中（称之为桶），通过使用对象的 Object ID（有时称为对象密钥），来检索离散后的所有数据对象。然后，应用程序通过 Web API 来访问对象，这与文件系统的访问方式不同。
对象（像文件一样）包含二进制数据流，并且大小无限制。对象还包含描述数据的元数据。文件也同样有元数据，包括文件权限、修改时间等。对象本身支持扩展元数据信息，通常以K/V形式管理元数据——将有关对象中数据的信息存储在键–值对中。
特性特征
权限隔离：使用一个账户可以访问同一存储集群上的多个桶。这些桶可能具有不同的访问权限，并且可能用于不同的对象存储。
抽象文件检索：对象存储的优点是简单易用、易于扩展。每个对象的唯一 ID 允许被存储或检索，无须最终用户知道该对象所在的确切位置。对象存储消除传统文件系统中的目录层次结构，因此可以简化对象之间的关系。
通过 Web 访问：对象存储不能像文件系统的磁盘那样被操作系统直接访问，相反，它只能通过 Web API 从应用层面被访问；
应用场景
通常，有两种访问对象API的方式：Amazon S3和OpenStack Swift（OpenStack对象存储）。Amazon S3将对象的扁平命名空间称为桶（Bucket），OpenStackSwift将其称为容器（Container）。Bucket不能嵌套。[……]

| k4nz

NOTE

/ 记录问题 / 解决问题 / 技术博客 / 工作笔记 /

Categories

Recent Posts

Archives

「less」

「Consensus Algorithm」- 共识算法

「Docker」- 安装、部署（CentOS、Debian、Ubuntu）

「DOCKER」- 快速入门

「Docker」- 常见错误汇总

「OBSERVABILITY」- 概念术语

「KUBERNETES-ADDONS」- 日志

「OBSERVABILITY」- 监控解决方案

「OBSERVING-SYSTEM」- 需要观测的内容

「OBSERVABILITY」- 观测系统模块交互

「OBSERVABILITY」- Grafana-based Observing Stack（for Cloud Native Computing）

「Linux」- 搭建 Prometheus 监控（单机，安装，部署，快速开始，实验环境）

「Prometheus」- 高可用监控集群，集群监控（Monitoring）

「WEB-SCRAPING」- 网络爬虫

「Selenium」- A browser automation framework and ecosystem

「Selenium」- 基本概念及快速开始

「Selenium」- 搭建本地环境（Linux）

「Selenium Grid 3」- 在远程主机中执行测试任务

「Selenium IDE」- 针对 Web 环境，记录和回放的自动化测试工具

「Selenium Grid 3」- 使用 Java / Groovy 语言

「Selenium」- 在页面中，点击按钮（或元素）

「Selenium」- 正确的休眠（等待）方法

「Selenium」- 文件上传

「Selenium」- 使用代理服务器

「Selenium」- 常见错误汇总

「Selenium」- Can not connect to the Service /path/to/chromedriver

「Selemium」- ChromeDriver only supports characters in the BMP

「Selenium」- Element XXX is not clickable at point (672, 582)

「Selenium Grid 3」- Node xxxxx has no free slots（close() vs quit()）

「STORAGE」- 对象存储