参考文章: 1.ClickHouse vs StarRocks选型对比 2.openvpn服务高可用的三种方案 这里的三种方案,实现openvpn server的高可用性:1.在vpn客户端使用多个配置文件实现(由用户选择拨号);2.通过在客户端配置文件实现负载均衡;3.通过域名加DNS轮询的方式实现负载均衡(DNS自动分配VPN)。 3.StarRocks集群管理 StarRocks可以通过滚动升级的方式,平滑进行升级。升级顺序是先升级BE,再升级FE。StarRocks保证BE后向兼容FE。升级的过程可以分为:测试升级的正确性,滚动升级,观察服务。BE、FE启动顺序不能颠倒。因为如果升级导致新旧 FE、BE 不兼容,从新 FE 发出的命令可能会导致旧的 BE 挂掉。但是因为已经部署了新的 BE 文件,BE 通过守护进程自动重启后,即已经是新的 BE 了。 4.实时数仓不用愁,StarRocks+Flink来解忧!
(1) Follower FE(包括Master)的数量必须为奇数,建议部署3个,组成高可用(HA)模式即可。 (2) 当 FE 处于高可用部署时(1个Master,2个Follower),建议通过增加 Observer FE 来扩展 FE 的读服务能力。当然也可以继续增加 Follower FE,但几乎是不必要的。 (3) 通常一个 FE 节点可以应对 10-20 台 BE 节点。建议总的 FE 节点数量在 10 个以下。而3个即可满足绝大部分需求。 (4) helper 不能指向 FE 自身,必须指向一个或多个已存在并且正常运行中的 Master/Follower FE。
我在这里卡了很久,启动之后,查看日志,总是出现 :frontend 127.0.0.1:9010 is not added to cluster yet. 意思是某一个地址没有加入到集群中。我是阿里云的机器,使用ifconfig查看的时候,会有一个回环地址和网卡地址,根据参考资料我尝试了修改fe.conf的配置。
(2) frontend 172.24.148.4:9010 is not added to cluster yet 问题主要是没有配置fe.conf的 priority_networks
1 2
frontend 172.24.148.4:9010 is not added to cluster yet. role UNKNOWN 2022-03-18 18:41:08,444 WARN (main|1) [Catalog.getClusterIdAndRole():1008] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes:
(3) frontend 127.0.0.1:9010 is not added to cluster yet
2022-03-23 10:06:44,662 INFO (nioEventLoopGroup-3-8|116) [BaseAction.handleRequest():88] receive http request. url=/role?host=10.9.0.4&port=9010 2022-03-23 10:06:44,662 WARN (nioEventLoopGroup-3-8|116) [MetaBaseAction.isFromValidFe():98] request is not from valid FE. client: 10.9.0.1 2022-03-23 10:06:44,976 INFO (tablet checker|32) [TabletChecker.checkTablets():312] finished to check tablets. unhealth/total/added/in_sched/not_ready: 0/0/0/0/0, cost: 0 ms 2022-03-23 10:06:44,976 INFO (tablet checker|32) [TabletChecker.runAfterCatalogReady():188] TStat : TStat num of tablet check round: 30 (+1) TStat cost of tablet check(ms): 0 (+0) TStat num of tablet checked in tablet checker: 0 (+0) TStat num of unhealthy tablet checked in tablet checker: 0 (+0) TStat num of tablet being added to tablet scheduler: 0 (+0) TStat num of tablet schedule round: 580 (+20) TStat cost of tablet schedule(ms): 24 (+0) TStat num of tablet being scheduled: 0 (+0) TStat num of tablet being scheduled succeeded: 0 (+0) TStat num of tablet being scheduled failed: 0 (+0) TStat num of tablet being scheduled discard: 0 (+0) TStat num of tablet priority upgraded: 0 (+0) TStat num of tablet priority downgraded: 0 (+0) TStat num of clone task: 0 (+0) TStat num of clone task succeeded: 0 (+0) TStat num of clone task failed: 0 (+0) TStat num of clone task timeout: 0 (+0) TStat num of replica missing error: 0 (+0) TStat num of replica version missing error: 0 (+0) TStat num of replica unavailable error: 0 (+0) TStat num of replica redundant error: 0 (+0) TStat num of replica missing in cluster error: 0 (+0) TStat num of balance scheduled: 0 (+0) TStat num of colocate replica mismatch: 0 (+0) TStat num of colocate replica redundant: 0 (+0)
2022-03-23 10:06:44,978 INFO (Routine load scheduler|37) [RoutineLoadScheduler.process():76] there are 0 job need schedule 2022-03-23 10:06:44,978 WARN (Routine load task scheduler|38) [RoutineLoadTaskScheduler.process():103] no available be slot to scheduler tasks, wait for 10 seconds to scheduler again, you can set max_routine_load_task_num_per_be bigger in fe.conf, current value is 5 2022-03-23 10:06:45,071 INFO (tablet scheduler|31) [TabletScheduler.adjustPriorities():382] adjust priority for all tablets. changed: 0, total: 0 2022-03-23 10:06:46,091 WARN (heartbeat-mgr-pool-4|103) [Util.getResultForUrl():336] failed to get result from url: http://10.9.0.4:8030/api/bootstrap?cluster_id=1047167524&token=ff455fb4-2f55-4ef4-9fba-78ef83b82fe2. Connection refused 2022-03-23 10:06:46,092 WARN (heartbeat mgr|25) [HeartbeatMgr.runAfterCatalogReady():142] get bad heartbeat response: type: FRONTEND, status: BAD, msg: got exception, name: 10.9.0.4_9010_1648000684518, queryPort: 0, rpcPort: 0, replayedJournalId: 0, feStartTime: \N, feVersion: null
2022-03-23 10:07:35,689 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:07:40,786 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:07:40,787 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:07:45,884 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:07:45,885 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:07:50,982 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:07:50,982 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:07:56,080 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:07:56,081 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:08:01,178 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:08:01,179 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:08:06,276 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:08:06,276 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:08:11,334 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:08:11,334 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:08:16,435 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:08:16,436 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:08:21,533 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:08:21,533 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:08:26,631 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:08:26,631 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:08:31,729 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:08:31,729 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:08:36,827 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:08:36,827 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:08:41,925 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:08:41,926 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:08:47,023 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:08:47,024 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:08:52,121 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:08:52,122 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010] 2022-03-23 10:08:57,219 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1116] failed to get fe node type from helper node: 10.9.0.6:9010. response code: 400 2022-03-23 10:08:57,220 WARN (main|1) [Catalog.getClusterIdAndRole():996] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [10.9.0.6:9010]