安装过程(以CentOS系统为例)

一. 环境要求

几乎所有传统X86设备都支持,只需要确认二点

  1. linux操作系统

  2. cpu支持SSE4.2 指令集

    确定cpu是否支持的方法:

    grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
    #得到以下输出结果
    SSE 4.2 supported

    即使CPU不支持SSE指令集,官方也提供了特定版本的源码包,不过需要自行编译安装,过程较为复杂。

二. 系统参数调整

  1. 时间修正
yes | cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
  1. limit调大
vi /etc/security/limits.conf    #结尾增加如下配置
* soft nproc 1048576
* hard nproc 1048576
* soft nofile 1048576
* hard nofile 1048576
  1. 防火墙关闭(可选,不关闭需要设置进出规则)
systemctl stop firewalld.service
  1. SELinux关闭(可选,不关闭需设置相应规则)
# vi /etc/selinux/config
SELINUX=disabled
  1. 集群设备免密登陆

    ssh-keygen生成密钥即可

三. 安装方式

  1. rpm包安装步骤

    • 安装包下载,地址2选1即可

      【首选,官方维护】https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/

      【次选,三方收集】https://packagecloud.io/Altinity/clickhouse

    • 包很多,选择需要的版本中的4个主要包即可

      wget --content-disposition https://packagecloud.io/Altinity/clickhouse/packages/el/7/clickhouse-server-common-20.3.12.112-1.el7.x86_64.rpm/download.rpm
           
      wget --content-disposition https://packagecloud.io/Altinity/clickhouse/packages/el/7/clickhouse-server-20.3.12.112-1.el7.x86_64.rpm/download.rpm
           
      wget --content-disposition https://packagecloud.io/Altinity/clickhouse/packages/el/7/clickhouse-common-static-20.3.12.112-1.el7.x86_64.rpm/download.rpm
           
      wget --content-disposition https://packagecloud.io/Altinity/clickhouse/packages/el/7/clickhouse-client-20.3.12.112-1.el7.x86_64.rpm/download.rpm
      
    • rpm -ivh ./*.rpm执行安装
  2. 编译安装(步骤较多较复杂,暂不展开)

    • 下载源码包
    • 编译环境准备
    • 执行编译

四. 重要文件结构

  1. 核心目录:

    • /etc/clickhouse-server:服务端的配置文件目录,包括全局配置config.xml和用户配置users.xml等
    • /var/lib/clickhouse:默认的数据存储目录(通常会修改默认路径配置,将数据保存到大容量磁盘挂载的路径)
    • /var/log/clickhouse-server:默认保存日志的目录(通常会修改路径配置,将日志保存到大容量磁盘挂载的路径)
  2. 程序引擎配置文件:

    • /etc/security/limits.d/clickhouse.conf:文件句柄数量的配置
    • /etc/cron.d/clickhouseserver:cron定时任务配置,用于恢复因异常原因中断的ClickHouse服务进程
  3. 程序执行文件:

    • /usr/bin/clickhouse:主程序的可执行文件
    • /usr/bin/clickhouse-client:一个指向ClickHouse可执行文件的软链接,供客户端连接使用
    • /usr/bin/clickhouse-server:一个指向ClickHouse可执行文件的软链接,供服务端启动使用
    • /usr/bin/clickhouse-compressor:内置提供的压缩工具,可用于数据的正压反解

五. 工互ClickHouse集群核心配置文件详解

  1. /etc/clickhouse-server/config.xml
<?xml version="1.0"?>
<yandex>
    <!--本机节点信息-->
    <listen_host>SDGJ-GYHLW-SLC17F-B09-clickhouse-01</listen_host>
    <listen_host>::1</listen_host>
    <listen_host>127.0.0.1</listen_host>
    <include_from>/etc/clickhouse-server/metrika.xml</include_from>
    <remote_url_allow_hosts></remote_url_allow_hosts>

    <!--日志-->
    <logger>
        <level>trace</level>
        <log>/data1/clickhouse-server/clickhouse-server.log</log>
        <errorlog>/data1/clickhouse-server/clickhouse-server.err.log</errorlog>
        <size>1000M</size>
        <count>10</count>
    </logger>
     
    <!--端口-->
    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>

    <!--SSL-->
    <openSSL>
        <server>
            <certificateFile>/etc/clickhouse-server/server.crt</certificateFile>
            <privateKeyFile>/etc/clickhouse-server/server.key</privateKeyFile>
            <dhParamsFile>/etc/clickhouse-server/dhparam.pem</dhParamsFile>
            <verificationMode>none</verificationMode>
            <loadDefaultCAFile>true</loadDefaultCAFile>
            <cacheSessions>true</cacheSessions>
            <disableProtocols>sslv2,sslv3</disableProtocols>
            <preferServerCiphers>true</preferServerCiphers>
        </server>
        <client>
            <loadDefaultCAFile>true</loadDefaultCAFile>
            <cacheSessions>true</cacheSessions>
            <disableProtocols>sslv2,sslv3</disableProtocols>
            <preferServerCiphers>true</preferServerCiphers>
            <invalidCertificateHandler>
                <name>RejectCertificateHandler</name>
            </invalidCertificateHandler>
        </client>
    </openSSL>
    
    <!--端口-->
    <interserver_http_port>9009</interserver_http_port>

    <!--参数优化/存储路径-->   
    <max_connections>4096</max_connections>
    <keep_alive_timeout>600</keep_alive_timeout>
    <max_concurrent_queries>150</max_concurrent_queries>
    <uncompressed_cache_size>8589934592</uncompressed_cache_size>
    <mark_cache_size>5368709120</mark_cache_size>
    <path>/data1/clickhouse-server/lib/clickhouse/</path>
    <tmp_path>/data1/clickhouse-server/lib/clickhouse/tmp/</tmp_path>
    <user_files_path>/data1/clickhouse-server/lib/clickhouse/user_files/</user_files_path>
    <users_config>users.xml</users_config>
    <default_profile>default</default_profile>
    <default_database>default</default_database>
    <mlock_executable>false</mlock_executable>
    <remote_servers incl="clickhouse_remote_servers" ></remote_servers>
    <zookeeper incl="zookeeper-servers" optional="true" />
    <macros incl="macros" optional="true" />
    <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>
    <max_session_timeout>3600</max_session_timeout>
    <default_session_timeout>60</default_session_timeout>

    <!--数据库名称--> 
    <query_log>
        <database>system</database>
        <table>query_log</table>
        <partition_by>toYYYYMM(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_log>
    <trace_log>
        <database>system</database>
        <table>trace_log</table>
        <partition_by>toYYYYMM(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </trace_log>
    <query_thread_log>
        <database>system</database>
        <table>query_thread_log</table>
        <partition_by>toYYYYMM(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_thread_log>
    <dictionaries_config>*_dictionary.xml</dictionaries_config>
    <compression incl="clickhouse_compression"></compression>
    <distributed_ddl>
        <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>
    <graphite_rollup_example>
        <pattern>
            <regexp>click_cost</regexp>
            <function>any</function>
            <retention>
                <age>0</age>
                <precision>3600</precision>
            </retention>
            <retention>
                <age>86400</age>
                <precision>60</precision>
            </retention>
        </pattern>
        <default>
            <function>max</function>
            <retention>
                <age>0</age>
                <precision>60</precision>
            </retention>
            <retention>
                <age>3600</age>
                <precision>300</precision>
            </retention>
            <retention>
                <age>86400</age>
                <precision>3600</precision>
            </retention>
        </default>
    </graphite_rollup_example>
    <format_schema_path>/data1/clickhouse-server/lib/clickhouse/format_schemas/</format_schema_path>
</yandex>
  1. /etc/clickhouse-server/users.xml
<?xml version="1.0"?>
<yandex>
    <!-- 读写用户设置  -->
    <profiles>
        <default>
            <max_bytes_before_external_group_by>220000000000</max_bytes_before_external_group_by>
            <max_memory_usage>220000000000</max_memory_usage>
            <max_memory_usage_for_all_queries>220000000000</max_memory_usage_for_all_queries>
            <max_execution_time>600</max_execution_time>
            <distributed_product_mode>local</distributed_product_mode>
            <background_pool_size>48</background_pool_size>
            <use_uncompressed_cache>0</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
            <log_queries>1</log_queries>
            <skip_unavailable_shards>1</skip_unavailable_shards>
        </default>

        <readonly>
            <readonly>1</readonly>
        </readonly>
    </profiles>

    <!-- 用户和访问控制 -->
    <users>
        <default>
            <password></password>
            <networks incl="networks" replace="replace">
                <ip>::/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </default>
    </users>

    <!-- 用户和配额 -->
    <quotas>
        <default>
            <interval>
                <duration>3600</duration>
                <queries>0</queries>
                <errors>0</errors>
                <result_rows>0</result_rows>
                <read_rows>0</read_rows>
                <execution_time>0</execution_time>
            </interval>
        </default>
    </quotas>
</yandex>
  1. /etc/clickhouse-server/metrika.xml
<yandex>
    <!-- 集群配置 -->
    <clickhouse_remote_servers>
        <!--eversec集群数据分片和副本;以下配置1个副本2个分片 -->
        <eversec_shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-B09-clickhouse-01</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-B09-clickhouse-02</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-03</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-04</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-05</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-06</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-07</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-08</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-09</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-10</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-11</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-12</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-13</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-14</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-15</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-16</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-17</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-18</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-19</host>
                    <port>9000</port>
                </replica>
            </shard>
        </eversec_shard>

       <!--eversec_shard_none_rep集群数据只分片;无副本  -->
        <eversec_shard_none_rep>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-B09-clickhouse-01</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-B09-clickhouse-02</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-03</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-04</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-05</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-06</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-07</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-08</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-09</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-10</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-11</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-12</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-13</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-14</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-15</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-16</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-17</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-18</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-19</host>
                    <port>9000</port>
                </replica>
            </shard>
       </eversec_shard_none_rep>

    </clickhouse_remote_servers>
    <!-- 本节点服务器的副本名称 -->
    <macros>
        <cluster>eversec_shard</cluster>
        <shard>1</shard>
        <replica>SDGJ-GYHLW-SLC17F-B09-clickhouse-01</replica>
    </macros>

    <!-- ZK配置信息 -->
    <zookeeper-servers>
        <node index="1">
            <host>SDGJ-GYHLW-SLC17F-B09-clickhouse-01</host>
            <port>2181</port>
        </node>
        <node index="2">
            <host>SDGJ-GYHLW-SLC17F-B09-clickhouse-02</host>
            <port>2181</port>
        </node>
        <node index="3">
            <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-03</host>
            <port>2181</port>
        </node>
        <node index="4">
            <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-04</host>
            <port>2181</port>
        </node>
        <node index="5">
            <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-05</host>
            <port>2181</port>
        </node>
    </zookeeper-servers>

    <!-- 数据压缩算法 -->
    <clickhouse_compression>
        <case>
            <min_part_size>10000000000</min_part_size>
            <min_part_size_ratio>0.01</min_part_size_ratio>
            <method>zstd</method>
        </case>
    </clickhouse_compression>
</yandex>

六. 客户端访问接口

1. CLI

即命令行接口,其底层是基于TCP接口进行通信的,是通过clickhouseclient脚本运行的。它拥有两种执行模式。

  1. 交互式执行

    交互式执行可以广泛用于调试、运维、开发和测试等场景,它的使用方法是直接运行clickhouse-client进行登录,然后便可以执行一问一答的交互查询,具体如下所示:

    clickhouse-client 
    ClickHouse client version 20.4.6.53 (official build).
    Connecting to localhost:9000 as user default.
    Connected to ClickHouse server version 20.4.6 revision 54434.
    
    localhost :) select 1
    
    SELECT 1
    
    ┌─1─┐
    │ 1 │
    └───┘
    
    1 rows in set. Elapsed: 0.040 sec.

    通过交互式执行的SQL语句,相关查询结果会统一被记录到~/.clickhouse-client-history文件,该记录可以作为审计之用。执行记录的顺序是由下至上,从新到旧。

  2. 非交互式执行

    非交互式模式主要用于批处理场景,诸如对数据的导入和导出等操作。在执行脚本命令时,需要追加query参数指定执行的SQL语句。在导入数据时,它可以接收操作系统的stdin标准输入作为写入的数据源。

    cat /chbase/test_fetch.tsv | clickhouse-client --query "INSERT INTO test_fetch FORM ATTSV"

    cat命令读取的文件流,将会作为INSERT查询的数据输入。

    而在数据导出时,则可以将输出流重定向到文件:

    clickhouse-client --query="SELECT * FROM test_fetch" > /chbase/test_fetch.tsv

    在默认的情况下,clickhouseclient一次只能运行一条SQL语句,如果需要执行多次查询,则需要在循环中重复执行,这显然不是一种高效的方式。此时可以追加multiquery参数,它可以支持一次运行多条SQL查询,多条查询语句之间使用分号间隔,例如:

    clickhouse-client -h 10.37.129.10 --multiquery --query="SELECT 1;SELECT 2;SELECT 3;"
  3. 一些重要参数
    • –host/-h:服务端的地址,默认值为localhost
    • –port:服务端的TCP端口,默认值为9000
    • –user/-u:登录的用户名,默认值为default
    • –password:登录的密码,默认值为空。如果在用户定义中未设置密码,则不需要填写(例如默认的default用户)
    • –database/-d:登录的数据库,默认值为default
    • –query/-q:只能在非交互式查询时使用,用于指定SQL语句
    • –multiquery/-n:在非交互式执行时,允许一次运行多条SQL语句,多条语句之间以分号间隔。
    • –time/-t:在非交互式执行时,会打印每条SQL的执行时间

2. JDBC

ClickHouse支持标准的JDBC协议,底层基于HTTP接口通信。使用下面的Maven依赖,即可为Java程序引入官方提供的数据库驱动包:

<dependency>
  <groupId>ru.yandex.clickhouse</groupId>
  <artifactId>clickhousejdbc</artifactId>
  <version>0.2.4</version>
</dependency>

该驱动有两种使用方式

  1. 标准形式:

    标准形式是我们常用的方式,通过JDK原生接口获取连接,其关键参数如下:

    #JDBC Driver Class为
     ru.yandex.clickhouse.ClickHouseDriver;
     
     #JDBC URL为
     jdbc:clickhouse://<host>:<port>[/<database>]

  2. 高可用模式:

    高可用模式允许设置多个host地址,每次会从可用的地址中随机选择一个进行连接,其URL声明格式如下:

    jdbc:clickhouse://<firsthost>:<port>,<secondhost>:<port>[,…]/<database>

3. 其它

除了上面介绍的CLI(基于TCP)和JDBC(基于HTTP)。ClickHouse的访问接口还拥有原生的C++、ODBC接口及众多第三方的集成接口(Python、NodeJS、Go、PHP等),不做依依介绍了,根据使用语言自行查阅。

七. 内置的实用工具

1. clickhouse-local

clickhouse-local可以独立运行大部分SQL查询,不需要依赖任何ClickHouse的服务端程序,它可以理解成是ClickHouse服务的单机版微内核,是一个轻量级的应用程序。clickhouse-local只能够使用File表引擎(关于表引擎的更多介绍在后续章节展开),它的数据与同机运行的ClickHouse服务也是完全隔离的,相互之间并不能访问。clickhouse-local是非交互式运行的,每次执行都需要指定数据来源,例如通过stdin标准输入,以echo打印作为数据来源。

用的比较少,不做展开,这里近展示一个实例,借助操作系统的命令,实现对系统用户内存用量的查询:

ps aux | tail -n +2 | awk '{ printf("%s\t%s\n",$1,$4) }' | clickhouse-local -S "user String, memory Float64" -q "SELECT user,round(sum(memory),2) as memoryTotal FROM table GROUP BY user ORDER BY memoryTotal DESC FORMAT Pretty"

2. clickhouse-benchmark

clickhousebenchmark是基准测试的小工具,它可以自动运行SQL查询,并生成相应的运行指标报告

echo "SELECT * FROM system.numbers LIMIT 100" | clickhouse-benchmark -i 5

查询执行5次,输出报告

八. 利用DBeaver工具连接工户ClickHouse集群


0 条评论

发表回复

Avatar placeholder

您的邮箱地址不会被公开。 必填项已用 * 标注