安装过程(以CentOS系统为例)
一. 环境要求
几乎所有传统X86设备都支持,只需要确认二点
linux操作系统
cpu支持SSE4.2 指令集
确定cpu是否支持的方法:
grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported" #得到以下输出结果 SSE 4.2 supported
即使CPU不支持SSE指令集,官方也提供了特定版本的源码包,不过需要自行编译安装,过程较为复杂。
二. 系统参数调整
- 时间修正
yes | cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
- limit调大
vi /etc/security/limits.conf #结尾增加如下配置 * soft nproc 1048576 * hard nproc 1048576 * soft nofile 1048576 * hard nofile 1048576
- 防火墙关闭(可选,不关闭需要设置进出规则)
systemctl stop firewalld.service
- SELinux关闭(可选,不关闭需设置相应规则)
# vi /etc/selinux/config SELINUX=disabled
集群设备免密登陆
ssh-keygen生成密钥即可
三. 安装方式
rpm包安装步骤
安装包下载,地址2选1即可
【首选,官方维护】https://repo.yandex.ru/clickhouse/rpm/stable/x86_64/
包很多,选择需要的版本中的4个主要包即可
wget --content-disposition https://packagecloud.io/Altinity/clickhouse/packages/el/7/clickhouse-server-common-20.3.12.112-1.el7.x86_64.rpm/download.rpm wget --content-disposition https://packagecloud.io/Altinity/clickhouse/packages/el/7/clickhouse-server-20.3.12.112-1.el7.x86_64.rpm/download.rpm wget --content-disposition https://packagecloud.io/Altinity/clickhouse/packages/el/7/clickhouse-common-static-20.3.12.112-1.el7.x86_64.rpm/download.rpm wget --content-disposition https://packagecloud.io/Altinity/clickhouse/packages/el/7/clickhouse-client-20.3.12.112-1.el7.x86_64.rpm/download.rpm
- rpm -ivh ./*.rpm执行安装
编译安装(步骤较多较复杂,暂不展开)
- 下载源码包
- 编译环境准备
- 执行编译
四. 重要文件结构
核心目录:
- /etc/clickhouse-server:服务端的配置文件目录,包括全局配置config.xml和用户配置users.xml等
- /var/lib/clickhouse:默认的数据存储目录(通常会修改默认路径配置,将数据保存到大容量磁盘挂载的路径)
- /var/log/clickhouse-server:默认保存日志的目录(通常会修改路径配置,将日志保存到大容量磁盘挂载的路径)
程序引擎配置文件:
- /etc/security/limits.d/clickhouse.conf:文件句柄数量的配置
- /etc/cron.d/clickhouseserver:cron定时任务配置,用于恢复因异常原因中断的ClickHouse服务进程
程序执行文件:
- /usr/bin/clickhouse:主程序的可执行文件
- /usr/bin/clickhouse-client:一个指向ClickHouse可执行文件的软链接,供客户端连接使用
- /usr/bin/clickhouse-server:一个指向ClickHouse可执行文件的软链接,供服务端启动使用
- /usr/bin/clickhouse-compressor:内置提供的压缩工具,可用于数据的正压反解
五. 工互ClickHouse集群核心配置文件详解
- /etc/clickhouse-server/config.xml
<?xml version="1.0"?> <yandex> <!--本机节点信息--> <listen_host>SDGJ-GYHLW-SLC17F-B09-clickhouse-01</listen_host> <listen_host>::1</listen_host> <listen_host>127.0.0.1</listen_host> <include_from>/etc/clickhouse-server/metrika.xml</include_from> <remote_url_allow_hosts></remote_url_allow_hosts> <!--日志--> <logger> <level>trace</level> <log>/data1/clickhouse-server/clickhouse-server.log</log> <errorlog>/data1/clickhouse-server/clickhouse-server.err.log</errorlog> <size>1000M</size> <count>10</count> </logger> <!--端口--> <http_port>8123</http_port> <tcp_port>9000</tcp_port> <!--SSL--> <openSSL> <server> <certificateFile>/etc/clickhouse-server/server.crt</certificateFile> <privateKeyFile>/etc/clickhouse-server/server.key</privateKeyFile> <dhParamsFile>/etc/clickhouse-server/dhparam.pem</dhParamsFile> <verificationMode>none</verificationMode> <loadDefaultCAFile>true</loadDefaultCAFile> <cacheSessions>true</cacheSessions> <disableProtocols>sslv2,sslv3</disableProtocols> <preferServerCiphers>true</preferServerCiphers> </server> <client> <loadDefaultCAFile>true</loadDefaultCAFile> <cacheSessions>true</cacheSessions> <disableProtocols>sslv2,sslv3</disableProtocols> <preferServerCiphers>true</preferServerCiphers> <invalidCertificateHandler> <name>RejectCertificateHandler</name> </invalidCertificateHandler> </client> </openSSL> <!--端口--> <interserver_http_port>9009</interserver_http_port> <!--参数优化/存储路径--> <max_connections>4096</max_connections> <keep_alive_timeout>600</keep_alive_timeout> <max_concurrent_queries>150</max_concurrent_queries> <uncompressed_cache_size>8589934592</uncompressed_cache_size> <mark_cache_size>5368709120</mark_cache_size> <path>/data1/clickhouse-server/lib/clickhouse/</path> <tmp_path>/data1/clickhouse-server/lib/clickhouse/tmp/</tmp_path> <user_files_path>/data1/clickhouse-server/lib/clickhouse/user_files/</user_files_path> <users_config>users.xml</users_config> <default_profile>default</default_profile> <default_database>default</default_database> <mlock_executable>false</mlock_executable> <remote_servers incl="clickhouse_remote_servers" ></remote_servers> <zookeeper incl="zookeeper-servers" optional="true" /> <macros incl="macros" optional="true" /> <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval> <max_session_timeout>3600</max_session_timeout> <default_session_timeout>60</default_session_timeout> <!--数据库名称--> <query_log> <database>system</database> <table>query_log</table> <partition_by>toYYYYMM(event_date)</partition_by> <flush_interval_milliseconds>7500</flush_interval_milliseconds> </query_log> <trace_log> <database>system</database> <table>trace_log</table> <partition_by>toYYYYMM(event_date)</partition_by> <flush_interval_milliseconds>7500</flush_interval_milliseconds> </trace_log> <query_thread_log> <database>system</database> <table>query_thread_log</table> <partition_by>toYYYYMM(event_date)</partition_by> <flush_interval_milliseconds>7500</flush_interval_milliseconds> </query_thread_log> <dictionaries_config>*_dictionary.xml</dictionaries_config> <compression incl="clickhouse_compression"></compression> <distributed_ddl> <path>/clickhouse/task_queue/ddl</path> </distributed_ddl> <graphite_rollup_example> <pattern> <regexp>click_cost</regexp> <function>any</function> <retention> <age>0</age> <precision>3600</precision> </retention> <retention> <age>86400</age> <precision>60</precision> </retention> </pattern> <default> <function>max</function> <retention> <age>0</age> <precision>60</precision> </retention> <retention> <age>3600</age> <precision>300</precision> </retention> <retention> <age>86400</age> <precision>3600</precision> </retention> </default> </graphite_rollup_example> <format_schema_path>/data1/clickhouse-server/lib/clickhouse/format_schemas/</format_schema_path> </yandex>
- /etc/clickhouse-server/users.xml
<?xml version="1.0"?> <yandex> <!-- 读写用户设置 --> <profiles> <default> <max_bytes_before_external_group_by>220000000000</max_bytes_before_external_group_by> <max_memory_usage>220000000000</max_memory_usage> <max_memory_usage_for_all_queries>220000000000</max_memory_usage_for_all_queries> <max_execution_time>600</max_execution_time> <distributed_product_mode>local</distributed_product_mode> <background_pool_size>48</background_pool_size> <use_uncompressed_cache>0</use_uncompressed_cache> <load_balancing>random</load_balancing> <log_queries>1</log_queries> <skip_unavailable_shards>1</skip_unavailable_shards> </default> <readonly> <readonly>1</readonly> </readonly> </profiles> <!-- 用户和访问控制 --> <users> <default> <password></password> <networks incl="networks" replace="replace"> <ip>::/0</ip> </networks> <profile>default</profile> <quota>default</quota> </default> </users> <!-- 用户和配额 --> <quotas> <default> <interval> <duration>3600</duration> <queries>0</queries> <errors>0</errors> <result_rows>0</result_rows> <read_rows>0</read_rows> <execution_time>0</execution_time> </interval> </default> </quotas> </yandex>
- /etc/clickhouse-server/metrika.xml
<yandex> <!-- 集群配置 --> <clickhouse_remote_servers> <!--eversec集群数据分片和副本;以下配置1个副本2个分片 --> <eversec_shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-B09-clickhouse-01</host> <port>9000</port> </replica> <replica> <host>SDGJ-GYHLW-SLC17F-B09-clickhouse-02</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-03</host> <port>9000</port> </replica> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-04</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-05</host> <port>9000</port> </replica> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-06</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-07</host> <port>9000</port> </replica> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-08</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-09</host> <port>9000</port> </replica> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-10</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-11</host> <port>9000</port> </replica> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-12</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-13</host> <port>9000</port> </replica> <replica> <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-14</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-15</host> <port>9000</port> </replica> <replica> <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-16</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-17</host> <port>9000</port> </replica> <replica> <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-18</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-19</host> <port>9000</port> </replica> </shard> </eversec_shard> <!--eversec_shard_none_rep集群数据只分片;无副本 --> <eversec_shard_none_rep> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-B09-clickhouse-01</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-B09-clickhouse-02</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-03</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-04</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-05</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-06</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-07</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-08</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-09</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-10</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-11</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-12</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-13</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-14</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-15</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-16</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-17</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-18</host> <port>9000</port> </replica> </shard> <shard> <internal_replication>true</internal_replication> <replica> <host>SDGJ-GYHLW-SLC17F-A08-clickhouse-19</host> <port>9000</port> </replica> </shard> </eversec_shard_none_rep> </clickhouse_remote_servers> <!-- 本节点服务器的副本名称 --> <macros> <cluster>eversec_shard</cluster> <shard>1</shard> <replica>SDGJ-GYHLW-SLC17F-B09-clickhouse-01</replica> </macros> <!-- ZK配置信息 --> <zookeeper-servers> <node index="1"> <host>SDGJ-GYHLW-SLC17F-B09-clickhouse-01</host> <port>2181</port> </node> <node index="2"> <host>SDGJ-GYHLW-SLC17F-B09-clickhouse-02</host> <port>2181</port> </node> <node index="3"> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-03</host> <port>2181</port> </node> <node index="4"> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-04</host> <port>2181</port> </node> <node index="5"> <host>SDGJ-GYHLW-SLC17F-A09-clickhouse-05</host> <port>2181</port> </node> </zookeeper-servers> <!-- 数据压缩算法 --> <clickhouse_compression> <case> <min_part_size>10000000000</min_part_size> <min_part_size_ratio>0.01</min_part_size_ratio> <method>zstd</method> </case> </clickhouse_compression> </yandex>
六. 客户端访问接口
1. CLI
即命令行接口,其底层是基于TCP接口进行通信的,是通过clickhouseclient脚本运行的。它拥有两种执行模式。
交互式执行
交互式执行可以广泛用于调试、运维、开发和测试等场景,它的使用方法是直接运行clickhouse-client进行登录,然后便可以执行一问一答的交互查询,具体如下所示:
clickhouse-client ClickHouse client version 20.4.6.53 (official build). Connecting to localhost:9000 as user default. Connected to ClickHouse server version 20.4.6 revision 54434. localhost :) select 1 SELECT 1 ┌─1─┐ │ 1 │ └───┘ 1 rows in set. Elapsed: 0.040 sec.
通过交互式执行的SQL语句,相关查询结果会统一被记录到~/.clickhouse-client-history文件,该记录可以作为审计之用。执行记录的顺序是由下至上,从新到旧。
非交互式执行
非交互式模式主要用于批处理场景,诸如对数据的导入和导出等操作。在执行脚本命令时,需要追加query参数指定执行的SQL语句。在导入数据时,它可以接收操作系统的stdin标准输入作为写入的数据源。
cat /chbase/test_fetch.tsv | clickhouse-client --query "INSERT INTO test_fetch FORM ATTSV"
cat命令读取的文件流,将会作为INSERT查询的数据输入。
而在数据导出时,则可以将输出流重定向到文件:
clickhouse-client --query="SELECT * FROM test_fetch" > /chbase/test_fetch.tsv
在默认的情况下,clickhouseclient一次只能运行一条SQL语句,如果需要执行多次查询,则需要在循环中重复执行,这显然不是一种高效的方式。此时可以追加multiquery参数,它可以支持一次运行多条SQL查询,多条查询语句之间使用分号间隔,例如:
clickhouse-client -h 10.37.129.10 --multiquery --query="SELECT 1;SELECT 2;SELECT 3;"
- 一些重要参数
- –host/-h:服务端的地址,默认值为localhost
- –port:服务端的TCP端口,默认值为9000
- –user/-u:登录的用户名,默认值为default
- –password:登录的密码,默认值为空。如果在用户定义中未设置密码,则不需要填写(例如默认的default用户)
- –database/-d:登录的数据库,默认值为default
- –query/-q:只能在非交互式查询时使用,用于指定SQL语句
- –multiquery/-n:在非交互式执行时,允许一次运行多条SQL语句,多条语句之间以分号间隔。
- –time/-t:在非交互式执行时,会打印每条SQL的执行时间
2. JDBC
ClickHouse支持标准的JDBC协议,底层基于HTTP接口通信。使用下面的Maven依赖,即可为Java程序引入官方提供的数据库驱动包:
<dependency> <groupId>ru.yandex.clickhouse</groupId> <artifactId>clickhousejdbc</artifactId> <version>0.2.4</version> </dependency>
该驱动有两种使用方式
标准形式:
标准形式是我们常用的方式,通过JDK原生接口获取连接,其关键参数如下:
#JDBC Driver Class为 ru.yandex.clickhouse.ClickHouseDriver; #JDBC URL为 jdbc:clickhouse://<host>:<port>[/<database>]
高可用模式:
高可用模式允许设置多个host地址,每次会从可用的地址中随机选择一个进行连接,其URL声明格式如下:
jdbc:clickhouse://<firsthost>:<port>,<secondhost>:<port>[,…]/<database>
3. 其它
除了上面介绍的CLI(基于TCP)和JDBC(基于HTTP)。ClickHouse的访问接口还拥有原生的C++、ODBC接口及众多第三方的集成接口(Python、NodeJS、Go、PHP等),不做依依介绍了,根据使用语言自行查阅。
七. 内置的实用工具
1. clickhouse-local
clickhouse-local可以独立运行大部分SQL查询,不需要依赖任何ClickHouse的服务端程序,它可以理解成是ClickHouse服务的单机版微内核,是一个轻量级的应用程序。clickhouse-local只能够使用File表引擎(关于表引擎的更多介绍在后续章节展开),它的数据与同机运行的ClickHouse服务也是完全隔离的,相互之间并不能访问。clickhouse-local是非交互式运行的,每次执行都需要指定数据来源,例如通过stdin标准输入,以echo打印作为数据来源。
用的比较少,不做展开,这里近展示一个实例,借助操作系统的命令,实现对系统用户内存用量的查询:
ps aux | tail -n +2 | awk '{ printf("%s\t%s\n",$1,$4) }' | clickhouse-local -S "user String, memory Float64" -q "SELECT user,round(sum(memory),2) as memoryTotal FROM table GROUP BY user ORDER BY memoryTotal DESC FORMAT Pretty"
2. clickhouse-benchmark
clickhousebenchmark是基准测试的小工具,它可以自动运行SQL查询,并生成相应的运行指标报告
echo "SELECT * FROM system.numbers LIMIT 100" | clickhouse-benchmark -i 5
查询执行5次,输出报告
0 条评论