Kafka集群部署-蒲公英云

1.集群规划
使用3台机器部署，分别是node03、node04、node05
2.下载Kafka安装包
下载地址http://kafka.apache.org/downloads，选择Kafka版本
3.使用hadoop用户将安装包上传到其中一台机器上，并解压到/home/hadoop/apps目录下
tar -zxvf kafka_2.11-0.10.2.1.tgz -C /home/hadoop/apps
4.修改配置文件
cd /home/hadoop/apps/kafka_2.11-0.10.2.1/config
vim server.properties
5.在/home/hadoop/apps/kafka_2.11-0.10.2.1创建kafka-logs文件夹
mkdir /home/hadoop/apps/kafka_2.11-0.10.2.1/kafka-logs
6.使用scp将配置好的kafka安装包拷贝到node04和node05两个节点
scp -r /home/hadoop/apps/kafka_2.11-0.10.2.1 hadoop@node04:/home/hadoop/apps/
scp -r /home/hadoop/apps/kafka_2.11-0.10.2.1 hadoop@node05:/home/hadoop/apps/
7.分别修改node04和node05的配置文件server.properties
7.1 node04的server.properties修改项
broker.id=1
host.name=192.168.183.103
7.2 node05的server.properties修改项
broker.id=2
host.name=192.168.183.104
8.分别在node03、node04、node05启动kafka
cd /home/hadoop/apps/kafka_2.11-0.10.2.1
启动的时候使用-daemon选项，则kafka将以守护进程的方式启动
bin/kafka-server-start.sh -daemon config/server.properties
9.日志目录
默认在kafka安装路径生成的logs文件夹中

配置文件：server.properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the “License”); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an “AS IS” BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# see kafka.server.KafkaConfig for additional details and defaults

############################# Server Basics #############################

#每个borker的id是唯一的，多个broker要设置不同的id
broker.id=0

#访问端口号
port=9092

#访问地址
host.name=192.168.183.102

#允许删除topic
delete.topic.enable=true

# The number of threads handling network requests
num.network.threads=3

# The number of threads doing disk I/O
num.io.threads=8

# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400

# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400

# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600

############################# Log Basics #############################

#存储数据路径，默认是在/tmp目录下，需要修改
log.dirs=/home/hadoop/apps/kafka_2.11-0.10.2.1/kafka-logs

#创建topic默认分区数
num.partitions=1

# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1

############################# Log Flush Policy #############################

# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
# 1. Durability: Unflushed data may be lost if you are not using replication.
# 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.

# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000

# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000

############################# Log Retention Policy #############################

# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.

#数据保存时间，默认7天，单位小时
log.retention.hours=168

# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining
# segments don’t drop below log.retention.bytes. Functions independently of log.retention.hours.
#log.retention.bytes=1073741824

# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

############################# Zookeeper #############################

#zookeeper地址，多个地址用逗号隔开
zookeeper.connect=192.168.183.100:2181,192.168.183.101:2181,192.168.183.102:2181

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000