CDH 6
.3.2 Parcel 包安装 Apache Flink 1.10.2
- 文档材料
- 介质路径
- 调整 Maven 配置文件
- 编译 Flink
-
- 1. 创建服务目录
- 2. 下载介质
- 3. 编译 Flink Shaded
- 制作 Pacel 包
- 配置 Flink Parcel
-
- 1. 节点配置
- 2. CM Web UI,选择 Parcel 配置,添加 http://${httpd_server_ip}/flink1.10.2
- 3. Parcel 中将会识别 Flink Parcel 包
- 4. 下载 => 分配 => 激活 Parcel 包
- 部署 Flink 服务
-
- 1. 重启 cloudera-scm-server 服务
- 2. 将 Flink Shaded 存入指定路径
- 3. 按流程完成 Flink 部署(若未配置 kerberos,需将两项 kerberos 配置清空)
- 验证 Flink 服务
-
- 1. 查看 YARN 应用程序,存在驻留任务 "Flink session cluster"
- 2. 通过此任务信息,跳转至 Flink Dashbord
文档材料
- 官方文档 01:https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/yarn/
- 官方文档 02:https://nightlies.apache.org/flink/flink-docs-release-1.12/deployment/resource-providers/yarn.html
- CSDN 文档:https://blog.csdn.net/qq_31454379/article/details/110440037
介质路径
- Flink Shaded 10.0 包:https://archive.apache.org/dist/flink/flink-shaded-10.0/flink-shaded-10.0-src.tgz
- Flink 1.10.2 源码包:https://archive.apache.org/dist/flink/flink-1.10.2/flink-1.10.2-src.tgz
- Flink 1.10.2 bin包:https://archive.apache.org/dist/flink/flink-1.10.2/flink-1.10.2-bin-scala_2.12.tgz
- Flink Parcel GitHub 项目:https://github.com/pkeropen/flink-parcel.git
调整 Maven 配置文件
# 备份原始文件 cp /data/maven/apache-maven-3.6.3/conf/settings.xml /data/maven/apache-maven-3.6.3/conf/settings.xml.orig # 添加镜像路径 # 在 159 行的 "</mirrors>" 前,添加如下配置 <!-- flink 源码编译--> <mirror> <id>alimaven</id> <mirrorOf>central</mirrorOf> <name>aliyun maven</name> <url>http://maven.aliyun.com/nexus/content/repositories/central/</url> </mirror> <mirror> <id>alimaven</id> <name>aliyun maven</name> <url>http://maven.aliyun.com/nexus/content/groups/public/</url> <mirrorOf>central</mirrorOf> </mirror> <mirror> <id>central</id> <name>Maven Repository Switchboard</name> <url>http://repo1.maven.org/maven2/</url> <mirrorOf>central</mirrorOf> </mirror> <mirror> <id>repo2</id> <mirrorOf>central</mirrorOf> <name>Human Readable Name for this Mirror.</name> <url>http://repo2.maven.org/maven2/</url> </mirror> <mirror> <id>ibiblio</id> <mirrorOf>central</mirrorOf> <name>Human Readable Name for this Mirror.</name> <url>http://mirrors.ibiblio.org/pub/mirrors/maven2/</url> </mirror> <mirror> <id>jboss-public-repository-group</id> <mirrorOf>central</mirrorOf> <name>JBoss Public Repository Group</name> <url>http://repository.jboss.org/nexus/content/groups/public</url> </mirror> <mirror> <id>google-maven-central</id> <name>Google Maven Central</name> <url>https://maven-central.storage.googleapis.com </url> <mirrorOf>central</mirrorOf> </mirror> <!-- 中央仓库在中国的镜像 --> <mirror> <id>maven.net.cn</id> <name>oneof the central mirrors in china</name> <url>http://maven.net.cn/content/groups/public/</url> <mirrorOf>central</mirrorOf> </mirror>
编译 Flink
1. 创建服务目录
mkdir -p /data/flink
2. 下载介质
wget https://archive.apache.org/dist/flink/flink-shaded-10.0/flink-shaded-10.0-src.tgz -P /data/flink wget https://archive.apache.org/dist/flink/flink-1.10.2/flink-1.10.2-bin-scala_2.12.tgz -P /data/flink
3. 编译 Flink Shaded
# 解压 Flink Shaded 压缩包 tar -xzf /data/flink/flink-shaded-10.0-src.tgz -C /data/flink # 备份初始配置文件 cp /data/flink/flink-shaded-10.0/pom.xml /data/flink/flink-shaded-10.0/pom.xml.orig # 修改配置文件 # 在 170 行的 "</profiles>" 前,添加如下配置 <profile> <id>java11</id> <activation> <jdk>11</jdk> </activation> <id>vendor-repos</id> <activation> <property> <name>vendor-repos</name> </property> </activation> <!-- Add vendor maven repositories --> <repositories> <!-- Cloudera --> <repository> <id>cloudera-releases</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> <!-- Hortonworks --> <repository> <id>HDPReleases</id> <name>HDP Releases</name> <url>https://repo.hortonworks.com/content/repositories/releases/</url> <snapshots><enabled>false</enabled></snapshots> <releases><enabled>true</enabled></releases> </repository> <repository> <id>HortonworksJettyHadoop</id> <name>HDP Jetty</name> <url>https://repo.hortonworks.com/content/repositories/jetty-hadoop</url> <snapshots><enabled>false</enabled></snapshots> <releases><enabled>true</enabled></releases> </repository> <!-- MapR --> <repository> <id>mapr-releases</id> <url>https://repository.mapr.com/maven/</url> <snapshots><enabled>false</enabled></snapshots> <releases><enabled>true</enabled></releases> </repository> </repositories> </profile> # 编译 Flink Shaded # - clean: 在构建项目之前,清理先前生成的文件。它会删除 target 目录 # - install: 将构建的项目文件安装到本地 Maven 仓库中。其他项目可以从本地仓库中引用这个项目 # - DskipTests: 在构建期间跳过运行测试 # - Pvendor-repos: 激活 Maven profile 为 vendor-repos # - Dhadoop.version=3.0.0-cdh6.3.2: 指定 Hadoop 版本为 3.0.0-cdh6.3.2 # - Dscala-2.12: 指定 Scala 版本为 2.12 # - Drat.skip=true: 跳过 "Release Audit Tool"(RAT)检查。RAT 用于检查项目是否符合 Apache 许可证要求 # - T10C: 启用并行构建,线程数为 10,C 表示以类的方式执行构建 cd /data/flink/flink-shaded-10.0/ && mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=3.0.0-cdh6.3.2 -Dscala-2.12 -Drat.skip=true -T10C
制作 Pacel 包
# 下载介质 git clone https://github.com/pkeropen/flink-parcel.git # 将 flink-1.10.2-bin-scala_2.12.tgz 存入指定路径 cp /data/flink/flink-1.10.2-bin-scala_2.12.tgz /data/flink/flink-parcel/ # 备份原始配置文件 cp /data/flink/flink-parcel/flink-parcel.properties /data/flink/flink-parcel/flink-parcel.properties.orig # 修改配置文件 cat > /data/flink/flink-parcel/flink-parcel.properties << EOF # FLINK 下载地址 FLINK_URL=https://mirrors.tuna.tsinghua.edu.cn/apache/flink/flink-1.10.2/flink-1.10.2-bin-scala_2.12.tgz # Flink 版本号 FLINK_VERSION=1.10.2 # 扩展版本号 EXTENS_VERSION=BIN-SCALA_2.12 # 操作系统版本,以centos为例 OS_VERSION=7 # CDH 小版本 CDH_MIN_FULL=5.2 CDH_MAX_FULL=6.3.3 # CDH大版本 CDH_MIN=5 CDH_MAX=6 EOF # 增加 build.sh 权限 chmod +x build.sh # 编译 Flink Parcel sh build.sh parcel # 生成 csd 文件 # On YARN sh build.sh csd_on_yarn # StandAlone sh build.sh csd_standalone # 查看是否已生成所需文件 ll /data/flink/flink-parcel -rwxr-xr-x 1 root root 5863 Nov 27 14:50 build.sh drwxr-xr-x 6 root root 142 Nov 27 15:03 cm_ext drwxr-xr-x 4 root root 29 Nov 27 15:31 FLINK-1.10.2-BIN-SCALA_2.12 drwxr-xr-x 2 root root 123 Nov 27 15:31 FLINK-1.10.2-BIN-SCALA_2.12_build -rw-r--r-- 1 root root 280626150 Nov 27 14:52 flink-1.10.2-bin-scala_2.12.tgz -rw-r--r-- 1 root root 7737 Nov 27 15:40 FLINK-1.10.2.jar drwxr-xr-x 5 root root 53 Nov 27 15:40 flink_csd_build drwxr-xr-x 5 root root 53 Nov 27 14:50 flink-csd-on-yarn-src drwxr-xr-x 5 root root 53 Nov 27 14:50 flink-csd-standalone-src -rw-r--r-- 1 root root 8260 Nov 27 15:40 FLINK_ON_YARN-1.10.2.jar -rw-r--r-- 1 root root 350 Nov 27 14:55 flink-parcel.properties -rw-r--r-- 1 root root 346 Nov 27 14:53 flink-parcel.properties.orig drwxr-xr-x 3 root root 85 Nov 27 14:50 flink-parcel-src -rw-r--r-- 1 root root 11357 Nov 27 14:50 LICENSE -rw-r--r-- 1 root root 4334 Nov 27 14:50 README.md
配置 Flink Parcel
1. 节点配置
# 将 csd 文件存入 cloudera-scm-server 节点的 /opt/cloudera/csd 目录下 scp FLINK-1.10.2.jar FLINK_ON_YARN-1.10.2.jar root@cloudera-scm-server:/opt/cloudera/csd # 配置 Httpd 服务,外发 Flink Parcel 配置及介质 ln -s /data/flink/flink-parcel/FLINK-1.10.2-BIN-SCALA_2.12_build /var/www/html/flink1.10.2 # 查看外发 Flink Parcel 配置及介质 ll /var/www/html/flink1.10.2/ -rw-r--r-- 1 root root 280629521 Nov 27 15:47 FLINK-1.10.2-BIN-SCALA_2.12-el7.parcel -rw-r--r-- 1 root root 41 Nov 27 15:47 FLINK-1.10.2-BIN-SCALA_2.12-el7.parcel.sha -rw-r--r-- 1 root root 583 Nov 27 15:47 manifest.json # 备份原文件 cp /etc/httpd/conf/httpd.conf /etc/httpd/conf/httpd.conf.orig # 调整配置文件 # 将 284 行更改为如下配置 AddType application/x-gzip .gz .tgz .parcel # 重启服务使其生效 systemctl restart httpd # 查看外发状态 curl http://${httpd_server_ip}/flink1.10.
2. CM Web UI,选择 Parcel 配置,添加 http://${httpd_server_ip}/flink1.10.2
3. Parcel 中将会识别 Flink Parcel 包
4. 下载 => 分配 => 激活 Parcel 包
部署 Flink 服务
1. 重启 cloudera-scm-server 服务
systemctl restart cloudera-scm-server
2. 将 Flink Shaded 存入指定路径
# 所有 cloudera-scm-agent 都需进行如下操作 cp /data/flink/flink-shaded-10.0/flink-shaded-hadoop-2-parent/flink-shaded-hadoop-2-uber/target/flink-shaded-hadoop-2-uber-3.0.0-cdh6.3.2-10.0.jar /opt/cloudera/parcels/FLINK/lib/flink/lib/