Yarn Rest Api使用

前言

Yarn Rest Api 返回的数据都是XML格式,需要解析XML。

任务查询

查询所有任务

http://hadoop02:8088/ws/v1/cluster/apps

字段说明

Item DataType Description
id string 应用的application-id
user string 提交任务的用户名
name string 应用程序的名称
queue string 应用程序所属消息队列
state string 应用程序当前状态
finalStatus string 应用程序最终状态
progress double 应用程序进度
trackingUI string 追踪UI显示名称
trackingUrl string 追踪UI的url
clusterId string 集群id
applicationType string 应用程序类型
priority int 应用程序优先级
startedTime long 应用程序开始时间
launchTime long 应用程序加载时间
finishedTime long 应用程序完成时间
elapsedTime long 应用程序消耗时间(finished-start)
amContainerLogs string am容器日志地址
amHostHttpAddress string am的主机http地址
amRPCAddress string am的RPC地址
allocatedMB string 初始化内存大小
allocatedVCores string 初始化核心数
reservedMB string 保留内存
reservedVCores string 保留核心数
runningContainers string 正在运行的容器数
memorySeconds int 所有的container每秒消耗的内存总和
vcoreSecond string 所有的container每秒消耗的核心数总和
queueUsagePercentage double 所属队列的资源使用百分比
clusterUsagePercentage double 所属集群的资源使用百分比
logAggregationStatus string 日志聚合状态
unmanagedApplication boolean 未被管理的应用程序

查询单个任务

http://hadoop02:8088/ws/v1/cluster/apps/application_1672710362889_0012

其中amHostHttpAddress是运行任务所在的服务器

返回值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
<app>
<id>application_1672710362889_0012</id>
<user>root</user>
<name>yarnforflink</name>
<queue>default</queue>
<state>RUNNING</state>
<finalStatus>UNDEFINED</finalStatus>
<progress>100.0</progress>
<trackingUI>ApplicationMaster</trackingUI>
<trackingUrl>http://hadoop02:8088/proxy/application_1672710362889_0012/</trackingUrl>
<diagnostics/>
<clusterId>1672710362889</clusterId>
<applicationType>Apache Flink</applicationType>
<applicationTags/>
<startedTime>1672799886183</startedTime>
<finishedTime>0</finishedTime>
<elapsedTime>3849093</elapsedTime>
<amContainerLogs>http://hadoop01:8042/node/containerlogs/container_e19_1672710362889_0012_01_000001/root</amContainerLogs>
<amHostHttpAddress>hadoop01:8042</amHostHttpAddress>
<allocatedMB>2048</allocatedMB>
<allocatedVCores>1</allocatedVCores>
<runningContainers>1</runningContainers>
<memorySeconds>8116067</memorySeconds>
<vcoreSeconds>3960</vcoreSeconds>
<preemptedResourceMB>0</preemptedResourceMB>
<preemptedResourceVCores>0</preemptedResourceVCores>
<numNonAMContainerPreempted>0</numNonAMContainerPreempted>
<numAMContainerPreempted>0</numAMContainerPreempted>
<resourceRequests>
<capability>
<memory>2048</memory>
<virtualCores>1</virtualCores>
</capability>
<nodeLabelExpression/>
<numContainers>0</numContainers>
<priority>
<priority>0</priority>
</priority>
<relaxLocality>true</relaxLocality>
<resourceName>*</resourceName>
</resourceRequests>
<resourceRequests>
<capability>
<memory>2048</memory>
<virtualCores>1</virtualCores>
</capability>
<nodeLabelExpression/>
<numContainers>0</numContainers>
<priority>
<priority>1</priority>
</priority>
<relaxLocality>true</relaxLocality>
<resourceName>*</resourceName>
</resourceRequests>
</app>

查看任务状态

http://hadoop02:8088/ws/v1/cluster/apps/application_1672710362889_0012/state

返回

1
2
3
<appstate>
<state>RUNNING</state>
</appstate>

状态值

Item Data Type Description
state string The application state - can be one of “NEW”, “NEW_SAVING”, “SUBMITTED”, “ACCEPTED”, “RUNNING”, “FINISHED”, “FAILED”, “KILLED”

集群

集群信息

http://hadoop02:8088/ws/v1/cluster

返回类似于

1
2
3
4
5
6
7
8
9
10
11
12
13
14
<clusterInfo>
<id>1672710362889</id>
<startedOn>1672710362889</startedOn>
<state>STARTED</state>
<haState>ACTIVE</haState>
<rmStateStoreName>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</rmStateStoreName>
<resourceManagerVersion>2.7.7</resourceManagerVersion>
<resourceManagerBuildVersion>2.7.7 from c1aad84bd27cd79c3d1a7dd58202a8c3ee1ed3ac by stevel source checksum d0c780b3552e7bd9462fffca3f9fc51d</resourceManagerBuildVersion>
<resourceManagerVersionBuiltOn>2018-07-19T00:39Z</resourceManagerVersionBuiltOn>
<hadoopVersion>2.7.7</hadoopVersion>
<hadoopBuildVersion>2.7.7 from c1aad84bd27cd79c3d1a7dd58202a8c3ee1ed3ac by stevel source checksum 792e15d20b12c74bd6f19a1fb886490</hadoopBuildVersion>
<hadoopVersionBuiltOn>2018-07-18T22:47Z</hadoopVersionBuiltOn>
<haZooKeeperConnectionState>CONNECTED</haZooKeeperConnectionState>
</clusterInfo>

返回数据字段说明

Item Data Type Description
id long 集群ID
startedOn long 集群启动的时间(从纪元开始以毫秒为单位)
state string ResourceManager状态-有效值为:NOTINITED,INITED,STARTED,STOPPED
haState string ResourceManager HA状态-有效值为:INITIALIZING,ACTIVE,STANDBY,STOPPED
rmStateStoreName string 实现ResourceManager状态存储的类的完全限定名称
resourceManagerVersion string ResourceManager的版本
resourceManagerBuildVersion string ResourceManager构建字符串以及构建版本,用户和校验和
resourceManagerVersionBuiltOn string 生成ResourceManager的时间戳(自纪元以来以毫秒为单位)
hadoopVersion string Hadoop通用版本
hadoopBuildVersion string 具有构建版本,用户和校验和的Hadoop通用构建字符串
hadoopVersionBuiltOn string 建立hadoop common的时间戳(自纪元以来以毫秒为单位)
haZooKeeperConnectionState string ZooKeeper高可用性服务的连接状态

集群指标

http://hadoop02:8088/ws/v1/cluster/metrics

返回数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<clusterMetrics>
<appsSubmitted>13</appsSubmitted>
<appsCompleted>10</appsCompleted>
<appsPending>0</appsPending>
<appsRunning>1</appsRunning>
<appsFailed>0</appsFailed>
<appsKilled>2</appsKilled>
<reservedMB>0</reservedMB>
<availableMB>4096</availableMB>
<allocatedMB>2048</allocatedMB>
<reservedVirtualCores>0</reservedVirtualCores>
<availableVirtualCores>23</availableVirtualCores>
<allocatedVirtualCores>1</allocatedVirtualCores>
<containersAllocated>1</containersAllocated>
<containersReserved>0</containersReserved>
<containersPending>0</containersPending>
<totalMB>6144</totalMB>
<totalVirtualCores>24</totalVirtualCores>
<totalNodes>3</totalNodes>
<lostNodes>0</lostNodes>
<unhealthyNodes>0</unhealthyNodes>
<decommissionedNodes>0</decommissionedNodes>
<rebootedNodes>0</rebootedNodes>
<activeNodes>3</activeNodes>
</clusterMetrics>

返回数据字段说明

Item Data Type Description
appsSubmitted int 提交的应用程序数量
appsCompleted int 完成的应用程序数量
appsPending int 等待的应用程序数量
appsRunning int 正在运行的应用程序数量
appsFailed int 失败的应用程序数量
appsKilled int 被杀死的应用程序数量
reservedMB long 保留的内存量(MB)
availableMB long 可用的内存量(MB)
allocatedMB long 分配的内存量(MB)
totalMB long 总内存量(MB)
reservedVirtualCores long 保留的虚拟核心数
availableVirtualCores long 可用虚拟核心数
allocatedVirtualCores long 分配的虚拟核心数
totalVirtualCores long 虚拟核心总数
containersAllocated int 分配的容器数
containersReserved int 保留的容器数
containersPending int 待处理的容器数
totalNodes int 节点总数
activeNodes int 活动节点数
lostNodes int 丢失的节点数
unhealthyNodes int 不良节点数
decommissioningNodes int 停用的节点数
decommissionedNodes int 退役的节点数
rebootedNodes int 重新启动的节点数
shutdownNodes int 关闭的节点数