前言
Yarn Rest Api 返回的数据都是XML格式,需要解析XML。
任务查询
查询所有任务
http://hadoop02:8088/ws/v1/cluster/apps
字段说明
| Item |
DataType |
Description |
| id |
string |
应用的application-id |
| user |
string |
提交任务的用户名 |
| name |
string |
应用程序的名称 |
| queue |
string |
应用程序所属消息队列 |
| state |
string |
应用程序当前状态 |
| finalStatus |
string |
应用程序最终状态 |
| progress |
double |
应用程序进度 |
| trackingUI |
string |
追踪UI显示名称 |
| trackingUrl |
string |
追踪UI的url |
| clusterId |
string |
集群id |
| applicationType |
string |
应用程序类型 |
| priority |
int |
应用程序优先级 |
| startedTime |
long |
应用程序开始时间 |
| launchTime |
long |
应用程序加载时间 |
| finishedTime |
long |
应用程序完成时间 |
| elapsedTime |
long |
应用程序消耗时间(finished-start) |
| amContainerLogs |
string |
am容器日志地址 |
| amHostHttpAddress |
string |
am的主机http地址 |
| amRPCAddress |
string |
am的RPC地址 |
| allocatedMB |
string |
初始化内存大小 |
| allocatedVCores |
string |
初始化核心数 |
| reservedMB |
string |
保留内存 |
| reservedVCores |
string |
保留核心数 |
| runningContainers |
string |
正在运行的容器数 |
| memorySeconds |
int |
所有的container每秒消耗的内存总和 |
| vcoreSecond |
string |
所有的container每秒消耗的核心数总和 |
| queueUsagePercentage |
double |
所属队列的资源使用百分比 |
| clusterUsagePercentage |
double |
所属集群的资源使用百分比 |
| logAggregationStatus |
string |
日志聚合状态 |
| unmanagedApplication |
boolean |
未被管理的应用程序 |
查询单个任务
http://hadoop02:8088/ws/v1/cluster/apps/application_1672710362889_0012
其中amHostHttpAddress是运行任务所在的服务器
返回值
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
| <app> <id>application_1672710362889_0012</id> <user>root</user> <name>yarnforflink</name> <queue>default</queue> <state>RUNNING</state> <finalStatus>UNDEFINED</finalStatus> <progress>100.0</progress> <trackingUI>ApplicationMaster</trackingUI> <trackingUrl>http://hadoop02:8088/proxy/application_1672710362889_0012/</trackingUrl> <diagnostics/> <clusterId>1672710362889</clusterId> <applicationType>Apache Flink</applicationType> <applicationTags/> <startedTime>1672799886183</startedTime> <finishedTime>0</finishedTime> <elapsedTime>3849093</elapsedTime> <amContainerLogs>http://hadoop01:8042/node/containerlogs/container_e19_1672710362889_0012_01_000001/root</amContainerLogs> <amHostHttpAddress>hadoop01:8042</amHostHttpAddress> <allocatedMB>2048</allocatedMB> <allocatedVCores>1</allocatedVCores> <runningContainers>1</runningContainers> <memorySeconds>8116067</memorySeconds> <vcoreSeconds>3960</vcoreSeconds> <preemptedResourceMB>0</preemptedResourceMB> <preemptedResourceVCores>0</preemptedResourceVCores> <numNonAMContainerPreempted>0</numNonAMContainerPreempted> <numAMContainerPreempted>0</numAMContainerPreempted> <resourceRequests> <capability> <memory>2048</memory> <virtualCores>1</virtualCores> </capability> <nodeLabelExpression/> <numContainers>0</numContainers> <priority> <priority>0</priority> </priority> <relaxLocality>true</relaxLocality> <resourceName>*</resourceName> </resourceRequests> <resourceRequests> <capability> <memory>2048</memory> <virtualCores>1</virtualCores> </capability> <nodeLabelExpression/> <numContainers>0</numContainers> <priority> <priority>1</priority> </priority> <relaxLocality>true</relaxLocality> <resourceName>*</resourceName> </resourceRequests> </app>
|
查看任务状态
http://hadoop02:8088/ws/v1/cluster/apps/application_1672710362889_0012/state
返回
1 2 3
| <appstate> <state>RUNNING</state> </appstate>
|
状态值
| Item |
Data Type |
Description |
| state |
string |
The application state - can be one of “NEW”, “NEW_SAVING”, “SUBMITTED”, “ACCEPTED”, “RUNNING”, “FINISHED”, “FAILED”, “KILLED” |
集群
集群信息
http://hadoop02:8088/ws/v1/cluster
返回类似于
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| <clusterInfo> <id>1672710362889</id> <startedOn>1672710362889</startedOn> <state>STARTED</state> <haState>ACTIVE</haState> <rmStateStoreName>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</rmStateStoreName> <resourceManagerVersion>2.7.7</resourceManagerVersion> <resourceManagerBuildVersion>2.7.7 from c1aad84bd27cd79c3d1a7dd58202a8c3ee1ed3ac by stevel source checksum d0c780b3552e7bd9462fffca3f9fc51d</resourceManagerBuildVersion> <resourceManagerVersionBuiltOn>2018-07-19T00:39Z</resourceManagerVersionBuiltOn> <hadoopVersion>2.7.7</hadoopVersion> <hadoopBuildVersion>2.7.7 from c1aad84bd27cd79c3d1a7dd58202a8c3ee1ed3ac by stevel source checksum 792e15d20b12c74bd6f19a1fb886490</hadoopBuildVersion> <hadoopVersionBuiltOn>2018-07-18T22:47Z</hadoopVersionBuiltOn> <haZooKeeperConnectionState>CONNECTED</haZooKeeperConnectionState> </clusterInfo>
|
返回数据字段说明
| Item |
Data Type |
Description |
| id |
long |
集群ID |
| startedOn |
long |
集群启动的时间(从纪元开始以毫秒为单位) |
| state |
string |
ResourceManager状态-有效值为:NOTINITED,INITED,STARTED,STOPPED |
| haState |
string |
ResourceManager HA状态-有效值为:INITIALIZING,ACTIVE,STANDBY,STOPPED |
| rmStateStoreName |
string |
实现ResourceManager状态存储的类的完全限定名称 |
| resourceManagerVersion |
string |
ResourceManager的版本 |
| resourceManagerBuildVersion |
string |
ResourceManager构建字符串以及构建版本,用户和校验和 |
| resourceManagerVersionBuiltOn |
string |
生成ResourceManager的时间戳(自纪元以来以毫秒为单位) |
| hadoopVersion |
string |
Hadoop通用版本 |
| hadoopBuildVersion |
string |
具有构建版本,用户和校验和的Hadoop通用构建字符串 |
| hadoopVersionBuiltOn |
string |
建立hadoop common的时间戳(自纪元以来以毫秒为单位) |
| haZooKeeperConnectionState |
string |
ZooKeeper高可用性服务的连接状态 |
集群指标
http://hadoop02:8088/ws/v1/cluster/metrics
返回数据
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| <clusterMetrics> <appsSubmitted>13</appsSubmitted> <appsCompleted>10</appsCompleted> <appsPending>0</appsPending> <appsRunning>1</appsRunning> <appsFailed>0</appsFailed> <appsKilled>2</appsKilled> <reservedMB>0</reservedMB> <availableMB>4096</availableMB> <allocatedMB>2048</allocatedMB> <reservedVirtualCores>0</reservedVirtualCores> <availableVirtualCores>23</availableVirtualCores> <allocatedVirtualCores>1</allocatedVirtualCores> <containersAllocated>1</containersAllocated> <containersReserved>0</containersReserved> <containersPending>0</containersPending> <totalMB>6144</totalMB> <totalVirtualCores>24</totalVirtualCores> <totalNodes>3</totalNodes> <lostNodes>0</lostNodes> <unhealthyNodes>0</unhealthyNodes> <decommissionedNodes>0</decommissionedNodes> <rebootedNodes>0</rebootedNodes> <activeNodes>3</activeNodes> </clusterMetrics>
|
返回数据字段说明
| Item |
Data Type |
Description |
| appsSubmitted |
int |
提交的应用程序数量 |
| appsCompleted |
int |
完成的应用程序数量 |
| appsPending |
int |
等待的应用程序数量 |
| appsRunning |
int |
正在运行的应用程序数量 |
| appsFailed |
int |
失败的应用程序数量 |
| appsKilled |
int |
被杀死的应用程序数量 |
| reservedMB |
long |
保留的内存量(MB) |
| availableMB |
long |
可用的内存量(MB) |
| allocatedMB |
long |
分配的内存量(MB) |
| totalMB |
long |
总内存量(MB) |
| reservedVirtualCores |
long |
保留的虚拟核心数 |
| availableVirtualCores |
long |
可用虚拟核心数 |
| allocatedVirtualCores |
long |
分配的虚拟核心数 |
| totalVirtualCores |
long |
虚拟核心总数 |
| containersAllocated |
int |
分配的容器数 |
| containersReserved |
int |
保留的容器数 |
| containersPending |
int |
待处理的容器数 |
| totalNodes |
int |
节点总数 |
| activeNodes |
int |
活动节点数 |
| lostNodes |
int |
丢失的节点数 |
| unhealthyNodes |
int |
不良节点数 |
| decommissioningNodes |
int |
停用的节点数 |
| decommissionedNodes |
int |
退役的节点数 |
| rebootedNodes |
int |
重新启动的节点数 |
| shutdownNodes |
int |
关闭的节点数 |