使用hadoop restful api实现对集群信息的统计
发布日期:2021-08-30 16:00:36 浏览次数:9 分类:技术文章

本文共 23978 字,大约阅读时间需要 79 分钟。

(适用于hadoop 2.7及以上版本)

涉及到RESTful API

  • ResourceManager REST API’s:
  • WebHDFS REST API:
  • MapReduce History Server REST API’s:
  • Spark Monitoring and Instrumentation

1. 统计HDFS文件系统实时使用情况

  • URL
  • 返回结果:
{  "ContentSummary":  {    "directoryCount": 2,    "fileCount"     : 1,    "length"        : 24930,    "quota"         : -1,    "spaceConsumed" : 24930,    "spaceQuota"    : -1  }}
  • 关于返回结果的说明:
{  "name"      : "ContentSummary",  "properties":  {    "ContentSummary":    {      "type"      : "object",      "properties":      {        "directoryCount":        {          "description": "The number of directories.",          "type"       : "integer",          "required"   : true        },        "fileCount":        {          "description": "The number of files.",          "type"       : "integer",          "required"   : true        },        "length":        {          "description": "The number of bytes used by the content.",          "type"       : "integer",          "required"   : true        },        "quota":        {          "description": "The namespace quota of this directory.",          "type"       : "integer",          "required"   : true        },        "spaceConsumed":        {          "description": "The disk space consumed by the content.",          "type"       : "integer",          "required"   : true        },        "spaceQuota":        {          "description": "The disk space quota.",          "type"       : "integer",          "required"   : true        }      }    }  }}
  • 注意length与spaceConsumed的关系,跟hdfs副本数有关。
  • 如果要统计各个组工作目录的使用情况,使用如下请求:

2. 查看集群的实时信息和状态

  • URL

  • 返回结果
{    "clusterInfo": {        "id": 1495123166259,         "startedOn": 1495123166259,         "state": "STARTED",         "haState": "ACTIVE",         "rmStateStoreName": "org.apache.hadoop.yarn.server.resourcemanager.recovery.NullRMStateStore",         "resourceManagerVersion": "2.7.2",         "resourceManagerBuildVersion": "2.7.2 from 4bee04d3d1c27d7ef559365d3bdd2a8620807bfc by root source checksum c63f7cc71b8f63249e35126f0f7492d",         "resourceManagerVersionBuiltOn": "2017-04-17T12:28Z",         "hadoopVersion": "2.7.2",         "hadoopBuildVersion": "2.7.2 from 4bee04d3d1c27d7ef559365d3bdd2a8620807bfc by root source checksum 3329b146070a2bc9e249fa9ba9fb55",         "hadoopVersionBuiltOn": "2017-04-17T12:18Z",         "haZooKeeperConnectionState": "ResourceManager HA is not enabled."    }}

3. 查看资源队列的实时信息,包括队列的配额信息、资源使用实时情况

  • URL

  • 返回结果
{    "scheduler": {        "schedulerInfo": {            "type": "capacityScheduler",             "capacity": 100,             "usedCapacity": 0,             "maxCapacity": 100,             "queueName": "root",             "queues": {                "queue": [                    {                        "type": "capacitySchedulerLeafQueueInfo",                         "capacity": 1,                         "usedCapacity": 0,                         "maxCapacity": 90,                         "absoluteCapacity": 1,                         "absoluteMaxCapacity": 90,                         "absoluteUsedCapacity": 0,                         "numApplications": 0,                         "queueName": "algorithm_aliyun",                         "state": "RUNNING",                         "resourcesUsed": {                            "memory": 0,                             "vCores": 0                        },                         "hideReservationQueues": false,                         "nodeLabels": [                            "*"                        ],                         "numActiveApplications": 0,                         "numPendingApplications": 0,                         "numContainers": 0,                         "maxApplications": 100,                         "maxApplicationsPerUser": 100,                         "userLimit": 100,                         "users": null,                         "userLimitFactor": 1,                         "AMResourceLimit": {                            "memory": 11776,                             "vCores": 7                        },                         "usedAMResource": {                            "memory": 0,                             "vCores": 0                        },                         "userAMResourceLimit": {                            "memory": 160,                             "vCores": 1                        },                         "preemptionDisabled": true                    },                     {                        "type": "capacitySchedulerLeafQueueInfo",                         "capacity": 1,                         "usedCapacity": 0,                         "maxCapacity": 90,                         "absoluteCapacity": 1,                         "absoluteMaxCapacity": 90,                         "absoluteUsedCapacity": 0,                         "numApplications": 0,                         "queueName": "dcps_aliyun",                         "state": "RUNNING",                         "resourcesUsed": {                            "memory": 0,                             "vCores": 0                        },                         "hideReservationQueues": false,                         "nodeLabels": [                            "*"                        ],                         "numActiveApplications": 0,                         "numPendingApplications": 0,                         "numContainers": 0,                         "maxApplications": 100,                         "maxApplicationsPerUser": 100,                         "userLimit": 100,                         "users": null,                         "userLimitFactor": 1,                         "AMResourceLimit": {                            "memory": 11776,                             "vCores": 7                        },                         "usedAMResource": {                            "memory": 0,                             "vCores": 0                        },                         "userAMResourceLimit": {                            "memory": 160,                             "vCores": 1                        },                         "preemptionDisabled": true                    },                     {                        "type": "capacitySchedulerLeafQueueInfo",                         "capacity": 31,                         "usedCapacity": 0,                         "maxCapacity": 100,                         "absoluteCapacity": 31,                         "absoluteMaxCapacity": 100,                         "absoluteUsedCapacity": 0,                         "numApplications": 0,                         "queueName": "default",                         "state": "RUNNING",                         "resourcesUsed": {                            "memory": 0,                             "vCores": 0                        },                         "hideReservationQueues": false,                         "nodeLabels": [                            "*"                        ],                         "numActiveApplications": 0,                         "numPendingApplications": 0,                         "numContainers": 0,                         "maxApplications": 3100,                         "maxApplicationsPerUser": 3100,                         "userLimit": 100,                         "users": null,                         "userLimitFactor": 1,                         "AMResourceLimit": {                            "memory": 13088,                             "vCores": 8                        },                         "usedAMResource": {                            "memory": 0,                             "vCores": 0                        },                         "userAMResourceLimit": {                            "memory": 4064,                             "vCores": 3                        },                         "preemptionDisabled": true                    },                     {                        "type": "capacitySchedulerLeafQueueInfo",                         "capacity": 15.000001,                         "usedCapacity": 0,                         "maxCapacity": 100,                         "absoluteCapacity": 15.000001,                         "absoluteMaxCapacity": 100,                         "absoluteUsedCapacity": 0,                         "numApplications": 0,                         "queueName": "feed_aliyun",                         "state": "RUNNING",                         "resourcesUsed": {                            "memory": 0,                             "vCores": 0                        },                         "hideReservationQueues": false,                         "nodeLabels": [                            "*"                        ],                         "numActiveApplications": 0,                         "numPendingApplications": 0,                         "numContainers": 0,                         "maxApplications": 1500,                         "maxApplicationsPerUser": 7500,                         "userLimit": 100,                         "users": null,                         "userLimitFactor": 5,                         "AMResourceLimit": {                            "memory": 12320,                             "vCores": 8                        },                         "usedAMResource": {                            "memory": 0,                             "vCores": 0                        },                         "userAMResourceLimit": {                            "memory": 9856,                             "vCores": 7                        },                         "preemptionDisabled": true                    },                     {                        "type": "capacitySchedulerLeafQueueInfo",                         "capacity": 51,                         "usedCapacity": 0,                         "maxCapacity": 90,                         "absoluteCapacity": 51,                         "absoluteMaxCapacity": 90,                         "absoluteUsedCapacity": 0,                         "numApplications": 0,                         "queueName": "hot_aliyun",                         "state": "RUNNING",                         "resourcesUsed": {                            "memory": 0,                             "vCores": 0                        },                         "hideReservationQueues": false,                         "nodeLabels": [                            "*"                        ],                         "numActiveApplications": 0,                         "numPendingApplications": 0,                         "numContainers": 0,                         "maxApplications": 5100,                         "maxApplicationsPerUser": 5100,                         "userLimit": 100,                         "users": null,                         "userLimitFactor": 1,                         "AMResourceLimit": {                            "memory": 11776,                             "vCores": 7                        },                         "usedAMResource": {                            "memory": 0,                             "vCores": 0                        },                         "userAMResourceLimit": {                            "memory": 6688,                             "vCores": 5                        },                         "preemptionDisabled": true                    },                     {                        "type": "capacitySchedulerLeafQueueInfo",                         "capacity": 1,                         "usedCapacity": 0,                         "maxCapacity": 90,                         "absoluteCapacity": 1,                         "absoluteMaxCapacity": 90,                         "absoluteUsedCapacity": 0,                         "numApplications": 0,                         "queueName": "push_aliyun",                         "state": "RUNNING",                         "resourcesUsed": {                            "memory": 0,                             "vCores": 0                        },                         "hideReservationQueues": false,                         "nodeLabels": [                            "*"                        ],                         "numActiveApplications": 0,                         "numPendingApplications": 0,                         "numContainers": 0,                         "maxApplications": 100,                         "maxApplicationsPerUser": 100,                         "userLimit": 100,                         "users": null,                         "userLimitFactor": 1,                         "AMResourceLimit": {                            "memory": 11776,                             "vCores": 7                        },                         "usedAMResource": {                            "memory": 0,                             "vCores": 0                        },                         "userAMResourceLimit": {                            "memory": 160,                             "vCores": 1                        },                         "preemptionDisabled": true                    }                ]            }        }    }}
  • 具体参数说明参考:

4. 查看实时的作业列表,列表信息中也包含了作业运行的详情信息,包括作业名称、id、运行状态、起止时间,资源使用情况。

  • URL

  • 返回结果
{  "apps":  {    "app":    [       {          "finishedTime" : 1326815598530,          "amContainerLogs" : "http://host.domain.com:8042/node/containerlogs/container_1326815542473_0001_01_000001",          "trackingUI" : "History",          "state" : "FINISHED",          "user" : "user1",          "id" : "application_1326815542473_0001",          "clusterId" : 1326815542473,          "finalStatus" : "SUCCEEDED",          "amHostHttpAddress" : "host.domain.com:8042",          "progress" : 100,          "name" : "word count",          "startedTime" : 1326815573334,          "elapsedTime" : 25196,          "diagnostics" : "",          "trackingUrl" : "http://host.domain.com:8088/proxy/application_1326815542473_0001/jobhistory/job/job_1326815542473_1_1",          "queue" : "default",          "allocatedMB" : 0,          "allocatedVCores" : 0,          "runningContainers" : 0,          "memorySeconds" : 151730,          "vcoreSeconds" : 103       },       {          "finishedTime" : 1326815789546,          "amContainerLogs" : "http://host.domain.com:8042/node/containerlogs/container_1326815542473_0002_01_000001",          "trackingUI" : "History",          "state" : "FINISHED",          "user" : "user1",          "id" : "application_1326815542473_0002",          "clusterId" : 1326815542473,          "finalStatus" : "SUCCEEDED",          "amHostHttpAddress" : "host.domain.com:8042",          "progress" : 100,          "name" : "Sleep job",          "startedTime" : 1326815641380,          "elapsedTime" : 148166,          "diagnostics" : "",          "trackingUrl" : "http://host.domain.com:8088/proxy/application_1326815542473_0002/jobhistory/job/job_1326815542473_2_2",          "queue" : "default",          "allocatedMB" : 0,          "allocatedVCores" : 0,          "runningContainers" : 1,          "memorySeconds" : 640064,          "vcoreSeconds" : 442       }     ]  }}
  • 如果要统计固定时间段的,可以加上"?finishedTimeBegin={时间戳}&finishedTimeEnd={时间戳}"参数,例如

5. 统计作业扫描的数据量情况

job扫描的数据量,需要通过History Server的RESTful API查询,MapReduce的和Spark的又有一些差异。

5.1 Mapreduce job扫描数据量

  • URL

  • 返回结果
{   "jobCounters" : {      "id" : "job_1326381300833_2_2",      "counterGroup" : [         {            "counterGroupName" : "Shuffle Errors",            "counter" : [               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "BAD_ID"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "CONNECTION"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "IO_ERROR"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "WRONG_LENGTH"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "WRONG_MAP"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "WRONG_REDUCE"               }            ]          },         {            "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",            "counter" : [               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 2483,                  "name" : "FILE_BYTES_READ"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 108525,                  "name" : "FILE_BYTES_WRITTEN"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "FILE_READ_OPS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "FILE_LARGE_READ_OPS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "FILE_WRITE_OPS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 48,                  "name" : "HDFS_BYTES_READ"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "HDFS_BYTES_WRITTEN"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 1,                  "name" : "HDFS_READ_OPS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "HDFS_LARGE_READ_OPS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "HDFS_WRITE_OPS"               }            ]         },         {            "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",            "counter" : [               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 1,                  "name" : "MAP_INPUT_RECORDS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 1200,                  "name" : "MAP_OUTPUT_RECORDS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 4800,                  "name" : "MAP_OUTPUT_BYTES"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 2235,                  "name" : "MAP_OUTPUT_MATERIALIZED_BYTES"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 48,                  "name" : "SPLIT_RAW_BYTES"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "COMBINE_INPUT_RECORDS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "COMBINE_OUTPUT_RECORDS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 1200,                  "name" : "REDUCE_INPUT_GROUPS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 2235,                  "name" : "REDUCE_SHUFFLE_BYTES"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 1200,                  "name" : "REDUCE_INPUT_RECORDS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "REDUCE_OUTPUT_RECORDS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 2400,                  "name" : "SPILLED_RECORDS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 1,                  "name" : "SHUFFLED_MAPS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "FAILED_SHUFFLE"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 1,                  "name" : "MERGED_MAP_OUTPUTS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 113,                  "name" : "GC_TIME_MILLIS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 1830,                  "name" : "CPU_MILLISECONDS"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 478068736,                  "name" : "PHYSICAL_MEMORY_BYTES"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 2159284224,                  "name" : "VIRTUAL_MEMORY_BYTES"               },               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 378863616,                  "name" : "COMMITTED_HEAP_BYTES"               }            ]         },         {            "counterGroupName" : "org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter",            "counter" : [               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "BYTES_READ"               }            ]         },         {            "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",            "counter" : [               {                  "reduceCounterValue" : 0,                  "mapCounterValue" : 0,                  "totalCounterValue" : 0,                  "name" : "BYTES_WRITTEN"               }            ]         }      ]   }}

其中org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter里面的BYTES_READ为job扫描的数据量

具体参数:

5.2 Mapreduce job扫描数据量

  • URL

每个executor的totalInputBytes总和为整个job的数据扫描量。

更多参考:

转载地址:https://blog.csdn.net/weixin_34038652/article/details/90570583 如侵犯您的版权,请留言回复原文章的地址,我们会给您删除此文章,给您带来不便请您谅解!

上一篇:《精通Android 实例开发》——第1章,第1.3节在Windows环境下集成安装Eclipse和Android SDK...
下一篇:当基因联网,人类会怎样

发表评论

最新留言

不错!
[***.144.177.141]2024年03月26日 12时23分11秒

关于作者

    喝酒易醉,品茶养心,人生如梦,品茶悟道,何以解忧?唯有杜康!
-- 愿君每日到此一游!

推荐文章

java mysql 表关系分析_数据库表的关系 2019-04-21
c语言 变量 函数命名 风格_C语言static变量和函数 2019-04-21
mysql update的引号_新人写了一个update,误用一个双引号,生产数据全变0了! 2019-04-21
mysql男女字段应该建立索引吗_那些字段适不适合建索引? 2019-04-21
安装mysql最后一步密码_MySQL安装最后一步无响应解决方法 2019-04-21
mysql modify语句格式_40条MySQL数据库语句格式 2019-04-21
mysql忽略大小写jpa解决_JPA 大小写敏感问题 2019-04-21
MySQL5.7 固态盘性能设置_MySQL 5.7 安装完成后,立即要调整的性能选项 2019-04-21
idea java web mysql_解决idea中javaweb的mysql8.0.15配置问题 2019-04-21
mysql.net 环境配置_windows环境下下安装配置mysql5.7.24 2019-04-21
servlet对象是在服务器端还是在客户端被创建?_Servlet编程之会话管理 2019-04-21
sqoop mysql hadoop_如何将mysql数据导入Hadoop之Sqoop安装 2019-04-21
webpack卸载_webpack的安装 2019-04-21
mysql主库线程_MySQL 主从扩展--主库的线程状态 2019-04-21
phpmyadmin管理mysql_LAMP实验二:使用phpMyAdmin管理MySQL 2019-04-21
mysql一秒最多写多少次_Mysql的两种“超过多少次”写法(力扣596) 2019-04-21
mysql el函数_MySQL中的常用函数 2019-04-21
mysql 备份 晓燕_mysql字符串函数 2019-04-21
mysql返回第n个值_获取mysql一组数据中的第N大的值 2019-04-21
java 获取请求的路径_JavaWeb-request获取请求路径的相关方法 2019-04-21