Nova如何統(tǒng)計節(jié)點硬件資源

derek_334892 發(fā)布于2019-07-25 10:47 / 808人閱讀

摘要：引言當我們在使用那些建設在之上的云平臺服務的時候，往往在概覽頁面都有一個明顯的位置用來展示當前集群的一些資源使用情況，如，，內(nèi)存，硬盤等資源的總量使用量剩余量。如上，就是統(tǒng)計節(jié)點硬件資源的整個邏輯過程為例。

引言

當我們在使用那些建設在OpenStack之上的云平臺服務的時候，往往在概覽頁面都有一個明顯的位置用來展示當前集群的一些資源使用情況，如，CPU，內(nèi)存，硬盤等資源的總量、使用量、剩余量。而且，每當我們拓展集群規(guī)模之后，概覽頁面上的資源總量也會自動增加，我們都熟知，OpenStack中的Nova服務負責管理這些計算資源，那么你有沒有想過，它們是如何被Nova服務獲取的嗎？

Nova如何統(tǒng)計資源

我們知道，統(tǒng)計資源的操作屬于Nova服務內(nèi)部的機制，考慮到資源統(tǒng)計結(jié)果對后續(xù)操作(如創(chuàng)建虛擬機，創(chuàng)建硬盤)的重要性，我們推斷該機制的運行順序一定先于其他服務。

通過上述簡單的分析，再加上一些必要的Debug操作，我們得出：
該機制的觸發(fā)點位于nova.service.WSGIService.start方法中：

    def start(self):
        """Start serving this service using loaded configuration.

        Also, retrieve updated port number in case "0" was passed in, which
        indicates a random port should be used.

        :returns: None

        """
        if self.manager:
            self.manager.init_host()
            self.manager.pre_start_hook()
            if self.backdoor_port is not None:
                self.manager.backdoor_port = self.backdoor_port
        self.server.start()
        if self.manager:
            self.manager.post_start_hook()

其中，self.manager.pre_start_hook()的作用就是去獲取資源信息,它的直接調(diào)用為nova.compute.manager.pre_start_hook如下：

    def pre_start_hook(self):
        """After the service is initialized, but before we fully bring
        the service up by listening on RPC queues, make sure to update
        our available resources (and indirectly our available nodes).
        """
        self.update_available_resource(nova.context.get_admin_context())
...
    @periodic_task.periodic_task
    def update_available_resource(self, context):
        """See driver.get_available_resource()

        Periodic process that keeps that the compute host"s understanding of
        resource availability and usage in sync with the underlying hypervisor.

        :param context: security context
        """
        new_resource_tracker_dict = {}
        nodenames = set(self.driver.get_available_nodes())
        for nodename in nodenames:
            rt = self._get_resource_tracker(nodename)
            rt.update_available_resource(context)
            new_resource_tracker_dict[nodename] = rt

        # Delete orphan compute node not reported by driver but still in db
        compute_nodes_in_db = self._get_compute_nodes_in_db(context,
                                                            use_slave=True)

        for cn in compute_nodes_in_db:
            if cn.hypervisor_hostname not in nodenames:
                LOG.audit(_("Deleting orphan compute node %s") % cn.id)
                cn.destroy()

        self._resource_tracker_dict = new_resource_tracker_dict

上述代碼中的rt.update_available_resource()的直接調(diào)用實為nova.compute.resource_tracker.update_available_resource()如下:

    def update_available_resource(self, context):
        """Override in-memory calculations of compute node resource usage based
        on data audited from the hypervisor layer.

        Add in resource claims in progress to account for operations that have
        declared a need for resources, but not necessarily retrieved them from
        the hypervisor layer yet.
        """
        LOG.audit(_("Auditing locally available compute resources"))
        resources = self.driver.get_available_resource(self.nodename)

        if not resources:
            # The virt driver does not support this function
            LOG.audit(_("Virt driver does not support "
                 ""get_available_resource"  Compute tracking is disabled."))
            self.compute_node = None
            return
        resources["host_ip"] = CONF.my_ip

        # TODO(berrange): remove this once all virt drivers are updated
        # to report topology
        if "numa_topology" not in resources:
            resources["numa_topology"] = None

        self._verify_resources(resources)
        
        self._report_hypervisor_resource_view(resources)

        return self._update_available_resource(context, resources)

上述代碼中的self._update_available_resource的作用是根據(jù)計算節(jié)點上的資源實際使用結(jié)果來同步數(shù)據(jù)庫記錄，這里我們不做展開；self.driver.get_available_resource()的作用就是獲取節(jié)點硬件資源信息，它的實際調(diào)用為：

class LibvirtDriver(driver.ComputeDriver):
    def get_available_resource(self, nodename):
        """Retrieve resource information.

        This method is called when nova-compute launches, and
        as part of a periodic task that records the results in the DB.

        :param nodename: will be put in PCI device
        :returns: dictionary containing resource info
        """

        # Temporary: convert supported_instances into a string, while keeping
        # the RPC version as JSON. Can be changed when RPC broadcast is removed
        stats = self.get_host_stats(refresh=True)
        stats["supported_instances"] = jsonutils.dumps(
                stats["supported_instances"])
        return stats
        
    def get_host_stats(self, refresh=False):
        """Return the current state of the host.

        If "refresh" is True, run update the stats first.
        """
        return self.host_state.get_host_stats(refresh=refresh)
        
        def _get_vcpu_total(self):
        """Get available vcpu number of physical computer.

        :returns: the number of cpu core instances can be used.

        """
        if self._vcpu_total != 0:
            return self._vcpu_total

        try:
            total_pcpus = self._conn.getInfo()[2] + 1
        except libvirt.libvirtError:
            LOG.warn(_LW("Cannot get the number of cpu, because this "
                         "function is not implemented for this platform. "))
            return 0

        if CONF.vcpu_pin_set is None:
            self._vcpu_total = total_pcpus
            return self._vcpu_total

        available_ids = hardware.get_vcpu_pin_set()
        if sorted(available_ids)[-1] >= total_pcpus:
            raise exception.Invalid(_("Invalid vcpu_pin_set config, "
                                      "out of hypervisor cpu range."))
        self._vcpu_total = len(available_ids)
        return self._vcpu_total

.....
class HostState(object):
    """Manages information about the compute node through libvirt."""
    def __init__(self, driver):
        super(HostState, self).__init__()
        self._stats = {}
        self.driver = driver
        self.update_status()

    def get_host_stats(self, refresh=False):
        """Return the current state of the host.

        If "refresh" is True, run update the stats first.
        """
        if refresh or not self._stats:
            self.update_status()
        return self._stats
        
    def update_status(self):
        """Retrieve status info from libvirt."""
        ...
        data["vcpus"] = self.driver._get_vcpu_total()
        data["memory_mb"] = self.driver._get_memory_mb_total()
        data["local_gb"] = disk_info_dict["total"]
        data["vcpus_used"] = self.driver._get_vcpu_used()
        data["memory_mb_used"] = self.driver._get_memory_mb_used()
        data["local_gb_used"] = disk_info_dict["used"]
        data["hypervisor_type"] = self.driver._get_hypervisor_type()
        data["hypervisor_version"] = self.driver._get_hypervisor_version()
        data["hypervisor_hostname"] = self.driver._get_hypervisor_hostname()
        data["cpu_info"] = self.driver._get_cpu_info()
        data["disk_available_least"] = _get_disk_available_least()
        ...

注意get_available_resource方法的注釋信息，完全符合我們開始的推斷。我們下面單以vcpus為例繼續(xù)調(diào)查資源統(tǒng)計流程，self.driver._get_vcpu_total的實際調(diào)用為LibvirtDriver._get_vcpu_total(上述代碼中已給出)，如果配置項vcpu_pin_set沒有生效，那么得到的_vcpu_total的值為self._conn.getInfo()[2]（self._conn可以理解為libvirt的適配器，它代表與kvm,qemu等底層虛擬化工具的抽象連接，getInfo()就是對libvirtmod.virNodeGetInfo的一次簡單的封裝，它的返回值是一組數(shù)組，其中第三個元素就是vcpus的數(shù)量），我們看到這里基本就可以了，再往下就是libvirt的C語言代碼而不是Python的范疇了。

另一方面，如果我們配置了vcpu_pin_set配置項，那么該配置項就被hardware.get_vcpu_pin_set方法解析成一個可用CPU位置索引的集合，再通過對該集合求長后，我們也能得到最終想要的vcpus的數(shù)量。

如上，就是Nova統(tǒng)計節(jié)點硬件資源的整個邏輯過程(vcpus為例)。

云服務器 GPU云服務器云服務器硬件資源隔離資源數(shù)據(jù)統(tǒng)計數(shù)據(jù)資源統(tǒng)計分析技術(shù) 中資源國外節(jié)點

文章版權(quán)歸作者所有，未經(jīng)允許請勿轉(zhuǎn)載,若此文章存在違規(guī)行為，您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請注明本文地址：http://systransis.cn/yun/38158.html

發(fā)表評論

登陸后可評論

0條評論

derek_334892

男|高級講師

我要關(guān)注我要私信

TA的文章

DediPath: 夏季特賣會！VPS和混合服務器5折優(yōu)惠，|英特爾至強E3-1230v3專用服務器

閱讀 2425·2021-08-18 10:21
iKcamp出品｜微信小程序｜工具安裝+目錄說明｜基于最新版1.0開發(fā)者工具初中級教程分享

閱讀 2534·2019-08-30 13:45
微信小程序開發(fā)-個人總結(jié)

閱讀 2165·2019-08-30 13:16
Web-Fontmin -- 在線提取你需要的字體

閱讀 2131·2019-08-30 12:52
常見的CSS布局樣式

閱讀 1376·2019-08-30 11:20
關(guān)于 emotion 初步使用的筆記

閱讀 2635·2019-08-29 13:47
移動Web開發(fā)小結(jié)

閱讀 1633·2019-08-29 11:22
JavaScript中的Function類型

閱讀 2774·2019-08-26 12:11

成人国产在线小视频_日韩寡妇人妻调教在线播放_色成人www永久在线观看_2018国产精品久久_亚洲欧美高清在线30p_亚洲少妇综合一区_黄色在线播放国产_亚洲另类技巧小说校园_国产主播xx日韩_a级毛片在线免费

資訊專欄INFORMATION COLUMN

上云采購季！| 2核2G4M爆款云服務器低至59元/年，更有多臺、長期優(yōu)惠，快來選購！

Nova如何統(tǒng)計節(jié)點硬件資源

相關(guān)文章

**云計算節(jié)點故障自動化運維服務設計**

OpenStack虛擬云桌面在攜程呼叫中心的應用

OpenStack虛擬云桌面在攜程呼叫中心的應用

深度解析 OpenStack metadata 服務架構(gòu)

深度解碼超實用的OpenStack Heat

發(fā)表評論

0條評論

derek_334892

男|高級講師

TA的文章

DediPath: 夏季特賣會！VPS和混合服務器5折優(yōu)惠，|英特爾至強E3-1230v3專用服務器

iKcamp出品｜微信小程序｜工具安裝+目錄說明｜基于最新版1.0開發(fā)者工具初中級教程分享

微信小程序開發(fā)-個人總結(jié)

Web-Fontmin -- 在線提取你需要的字體

常見的CSS布局樣式

關(guān)于 emotion 初步使用的筆記

移動Web開發(fā)小結(jié)

JavaScript中的Function類型

最新活動

資訊專欄INFORMATION COLUMN

上云采購季！| 2核2G4M爆款云服務器低至59元/年，更有多臺、長期優(yōu)惠，快來選購！

Nova如何統(tǒng)計節(jié)點硬件資源

相關(guān)文章

發(fā)表評論

0條評論

男|高級講師

TA的文章

最新活動

上云采購季！| 2核2G4M爆款云服務器低至59元/年，更有多臺、長期優(yōu)惠，快來選購！