1
ITC561 Cloud Computing
Topic 4: Cloud Architecture
NOTICE: Slides are extracted from Cloud Computing: Concepts, Technology & Architecture by Thomas Erl; Ricardo Puttini; Zaigham Mahmood and other resources.
© PRENTICE HALL
1
understand and distinguish between Cloud architectures
understand and be able to prepare a design for a Cloud service that can use elements of:
load balancing
elastic disk provisioning
hypervisor clustering
dynamic failure detection
rapid provisioning
Prescribed text: Erl, Chapter 11: Fundamental Cloud Architectures
Learning Outcomes
workload distribution architecture reduces both IT resource over-utilization and under-utilization to an extent dependent upon the sophistication of the load balancing algorithms and runtime logic.
Workload Distribution Architecture
Figure 11.1 A redundant copy of Cloud Service A is implemented on Virtual Server B. The load balancer intercepts cloud service consumer requests and directs them to both Virtual Servers A and B to ensure even workload distribution.
load balancer mechanism applied in this cloud architecture:
Audit Monitor – When distributing runtime workloads, the type and geographical location of the IT resources that process the data can determine whether monitoring is necessary to fulfill legal and regulatory requirements.
• Cloud Usage Monitor – Various monitors can be involved to carry out runtime workload tracking and data processing.
• Hypervisor – Workloads between hypervisors and the virtual servers that they host may require distribution.
• Logical Network Perimeter – The logical network perimeter isolates cloud consumer network boundaries in relation to how and where workloads are distributed.
• Resource Cluster – Clustered IT resources in active/active mode are commonly used to support workload balancing between different cluster nodes.
• Resource Replication – This mechanism can generate new instances of virtualized IT resources in response to runtime workload distribution demands.
Workload Distribution Architecture
A resource pooling architecture is based on the use of one or more resource pools, in which identical IT resources are grouped and maintained by a system that automatically ensures that they remain synchronized.
Pools of physical RAM can be used in newly provisioned physical servers or to vertically scale physical servers. Dedicated pools can be created for each type of IT resource and individual pools can be grouped into a larger pool, in which case each individual pool becomes a sub-pool
Resource Pooling Architecture
Figure 11.2 A sample resource pool that is comprised of four sub-pools of CPUs, memory, cloud storage devices, and virtual network devices.
commonly pooled mechanisms can also be part of this cloud architecture:
Audit Monitor – This mechanism monitors resource pool usage to ensure compliance with privacy and regulation requirements, especially when pools contain cloud storage devices or data loaded into memory.
Cloud Usage Monitor – Various cloud usage monitors are involved in the runtime tracking and synchronization that are required by the pooled IT resources and any underlying management systems.
Hypervisor – The hypervisor mechanism is responsible for providing virtual servers with access to resource pools, in addition to hosting the virtual servers and sometimes the resource pools themselves.
Logical Network Perimeter – The logical network perimeter is used to logically organize and isolate resource pools.
Pay-Per-Use Monitor – The pay-per-use monitor collects usage and billing information on how individual cloud consumers are allocated and use IT resources from various pools.
Remote Administration System – This mechanism is commonly used to interface with backend systems and programs in order to provide resource pool administration features via a front-end portal.
Resource Management System – The resource management system mechanism supplies cloud consumers with the tools and permission management options for administering resource pools.
Resource Replication – This mechanism is used to generate new instances of IT resources for resource pools.
Resource Pooling Architecture
Figure 11.3 Pools B and C are sibling pools that are taken from the larger Pool A, which has been allocated to a cloud consumer. This is an alternative to taking the IT resources for Pool B and Pool C from a general reserve of IT resources that is shared throughout the cloud.
Figure 11.4 Nested Pools A.1 and Pool A.2 are comprised of the same IT resources as Pool A, but in
different quantities. Nested pools are typically used to provision cloud services that need
to be rapidly instantiated using the same type of IT resources with the same configuration
settings.
The dynamic scalability architecture is an architectural model based on a system of predefined scaling conditions that trigger the dynamic allocation of IT resources from resource pools.
The following types of dynamic scaling are commonly used:
Dynamic Horizontal Scaling – IT resource instances are scaled out and in to handle fluctuating workloads. The automatic scaling listener monitors requests and signals resource replication to initiate IT resource duplication, as per requirements and permissions.
Dynamic Vertical Scaling – IT resource instances are scaled up and down when there is a need to adjust the processing capacity of a single IT resource. For example, a virtual server that is being overloaded can have its memory dynamically increased or it may have a processing core added.
Dynamic Relocation – The IT resource is relocated to a host with more capacity.
For example, a database may need to be moved from a tape-based SAN storage device with 4 GB per second I/O capacity to another disk-based SAN storage device with 8 GB per second I/O capacity.
Dynamic Scalability Architecture
Dynamic Scalability mechanism applied in this cloud architecture:
Cloud Usage Monitor – Specialized cloud usage monitors can track runtime usage in response to dynamic fluctuations caused by this architecture.
Hypervisor – The hypervisor is invoked by a dynamic scalability system to create or remove virtual server instances, or to be scaled itself.
Pay-Per-Use Monitor – The pay-per-use monitor is engaged to collect usage cost information in response to the scaling of IT resources.
Dynamic Scalability Architecture
Figure 11.5 Cloud service consumers are sending requests to a cloud service (1). The automated scaling listener monitors the cloud service to determine if predefined capacity thresholds are being exceeded (2).
Figure 11.6 The number of requests coming from cloud service consumers increases (3). The workload exceeds the performance thresholds. The automated scaling listener determines the next course of action based on a predefined scaling policy (4). If the cloud service implementation is deemed eligible for additional scaling, the automated scaling listener initiates the scaling process (5).
Continue from Figure 11.5
Figure 11.7 The automated scaling listener sends a signal to the resource replication mechanism (6),
which creates more instances of the cloud service (7). Now that the increased workload has been
accommodated, the automated scaling listener resumes monitoring and detracting and adding IT
resources, as required (8).
Continue from Figure 11.6
The elastic resource capacity architecture is primarily related to the dynamic provisioning of virtual servers, using a system that allocates and reclaims CPUs and RAM in immediate response to the fluctuating processing requirements of hosted IT resources
Elastic resource capacity mechanism applied in this cloud architecture:
Cloud Usage Monitor – Specialized cloud usage monitors collect resource usage information on IT resources before, during, and after scaling, to help define the future processing capacity thresholds of the virtual servers.
Pay-Per-Use Monitor – The pay-per-use monitor is responsible for collecting resource usage cost information as it fluctuates with the elastic provisioning.
Resource Replication – Resource replication is used by this architectural model to generate new instances of the scaled IT resources.
Elastic Resource Capacity Architecture
Figure 11.8 Cloud service consumers are actively sending requests to a cloud service (1), which are monitored by an automated scaling listener (2). An intelligent automation engine script is deployed with workflow logic (3) that is capable of notifying the resource pool using allocation requests (4).
Figure 11.9 Cloud service consumer requests increase (5), causing the automated scaling listener to
signal the intelligent automation engine to execute the script (6). The script runs the
workflow logic that signals the hypervisor to allocate more IT resources from the resource
pools (7). The hypervisor allocates additional CPU and RAM to the virtual server, enabling
the increased workload to be handled (8).
Continue from Figure 11.8
The service load balancing architecture can be considered a specialized variation of the workload distribution architecture that is geared specifically for scaling cloud service implementations.
Service load balancing architecture mechanism applied in this cloud architecture:
Cloud Usage Monitor – Cloud usage monitors may be involved with monitoring cloud service instances and their respective IT resource consumption levels, as well as various runtime monitoring and usage data collection tasks.
Resource Cluster – Active-active cluster groups are incorporated in this architecture to help balance workloads across different members of the cluster.
Resource Replication – The resource replication mechanism is utilized to generate cloud service implementations in support of load balancing requirements.
Service Load Balancing Architecture
Figure 11.10 The load balancer intercepts messages sent by cloud service consumers (1) and
forwards them to the virtual servers so that the workload processing is horizontally scaled (2).
Figure 11.11 Cloud service consumer requests are sent to Cloud Service A on Virtual Server A (1). The cloud service implementation includes built-in load balancing logic that is capable of distributing requests to the neighboring Cloud Service A implementations on Virtual Servers B and C (2).
The cloud bursting architecture establishes a form of dynamic scaling that scales or “bursts out” on-premise IT resources into a cloud whenever predefined capacity thresholds have been reached.
Cloud Bursting Architecture
Figure 11.12 An automated scaling listener monitors the usage of on-premise Service A, and redirects Service Consumer C’s request to Service A’s redundant implementation in the cloud (Cloud Service A) once Service A’s usage threshold has been exceeded (1). A resource replication system is used to keep state management databases synchronized (2).
The elastic disk provisioning architecture establishes a dynamic storage provisioning system that ensures that the cloud consumer is granularly billed for the exact amount of storage that it actually uses.
Cloud consumers are commonly charged for cloud-based storage space based on fixed-disk storage allocation, meaning the charges are predetermined by disk capacity and not aligned with actual data storage consumption.
Elastic disk provisioning architecture mechanism applied in this cloud architecture:
Cloud Usage Monitor – Specialized cloud usage monitors can be used to track and log storage usage fluctuations.
Resource Replication – Resource replication is part of an elastic disk provisioning system when conversion of dynamic thin-disk storage into static thick-disk storage is required.
Elastic Disk Provisioning Architecture
Figure 11.13 The cloud consumer requests a virtual server with three hard disks, each with a capacity of 150 GB (1). The virtual server is provisioned according to the elastic disk provisioning architecture, with a total of 450 GB of disk space (2). The 450 GB is allocated to the virtual server by the cloud provider (3). The cloud consumer has not installed any software yet, meaning the actual used space is currently 0 GB (4). Because the 450 GB are already allocated and reserved for the cloud consumer, it will be charged for 450 GB of disk usage as of the point of allocation (5).
Figure 11.14 The cloud consumer requests a virtual server with three hard disks, each with a capacity of 150 GB (1). The virtual server is provisioned by this architecture with a total of 450 GB of disk space (2). The 450 GB are set as the maximum disk usage that is allowed for this virtual server, although no physical disk space has been reserved or allocated yet (3). The cloud consumer has not installed any software, meaning the actual used space is currently at 0 GB (4). Because the allocated disk space is equal to the actual used space (which is currently at zero), the cloud consumer is not charged for any disk space usage (5).