Syan
Syan Syan
Customer Extranets
 

You are here: Home | Availability Solutions | IBM p5/pSeries

 

High Availability for pSeries and RS/6000

Complete Availability Line-up

HACMP: can keep mission-critical applications highly available within a location through application fallover and monitoring.

HAGEO: can quickly restore access to data following a location failure. HAGEO provides the same functionality as GeoRM, and, together with HACMP, it can automate failover, recovery and reintegration between geographically separate locations.

GeoRM: can provide an environment to mirror data to another location. It can provide remote data mirroring for backup of regional (up to seven) locations to a single centralized server or between just two geographically separate servers. GeoRM is mutually exclusive and cannot be used in conjunction with HAGEO.

 


HACMP

Highlights

  • Combines world-class, easy-to-use, 24x7 clustering technology with IBM advanced systems technologies
  • Significantly reduces planned and unplanned outages, allowing for cluster upgrades and system maintenance without interrupting operations
  • Offers multiple data backup and recovery methods to meet disaster management needs

Need for High Availability

What happens when IT systems fail? During the business day, IT investments are hard at work: recording customer activities, tracking inventory, keeping company statistics, providing employees with the computing power needed to generate business revenue. But what happens when those systems fail? The cost of computer downtime is widely documented; unplanned outages cost real money and increase the total cost of ownership (TCO) for IT. Planned outages for system maintenance can also impact business performance. Keeping systems highly available should be the top goal of every system administrator or corporate CIO. What every business needs are high-availability (HA) solutions that keep a company's IT investment running 24x7, allow end users to never experience any system outages, and let system maintenance occur without causing downtime.

IBM HA Clustering Solution

Better protect critical business applications from failures with the capabilities of IBM High Availability Cluster Multiprocessing for AIX 5L V5.1.0 (HACMP V5.1). For over 10 years, HACMP has been providing reliable high-availability services, monitoring capabilities and dependable detection of application failures. HACMP manages the fallover of business application environments to backup servers. And with the introduction of the new optional package, HACMP/XD (Extended Distance), HACMP will also manage fallover to backup servers at remote sites. HACMP/XD provides long distance remote fallover for ESS/PPRC peers, and unlimited distance fallover for IP connected peers using proven IBM HAGEO (High Availability Geographic Cluster) technology. Now there is a single, world-class source of protection for mission-critical applications.

HACMP makes use of redundant hardware configured in a cluster to keep an application running, restarting it on a backup server if necessary. This minimizes expensive downtime for both planned and unplanned outages and provides flexibility to accommodate changing business needs. Up to 32 servers can participate in an HACMP cluster - ideal for an environment requiring horizontal growth with rock-solid reliability. HACMP can also detect software problems that are not severe enough to interrupt proper operation of the system, such as process failure or exhaustion of system resources. HACMP monitors, detects and reacts to such failure events, allowing the system to stay available during random, unexpected software problems. HACMP can be configured to react to hundreds of system events.

Using HACMP can virtually eliminate planned outages, since users, applications and data can be moved to backup systems during scheduled system maintenance. HACMP clusters can be configured to meet complex and varied application availability and recovery needs.

Benefits of HACMP

HACMP takes advantage of AIX 5L - the high-performance, scalable UNIX operating system from IBM - and exploits its systems and network management capabilities. AIX 5L is one of the world's most open UNIX operating systems and includes functions to improve usability, security, system availability, and performance. These include improved availability of mirrored data and enhancements to AIX Workload Manager that help solve problems of mixed workloads by quickly and dynamically providing resource availability to critical applications. Used across the IBM eServer pSeries line of on demand servers along with the Reliable Scalable Cluster Technology (RSCT) infrastructure technology layer in AIX 5L, HACMP can provide both horizontal and vertical scalability without downtime.

High Availability Cluster Enhancements

HACMP V5.1 requires AIX 5L and builds upon its features. New HACMP V5.1 functions include:

  • Reduced fallover time using fast disk takeover which happens within 10 seconds
  • Streamlined configuration interface which requires only six user inputs to build a simple HA cluster
  • New non-IP heartbeating protection over disks where no additional hardware is required
  • Enhanced security mechanism, removing the need for /.rhosts
  • Increased administration productivity through faster cluster verification and synchronization
  • Greater control over resources owning application startup and fallover behaviour
  • More cluster status information readily available in the cluster monitor
  • Addition of multiple disaster recovery technologies to keep the system accessible if disaster strikes

Business Continuity

The HACMP/XD (Extended Distance) optional feature is a must for customers with business-critical data who want to mirror data between separate sites to aid in disaster recovery. This applies to businesses of any size, with multiple sites or regional operations, or wherever decentralization of data is desired. HACMP/XD is an attractive and affordable high-availability solution for small- and medium-sized enterprises, and for small- and medium-sized business units of large enterprises. "High availability" should be a fundamental buying criterion for business-critical and e-business applications.

In a single package, HACMP/XD offers multiple technologies for achieving long distance data mirroring, fallover, and resynchronisation.

  • HACMP/XD supports IBM Enterprise Storage Server (ESS) Peer-to-Peer Remote Copy (PPRC). This allows HACMP clusters to support automatic fallover of disks that are PPRC pairs and creates a powerful solution for customers on ESS with PPRC. By automating the management of PPRC, recovery time is minimized after an outage, regardless of whether the clustered environment is local or geographically dispersed. HACMP/XD in combination with PPRC manages a clustered environment to ensure mirroring of critical data is maintained at all times.
  • HACMP/XD IP-based mirroring will provide the well-known unlimited distance data mirroring of the IBM High Availability Geographic Cluster (HAGEO) for AIX product. IP-base mirroring allows a cluster of pSeries servers to be placed in two widely separated geographic locations, each maintaining an exact replica of the application and data. Data synchronization during production, fallover, recovery, and restoration is provided. HACMP/XD is independent of the disk storage used. RAID or mirroring can be used for local protection. HACMP/XD IP-based mirroring is done at the logical volume layer.

Complementary Cluster Software

IBM also offers a broad range of additional tools to aid in efficiently building, managing and expanding HA clusters in AIX 5L environments. These include:

  • Integrated Cluster File System utilizing General Parallel File System (GPFS) for AIX V1.5. GPFS is a high-performance, shared-disk file system using standard UNIX file system interfaces and providing concurrent access to data from all nodes in a cluster.
  • Workload Manager for AIX, which provides resource balancing between applications
  • Geographic Remote Mirroring (GeoRM ) for AIX to provide unlimited distance data mirroring for backup/recovery
  • Tivoli for enterprise level systems management and monitoring

New Generation of On Demand Servers

HACMP runs on IBM eServer pSeries, the server platform of choice for UNIX-based on demand applications. This technology-driven line of servers offers the availability, scalability and range of performance demanded by today's growing on demand business environments. It combines the benefits of high-performance copper chip and RISC technology with AIX 5L for reliable handling of mission-critical applications.

pSeries is part of the IBM eServer product line, a generation of servers featuring innovative technology, logical partitioning, outstanding scalability and availability, broad support of open standards for application flexibility, and a full range of new tools to manage IT infrastructure in an on demand world.

Gaining the IBM Advantage

HA solutions are often inherently single-sourced to reduce the risk of failures occurring since each element of the solution is designed and tested for proven reliability. This can be a critical decision factor for business environments, and IBM provides the advantage of pSeries servers, the AIX 5L operating system, and IBM TotalStorage offerings and HACMP solutions.

The IBM eServer product line is backed by comprehensive offerings and resources that provide value at every stage of IT implementation. These include High Availability Cluster Implementation Services, an offering which provides basic and customized assistance for installation of HACMP clusters. This service is customisable with the following elements:

  • High Availability Cluster Proof of Concept Review
  • Planning and design of a pSeries Availability Cluster
  • Installation and configuration of a pSeries Availability Cluster
  • Applications integration assistance
  • Development and execution of a Cluster Test Plan
  • Enhanced monitoring and reporting setup
  • Operations planning and operations documentation development
  • Migration/upgrades services

Based on an assessment of the complete system environment, IBM availability experts can design a customer solution to meet the target availability level for on demand business needs.

 


GeoRM for AIX and HAGEO for AIX

Highlights

  • Provides disaster recovery and resynchronisation capability for geographically separated sites
  • Protects data against total location failure by remote mirroring of data
  • Supports unlimited distance between participating sites
  • Performs automatic site takeover and recovery
  • Tight integration with IBM's High Availability Cluster Multiprocessing (HACMP) for AIX clustering software

GeoRM/HAGEO Key Features

Key features:

  • Support for both UDP and TCP transport options.
  • 64-bit kernel support for the TCP protocol.
  • Choice of "Write Ordering by Volume Group" under the TCP transport option which can realize performance gains.
  • Tighter integration with HACMP simplifies configuration of both products.
  • Allows automatic detection and response to site and network failures in the geographic cluster without user intervention.
  • Provides load balancing across the links and enhanced by choosing the fastest path.
  • Removes the AIX limitation of three mirror copies of a disk and allows three copies at each geographic site.
  • Wider range of data transmission rates, allowing more efficient use of networks and better tuning of network utilization.
  • Support for maximum sized logical volumes

Disaster Recovery Excellence

Today, keeping a business operational increasingly means keeping critical data and information systems available around the clock. To compete successfully in the global marketplace, companies are striving to protect critical information systems to help minimize costly business impacts, such as lost sales, decreased customer satisfaction and reduced employee productivity.

One aspect of high availability is protection against location disasters, such as power outages, hardware or software failures, and natural disasters. This is accomplished by eliminating the system and the site as points of failure.

Two software products provide differing levels of disaster recovery features for IBM eServer pSeries and IBM RS/6000 UNIX systems. Geographic Remote Mirror (GeoRM) for AIX protects critical data by duplicating the most up-to-date data reliably and quickly at a remote location. High Availability Geographic Cluster (HAGEO) for AIX helps keep mission-critical systems and applications operational in the event of disasters.

HAGEO provides the geographic mirroring functions of GeoRM and adds automatic failover and recovery capabilities.

 


GeoRM

GeoRM is a data mirroring product that provides a point-to-point method of duplicating the customer data in real-time over unlimited geographic distances. Since GeoRM is both database and file system independent, there is no modification required of applications that utilize GeoRM's mirroring capabilities.

Businesses can be assured that GeoRM is designed to mirror any data destined for one server (the source server) across any IP-based network to another server (the target server). A total failure (e.g., CPU, disk, network, power) of the source server at the local site will not cause the loss of data on the target server at the remote site.

GeoRM has the ability to continue operations while recovering from a server failure. Since a target server can support up to seven source servers in GeoRM, the flexibility to design the correct backup configuration serves all types of business recovery needs and allows business applications to continue running on the takeover system while you recover from a disaster or planned outage. Each of these source and target servers can be as near (in the same room) or as far (halfway around the world) as required.

GeoRM offers a wide range of mirroring configurations allowing for the most stringent data integrity mode to a higher performance mode. Data between the GeoRM sites can be mirrored in three modes:

  • Synchronous mode helps ensure that the same data exists on both sites at the completion of every write. This mode provides a high level of data integrity.
  • Synchronous with mirror write consistency helps ensure that both sites can be restored with identical data, even in the event of a site failure in mid-transaction. This mode provides data integrity and better performance results.
  • Asynchronous mode writes on the local disk without waiting for the remote write to complete. All data may not be on the remote site when a site failure occurs. This mode is chosen when performance is the highest priority in disaster recovery.

GeoRM is suitable for all customers, from small and medium-sized companies to large corporate enterprises. It is scalable and flexible across the entire range of IBM AIX servers.

 


HAGEO

HAGEO supports the same critical data mirroring functions as GeoRM like point-to-point mirroring, three mirroring modes, and backup configuration flexibility. Not only is data protected, but HAGEO also has built-in features to automatically respond to site and communication failures and provide for automatic site takeover.

An HAGEO cluster consists of two geographically separated sites, supporting a total of eight systems. There are three types of disaster protection: remote hot backup, remote mutual takeover and concurrent access.

Remote Hot Backup

A remote geographic site is designated as the hot backup site. This backup site includes hardware, system and application software, and application data and files. It is live and ready to takeover the current workload. In the event of a failure, the failed site's application workload automatically transfers to the remote hot backup site.

Remote Mutual Takeover

Remote mutual takeover takes remote hot backup a little further and allows geographically separated system sites to be designated as hot backups for each other. Should either site experience a failure, the other acts as a hot backup and automatically takes over the designated application workload of the failed site. Two different workloads running at two different sites are protected!

Concurrent Access

Concurrent access configurations have systems at both sites concurrently updating the same database. Users run instances of the same application at both sites for increased system throughput and extremely fast failover. HAGEO is one of the few products to have this ability.

Remote System Recovery

Because of the above types of disaster protection, after a failed site has been restored to operation, HAGEO can resynchronise mission-critical data and reintegrate the failed system with the remote hot backup. HAGEO updates the failed system with a current mirror of application data and files processed by the backup system after the failed system ceased operations. Upon completing restoration of an up-to-date data and file mirror, the HAGEO cluster will resume synchronized system operations, including the mirroring of real-time data and files between the system sites. This can occur while the remote backup is currently in user operation.

Complete Availability Line-Up

HAGEO is complemented by High Availability Cluster Multiprocessing (HACMP) for AIX, which can be used for local or campus disaster survivability with real-time automated fallover and reintegration for up to 32 servers. HACMP can protect against local system and application failures, preserve data integrity and consistency, and maintain cluster operations during unplanned and planned downtime. This strong line-up provides IBM AIX system customers with a wide choice of high availability and disaster recovery technologies.

 

Privacy Policy Disclaimer Site Map

© Syan 2008