According to IDC, the market for highly available servers and high availability itself is growing, reflecting an enterprise demand for solutions that strive to maintain business continuity. As a result, IT departments are under increasing pressure to preserve application availability. This article discusses strategies for high availability and how IT departments can determine the right solutions for their critical applications.
Key considerations for high availability
Companies today are implementing high-availability solutions for many reasons: to minimize planned downtime for maintenance and upgrades, to enable rapid recovery from unplanned downtime, and to meet compliance requirements that ensure continued end-user access to business-critical applications.
Two key questions companies need to consider, when initiating high-availability planning, include: which applications are mission-critical to my business users, and what are the acceptable levels of downtime for these applications?
Recent Symantec studies indicate that approximately one-half of a company’s applications are deemed mission-critical with an average recovery time objective (RTO) of 8 hours or less. Typical mission-critical applications today include a company’s email system, ERP and financial systems, transactional systems, and CRM tools. And the current trend shows more and more applications shifting to a mission-critical status.
Traditionally, restarting applications has largely been a manual process, a matter of reinstalling the software from disk. Fortunately, today’s provisioning tools enable critical applications to be back up and running on a more automated basis. A manual rebuild can take a day or so, but provisioning tools can reduce that startup time to 6 to 8 hours.
An even higher level of protection, the “platinum level,” involves moving applications to a clustered environment. This allows IT departments to monitor the health and status of applications and their relevant components (including the associated database, operating system, network, and storage resources) along with the ability to restart those applications on another server (or at another site) when needed. Therefore, if an issue were to occur, software automatically takes care of the process of restarting the application and all of its dependencies, making sure users are redirected to the new server. Typically, RTOs for these kinds of applications are measured in minutes rather than hours.
Downtime and change
When IT departments put together their high-availability plans, they outline possible scenarios that will need to be addressed, such as major hardware failure, a critical application shuts down, a storage or networking component becomes unavailable, or a natural disaster takes down an entire data center.
But according to survey results by Symantec customers, the leading cause of downtime is change, along with human error caused by administrators themselves.
Given the heterogeneous nature of today’s data centers, frequently a company fails to recognize which systems or applications are dependent on each another. Change management tools enable IT departments to determine the impact of a change on multiple systems (not just on the system where the change was made). These tools can tell who made a change, when it was made, and what was actually changed. As a result, when downtime does occur, diagnosing the problem does not take nearly as long.
In a clustered environment, so-called configuration drift (changing one server but not making the change on a backup server) can cause these expensive systems to fail. Installing change management tools can prevent such drift and ensure that an organization gets the most out of its high-availability investments.
High availability in the era of virtualization
Today, high-availability solutions are becoming even more important for operational viability as organizations increase their adoption of virtualization technologies. With virtualization technologies, multiple virtual machines are commonly hosted on a solitary physical server. A failure of that physical server can lead to a loss of availability for several applications. As a result, the importance of providing business continuity rises as the impact of virtualization technologies increases.
A critical challenge for data centers is securing adequate IT skill sets in-house to deploy and maintain clustered and highly available software solutions. For enterprises large and small, increasing pressure on IT budgets means that simplified administration, so-called “single-pane-of-glass” management, and increased automation are becoming more important to next-generation high-availability solutions.
Ultimately, most organizations will need a unified management console that supports both high-availability software and virtualization software that helps them discover, manage, and visualize both physical and virtual servers.
Veritas Cluster Server
Veritas Cluster Server is Symantec’s clustering solution for reducing both planned and unplanned downtime. Veritas Cluster Server can detect faults in an application and all its dependent components, including the associated database, operating system, network, and storage resources. When a failure is detected, Veritas Cluster Server shuts down the application, restarts it on an available server, connects it to the appropriate storage device, and resumes normal operations.
Because data center servers and applications are constantly changing, regularly testing a disaster recovery strategy is critical to guarantee a successful recovery in the event of a system or site-wide outage. To better guarantee the success of a disaster recovery strategy, Veritas Cluster Server includes Fire Drill, a tool that automates disaster recovery testing, reducing the time and expense associated with DR strategies. Administrators can make frequent changes to the IT infrastructure and simultaneously reflect those changes at a remote site. Because Fire Drill does not disrupt production applications, it can be run as often as necessary.
Veritas Cluster Server also provides a single solution for clustering both physical and virtual systems. With Veritas Cluster Server, administrators can monitor an application running within a virtual machine and recover it in the event of a failure.
Symantec Global Services help maximize the value of an investment in Veritas Cluster Server. Symantec’s highly skilled consultants leverage best practices from thousands of engagements to accelerate rollouts, minimize deployment risks, and help achieve high levels of operational efficiency. To deliver an optimal high availability and disaster recovery solution for your enterprise application and resources, Symantec specialists can thoroughly assess your infrastructure and design, and deploy and integrate Veritas Cluster Server into your environment.
Conclusion
High-availability solutions are essential to maintain business continuity in today’s 24x7 environment. But these solutions must be evaluated for their ability to address the business and technical needs of the organization in which they will be deployed.
Through central management tools, automated failover, features to test disaster recovery plans without disruption, and advanced failover management based on server capacity, Veritas Cluster Server allows IT managers to maximize resources by moving beyond reactive recovery to proactive management of application availability.
Additional information about Veritas Cluster Server
Veritas Cluster Server is the industry’s leading clustering solution for reducing both planned and unplanned downtime. By monitoring the status of applications and automatically moving them to another server in the event of a fault, Cluster Server can dramatically increase the availability of an application or database.
Veritas Cluster Server can detect faults in an application and all its dependent components, including the associated database, operating system, network, and storage resources. When a failure is detected, Cluster Server gracefully shuts down the application, restarts it on an available server, connects it to the appropriate storage device, and resumes normal operations.
Cluster Server can temporarily move applications to a standby server when routine maintenance such as upgrades or patches requires that the primary server be taken offline.
Key features and benefits
Snapshot & clustering Integration
Automated DR fail-over and DR testing through integration with Veritas Cluster Server and Veritas Storage Foundation FlashSnap
Single solution across any distance
VCS allows you to link multiple independent high availability clusters at multiple sites into a single, highly available Disaster Recovery framework
Disaster Recovery testing with Fire Drill
Effectively creates a carbon copy of live production data on a designated Disaster Recovery host and automates complete application testing against the copy
Cluster Management Console
Cluster Management Console for VCS is a single graphical interface to manage, monitor and report on all your clusters
Heterogeneous platform and storage support
Supports Solaris, HP-UX, AIX, Linux, Windows, and VMware operating system platforms, as well as multiple storage vendors ranging from expensive, enterprise-level arrays to inexpensive, SMB/E-level JBOD
Cluster Server Simulator
Enables users to completely simulate a new or existing Cluster Server clustering environment from their laptop or desktop PC prior to actual implementation of clusters
Out-of-the-box support for multiple apps & replication
VCS provides off-the-shelf engineering-level support for a wide range of enterprise-class applications, databases, and replication technologies. Custom agents can be created to make custom or unsupported applications available
Cluster Server Versions
Veritas Cluster Server is available in two versions enabling IT Organizations to utilize the version that is right for its availability needs.
Veritas Cluster Server
Veritas Cluster Server is for local availability. It includes the Cluster Management Console and agents for all supported applications and databases.
Veritas Cluster Server HA/DR
Veritas Cluster Server HA/DR is for local and remote availability. On top of all the features of Veritas Cluster Server, it includes Fire Drill Automated Testing, all supported replication agents, and global clustering capabilities (formerly Global Cluster Option).
Supported operating systems
Solaris 8, 9, and 10 (SPARC)
AIX 5.2 and 5.3
HP-UX 11iv2 (PA-RISC and Itanium)
Red Hat Enterprise Linux 4 (64-bit)
Novell SUSE Linux Enterprise Server 9 (64-bit)
For additional information, please contact you HP sales representative.
For more information
Veritas Cluster Server: high-availability, clustering solution
White Paper: Using Availability and Clustering Software to Maintain Business Continuity in the Era of Virtualization
