Monday, December 10, 2007

The Software Spectrum

Across nearly all aspects of enterprise technology, purchasing processes have shifted from hardware-centric decision making to identifying best platforms for software and applications. Servers have essentially become commodity-driven compute engines, amassed in such volume and density that they simply become processing power enclosures. Networking equipment can easily be strung together, but only the effective monitoring and management of that information highway can tap the potential of its capabilities. Similarly with storage—while having ample space always makes solving storage problems easier, effective use of storage resources requires sophisticated organizational monitoring and management systems.

Chapter 2, "The Storage Architectural Landscape," covered the benefits of networked storage architectures from a hardware perspective and the means to deploy effective infrastructure to accommodate torrents of corporate data. This underlying equipment, however, also must be shaped with effective organizational systems that allow data managers to minimize their ongoing administration of the infrastructure and maximize the utilization and efficiency of the overall system. Storage management software helps IT professionals accomplish this task and is the focus of Chapter 3.

With ongoing administration and operation costs far exceeding acquisition costs for storage infrastructure, IT decision makers must account for this imbalance while charting their storage networking roadmap. Careful examination of storage software components can help identify crucial costs early in the deployment phases, resulting in long-term savings. While the concepts of storage management may seem foreign at first, the underlying mechanisms boil down to simplifying organizational and administrative tasks related to storage. Keep in mind that data storage space is just another corporate asset that requires efficient mechanisms to realize maximum returns. As the overall amount of data storage increases, and the ability to assign dedicated personnel decreases, effective software tools emerge as the only viable means for corporations to gain control of their information assets.
Framework for Storage Management Software
The end-to-end equation for storage management software involves numerous components, functions, and product categories. This seemingly complex spectrum of terminology and definitions often leads to unnecessary confusion about the role and objectives of storage management software. Business and technical professionals alike will be well served by a breakdown of storage management categories into a coherent framework, as presented in the following sections. By starting with the basic functional needs and translating those into required actions (as assisted by software applications), IT professionals can create effective storage software deployment strategies.

3.1.1 The Need for Storage Management Software
Businesses face an increasing array of challenges to manage storage infrastructures. First, IT professionals must manage a dazzling array of hardware and networking components that provide the underlying platform for storage. The sheer number of devices far outstrips the capacity of a single individual or group to keep track of them. Second, storage professionals must provide capacity and associated services across a variety of applications, from databases, to email, to transaction processing. Again, keeping track of these requirements and the business fluctuations requires some degree of automated assistance. Finally, all of this must take place with 24/7/365 availability and the capability for immediate recovery in the event of any failure.

These business requirements mandate the automation of the storage management process. Using software tools, storage professionals can effectively manage and monitor their infrastructure, provide capacity and services for application requirements, and guarantee uptime through high-availability configurations with built-in disaster recovery mechanisms. These are the core components of storage management software.

3.1.2 User View of Storage Management Software Components
To accomplish their objectives, storage professionals require assistance in three primary areas for effective storage management—infrastructure management, transaction management, and recovery management—outlined in Figure 3-1. Infrastructure management provides visibility to the entire architecture and the ability to make adjustments to the underlying platforms. As storage moves from direct-attached to networked models, an administrator's visibility extends far beyond the traditional purview. Software tools for SAN management and storage resource management provide visibility and reach across the entire infrastructure.

Figure 3-1. Core components of storage management.


Transaction management represents the application focus storage administrators must maintain to effectively serve the organization. Tools for data coordination, such as volume management or NAS file services, help implement storage resources for applications. Once in place, storage policy management helps ensure that resources are appropriately dedicated, monitored, and managed in a dynamic storage environment.

The availability focus comes in the form of disaster recovery management through data protection. Whether through backup applications or sophisticated real-time replication, storage professionals can guarantee uptime through the use of these software applications.

Virtualization touches several aspects of the storage management infrastructure. The core principal of virtualization separates physical and logical storage, allowing for a variety of infrastructure, transaction, and recovery functions. The physical placement of virtualization in Figure 3-2 symbolizes that virtualization in and of itself provides limited functionality. However, that functionality facilitates many higher level processes outlined in the model. These are more specifically detailed in Section 3.8, "Virtualization."

Figure 3-2. Software elements of storage management.


3.1.3 Piecing Together Storage Management Software
Figure 3-3 shows where certain storage software functions reside from the architectural view. The following sections cover each area in more detail.

Figure 3-3. Storage management software components in the enterprise.

Storage Network Management
Because networked storage infrastructures are required for optimized storage deployments, the corresponding storage network management, or SAN management, takes on critical importance. Today, customers have a choice of SAN management software that can come directly from SAN switch vendors, array vendors, server vendors, or third-party software companies that integrate directly with SAN hardware.

Storage networking implementers will benefit from careful attention to this decision. While SAN management software from equipment vendors typically offer greater functionality and features, a third-party SAN management solution can provide configuration tools and diagnostics across multivendor solutions.

Most SAN management applications operate with an independent set of interfaces that drive control through that application. Efforts are underway in industry organizations like the Storage Networking Industry Association (SNIA) to standardize on common methods for storage network management to facilitate greater interoperability between applications. Ultimately, this will provide user flexibility to use multiple applications for SAN management and the ability to more easily shift between applications.

These standardization efforts fall underneath the overall framework of Web-based Enterprise Management, or WBEM. Within this framework, the Common Information Model (CIM) establishes a standardized means to organize storage network–related information, such as product information characteristics of a SAN switch or disk array. Within these overarching framework models are a set of specific means to exchange information between devices and applications, such as Extensible Markup Language (XML), Hypertext Transfer Protocol (HTTP), and Simple Network Management Protocol (SNMP).

3.2.1 Discovery
The first step towards effective SAN management begins with device discovery. The basic process involves the identification of storage network devices within a storage fabric. For end devices such as HBAs or disk arrays and tape libraries, the initial connectivity and boot process establishes the login to the SAN fabric, typically through a SAN switch. The SAN switches become the initial repository of device information and can then share this data with SAN switch management applications or third-party applications.

Typically, all devices within a SAN have an Ethernet/IP interface dedicated to management. Each device has a specific IP address and communications to other devices or to centralized management agents via an SNMP using Management Information Base (MIB). MIBs are basic frameworks that allow applications and devices to share device-specific information. There are standard MIBs, such as MIB-II, with information that pertains to all devices, such as interface status for a specific port. Hardware vendors also typically provide vendor-specific MIBs that cover unique product features.

Using a combination of information from SAN switches, which have device information through the login process, and direct access to devices through the Ethernet and IP interfaces using SNMP, management applications have a wide array of information to provide to administrators on the general status of devices within the storage network. From this initial discovery process, more sophisticated management, such as manipulation of the storage network, can occur.

3.2.2 Zoning and Configuration
Storage networks create connections between multiple servers and storage devices using a variety of interconnect mechanisms from a single switch or hub to a complex mesh of switches that provide redundancy for high availability. The universal connectivity of storage devices to a common playing field provides tremendous flexibility to architect storage solutions. However, having that connectivity doesn't necessarily mean that one would want every storage device to be able to see every other storage device.

Managed communication between storage devices helps administrators balance the agility of universal accessibility with the business needs of resource allocation, segmentation, security, and controlled access. This managed communication begins with a process of zoning and configuration.

Let's use a simple example of a Windows server and a UNIX server on the same storage network, with two separate disk units (JBOD-A and JBOD-B—each with two individual disks) and one tape library. A typical zoning and configuration application is shown in Figure 3-4. On the right side of the diagram is a list of devices, including the individual disks, tape library, and servers (identified by their HBA interfaces). On the left side of the diagram is a list of zones. Placing devices in a particular zone ensures that only other devices within that zone can "see" each other. SAN switches enforce the zoning through different mechanisms based on the worldwide name (WWN) of the storage device or on the port of the switch to which it is attached. For more detail on zoning, see Section 3.2.3, "SAN Management Guidance: Hard and Soft Zoning."

Figure 3-4. Typical storage networking zone configuration.


In this example, the administrator has separated the disk units for the Windows and UNIX hosts. This avoids any conflicts of one operating system trying to initialize all of the visible storage. However, the tape library has been placed in both zones. To avoid potential conflicts, this configuration would need to be accompanied by a backup software application that can manage the access to the tape library, ensuring access by only one host at time. Sharing the tape library allows its cost to be distributed across a greater number of servers, thereby servicing more direct backups.

3.2.3 SAN Management Guidance: Hard and Soft Zoning
Zoning applications typically operate in two types of device modes: hard zoning and soft zoning. Hard zoning, also referred to as port-based zoning, means that all of the devices connected to a single port remain mapped to that port. For example, three drives on a Fibre Channel arbitrated loop may be allocated via a specific port to Zone A. If another drive is assigned to that loop, and it attaches through the same specified port, it would automatically appear in Zone A.

Another zoning mechanism, soft zoning, or WWN zoning, uses the unique Fibre Channel address of the device. In IP or iSCSI terms, the device has a Worldwide Unique Identifier (WWUI). With soft zoning, devices may be moved and interconnected through different SAN switches but will remain in the same zone.

Hard zoning offers more physically oriented security, while soft zoning offers more flexibility through software-enforced zoning. For very large configurations, customers will likely benefit from the intelligence of soft zoning coupled with the appropriate security and authentication mechanisms.

3.2.4 SAN Topologies
In addition to zoning devices in a SAN, some applications offer the ability to visualize the SAN in a topology view. A sample SAN topology is shown in Figure 3-5.

Figure 3-5. Topology view of a storage area network.


SAN topology views can serve as effective management utilities, allowing administrators to quickly see the entire storage network and to drill down to device levels. In some cases, it may be easier for administrators to assign and allocate storage capacity using topology managers.

SAN topology views also provide more visibility to the network connectivity of SANs. While a zoning and configuration tool helps clarify the communication relationship between storage devices, topology managers help clarify the communication means between storage devices. Specifically, topology managers can show redundant connections between switches, redundant switches, and the available paths that link one storage device to another.

3.2.5 Monitoring
Monitoring allows storage administrators to keep the pulse of the storage network. This includes the general health of the SAN hardware, data activity levels, and configuration changes.

Event and error tracking is used to keep logs of the activity within a storage network. SAN management applications track events such as device additions, zone changes, and network connectivity changes. These logs keep detailed records of SAN activity and can serve as useful tools if problems occur. With hundreds or thousands of SAN configuration operations taking place on any given day, the ability to go back in time to analyze events is invaluable. Similarly, error logs help diagnose the root cause of potential failures within a SAN. Minor errors, such as an unsuccessful first login to a SAN switch may not mean much as a stand-alone event, but a pattern of such errors can help administrators rapidly analyze and repair potential problems in the SAN.

Storage administrators can use alarms and traps to help effectively monitor the storage network. A trap is a set threshold for a certain variable that triggers notification when reached. For example, a trap can be set for a SAN switch to send an alarm when the temperature of the box reaches a "red zone." Such traps are helpful because a problem may not be directly related to failures within the equipment. For example, a broken fan would automatically send an alarm, but if someone placed a large box next to a data center rack, prohibiting airflow, a temperature gauge would be the only mechanism to ensure preemptive awareness of overheating.

SAN management traps typically integrate with large enterprise management systems via SNMP. By tying into these larger software systems, SAN alarms can be directed to the appropriate support staff through existing email, paging, or telephone tracking systems.

Perhaps the most important monitoring component for progressive deployment of storage networking infrastructures is performance. Performance monitoring can be tricky. Ultimately, organizations measure performance of applications, not storage throughput. However, the underlying performance of the SAN helps enable optimized application performance.

Since storage networks carry the storage traffic without much interpretation of the data, SAN performance metrics focus on link utilization, or more specifically, how much available bandwidth is being used for any given connection. An overengineered SAN with low utilization means excess infrastructure and high costs. An overworked SAN with high utilization means more potential for congestion and service outages for storage devices.

3.2.6 SAN GUIDANCE: Protocol Conversion
As outlined in Chapter 2, three transport protocols exist for IP Storage: iSCSI, iFCP, and FCIP. For iSCSI to Fibre Channel conversion, a key part of storage network management, IT professionals can use the framework in evaluating product capabilities. Not all IP storage switches and gateways provide full conversion capabilities. For example, some products may support iSCSI servers to Fibre Channel storage, but not vice versa.

Additionally, storage-specific features for applications like mirroring or advanced zoning capabilities between IP and FC SANs may affect protocol conversion capabilities. This type of connectivity, items 2 and 3 in Figure 3-6, should be specifically examined in multiprotocol environments.

Figure 3-6. Type of IP and FCP conversion.


For Fibre Channel to Fibre Channel interconnect across an IP network, such as item 4 illustrates, the iFCP and FCIP protocols are more suitable because of their ability to retain more of the FCP layer.

3.2.7 Hard Management
Appropriately titled by its frequency of implementation, as opposed to the implementation challenge, is the physical documentation of SANs. No matter how much software is deployed or how detailed the visibility of the applications, nothing protects a business better than clear, easy-to-follow policies and procedures. Obviously, this is more easily said than done, and often the primary challenge in disaster recovery scenarios lies in understanding the design guidelines used by SAN architects. If these architects leave the company or are unavailable, the inherent recovery challenges increase dramatically.

Documenting SAN deployments, including a set of how-to instructions for amateur users, serves as both a self-check on implementation and an invaluable resource for those who may need to troubleshoot an installation sight unseen. At a minimum, items for this documentation would include applications used, vendor support contacts, passwords, authorized company personnel and contact information, backup and recovery procedures, and storage allocation mechanisms.

In conjunction with documenting storage administration policies and procedures, proper cabling and labeling of storage configurations can dramatically save time and effort in personnel costs—one of the largest components of the operational IT budget.

Storage Resource Management
Storage resource management (SRM) is the second component of infrastructure-focused SAN management. While storage network management (SNM) focuses primarily on the connectivity and combinations of storage devices on a network, SRM focuses on the individual devices and the ability to see and modify those devices from a central location. SRM may also be referred to as device management, an appropriate term, given the focus, and an easy way to distinguish from SNM. The primary benefit of SRM is the ability to make more effective use of the resources within an organization, specifically storage capacity and resource efficiency, as outlined at the end of this section.

For clarification purposes, we divide SRM categories into devices and subsystems. Devices are those elements within the overall storage infrastructure that provide access to storage, while subsystems house the actual disk or tape media.

3.3.1 Device Management
In the device category, SRM looks remarkably similar to SNM. Devices under an SRM application may include HBAs, switches, and routers. The SRM application provides a unified view of these devices and the ability to make some changes. In many cases, the SRM application will simply point to the device-specific management interface embedded within the product.

Device management features of SRM include discovery, topology mapping, and configuration planning. These tools provide SRM applications the information to help find potential problems and recommend solutions. Zoning is typically not included in SRM packages.

The primary difference between device management across SNM and SRM applications is the integration with additional multivendor infrastructure. For SRM applications, the visibility typically extends deeper into the subsystems category.

3.3.2 Subsystem Management
SRM features shine when used in large installations of disparate storage subsystems that require centralized administration. In these environments, an SRM application takes a logical view of the storage configuration and provides administrators a clearinghouse to receive information and implement changes.

SRM applications discover storage within the overall infrastructure and provide logical-to-physical maps of storage utilization, such as which servers are using which storage. Further, SRM applications identify operational file systems and file-level detail of such storage. For block-based storage, SRM locates volume groups and provides volume mapping visibility within and across subsystems.

This level of detail can be used by storage administrators to understand asset allocation and identify usage patterns across applications, departments, and individuals. From this viewpoint, administrators now have the required knowledge to drive decisions about optimized storage use within the organization. This leads to the establishment of policies such as quota management, covered in Section 3.5, "Storage Policy Management."

Even with storage resource management focused on the subsystems side, LUN and RAID creation and modification still remain within the subsystem. These features may be accessed through centralized interfaces; however, this basic level of disk aggregation can be effectively accomplished only by the subsystem vendor.

3.3.3 Resource Efficiency
SRM information allows administrators to see the efficiency of their storage resources, such as the amount of storage used compared to the total amount available in the array. This top-level view helps quickly pinpoint overutilized or underutilized storage, allowing administrators to appropriate respond. Additionally, SRM views of the data itself can provide efficiency clues. For example, closer examination of file data may reveal rampant file revision propagation or an absurdly large collection of .mp3 or video files. Once this information is collected and examined, administrators can appropriately respond.

Identifying which servers and applications address which storage helps match the logical and physical pictures. If the accounting department is using 75 percent of the capacity of an array in another building, it may make sense to have that array located within the accounting department yet still available through the storage network to the rest of the organization.

3.3.4 Capacity Planning
Using historical information from SRM applications, specifically trend analysis, administrators can plan for storage capacity more effectively. With visibility into the actual data, including the types of files and applications related to growing volumes, SRM applications can plot and chart areas of future storage growth. Based on such information, administrators can make more accurate decisions based on more than a simple look at the total number of terabytes within the organization.

In addition to helping predict future capacity needs, SRM can help administrators make use of existing capacity, thereby minimizing new storage purchases. For example, the duration for saved email archives can be weighed against the cost of new capacity.

3.3.5 Integration with Other Storage Management Applications
For SRM to be truly effective, the monitoring functions must be incorporated with other functions of storage management, such as data coordination, protection, and policy management. Administrators can begin with basic questions such as the type of storage data, primary users, usage habits, access times and durations, and storage location. From there, policies and actions can be put in place to maximize efficiency, cap reckless data accumulation, and automatically provision and protect storage as needed.

No comments: