Who’s minding the storage solutions?

Most business people are, or should be, aware of Parkinson’s Law, which states: ‘Work expands to fill the time available for its completion.’ There is a similar law which is demonstrated daily in IT:’ Data expands to fill the available storage.’

Everyone knows the problem. Data is simply multiplying. In business, the major research firms are in consensus that the average is about 50 per cent every year for all business enterprises internationally. That is exponential, so the result is a doubling every 18 months or so.

As it happens, the data storage capacity of those spinning magnetic disks we depend on so much is also growing at much the same rate as the engineers succeed in cramming more bits and bytes on the platters and rotating those disks even faster.

Even faster growth has been shown in the various kinds of management software we use to control the data storage. Information Life Cycle Management (ILM) has been around for more than a decade. An almost self-explanatory IT term (for once), ILM is based on the sensible observation that, the older the data is, the less likely it is to be looked at again.

Yesterday’s files, on the other hand, are still very much still live. Very smart ILM software automatically redeploys data all the way from top priority (such as today’s transactions) to archive material that is retained only because the law so requires.

ILM recognises that the value of data generally reduces as it ages, so most of it does not need top performance storage systems.

The next logical step is tiered storage with an ILM storage strategy that migrates the less vital data from the top-performance storage down a tiered set of systems until eventually it is archived or deleted.

The main result is reduced investment in topend systems or, from another angle, investment in smaller capacity but top specification storage to optimise the performance of really critical applications.

The other key concepts in data storage are the storage area network (San) and network attached storage (Nas) – both of which separate the data storage from the traditional servers onto dedicated devices – and virtualisation.

Virtualisation has been a buzzword for most of this decade, and is a strategy or architecture that is impacting on all areas of enterprise IT. In essence, virtualisation means that a set of computing elements is pooled as a single resource. It is, in fact, particularly effective and proven in data storage, where the separate disks effectively become invisible below a control layer and are managed as a single resource.

The data management software implements the organisation’s ILM and other policies, and will automatically move data to less high-performance storage and ultimately to archive. Other data management elements include tamper-proofing security and an audit trail for all compliance and forensic purposes. But only the storage management system needs to track what’s on location X, device one or disk three.

Now we have a new and long-awaited technology in enterprise data storage. Solid state memory – similar to Ram or those ubiquitous USB memory sticks or tiny memory cards for devices like cameras – has arrived in mainstream storage. Solid State Disks (SSD) are not disks at all, but the word helps to differentiate their performance, specifications and roles in primary storage from the popular stuff.

With capacities from about 128 gigabytes up to 512 gigabytes, these ultra high-speed units are relatively expensive. Proven in portable devices like laptops, SSDs have rapidly found their place at the front end of primary storage arrays, and a new denomination as Tier 0 to differentiate them from the former top performance stuff in Tier 1. It is actually a cache or buffer, just like Ram in the PC.

Tom Keane, technology consultant with storage specialist CMS Peripherals, said this was because our Moore’s Law-driven, multicore processors were driving systems so fast that data storage had to catch up. To get the best performance from the overall set of computing resources, the input/output [I/O] rates of storage had to match the requirements of the processing.

“Our constant demands for better and better performance at the front end of 24×7 live systems, processing those escalating quantities of transactions and data, have reached a stage where the whole architecture of data storage is changing again,” Keane said.

For a start, he said that data storage used to be just that but, in recent years, more and more intelligence and specialist functionality had had to be built into the storage systems.” The storage system now manages complex tasks such as virtualisation, replication for failover and disaster recovery, deduplication, snapshots for roll-back of data states, ILM and thin provisioning, so that additional hard capacity is only added as required,” he said.

What we have been seeing, Keane said, was the disaggregation of data storage from the primary computing resource.” That began with Nas and Sans, of course, and now we are seeing newer architectures to cope with the limitations of the previous generation – notably that, while it is easy to scale up capacity, it is not at all simple to raise the level of I/O performance.”

The latest concept is clustered or grid San storage, where each shelf or array of disks has its own intelligence built in.” Add more capacity and you also add the performance to keep the same overall level by the system,” Keane said.

World-leading data storage specialist EMC clearly has a global view, and it is working to a figure of 60 per cent for annual data growth.” Even in pure business data in sectors affected by the downturn – like financial services, telcos and small businesses – we are seeing a minimum of 40 per cent growth,” said Gerry Boyce, chief technical officer of EMC’s Irish market operation.

Some of the obvious growth drivers, like video and social networking, obscure the fact that about 85 per cent of the world’s stored data is still managed by organisations rather than individuals.” This will continue to be the pattern as online back-up for business and consumers becomes more common,” Boyce said.

The real challenge to organisations and business continued to be the management of stored data, rather than simple capacity issues, Boyce said.” Today I will organise the tiering of my data in a more subtle and granular way, and constantly analyse the characteristics of my data to ensure best fit with the storage system performance required.”

Interestingly, he suggested that the advent of SSDs in Tier 0meant that less Tier 1 gear may be needed in many organisations.

“Fibre Channel has been the top performing type of data storage for some years but, in the typical workload, a high proportion is not being utilised,” he said.

“If you have, say, about 5 per cent of your data capacity that is being stretched, the most effective answer may be to put in SSDs to deal with it. We then often see that the new generation of Sata drives and iSCSI protocols is more than adequate for the other 95 per cent of the corporate workload.”

The relatively high cost of SSDs is then offset by the much cheaper Sata disks, particularly in the new clustered Sans that ensure high I/O levels.” Smart virtualisation like VM ware moves and allocates data automatically to ensure that workloads are dealt with as efficiently as possible, and according to the business rules and priorities in place,” Boyce said.

All in all, what we are seeing in the market is just the natural progression of technology, according to Karl Jordan, HP Ireland’s enterprise storage manager.

“Faster processors and continued rapid data growth mean we need to push more inputs and outputs through the systems. So plugging SSDs into the front end of broadly standard data storage systems will help to do that. But it is a niche technology really, that probably nine out of ten Irish organisations certainly do not need right now.”

On the other hand, he and other data storage experts have said that solid state memory costs are coming down all the time.

The obvious value of SSD in portable devices, for example, is driving that end of the market strongly. It follows that, when the economics are favourable, the performance appeal may give SSDs a greater share of tiered primary storage.

In the meantime, clustered storage is the new default option for primary data storage with serious business workloads.” Up to now you had dual Raid controllers in a San and added disk for capacity as required. But that means you can add capacity but not I/O performance. In fact, it may degrade.

“Now each element in a storage cluster has its own intelligent controller. With HP Left Hand technology, the I/O bottlenecks are taken out and the organisation can grow its data capacity linearly without performance penalty,” said Jordan.

He was keen to point out that these new storage technologies (other than SSDs) were not expensive.

Entry level for about five terabytes of capacity in a San based on Sata and iSCSI would be less than €10,000, he said, and every step up continued to be affordable – while there were no particular upper limits to scalability.

The theme of matching technology performance to real business needs at an economical capex was taken up by Justin Connolly, IBM storage brand manager.

“We talk about business needing to be agile and dynamic to survive and compete, but that needs to be matched by its IT systems,” he said.” In data storage, virtualisation, ILM and hardware tiering are all aimed at cutting costs while retaining performance levels.

“Every single business today plans to be successful and respond as the economy improves. With the continuing objectives of costs down and performance up, investment in the next generation of systems will be part of that.” According to Connolly, IBM’s solution was the new XIV series for enterprise data storage.” We are promising Tier 1 performance and functionality from Tier 3 hardware and cost levels.”

In the market, that means an entry level of €170,000 to €200,000 for 20TB to 27TB of high performance storage.

Connolly spoke of adding more intelligence in the storage stack.” The storage cluster is automated in its own right and offering smarter control of ILM and replication, for example, more closely and dynamically matching specific applications. All of this contributes to lowering the costs of managing storage, which cannot be entirely automated. Once that really was a black box magic art in the IT world. Now it can be as simple and intuitive as using an iPhone. Define the business needs and policies and all of the smart stuff will follow through in the background,” he said.

Acquired just over a year ago by storage giant EMC, Iomega is a brand that most people will associate with its removable Zip drives (now discontinued) and higher capacity Rev drives, used mainly for offsite back-up. But Iomega today is firmly in the Nas market, nicely under its big brother EMC’s range of solutions, with desktop products offering smart performance and capacity up to six terabytes.

“These StorCenter products are particularly suitable for SMEs,” said Ashley Winfield, Iomega’s manager for the Irish market.” Each unit brings EMC’s LifeLine management software, VM ware virtualisation and a fast internal processor and Ram to ensure the performance of the Sata drives. Perhaps the real kicker is four-click installation, as near to plug-and-play as smart data storage can get.”

The smaller 2TB and 4TB units are desktop models, while the top-end StorCenter Nas is rack mounted. Entry level is less than €400 and these neat desktop units are primary storage for lower-end requirements (SMEs, advanced home networks, Cad professionals).

Although not aimed at competing with standard external drives, they are quite portable and would certainly suit micro business and project teams with frequent changes of location. In applications such as shared security video recording, they can be used without a PC.

The fact that even these desktop units contain cache Ram il lustrates the need for all types of data storage to be’ specced’ to keep up with the multiprocessor computing environments in today’s organisations.

“That new Tier 0 is already critical to high performance systems, say, with great volumes of online transaction processing, but it is still a minority need,” said Greg Moore, solutions architect in Dell’s infrastructure consulting team.

“So much so that, for those organisations which need it, the cost is simply not an issue, certainly compared to the return in I/O performance.”

In fact, Moore said that, in TCO [total cost of ownership] terms, the f lash storage in SSDs as part of a balanced system probably worked out at much the same as the traditional Fibre Channel solutions. There are lower operating expense (opex) costs in various areas, including much lower energy requirements, tiny space footprint and very low failure rates.

This general cost/performance balance also applies to Dell’s offerings in the clustered San market space.

“Each tray of disks has its own Equal Logic virtualised architecture with its own controller. Each set of 16 Sata drives is like a mini-San, running as fast as possible. They can work together in clusters with no drop in performance.”

The picture that emerges of data storage today is ever so slightly paradoxical. It moved out of the servers and onto the network to reduce the burden on the prime processing.

Now it is being beefed up to match the performance of the ever faster multi-core processing and that involves processors and Ram to govern the performance of the data storage to match that of the processing servers.

It certainly goes to show that today’s computer is more like an organism than a specific device or box. Every part of a system contributes to making up an effective central computing resource to serve the needs of the applications and their users. Miniaturise it all into one chassis and we can start all over again, connecting them up for higher levels of performance.

Online back-up beckons

In many respects, back-up should be separated from primary data storage and even the subsidiary tiers like archive material. Of course, it involves data and, in fact, is specifically concerned with the mission-critical data.

But as the primary virtualised systems contain several copies of that key data, and can fail over to another system in milli/ micro-seconds, the purpose of back-up is solely to enable business recovery in the event of a disaster.

So in that context it belongs with the business disciplines of business continuity and disaster recovery, which should design the appropriate technical solution for the SLAs.

That is part of the reasoning behind the growing trend towards online back-up. It copies the key data over to a third-party service provider, which usually offers data centre security and back-up of its own to a level beyond the in-house aspirations of all but the largest organisations.

The other huge attraction is that online back-up can be fully automated and is already offsite, unlike the back-up tape solution, which has a high failure rate because of the human and management factors.

“We’ve been selling the concept since 2003,” said Eoin Blacklock, managing director of KeepItSafe, Ire land’s largest provider of managed back-up services online.” It has followed the growth of broadband, although the general understanding of the bandwidth required for online back-up is exaggerated.

“The market today has become very feature specific and competitive, I’m glad to say, because it shows clearly how mainstream online back-up has now become.”

KeepItSafe has more than 1,200 business clients, Blacklock said, with about 128 terabytes of data managed for them by the company. The average data volumes being backed up are about 50GB per client, which is, in fact, a ten-fold increase from the 2005 level of 5GB.

What non-users should understand about online back-up is that it is incremental, so after the initial transfer of a complete data set on external media, only new data and changed files are copied over.There is also, in general, very little performance requirement, so a modest broadband link or bandwidth share is sufficient. In fact, the task can, as it were, shrink to fit the channel available down to about 256k.

In a managed service such as KeepItSafe, the process is managed by smart systems tracking file by file and keeping to strict SLAs based on the client’s business rules.

“Encryption is standard, the accuracy of everything is checked file by file and when the volume of altered files reaches an agreed proportion of the total, say 50 per cent, a fresh back-up will be initiated,” said Blacklock.

“I’m always asked how often the restore services are invoked and actually the answer is every day. The most common reason is to restore files that have become corrupted or been accidentally deleted on the client’s own system.” Complete or near disasters averaged a surprising level of about three a week, Blacklock said.

“By and large, they are malicious,” he said.” Not to malign two professions, but the sacked and disgruntled sales rep or chef can very easily remove or trash key information like customer lists and accounts, menus and suppliers and their accounts.” Another leading service provider is C Infinity, which offers hosted infrastructure online including servers and data storage. A sub-set of those services, clearly, is remote server back-up.

“This is, in essence, cloud computing and what is beginning to be called infrastructure on demand,” said C Infinity managing director Caitriona Lynch.

“Storage on demand and back-up are managed in much the same way and, for the customer, it is a pay-as-you-grow service, very attractive when you have tight budgets.” C Infinity has close to 1,000 back-up clients and is expanding through resellers. Lynch echoed the point about broadband bandwidth, pointing out that, for back-up, the requirement was modest, but for hosted primary storage the bandwidth had to be higher because there were performance issues.

“Security and smart management to ensure data integrity are hugely important to our clients but relatively easy for us to guarantee with world class systems to look after the set of clients,” said Lynch.” In fact, we have deliberately targeted SMEs and professional firms with tougher compliance and regulatory requirements because we can offer guaranteed solutions.”

C Infinity clients include a high proportion of medical practices and healthcare firms, accountants and tax advisers, legal practices and other professionals.

“Total safety and security of their own and client data is what they must have. The online managed service is a solution that can deliver that at an economical cost,” Lynch said. “When something does go wrong, immediate online restoration with a few clicks is a very tangible proof of concept.”

Share:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • FriendFeed
  • LinkedIn
  • Twitter
  • email
  • Posterous
  • Slashdot

This entry was posted on Monday, September 28th, 2009 at 20:21 and is filed under News. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

 
Get Adobe Flash playerPlugin by wpburn.com wordpress themes