Accelerate AI workloads with IBM Storage

18 February, 2020
Ashutosh Pathak
IBM

Artificial intelligence is on the rise.

Gartner’s 2019 CIO Agenda survey shows that between 2018 and 2019, organizations that have deployed AI grew from 4 percent to 14 percent. This increasing adoption demonstrates the value of the data insights and resulting gains to the business that AI can offer.

AI has come a long way since its earlier applications in nuclear and genomic sciences. Today, it can be used in numerous industry use cases, including:

  • Marketing: Retargeting a client or providing product recommendations to an existing client
  • Sales: Predictive sales patterns, lead generation, chat bots, mail bots as a sales representative
  • Operations: Predictive maintenance, manufacturing analytics
  • Customer service: Call analytics, response suggestions, automated claims processing
  • Data security and personal security: Cybersecurity and fraud detection, AI-powered autonomous security systems
  • Healthcare: Early diagnosis, image insights
  • Automotive: IoT and GPU advancements that make self-driving cars possible

The earlier you jump on the AI strategy train, the more competitive and innovative your business can be.

As you’re planning for AI adoption, it’s important to note that infrastructure, specifically storage, can be a key accelerator in any AI setup. Every AI or machine learning use case depends on data and the inferences we draw from that data. This data, as part of the AI process, goes through various stages in the AI workflow, and each stage has its own distinct I/O characteristics. Having a storage solution that not only supports these diverse I/O requirements but also provides easy data management is essential.

IBM Storage solutions stand out in addressing the kinds of challenges that can surface in an AI workflow, helping you balance cost and performance as you introduce more AI capabilities to your business.

Storage for every stage of the AI workflow

The typical AI workflow has four major processes:

  • Ingest
  • Classify/prepare/transform
  • Analyze/train
  • Inference/insights

Each stage requires different storage capabilities to achieve the desired results. Essential capabilities for a successful AI implementation include scalability, speed, cost and flexibility.

Ingest: Data ingestion is the process of obtaining and importing data for immediate use from anywhere. It can be challenging for businesses to ingest data at a reasonable speed. For successful data ingest, you need a storage solution capable of providing the required scalability, throughput and ability to ingest data from heterogeneous sources — and at an affordable cost.

Classify/prepare/transform: Post ingest, an automated classification, preparation and transformation (if any) can reduce the complexity of managing the data received from heterogeneous sources. One way of doing this could be to have a policy-based solution to help in using the same data for multiple AI use cases.

Analyze/train: Data analysis and training is a highly iterative process, and each iteration, depending on the data size, may take a long time to complete. A storage solution with low latency can help you complete an AI training cycle in less time.

Inference/insights: The final step is getting valuable insights out of the data, which can help a business in achieving the desired competitive advantage. Speed therefore is again an important aspect of the storage solution.

How does IBM Storage for AI stand apart?

IBM offers a wide portfolio of software defined storage solutions to address requirements at each stage of the AI workflow. The available solutions not only provide an intelligent information architecture but also enable data management in an efficient manner to accelerate the data pipeline from the ingest stage to insight.

Figure 1: IBM Storage portfolio designed and optimized to serve the unique requirements of different stages—from ingest to insights and beyond.

IBM Spectrum Scale

IBM Spectrum Scale is a proven software-defined offering that’s POSIX compliant, provides parallel access to data and helps minimize data movement across different stages of AI. Spectrum Scale provides multiple ways to access the data as a single global name space and hence is useful during different data cycles, where data ingest happens from heterogenous sources globally. Spectrum Scale moves data hassle-free and as quickly as possible across tiers, with extreme low latency and high throughput. It is also available as an appliance, IBM Elastic Storage Server, providing tremendous throughput from just a single unit.

IBM Cloud Object Storage

Large data sets for AI require storing a massive amount of data for a longer duration. IBM Cloud Object Storage provides a cost effective, scalable, secure, highly available and easily manageable solution. It is available both as a software solution and as appliances. The appliance is available in smaller configurations starting from terabytes and scaling to exabytes.

IBM Spectrum Discover

IBM Spectrum Discover is a modern metadata management software that helps to bring structure to unstructured data. Spectrum Discover helps speed up the data preparation stage with cataloging and classification of unstructured data and removing duplicated data during the classification and transformation process of the AI workflow.

IBM FlashSystem

Data analysis and training of AI systems is a process, and at this stage, faster storage is the key to quick results. IBM FlashSystem is developed and built up with IBM FlashCore patented technology to provide excellent performance and a highly competitive total cost of ownership.

IBM Storage solutions for AI support AI frameworks like TensorFlow, Caffe, Spark and CNTK. Organizations interested in adopting AI need to work with providers that offer a wide range of storage solutions to support every stage of AI process — and IBM Storage can do that.

Where you can find support

If you’re looking for support on IBM software-defined storage and the IBM Spectrum Storage Suite, contact IBM Systems Lab Services. The Lab Services Storage and Software Defined Infrastructure team has helped clients around the world efficiently capture, deliver, manage and protect data with superior performance.

The post Accelerate AI workloads with IBM Storage appeared first on IBM IT Infrastructure Blog.