Data—structured and unstructured alike—is about to come avalanching down upon everyone. Not that increasing data onslaughts haven’t already been a problem. Many businesses are trying to actively combat issues now by increasing their technological storage space by integrating different solutions including all-flash, cloud, and hybrid systems.
Now, however, there may be another forthcoming option brought into consideration by IBM. Some experts are speculating that once the Internet of Things flourishes even more of that data will overshadow current storage capacities. Even with all the analytical programs and software, gaining an understanding of all the available data is becoming more and more difficult.
In response to that possibility, IBM researchers are proposing and exploring an intelligent storage system that works something like the human brain. Cognitive Storage is based upon the idea that it’s easier to remember important details, like the face of your child, instead of the appearance of the check-out cashier.
For IBM, a world leader already in Artificial Intelligence, as proven by Watson’s success, applying intelligence to storage problem is a natural possible solution.
In Cognitive Storage for Big Data (paywall), IBM researchers Giovanni Cherubini, Jens Jelitto and Vinodh Venkatesan, of IBM Research — Zurich, described their prototype system. Cherubini explained that cognitive storage aims to assess the value of data and apply it to storage, with correct media class, level of data protection and more based on the importance of data.
Cognitive storage would cut costs by helping to decide automatically which class of media data should reside on, what levels of data protection should apply and what policies should be set for the retention and lifecycle of different classes of data.
As IoT data sets are being analyzed, the machine intelligence would be able to apply previously evaluated values like access frequency, time value, protection and more to new data sets. The AI could learn from observing human users determine what is significant, then properly categorize and store data sets according to enterprise needs.
Within their report, the researchers explain their use of a learning algorithm known as the “Information Bottleneck” (IB). It is “a supervised learning technique that has been used in the closely related context of document classification, where it has been shown to have lower complexity and higher robustness than other learning methods.”
Following the integration of the IB algorithm, the researchers have been pleased with its success rate. Within relatively small data sets, the team found that as the number of training sets increased, the accuracy improved, “reaching nearly 100 percent accuracy at around 30 percent of the training data included.”
Such results show promise even as the project continues to develop. While the cognitive storage is being currently developed for the Square Kilometer Array, an astronomy project that will comprise one million square meters of radio telescopes that will evaluate a collection of radio waves from the Big Bang more than 13 billion years ago. Once it is active, it will generate up to 1PB of data per day.
As the team set out to develop a cost-effective way to analyze and sort the data, it became absolutely clear that they needed to be able to properly tier and rank the data. Thus, the idea of modeling it after the human brain (where some information is stored and other is disregarded) was born.
For further and future applications of the system, IBM is also looking for business partners for beta testing in enterprise and other environments. Functionality will ultimately be provided in hardware/software product form as well as via the cloud, according to the project developers.
As the future of the technology enterprise storage unfolds, the likely applications of IBM’s idea will become clearer. It isn’t the first time that intelligence has tried to be integrated into storage systems, but the costs have ultimately proven greater than the cost of just increasing the amount of storage. Now, however, the needs may be more complex and the solution may need to be just as sophisticated.
Cognitive storage has an unpredictable future, one that even Watson likely can’t project.