Building a Better IT Stack with AI Founded on High-quality, Well-structured Data
By Daniel Parshall, Ph.D., Principal Data Scientist, Lakeside Software
The natural philosopher Henri Poincare opined, “Science is built up with facts, as a house is with bricks. But a collection of facts is no more a science than a heap of bricks is a house.” The same sentiment holds for IT. We have heaps and heaps of data – but lacking an organized structure, we are often reduced to rummaging about almost at random. The ongoing implementation of AI in IT will change much of that. Indeed, AI purpose built for IT is the foundation needed to simplify the notorious complexity of IT.
Let’s make sure we build a tidy house from the beginning, however, as “garbage in, garbage out” can compromise the foundation. What this means is that AI depends on data, but poor data can ruin the output, just as shoddy materials can make for an uninhabitable house. Given that no one in IT will ever turn to AI if the outputs are not trustworthy or explainable, it is essential to build and train the AI only on high-quality, well-structured data.
But what makes the difference between good data and bad data? Data can be poor for any number of reasons – it might have nulls, negative counts, or impermissible values. The documentation around the data may have been lost, so that while the column “bricks” exists, it may not be clear if those are clay bricks or cinder blocks, to extend the analogy. Or sometimes the data generation process might be subject to censoring or selection bias, in which case your model can be dangerously incomplete. Fortunately, the poor-quality data issue is relatively easy to mitigate, and missing documentation can be overcome with sufficient effort.
Perhaps what threatens a solid AI model for IT the most, however, is incomplete data. It is the most pernicious because you may not even be aware that these gaps in data are happening. Think about this: If the number of support tickets has steadily declined, is that because there are fewer problems on your estate… or because the resolution process is so onerous that many users quit reporting issues? For this reason, it is crucial to understand the provenance behind the data, just as it’s important to know the reputation of the contractor who is building your house.
One massive advantage we have in the IT world – which makes a significant difference as we incorporate AI – is that the vast bulk of our data is machine-generated telemetry data, using standardized data collection from a handful of the most important operating systems such as Windows, Mac, Linux, and of course TempleOS. Going back to our construction analogy, I would say that this is akin to working with standardized prefabricated bricks rather than having to take found stones and figure out how to put them in place. (On the other hand, user-generated data is notoriously inconsistent, although LLMs are making it easier to use.) One of the great benefits of such comprehensive and clean data is that it can greatly aid the search for complicated interaction effects.
How does this relate to IT use cases? Some of the most frustrating IT tickets appear when a particular chipset produces issues – but only with a particular driver, on a particular motherboard. Such tickets can burn up days of resources and lead to cascades of calls to other teams before a technician notices the components that produce the problem together. Interactions such as this can be O(N**2) or even O(N**3) with the number of possible combinations.
A key benefit of automated ML techniques is that they can scan every possible combination and call suspicious issues to the attention of IT much faster than a human can (assuming that our data collection process gathered the necessary info). It’s important to remember that human expertise is still indispensable; for example, while these automated techniques may be able to identify that a particular VPN server is causing issues, they won’t know that it’s due for an upgrade next month and can be safely ignored for the time being.
One important but under-discussed possibility is that of sharing insights without sharing data. Advances in algorithmic privacy mean it’s possible to share the insights behind the data without sharing the data itself. Much like the way in which meta-analyses in scientific and medical research combine the results of multiple papers to be able to estimate parameters very precisely, it’s possible for AI models to learn from each other without exchanging data. This process can lead to models that are much more accurate than any one estate could produce on its own.
As you build and evolve your IT stack, AI and ML will likely play an increasingly vital role in improving efficiency and other key metrics. Understanding the relationship between AI and your data will help you succeed in achieving your business goals.