Most data analysts spend just 20% of their time doing actual data analysis; the other 80% is spent finding, cleaning and organizing the data. This is the 80/20 rule, also known as the Pareto principle. The problem grows worse as the amount of data increases, and the problem is especially pronounced when it comes to security.
Today, organizations use all manner of security tools, each producing its own volume of data in its own unique format. Cross-correlating these disparate data sets can help reduce false positives and provides important context to increase security, but this can feel like a Sisyphean task.
Every minute spent pivoting between tools or trying to collect, clean and normalize data manually is time that isn’t spent on analyzing data to look for threats and vulnerabilities. That puts the organization at risk. Security (data) analysts need a better way to collect, streamline, centralize and correlate data to enable them to focus on doing what they do best: stopping threats and fortifying the organization.
Looking at the challenges
Data analysts have a lot to contend with: incomplete configuration management databases (CMDBs), isolated security controls, limited next-gen security information and event management (SIEM) platforms and more. These all work from or produce disjointed and variable raw data, thus filling analysts’ days with redundant data work. They do more data wrangling than analysis.
Analyst morale and motivation wane as alert fatigue takes over. In addition to a massive amount of data, security analysts are also struggling with this reality. There are more alerts than time in a day to address them. If all the dials on the instrument panel are in the red, which one do you look at first?
These are highly paid and educated employees, and spending their time on tedious, repetitive data cleaning and normalization instead of actual analysis and extrapolation of insights from data is a waste of money and time.
Taking back control
Organizations need to streamline and centralize their data collection to establish a better data foundation and eliminate these other challenges. The path forward starts with solving for data quality and data completeness. Organizational challenges aside, by capturing data at or near the source, normalizing and enriching it to create a common wellspring for every business unit to work from, a company can accelerate time-to-value and turn the burden of big data into benefits that can help it truly differentiate.
How do security leaders do this? It starts with having good, clean, standardized data. Security leaders need to be able to collect their data from many different sources and then put it into a common schema or data structure. Once it’s in a common data structure, it will be much easier to cross-correlate.
Having a data quality game plan is key. Increasing data’s quality and usefulness doesn’t just happen; there are distinct formats, syntaxes, and volumes of varied data sources that must be addressed — and much of this work is manual. A holistic plan that spans the enterprise will help to automate and hasten this process. It also creates a foundation for ongoing data maturity innovations. Here are two factors to consider:
- Make sure the data sources and fields identified align with the current use cases and possible applications in the near term.
- Think about what the “data of interest” is for an organization. Determine the scope and develop use cases that help to visualize the ideal final goal. In other words, consider what you want the data to do, how employees will leverage it and which outcomes you want to drive.
To centralize the data and move forward, adopt a data lake approach. A security data lake provides a place to collect your downstream data. An enterprise can draw from that lake as needed. An enterprise can have multiple data lakes since data may come in various forms and from many sources. Step One of mastering data is to centralize it and grant as much access to it as possible within the confines of a company’s governance policies. This lake will be a “raw materials” source for the needs of security and operational staff.
Next, look to a data fabric to bring it all together. Combining a data lake and a data fabric creates an ideal method of expediting this process flow. These two technologies, or architectural approaches, work together to help organizations understand and unlock what their data can tell them and how to incorporate the resulting insights into their operations.
Understanding the benefits of quality data
Better data can help you gain confidence in your ability to protect your organization. It enables improved security, business, and operational outcomes and more focused employees. Security analysts and teams can find and fix vulnerabilities faster, ultimately giving them more time to work on other projects. Instead of spending 80% of their time wrangling and only 20% on analysis, analysts can spend closer to 100% of their time on analysis. And that enables faster responses to threats. This can also improve employee morale by giving analytical people who like problem-solving complex problems to solve.
Potentially avoidable problems that can be mitigated with this approach include:
- Adverse or delayed outcomes from stale or fragmented data
- Inability to answer security questions due to lack of data
- False positives that waste resources
- Uninformed decisions that might negatively impact brand reputation, financials, security and more
From data wranglers to security data analysts
It’s time to re-examine how your data analysts are spending their time. With vast amounts of data coming from myriad sources and having to be collected from different places, many analysts find themselves becoming data wranglers. When you’re talking about security, this means your security teams can’t focus enough on mission-critical tasks. It’s time to try a new approach — the powerful combination of a security data lake and a security data fabric — that streamlines and centralizes data collection so that your security teams can do what they do best.