Many years ago, a security team participated in a compliance audit. To pass the audit, they demonstrated to the auditor that their team maintained a full year of security log data. As proof of compliance, they sent a screen shot from their Security Incident and Event Management (SIEM) system showing it configured to store one year of logs. They passed the audit with flying colors, but in the best example I know of demonstrating “compliance is not security,” they later discovered the SIEM only actually had enough disk space for thirty days.
Thus began a journey to upgrade or replace the SIEM, and they quickly realized they had fundamental architectural flaws. First, expanding storage would be expensive because the team relied on a single vendor using proprietary technology. Additionally, every device producing logs sent them directly to the SIEM and thus would require massive effort to reconfigure if a new vendor was chosen. Finally, the SIEM itself did not meet every need of a modern security team. While it alerted on real time events relatively effectively, searching for historical data for investigations or hunting could take hours to complete.
To solve those problems, the team started from a blank slate to create a new kind of logging architecture.
It would be fast, flexible, scalable, and cost effective. This paper takes the reader from our original design with all its flaws to our modern implementation and its numerous benefits.