Since the end of January, I’ve been focused on a research project here at Tripwire. Within Tripwire R&D, we call these “labs”: a small team, including myself, explores strategic product solutions. The wonderful part about labs is the surprise realizations, those things you weren’t aiming for at the outset occurring at the confluence of multiple experiences.
One of those surprise realizations happened just last night after following up on a lab teammate’s recommendation to read Julia Evans‘ Machine Learning Isn’t Kaggle Competitions, a well-written article more about solving problems than machine learning.
After reading that article, I realized just how crucial it is to “understand the business problem” and to use the power of data as the starting point for designing a solution.
OK, you might be thinking “That’s not a surprise, or at least it shouldn’t be. Just skim Eric Evans‘ (any relation to Julia?) influential book Domain Driven Design, and you’d know that.”
It’s true, but its one thing to read a book about a technique and another to experience it first hand. And it was surprising how useful this technique proved to be as we crafted a system to optimize scaling Tripwire deployments.
How this surprise unfolded:
- We started with User Experience research into customer pain points.
- We identified a small set of use cases to solve, just as Ms. Evan’s article describes.
- Our designs focused on micro-services, using point-to-point RESTful communication.
- Questions during the design sessions immediately pointed to our lack of understanding of the domain e.g. What are the REST resources? What are the responsibilities for each micro-service? We didn’t have concise answers to these and other related questions.
- Here things started to get interesting. With no specific goal, we created concrete messages for our use cases – messages containing persona names, business-related attributes, etc. This led to thinking about the actual problem domain, the data we needed to solve the problems, and which micro-services would be responsible for each type of data.
- Then we started thinking of how data would flow from one micro-service to the next and posed the question, “What if every micro-service used a pub/sub model?”
- The responsibilities of each micro-service shifted. We split the data on problem domain boundaries – ease of deployment, system health, troubleshooting, support, and so on – and we used services to recombine that data to form richer forms of the data.
OK, let’s take a break… in reality it didn’t unfold as neatly as I’m describing, in a clean, sequential, lock-step fashion. Steps 5, 6 and 7 repeated and commingled over a period of a week or so.
- Once we had a design for a system of micro-services and a data flow, we stopped designing and started creating data definitions, prototype services and tools to simulate system messages.
- We started prototyping a working demo.
We were very pleased with the result: an elegantly simple event-driven system where the data flowed from the originating endpoints, enriched by processes on its way to the user interface.
The resulting system was segmented in a way that distributed the construction tasks among lab members, was easy to test and extend. It was natural to treat the data flowing through the system as immutable facts, since we used Clojure and its strong emphasis on identity and values, really just a re-iteration of the value of data (or values).
This was the point where I read Julia Evan’s article, skimmed Eric Evan’s book, and thought about Clojure’s focus on data and immutability… and I had my “ah hah!” moment of surprise.
It was at step five when we crafted concrete data, when the design really took off, and when the utility of this technique led to an elegant design.
At Tripwire, we pride ourselves on harvesting the highest quality security data from customer assets (aka endpoints), data which provides us with an understanding of the security problem domain and allows customers to design solutions for their security challenges.
Title image courtesy of ShutterStock