Building the airplane as you fly it
Balance is hard.
In the data domain you typically have to balance between building the right thing and building the thing right.
You have to balance the speed to market of getting information into your stakeholders hands to support their next action, the outcome from that action and the value that will be delivered from that outcome. All with the rigour that is required to make all the steps in the Information Value Stream robust, repeatable and ideally automated.
The days of being able to spend 6 months or a year on “Sprint Zero” creating your data platform have gone.
The days of spending months collecting all the data into the data platform before you combined it and made it consumable have gone.
The days of spending 6 months defining the perfect enterprise data model have gone.
Yes the latest wave of cloud enabled data technologies, and the advent of SaaS tools that you can lease, have made data teams ability to deliver a data platform faster than it has ever been.
But if you go too fast, you end up with 28 point solution technologies that you have integrated (some would say cobbled) together, and a pile of technical debt that means you are reluctant to touch any of the chewing gum and string that holds them together.
Yes the ability to create a blob of SQL code quickly and call it a “model” means that you can deliver quickly, bypassing the need to create any reusable data models, but at some stage you will hit the problem of thousands of blobs of code running, and when eventually the code, core business concepts and core business processes overlap, your stakeholders will start getting different answers to the same questions.
Building the thing right is just as important as building the right thing.
But balance is hard.
The good news is that there is a world of agile, product and data patterns your data teams can adopt that will give them both speed to delivery and start them on the journey of building a robust Information Value Stream and Information Factory.
A few patterns I have seen work:
1) Adopt an agile way of data modeling based on Scott Ambler work (which he freely shares).
2) Always quickly create and share a conceptual model first, that only contains Core Business Concepts (Customer, Product, Order etc).
3) Define either policies (guardrails) or patterns (helper kits) that indicate which data patterns should be used and where.
4) Have data teams regularly demo their patterns to other data teams.
5) Instead of basing your Data Governance process on committees, documents and rules nobody will read or follow, base it on collaboration and harvesting. Take the good work done by the data teams and making it easily accessible to everybody else in the organisation.
Making trade off decisions between spending time on building the thing right, and delivering value as early as possible to our stakeholders is a difficult balance.
One team I worked with called it “building the airplane as you fly it”.
You can listen to our podcast with the leader of that team here:
https://agiledata.io/podcast/flying-the-airplane-while-building-it/