The Soil Cultivation Metaphor

hand-1-by-david-pacey
Hand 1.  Attributed to David Pacey.

One of the greatest pleasures of home ownership is the opportunity to work in the garden.  Gardening is fulfilling for several reasons.  The accomplishment is satisfying and tangible, unlike a lot of office work.  Gardening is great physical exercise, involving a range of low-impact and core-intensive body movements.  Gardeners get time outdoors, bolstering vitamin D intake and exposure to fresh air.  By handling soil, our body develops a resistance to the germs and bacteria, or so rumor has it.  The work is solitary and meditative, improving mindfulness.  The home-grown and fresh-picked produce tastes better and is higher in nutrients.  Gardening is a kind of cure-all for wellbeing.

Soil cultivation is extremely similar to the practice of cleaning a data set.

My first round of soil cultivation was a patch of land that previously had a garden shed sitting on top of it.  There was no organic matter in the soil at all; just a dense patch of dust.  The soil required a major intervention to become useful.  This is also true about a new data set.  It’s great to have new data, but I just know it’s going to require a lot of attention before I can use it properly.  The data will lack clear labels, there will be columns or fields that are useless in some way, and some data points need to be converted.  Sometimes it simply arrives in the wrong format, such as on paper or ascii, or built around a different software environment.  The certainty that I must put work into it in order to get something back, turns this into “real” work.

With data I typically find that some fields are all wrong, and I need to track down or create “lookup” tables that convert the raw data into something that I know is accurate.  I don’t like throwing out data.  I prefer to just keep the dirty data on the left hand side of a spreadsheet, and to the right of a thick, vertical line, create a modified column or field.  The raw data is black, and the modified data is has color, so I know what I’m working with.  I also give the modified data more explicit labels, in succinct but plain English.  Many dubious data fields are suddenly rendered “accurate” by a good label.

As with my fully-remediated soil, my fully-amended data set means that I am ready-to-roll.  I can work with a converted and color-coded batch of cultivated data, culled of garbage and meaningless fields, and turned into something useful.

I know that some people think of a garden as a place where plants grow.  And some people think of data as something that is capable of producing analytic insights.  In both cases, there is something deeply human about taking a mess – soil or data – and turning it into something more.  We advance civilization one step at a time, one cubic foot at a time, one data point at a time.  Sometimes we just need to break a sweat, get some sun, work our bodies, and build immunity.

The Mountain of Gold

camp-millar-gold-mine-hut-by-smudege-9000
Camp Millar Gold Mine Hut, photo by Smudge 900.

Many human resource people are bad at math. But, even if you regard yourself as one of these people, a great event is ahead you.  It’s helpful to place HR in the context of other strategic business pillars – those major units inside large firms that have their own professions or their own Vice-Presidents, such as accounting, production, and marketing.  In the recent history of business, computers made it easy for each of the pillars to apply math to their data.

Finance and accounting got good at this in the 1970s and 80s.  Engineers changed production lines using data throughout the 90s.  Marketing and customer service saw big data happen from the millennium onward.  These fields achieved great business success applying new computer tools to fresh data.  In the meantime, human resources twiddled its collective thumbs for fear of the data itself.

I like to think of data as deposits of gold in a mine.   And, as data miners, it is up to us to use new tools to bring this treasure to the surface. Although many mountains of data have been already mined, and have produced lots of gold, data miners have started hitting a lot of rock. The yield just isn’t what it used to be. Beginning around 2010, someone found a large, new, and un-touched mountain of gold.  To many this mountain appeared to have nothing inside it.  That is incorrect, though. This mountain is filled with gold, but you just need to know how to drill for it.

This gold is human resources metrics.  Decades ago the finance function insisted that we automate payroll, create a line-of-sight into spending, spend the correct amount of money, and minimize risk by obeying all laws.  A fancy system of rules was created to regulate salaries including collective agreements, pay policies, rules-based pension plans, and human rights laws about fair pay.

Deep inside this alleged payroll data exists a large and accurate dataset about age, sex, length of service, union, rate of pay, and job code.  And the other pillars can’t get into this data because they don’t know enough about the people.  Some classic examples would be that there may appear to be a large number of women quitting, or a modest number of accident claims, or a growing bureaucracy.  By lining up the apples-to-apples comparisons and creating better ratios, things may look different.  You might find that women’s rate of turnover is average, that one unit’s accident rate is high, or that the size of the bureaucracy is modest.  Then, people can make better choices about where to devote their efforts.

By starting small, building basic skills, cleaning the data, and creating a steady growth of analysis tables, there are ways to make gold.  Strangely, you can only do this the human resources way, full of stories and feelings and a sense of fairness and the motivation to strive.  The data just takes you there; because this is still human resources.  So start digging.  There are treasures to be found.