The Probability of Happiness

So if I have conducted 81 interviews from a pool of 189 applicants, and I have an array of application review scores that ranked the applicants from 1 to 189 to determine which 81 I would interview, and I now also have an array of 81 interview scores ranking the interviews from 1 to 81 to determine which 26 I will select, and if I want to see how strong the correlation is between the application review scores and the interview scores, all I have to do is add up the first array of numbers, find the mean, subtract the mean from each number in the array, square the absolute value of each of those results, add up those results, divide by 81 (the number of numbers in the array) to find the variance, take the square root of the variance to find the standard deviation, repeat this process to find the standard deviation in the second array of numbers, convert each app review score in the first array to a standard unit by subtracting the app review score mean from the app review score and dividing that by the app review score standard deviation, convert each interview score in the second array to a standard unit by subtracting the interview score mean from the interview score and dividing that by the interview score standard deviation, multiply each candidate's app review score standard unit by their interview score standard unit, add up these results and divide that total by 81, (the number of numbers in each array). Simple.

The number I am left with is the correlation coefficient. It will be somewhere between -1 and 1. The larger the absolute value, the stronger the correlation. A correlation close to 0 indicates no meaningful association. A negative number indicates a negative correlation; (i.e. change in the opposite direction, or, the higher the app review score, the lower the interview score). A positive number indicates a positive correlation; (i.e. change in the same direction, or, the higher the app review score, the higher the interview score). Typically, an absolute value less than 0.3 indicates a weak correlation, an absolute value between 0.3 and 0.7 indicates a moderate correlation, and an absolute value greater than 0.7 indicates a strong correlation.

In this particular circumstance I believe I would like to see a strong positive correlation between app review scores and interview scores because I would like to feel confident that I am not leaving strong applicants without an interview, thus with no chance of selection. But I also have to be true to our interview process, (knowing that it is a more thorough vetting), and refrain from artificially inflating correlation, (possibly by including the app review score as a larger weighted portion of the interview score), simply to justify the app review process. So I must work backwards and structure the app review process to more closely approximate the interview process. If I am unable to strengthen the correlation by making solid connections from one process to the next, (and back), then (after a given time that in this circumstance will be measured in years), I will look at simpler criteria that will provide similar correlation with less complexity and less work. Of course the ultimate measure will be the success of those selected; but in this circumstance it is too early to measure this correlation, hence the continued complexity and effort, and the necessity of years.

Interestingly, between these first two years, I see the strongest correlation (between app review scores and interview scores) in the first year. Obviously this indicates that the two selection processes were more proximate in the first year than in the second year. But proximation does not mean they were better. My interpretation is that in the first year we were more instinctive and subjective, (and less experienced), in both the app review process and the interview process, and in the second year we were more consistent and precise in the app review process, but maintained a more subjective (but also more experienced) interview process. As already mentioned, the success of those selected may help to clarify but even that cannot tell us the appropriate mix of perspectives as we cannot measure the performance of those we did not select; and because of the quantity of candidates it may just simply be difficult to select poorly.

Though this is a good problem to have, it does not justify haphazard methods. If anything, it requires more focused diligence to take advantage of this opportunity to select the best of the best. Additionally, a selection process without consistency and measurement may very well produce positive results but if supply and demand change, how will we know what we have done right? To take advantage of the current circumstance we must continue to analyze the data as it comes, to determine if we have had the unfortunate good luck to have stumbled across two or more years of properly executed methods of selection, or the fortunate bad luck to have one process (statistically) distinguish itself.

Because these results are preliminary and premature, we will continue to make incremental improvements and within three to five years I can start playing with linear regression to predict outcomes based on various explanatory variables. But to do this with confidence it is better to be consistent in our subjective terminology and definitions. The end-result app review scores and interview scores, though objectified, are a mix of both objective data and subjective judgement. What I have done, going into this our third year of selection, is to design a rubric that defines the subjective performance measurements consistent with those subjective measurements utilized, (verbally and thoughtfully, if not expressly), and refined throughout our first two years of selection. I had mentioned that our interview process was a more thorough vetting, and it is largely from this process that I have created this rubric originally intending it to serve as a measure of the performance of those already selected. After some epiphanous analysis though, I have adapted this rubric's terminology and definitions to not only serve as a performance measurement, but also to reach back into the interview process and again back into the app review process, and reflect more precisely how our selection process has functioned and evolved. In a sense, I have used hindsight to structure foresight in order to measure the past, the present, and (ultimately) the future. What I must remember is that the present, (this moment), is fleeting; and to claim, (as I did at the beginning of this paragraph), that only "incremental" improvements are necessary from this moment forward, is to allege an arrogant certainty from a potentially devastating height. To prevent this fall, I must also remember that certainty kills effort, whereas uncertainty, as a byproduct of subjectivity, perpetuates effort.

S	M	T	W	T	F	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Leave a Reply Cancel reply

Archives