Discussion on: Diversity Matters in The Workplace

View post

Replies for: I wish people would stop using the AI examples for this, as it is a very poor argument. When training an ML model, or doing any kind of statistics...

I understand the AI example may not be the best, but it's a sign of something bigger. And it is not limited to poor test data. As I put in a different comment, development is not only coding, it involves all the steps in the SDLC, and data gathering too.

The data-gathering is definitely inadequate, but it's not an excuse either. Training data doesn't show up out of thin air, it is created and gathered by people (or algorithms created by people), which may influence its representation of the population and neutrality.

Even if the data is wrong, and the training is wrong. Nobody realizing that the accuracy was so far sided is a sign that they were oblivious to a sex/skin color issue. No one thought "hey, we have 99% accuracy for white men, but 65% for black women"? And if they did, nobody did anything? That's not a data gathering issue.

I agree that it is not always possible to get a good representation of the population. But in this day and age, with many free sources available for images and portraits, having bad data for a vision system is a poor excuse.

DrBearhands • Jun 6 '19

I can't really see the point you're trying to make here. Nevertheless, I think there are a few problems with what you're saying.

it's a sign of something bigger

Yes, it is a product of a divided society. The reasoning "biased AI → we need diversity in tech" does not hold though.

And it is not limited to poor test data [...] data gathering too.

If you know a good example about how diversity in the development team can profit the company, use that rather than AI. Let's not dilute good arguments with bad ones.

You also appear to assume the entire team is responsible for the whole process, which is often not true. Essentially this issue only matters for QA.

The data-gathering [...] sex/skin color issue.

I think you've missed my point here. There are an uncountable number of biases your dataset might have. A good data-gathering process ensures samples are representative of the final use case. Skin color issues are an indicator that the data-gathering process is poor and produces bad results. That is a problem in and of itself. Adding a black woman to the team might solve this particular issue, but the team is still going to produce dangerously biased models, with biases that are far less obvious to notice.

and the training is wrong

This is unlikely to be the case. ML will just match the data, whatever that is. Beyond having a model that is too simple, which will result in low accuracy, bias of the model after training is a reflection of the bias in the input data.

with many free sources available for images and portraits, having bad data for a vision system is a poor excuse.

This would cause exactly the bias problems I was talking about. Data gathering is hard. You can't just download some pictures and expect it to be an unbiased dataset.

I'd like to reiterate: I'm not making and argument against diversity. I've had rather good experiences pair-programming with women; men and women have different ways of tackling problems and there's definitely a "stronger together" effect. I would, however, like to see the argument of biased AI go away.

If you add bad arguments to good ones, the good arguments lose credibility.