full transcript

From the Ted Talk by Kasia Chmielinski: Why AI needs a "nutrition label"


Unscramble the Blue Letters


Now, I've been asking myself a lot of questions about how can we understand the data quality before we use it. And this emerges from two daedecs of bldnuiig these kinds of systems. The way I was tiearnd to build semstys is similar to how people do it today. You build for the middle of the distribution. That's your nomarl user. So for me, a lot of my training data sets would include iiaoontfrmn about people from the Western world who speak English, who have certain normative ctacstrihiercas. And it took me an elbangmsrrsaiy long amount of time to realize that I was not my own user. So I ifdentiy as non-binary, as mixed race, I wear a hearing aid and I just wasn't represented in the data sets that I was using. And so I was building systems that literally didn't work for me. And for example, I once built a system that repeatedly told me that I was a white Eastern-European lady. This did a real number on my identity.

Open Cloze


Now, I've been asking myself a lot of questions about how can we understand the data quality before we use it. And this emerges from two _______ of ________ these kinds of systems. The way I was _______ to build _______ is similar to how people do it today. You build for the middle of the distribution. That's your ______ user. So for me, a lot of my training data sets would include ___________ about people from the Western world who speak English, who have certain normative _______________. And it took me an ______________ long amount of time to realize that I was not my own user. So I ________ as non-binary, as mixed race, I wear a hearing aid and I just wasn't represented in the data sets that I was using. And so I was building systems that literally didn't work for me. And for example, I once built a system that repeatedly told me that I was a white Eastern-European lady. This did a real number on my identity.

Solution


  1. embarrassingly
  2. information
  3. systems
  4. characteristics
  5. normal
  6. trained
  7. building
  8. decades
  9. identify

Original Text


Now, I've been asking myself a lot of questions about how can we understand the data quality before we use it. And this emerges from two decades of building these kinds of systems. The way I was trained to build systems is similar to how people do it today. You build for the middle of the distribution. That's your normal user. So for me, a lot of my training data sets would include information about people from the Western world who speak English, who have certain normative characteristics. And it took me an embarrassingly long amount of time to realize that I was not my own user. So I identify as non-binary, as mixed race, I wear a hearing aid and I just wasn't represented in the data sets that I was using. And so I was building systems that literally didn't work for me. And for example, I once built a system that repeatedly told me that I was a white Eastern-European lady. This did a real number on my identity.

Frequently Occurring Word Combinations


ngrams of length 2

collocation frequency
training data 3
nutrition labels 3
dataset nutrition 3
generative ai 3
artificial intelligence 2
stop eating 2
data set 2
data quality 2
data sets 2
include information 2
data nutrition 2
food nutrition 2
building ai 2
building datasets 2
transparency labeling 2
food packaging 2
private actors 2
basic principles 2



Important Words


  1. aid
  2. amount
  3. build
  4. building
  5. built
  6. characteristics
  7. data
  8. decades
  9. distribution
  10. embarrassingly
  11. emerges
  12. english
  13. hearing
  14. identify
  15. identity
  16. include
  17. information
  18. kinds
  19. lady
  20. literally
  21. long
  22. lot
  23. middle
  24. mixed
  25. normal
  26. normative
  27. number
  28. people
  29. quality
  30. questions
  31. race
  32. real
  33. realize
  34. repeatedly
  35. represented
  36. sets
  37. similar
  38. speak
  39. system
  40. systems
  41. time
  42. today
  43. told
  44. trained
  45. training
  46. understand
  47. user
  48. wear
  49. western
  50. white
  51. work
  52. world