Technically responsible knowledge & a feminist data set

Caroline Sinders
|
Convocation Design + Research
Machine learning design researcher and artist
We talk about human-rights centered design, and Caroline’s multi-year feminist data set project. In specific, we discuss one of her latest initiatives called technically responsible knowledge, a tool and advocacy initiative spotlighting unjust labor in the machine learning pipeline.

Nathalie Post
In today's episode, I am joined by the wonderful Caroline Sinders, Machine Learning Design researcher, artist and online harassment expert. For the past few years, Caroline has been examining the intersections of technology's impact in society, interface design, artificial intelligence, abuse, and politics in digital conversational spaces. We are going to be covering a lot in this episode today: from human rights centred design to Caroline's multi year feminist data set project, and also zooming into one of her latest initiatives, which is called technically responsible knowledge. This is a tool and advocacy initiative that is spotlighting unjust labour in the machine learning pipeline. And if you ask me, Caroline is one of the most inquisitive and thought provoking people working in this space right now. So if you're not familiar with her work, then go check it out. Enjoy the episode.

Hey, Caroline, welcome to the human centred AI podcast.

Caroline Sinders:  
Hi, thanks for having me.

Nathalie Post  
So good to have you here today. I'm so excited to be talking to you. And finally record this episode with you. I think you have had such a interesting career and like I've done so many things that Yeah, I cannot wait to ask you have a tonne of questions about. But let me start by asking, What is your story? And how did you end up doing the things that you're doing today?

Caroline Sinders
Oh, wow, that's such a good question. I feel like it's gonna be such a long answer. So I'll try to like summarise give you what Americans know is like the cliff notes version, which is just a book you can buy that is like just the things that happened in the bigger book you're reading. So the cliff notes version would be. I'm a photographer that then was really interested in technology and at the end of undergrad realised I should have perhaps majored in UX or computer science or something related like media literacy. And so then I went got a master's in design and tech, after working for a few years as a photographer and photojournalist. And, and I'm just deeply interested in how people understand technology and how they build either communities inside of technology, like platforms or how they use and misuse products in their daily lives. I graduated from the interactive telecommunications programme at NYU, briefly worked in advertising, and then went and worked at IBM Watson as a design researcher on natural language processing API's and products. So I was just really interested in how again, does this technology sort of understand the world around us and in particular, what are the problems that sort of arise when we're building systems to parse language? And what are all of the complexities within that even like the intersectional, or global complexities, but even still, like, what are some of the benefits? And what are some of the products that you can actually make, that people will use in their daily lives? From IBM, I went and had an arts residency with BuzzFeed and IBM, an Arts and Technology Centre in New York, where I was originally going to look at different kinds of natural image processing systems, you could insert into the comments to maybe the commenting sections of newspapers, maybe understand the quality of language, and I got to briefly collaborate with the coral project, which was funded by the Mozilla Foundation, which was looking at how do you create better conversations in the commenting section. And that group eventually went and was purchased by Vox. And from my fellowship at IBM and BuzzFeed, I then went and worked at the Wikimedia Foundation as a design researcher on the anti harassment team. And then from there, I decided to start my own practice. Within that I've held residencies and fellowships with the Harvard Kennedy School looking at trust in the design of platforms and harassment against journalists. And I've worked with the Weizenbaum Institute, looking at sort of the bigger systems around platforms, governance and artificial intelligence by looking at gig economy workers and content moderators. And generally, I've looked a lot at, again, what are sort of the design of these systems and how do you look at it through like an equitable and human rights lens. So the thing I should also mention as well, I had these day jobs and these residencies, I had what I called as my nine to five which was going to work working in tech, being a design researcher and then I had my six to eleven, which was my job I would clock into afterwards, which was where I effectively was a critical designer and artist and I was looking more at the systems from a hypercritical language angle. And I think I always like to sort of emphasise for people is the reason I've had this, I think, very interesting and very complex and strange career is that, you know, I didn't really take vacation days until that until, like 2019. So I used all of my vacation time in my free time to study online harassment, write about online harassment, study about the issues of bias and technology and artificial intelligence, look at how design either mitigates or amplifies problems inside of these big systems, from platforms and governance structures, when we think of platforms like Facebook and social networks, to also just general products are using artificial intelligence, I look at everything from like the consumer lens, mainly because I'm interested in like legibility and storytelling. So how do people understand the systems they're in? And so, you know, how do users build mental models of these really complex systems, those mental models, I think, are much more important than than what the systems actually allow you to do. Meaning like a system may allow a user to have to do a variety of things. But a user may only be aware of a few of those things. So I'm just really interested in that of how people understand and make sense of the world around them. That's what I was interested in when I was a photojournalist. That's why I'm interested in now. But yeah, in the six to 11, I was making art about the complexities of technologies with an art project called social media breakup coordinator, where we would help people work through the anxieties they would have from being in these sort of algorithmic timelines and social networks, and also how they can feel more okay about how they were using those platforms, if they want to use them less, they want to use them more if they want to use them differently. And yeah, you know, so that was an art piece, but it was coming out of real research. And feminist data set, another art piece, it also comes out a really in depth research of looking at artificial intelligence as a whole, and then breaking it into these different sections of the AI pipeline. So like, how do you build a feminist protocol, or feminist processes to collect data? What is a feminist way to train and clean a data set and restructure it? Is there a feminist way to do that? You know, so it's taking. So it's, you know, so there's all these different things. And I would do all these things, while I had a day job, and so I would go to conferences. Also, on my vacation times, I would go to a lot of human rights conferences, because I could see this sort of threat across like for profit, civil society, NGO arts, a nonprofit, and you know, I'm talking about this is like, 2015 2014, like, I can see that threat and that threat wasn't quite there yet. So I decided that the way for me to be in all of the conversations was to effectively have two jobs and like, you know, just not have any free time. So that's kind of like the background of how I got to where I am now where I run my own. My own small design firm and lab.

Nathalie Post  
Yeah, no, honestly, like, it is so fascinating to me. But I also love how you just emphasise, like what actually goes into it. It's not just like having that nine to five, but actually, the amount of time that you've spent after work on a continuous basis to you know, really delve into those topics.

Caroline Sinders
Totally. I mean, that's just one thing I want to add is like, I always want to emphasise that just even for anyone listening, especially for students, because I feel like sometimes what we see in a person's career from the outside is only the successes. Yeah, and not necessarily all of the hard work or the losses. And so one thing I always want to emphasise for people is like, I worked a lot, probably too much. And that's kind of what led to getting here that it wasn't just, oh, well, a bunch of stuff fell in my lap. It's, you know, like, for example, every year I tried to apply for 100 things that includes conferences, fellowships, grants, article writing and arts residencies. And I don't, you know, I don't get all of those things. I get, I get probably less than 20%. But that's still a lot of work. And so what people see is like, oh, Caroline is doing a lot of stuff, and what they don't see is like, Oh, actually, like, I applied to all of these things. Like, like how we met as someone suggested we speak but also it was through a blind email, and I send lots of emails like that to a lot of people.

Nathalie Post  
Yeah, no, and I think that is it is such an important message to share. Because I do agree like these things are often invisible. But it's, yeah, it's so important to actually understand how someone is doing these things and gets all these residencies and it seems like oh, it comes natural and it's like, yeah, It's like maybe a combination of luck. And, you know, like having been at the right place at the right time. But it's it's so much more than that. Yeah, I'm kind of curious based on that, like, what are your, let's say, foundational values or even drivers that you bring into this and into that work that you're doing?

Caroline Sinders
Oh, that's also such a good question. You have so many good questions, this is great. A lot of it is I have to, I really try to make sure that the question is personally interesting to me. And then if I'm the person that should be answering it. So a lot of the things I tackle are often because it's something that in a previous project I had to confront, or think about, and then it kind of cascades into "Oh, this should be another project I look at". So for example, with feminist data set, creating this, this alternative tool this open source tool to like data training, and cleaning instruction datasets that came out of me just making AI art and a collaborator being like, let's just get mechanical turkers to tag things within the data set to train the data model. And I was like, I don't know how I feel about that. And I kept sort of sitting on that. And then I went and worked at the Wikimedia Foundation, and was really, really exposed to like the amount of volunteer labour people do. And, you know, I attend a lot of open source meetings, and like meetups and convenings, there's also a lot of volunteer labour there. So I was already, you know, thinking a lot about that. So for me, the first thing is, is is it personally interesting to me, then does it align with my values? And so I then think a lot about how does the other entity benefit in terms of me engaging with them? And am I? Am I also providing something I call ethics washing? So are they benefiting more from me in my involvement? And is it? Is it something that actually engages in what's called a theory of change? So theory of change is often used in the human rights world, which is really, it's kind of exactly what it sounds like is, are there like steps or processes you can do to sort of push for change? And like, what is the change you're looking for? And how do you get there. So for example, like I've, I've run design sprints at Facebook in 2018. And that was really to help train some of their designers and some of their policymakers and engineers to understand how, like how design is really integral for, for, for harassment victim in terms of like creating or like creating safer spaces, or like mitigating harm that they're facing. And what I brought was like just a lot of expertise, one as a design researcher, and designer and two, as someone who's worked in the human rights field, and has talked to victims outside of the global north. And so the big thing I was trying to do, there was just sort of teach them my processes and teach them and give them a fluency sort of lean back on. To some people, I'm sure they'd be like, why Facebook? And does that fit your theory of change? And like, yes, and no, it doesn't fit like the theory of change, and totally trying to change Facebook. It is a really small team. But it was something where also very, we're not publicising working with me. So I was not, you know, like presented public as a way to like Band Aid a series of problems they had, in my own personal theory of change. Now, like, I don't know if I would do that, again, necessarily, because it's such a large, large company. And so even me going in and doing these two design sprints, one has to ask, like was this was this meaningful change? And I don't necessarily think so. But I do think for that, that team, it was meaningful, because it changed the way that they thought about their, their processes, and it changed the way they thought about and thought about and described harassment even to their other team members. So it's hard. There's a lot of things I consider, but a lot, a lot of it is really also coming back to, am I the right person to do this? And am I like inserting myself in a situation I shouldn't be inserting myself into? Yeah, do I then find it personally interesting? And does it align with my brand? And my brand is one for me, personally, of thinking of like social justice and equity. And then whoever I'm engaging with, are they benefiting? Benefiting from me more than I'm benefiting from them? Or are they using this, again, as a series of like, ethics washing, and they're not actually interested in meaningful change and engaging in a theory of change?

Nathalie Post  
Yeah. Yeah. I mean, honestly, I think that is such an important part to just emphasise, because, you know, like, for you being a critical designer, I think you have many labels, so I'm just gonna no frame it under the critical designer. I can only imagine that you do need to have like, this framework for yourself of like, how do I approach projects and what projects are right for me? And also for that organisation or whatever community or whoever you're affecting with it. And that it is not just ethical washing. But I'm kind of wondering. So in getting there and getting to those those five like points that you outlined, like, were there any projects or initiatives that really made such a profound impact on you that made you consider these principles for yourself? Like how Yeah, how did these take shape?

Caroline Sinders
Totally? Well, one thing to always I always try to highlight besides for people is, I see the world right through this lens of legibility, and design and mental models. But I also see this world through the lens of online harassment, because I've been studying online harassment and technology and feminist interventions and technology since 2013. So that's about eight years. And part of what I've seen in my online harassment research is how people purposefully weaponize and misconstrue the works or intentions of others. Yesterday, I was writing a piece hopefully republished soon, actually, for a Dutch arts tech institution, where I talk about almost, I have this like, ambient consistent, constantly, like subconscious algorithm running in the back of my head, which is constantly sort of waiting, waiting in the sense of, waiting and calculating. If I like, how one action could manifest into harassment, how to immediately mitigate it. And so I've kind of always had to be thoughtful, because I am an entity of one. And, um, but also, more importantly, I think as, as someone who sort of bridges both, but we can think of   as more like classical or traditional design research, because that's where, what, like, you know, that's the thing that I do. Like, I do love designing dashboard interfaces for people I worked at IBM, it's a thing I like to do. But I also like to do art. And so on a practical sense, like, if I were looking for a day job, I'd have to work at a company that would understand that I'm an artist, and that that's the thing I'm going to do. And on the other side of that, because I'm an entity of of myself, right of of one, I'm Caroline, that there is a lot of harassment and misconstruing that can come up just by the nature of the work that I do, because I'm because I'm working in online harassement, because I'm working, I'm looking at like responsible technology, I'm looking at human rights inside of technology from a design perspective. So I've always kind of had, I've always had to think about that. And what I add to something versus how other people will look at even an event I'm attending and say, Well, she's not, she's not really interested in equitable technology, right? She's not really doing this. So I've had to always think about that. Snd a lot of that was because early on in my career, you know, in 2014, it was GamerGate. And that was one of the big things I was looking at. And, you know, and that, like, they really weaponized all of that. And so even very, very early on as a young researcher, that was that was something I saw and lived through, you know, I was attacked by GamerGate, as well. And so I think it's just something that I have always been thinking about when I transitioned into, you know, my career as a critical designer.

Nathalie Post  
Yeah. Yeah. I mean, I like I'd love to ask you a million questions also about GamerGate. But maybe that is something for like a conversation for another time. But yeah, I'm kind of curious, like, around the Human Rights centred design, because you mentioned it a couple of times, and I feel like you've been quite actively advocating for it as well. Can you give a bit of an explanation of what it is and why we need it? And especially maybe when working with AI,

Caroline Sinders
For sure. So human rights centered design is like, is a growing movement in the human rights space where there are a lot of designers, design researchers, user researchers and technologists. And for me, what's really important about it is even so for designers who don't work in civic technology, who don't work in internet freedom, you know, for designers who maybe just work in the design world, and like the for profit design world, which is totally fine. You know, we know a lot about human centred design. I think even just the provocation of human rights centre design helps shape shift, and like reshape our narrative as to like what design does and who design is supposed to work for. And in that sense, it what it's talking about is perhaps an update to where we are now. All design systems, all methodologies, even if it's not designed all methodologies of education and schools of thought need to be updated at times that's completely normal and natural, right? We created user centred design and human centred design in the late late 80s, early 90s. So there was a design methodology before that. So it's totally natural to re-update it. In this case, human rights centred design is more asking this is my interpretation of it. And the way that I talked about it, it's asking us to actually look at human universal human rights principles, and ask if our products, or what we're making actually supports that on a global scale. And so what it's actually asking, and this is, you know, does draw heavily from, I think the ethos of design justice is, are we designing for ourselves or directly our communities. And when I say are, I mean, are we designing are designers just like sort of re perpetuating, global north, you know, sort of white centric views of technology, because how we think of harm, and what we think of usefulness in that case is very specific. And it's very different. Right? You'd have a whole different kind of set of threat modelling and harm reduction, if you designed where we're like, equally placing the, the thoughts and considerations of someone outside of that paradigm, right, if someone had a global power. And so human rights centred design is actually asking that, like, are we building products that is ensuring like global digital human rights? Is it ensuring safety? Is it ensuring freedom of expression? And for example, we can look at major platforms and say, like, no they're not doing that. And so when it's more asking is that, like, we center that in the process, and actually really think of the fact that we're building global products, even if it feels like we're building a product just for the one country we're in? Right? That's more what it's asking. And so, you know, deeper analysis of human rights centred design, again, does very much align with design justice, you know, how do you actually engage in a human rights centered design practice, it's not just having a diversity of user personas, it's actually engaging with communities, it's making sure you're doing that user research, you know, having a persona where it's like, "this person is based in this totally other country, and they're worried about x". But I asked designers is like, do you actually know enough about the reality of that persona of that actual person's experience in existence? Do you really understand the harms and problems they will face? And if you don't, then you're not designing from s human rights centred perspective. And you need to understand actually, the very real, real harm, real violence they could face, just in existence and of using digital tools and of how your tool could be complicit in that violence they face. And that's more what human rights centred design is asking for. So it's asking us even if we're building something as simple as a banking app, where we exchange money, it's asking, are you allowing for a variety of different use cases and people to use this? And how will one person be harmed in a way that another person may not be? For example, for designers at home thinking about this? One advice, I often give people like it's a it's a prompt, the prompt is, imagine that you're a victim of domestic violence, and you can't use your real name, or your actual location. And you have to think about you have to worry about if your location or entity was shared, then look at all the products you use in your day to day and think about, could you use that product? Is there a workaround for you to use that product? So amazon prime? Do you have to use your real name? Do you have to use your actual, like legal location? Can you have it sent somewhere else? Like so couldn't go to a P.O. box? Could it go to a friend's house, for example? Another example would be like, Instagram, if I'm private, and you like Natalie are public, and you are commenting on my account, can other people see that activity and vice versa? I, on your account, who can see my activity, right? Because then an abuser could sort of triangulate and try to figure out where I am or realise who you're to me. So when you go through that mindset, it's a completely different mindset, then of how of designing because you are actively pulling from this person's experiences to say, we're trying to design for a space of safety for that person. And then in their daily life, because they should be able to use banking apps, entertainment apps, like they should be allowed to use Netflix, right? They should be allowed to use Amazon, they should be allowed to like transfer money to people, right? They should be allowed to be on Twitter, right? Then saying, at what point would someone choose to leave and people choose to leave when they can't have a guarantee of their safety? Right. So that's more what human rights centered design is asking for.

Nathalie Post  
Yeah. And so I'm, I'm super curious to hear like when you're working with organisations like how do you embed that mindset within their projects that is not not just like an afterthought, but that it's really something that is, you know, embedded from the start. How does that look like?

Caroline Sinders  
A lot of it, again, is building this this fluency of immediately thinking through like, if I like so I haven't I haven't I work occasionally with domestic violence victims, but something I think about because it is an analogous experience to those that are facing networked harassment campaigns, right? In the sense that victims of networked harassment campaigns like GamerGate, people were trying to find their location, they were trying to share their banking information, they were trying to get them fired, they were looking for every piece of information to triangulate where a person was, where they worked, to their family was what they enjoyed, you know, so like, no one people didn't have a sense of safety. I want designers and researchers to have, even if you don't work, interest in safety, even if you are just working on, you know, maybe you work for a company that's really B2B, and you're building, you know, dashboards to help a company using, you know, some kind of like, like predictive analysis of like, you know, maybe it's like, like shipping metrics, maybe you're doing that, it's still important to think about spaces of like, weaponization. And like how like, really think through like the waterfall or cascading effects of harm. So if something goes wrong, what are all the things like where all the places it can, you know, sort of expand harm to a lot of this isn't that different than how like security engineers sort of look at the robustness of security and privacy protocols and the products they're using right because they're having to pull from a variety of like, well known experiences, they have to engage in this thing that's called threat modelling. What I'm asking is that all designers do that generally, regardless of the product they're working on. And one of the best ways to do that I think is either to come from is to come from is to come from a series of perspectives and scenarios that you've researc hed enough that it's affluency to you. So with security engineers, it's often thinking a lot about malware, and like major malware cases, and the follow on events of that, and then trying to sort of mitigate around that, and similar in online harassment, where you're really thinking of a variety of different use cases. And you're thinking, and those use cases are real, real use cases, they're real events, they're real things that happen to people. So the thing is really to just, you know, I guess read a lot make it a fluency, the same fluency, you'd have like the same fluency we already have and how to design certain products that we're we already know a lot about. And so that's more what it is. It's not just adding it to the user persona. It's actually really, really reading about those experiences. So when so when someone asks a question, you can be like, actually, in this one case, this is something that happened, like how are we going to wait, those considerations can at least flag that as a potential problem? That's kind of what I am sort of pushing for that people think about?

Nathalie Post  
Super interesting. And I, I want to ask you something slightly related to just the other project that you've been working on, where you're think you're also just pushing people to critically think and challenge existing tools and frameworks and perspectives. So that is the feminist dataset project that you've been working on. And so I'm super curious, first of all, like what even led you to investigating what an intersectional feminist machine learning process could be like? And yeah, what what are you working on right now within that space?

Caroline Sinders  
Totally. That's such a good question. And so feminist data set started, like coming into existence in 2017. And so I had been, I was like, in the middle of my IBM and BuzzFeed residency. And instead of so I was supposed to be looking at online harassment, the commenting section, but I ended up actually looking at was like the rise of the American alt right, and the far right, in spaces like Reddit, 4chan, and HTN. And so, like the TLDR of that was I was looking at a lot of white supremacy and like neo nazi data, a lot of it, and I was sort of sitting there. And I was thinking a lot about, you know, where does like machine learning and like the effects of machine learning kind of fit into the space, the Perspective API was just about to come out, which is something Google jigsaw had worked on. It wasn't very good, for lack of a better phrase.

And so I was just sort of sitting there being like, what are these conversations around trying to analyse abusive language? Why are artificial intelligence systems sort of? Like not, why are they not working very well, in this space. There's a variety of reasons of that because language and intent and context is hard to read. And you can't really train a system to do that. But um, but you know, I was just thinking a lot about that. And then at a certain point I realised I also wanted I wanted a breather in my work. I wanted something that wasn't about analysing harm, but that could actually be the opposite, which was building a system stem that I wanted to see. And so that's kind of where feminists dataset came came about, because I was like, what's the exact opposite? In my mind at the time? And I was like, Oh, yeah, feminism, I'm a feminist. What if I built a feminist tech system? What an interesting provocation. You know, I don't have to look at like, as much neo nazi data anymore. This was an interesting idea. And I really was so naive in the beginning of this project, because I thought it was gonna be like a six month project, I was like, I'm gonna collect some data, I'm gonna make a chatbot. And then when I actually like, started the process, because I wanted to rethink one, how do we gather data? Do I think of it as like, artisinal? digital data of like, server farm to table data? Do you know what I mean? But I really want to also think about like, Why? Because a lot of times in the process in the beginning of the project, if people still say this, they're like, why don't you just scrape Twitter for the word feminist? And I was like, Why is that a problem? Because the existence of the word feminist feminism, or feminist doesn't mean that the conversation itself is like, as the ethos of feminism, and I wanted to actually really rethink this idea that is very persistent. In the world that like data is is big, like big data is oil, that it's everywhere, that's worth a lot. But you can also just get it. And I think data is very human, every data set, really that's in existence. And point a person was involved in the creation of that data set, even if it's two computer computers, measuring each other's like, like connectivity or interaction, someone decided that that should happen at some point, right? So my framing of data is like data is this precious material. So when I was like thinking of Okay, like, how do I build this new process? And like, really, what are what are the structures, it would need the protocols, what process would need to be feminist? Like, what makes it and then feminism is just another word for equity? What would make it equitable, I realised that the project like couldn't stop at the data part, it had to look at every process, because once I had gathered the data, and like, Oh, I need to clean it and structure it and put it into a format that's then machine readable. What's the process of making that feminist, right? next step was like, Oh, well, I need to look at all the tools that are involved in that, you know, in that space, because I can't, I maybe can't use any of those tools. We already kind of knew that at some point, I was going to have to do an audit, you know, like an algorithmic audit of, of some of the out of the box, like NLP tools, because this data set was all text and sort of see, are these like, Are these the right tools? Can I remake them? Or is it impossible to sort of shift or change anything that's served and flattened into like a web API, right? And so I knew that that was like the middle step. But I just realised, like if I'm going to actually use intersectional, feminism as an investigatory framework, I can't just stop at the data set and just put it into something I actually really need to heavily question analyse every step of the process, then I'm in steps between I'm in steps two and step three. Step two, is the sort of asking like, what are feminists protocols for cleaning and structuring a data set? And then the next step is like, Okay, well, how do you build a feminist algorithmic audit of something like TensorFlow, right? Or like, IBM, IBM Watson's, like chatbot software? So like, I'm starting to, I'm starting to think of like that step three. Because then also, within that, there'll be like a step, you know, two B or three A, which is where do you go back and like retrain things. How do you like, how do you make a model accurate? Does it matter in this case, how accurate the model is, because it'll already be so misshapen, because I'm using all different kinds of text, you know. So it's, so those are a lot of the things I'm grappling with, which is, you know, what are all the steps? And, like, what are all the things you have to remake? What are the concessions you make in trying to look at the entire pipeline and make it feminist? Yeah.

Nathalie Post  
And so there's like, half your project has turned into a multi year, no end date.

Caroline Sinders  
I'm hoping before the pandemic, I know usually, it was like 2023. And now I'm like, Caroline it'ss probably like 2026/27. A lot of it ends up getting self funded. And so one thing I'm trying to do, I'm not, I'm not a great programmer, so I need to find someone that would be interested in like helping me conduct some of these audits, for example. And that's a different that's and maybe that could be the same person that helped me eventually build the system. Yeah. So that's one thing where it's like, you know, that's expensive and hard.

Nathalie Post  
Yeah, absolutely. And well, so rather than like talking about all the things that need to be done in the coming years, I'm super curious to hear more about the project that you have recently done. TRK technically responsible knowledge. So yeah, can you explain a bit what you did there and how you went about it? And like what you found out and what it is? I'm sorry, like, 10 questions!

Caroline Sinders  
Okay, let's probably start with what it is. So TRK stands for technically responsible knowledge. It is an open source, browser based tool that I built with Cade Diem of New Design Congress. Ian Ardouin Fumat, a fantastic data visualizer and engineer and Rainbow Unicorn, I design a design firm based in Berlin also fantastic. And it was funded as a part of my Mozilla Foundation fellowship where I was looking at AI and labour and interventions in the in the in the AI space, and how do you sort of explain data privacy to a general audience. And so it is a browser based tool. There's two aspects to it. What people first see is is an essay we wrote and the visualisation, a visualisation tool they can play with, that's trying to push people to understand pricing inside of gig economy style platforms like Mechanical Turk and crowdflower. So Mechanical Turk and crowdflower are these gig economy, micro service platforms where people are paid to do to do like small tasks. And these tasks are often related to, like datasets in machine learning, and things like surveys. So that can be how many birds are in the picture or like, label all the different things in the picture, right? And we'll pay you a few cents per each picture, or what's in this receipt, you know, so you can see how these are using computer vision, for example. Yeah. So TRK is a platform that anyone can start, they can also use our code, it's open source, if they want to build their own, where it's it's asking for a little bit more information. And it's asking people to confirm that, that the data someone else will be having to train or structure or clean is, is safe. And so we're asking for inspired by "data sheets for data data sets", a really fantastic canonical white paper. It's information of who you are, what date it is, we're asking the the trainers, if they will share their names, we're asking for like what the data set is it's supposed to be on. And then we're just sort of saying like how big it is. And then it injects that plain text into the the data set at the very end. So the idea is like, maybe like 10 years from now, someone if they were using TRK would have this information. So they would know like the intentions of the data set. And the visualisation tool we have came out came directly out of my own research, which was recognising that even if companies were research labs, or artists wanted to price equitably, that the interface design of these micro service platforms really worked against them, because time was never considered. So our tool is like a slider. And it just asks like how many like how many things are in your data set to maybe it's like 3000 images, we're asking you, like how long you think the tasks will take, and how and how much it will cost. And then the output is this, this is what makes it a critical design tool. Because it's more of a provocation. It's not a literal calculator, the output tells you if it's a living wage or not, and how many hours it would take someone to do. And also, if you're making it too short, like under a certain amount of time, it tells you that that's like an impossible task. So it's trying to get people to think about one, if you're saying someone can do this in under 10 seconds, they probably can't. Even if it's really, really easy. If you need them to train 1000 images at a certain point in the process, it's going to take longer, because they're going to get tired because people are not machines. And also, how long? How long is 1000 images? Like how many days is that? Can someone really do that in a day, that's also trying to get clients to think about that these tasks aren't just done in random that this is the entire day. And that if you're pricing, say, you know, label 1000 images, I think it'll take you five seconds and I'm only going to pay you three cents. So that actually that actually is so so low and that it takes a certain amount of time and even I could pull up while I'm afraid to pull up TRK right now cause we have like some internet connectivity problems. Usually what I do in interviews is I'm like let's take those and I'll tell you exactly, yeah, how how underpaid, that is. The living wage calculation comes from some papers we read, which was looking at, who are the biggest demographics of mechanical turkers. And that's India and the US. And we decided to go with the United States, like minimum wage and living wage research, what we found is that the highest minimum wage actually existed in Washington, which was a beautiful provocation, because what is headquartered in Washington? Yeah. What city had the highest, like what city needed the highest living wage, Seattle, what's also in Seattle, Amazon, so it's kind of this like, really poetic lining up. So what we realised that a living wage in Seattle, I think, I think this data came from, like 2017 2018, to be about $16, above $16 an hour for someone have a living wage, and I should highlight our calculator doesn't take into account taxes. And this is what we're I'm saying, like, it is a provocation. It's an art tool, like also use the open source tool to train datasets, you can't pay through our platform, right? Because we're not a start up. It's an art project. The critical design project that I hope people use, and you can use it, but like, you know, again, there's like, a cliff to, to, to it. Um, but we also started looking at what are the like, what are the things that would make a day job enjoyable, so $16 an hour. So the backtrack, the thing people have to keep in mind is when you do micro service style tasks, that you're only paid per task, right? Someone is making like, someone isn't making a salary. So in like the gig economy space, you're only paid for the time you work. So that puts the worker in a very precarious situation, because they have to think "Do I have time to eat? Do I have time to go to the bathroom? Do I have time to take a break? Or, you know, and can I afford that? Can I afford to take time off for lunch", right. And so we wanted to first calculate how much someone would have to earn in a day to hit the living wage. And then we went into calculate or giving them time off, so I work remotely at a company at the Wikimedia Foundation. And our meetings are supposed to be like 55 minute meetings, so everyone could have five minutes to just have a bio break. But no, we were obviously paid for those minutes. And we started we looked at things like what are other companies, or what are other work situations where people are given paid breaks, a lot of discourse, we sort of saw and like videos made about mechanical turkers of articles, remote Mechanical Turk, is that like Mechanical Turk is a better alternative than per se, working at McDonald's, but McDonald's gives you a paid 30 minute break. So we added in, like five minutes off for everyone to at the top of every hour, and then we subtracted like 45 minutes for a lunch break. But then what we did is we increased what the hourly had to be to account for this paid time off. So suddenly, what people have to earn per hour is like, over $19 an hour to accommodate for this paid time this these paid breaks people should have. So what that means is that what it means is that the minimum pricing per task had to go up. And and so like if if a task is priced under a certain amount, we'll also say like this, this wouldn't allow someone to have a living wage. Yeah, this is like a math, heavily math based art project. I hate math. My mom constantly laughs at this, because I am not great at math. She's like, I can't believe this art project, you're doing a lot of math in it. But the big thing was, like we need to do all of this to like, create this provocation, that these real sort of nudges and pushes in the tool to be like, you need to think about that. So you know, and but even then the price of the task is also correlated to the amount of time it takes, right? Because if a task takes a minute, you can't pay someone like 11 cents for that. Right? So they're both in tandem together. What what like crowdflower and Mechanical Turk kind of lack, is that asking of how long do you think this task will take? Um, and so you know, those, those platforms could ask people that are, you know, building a request, right? Then have workers work on, they could just ask them to, to do like, 20 of the tasks. And then long it takes and then and then add on a second or two more, maybe two seconds. They've got some good like boundary, right, and then say, hey, it's it's this many seconds, then the price needs to select that and go up to this. And so that was a lot of the stuff that was actually we saw that was missing in these cases. Yeah. As I say, and the reason we focused on money is because like, equitable pay is a feminist issue. But so is like, so is so is working a job where you can have time where you aren't, where you are treated with respect, you are given time off that you can also work eight or nine hours, you know what I mean? Yeah, and so the, the big, bigger issues we saw, and that's how it relates to the feminist data set, because it was really me thinking, how could I hire someone to help me train and structure this data set? What are the platforms I would use? Where would the data go? And how do I ensure that I'm treating them with respect, if I'm going to build this alternative, it has to also be situated in the real conversation of labour in AI and microservice platforms is happening inside of artificial intelligence. So like, that's kind of where TRK came from?

Nathalie Post  
Yeah, no. And I think it's like, fascinating how it provokes emotions, because I obviously, I played around with it on the website, and I would encourage everyone listening to do the same, because it just shifts your perspective from like, you know, like, "Oh, you know, like, this is a way to, like, easily label my data for not so much money" to like, "Oh, actually, like, Hey, this is not a fair value exchange". And it provokes that emotion, emotion, as a user of like, you know, this is not fair for me to ask this price, even though, you know, first, I wouldn't have thought of that, I would have just been like, Oh, yeah, it's cheap, I can get my data labelled. Cool. And I think that is beautiful, how you make it confrontational, but also, like, evoking emotions within, interacting with it. And I'm kind of curious, regarding that, like, when you're designing, how do you shift from, like, having identified a certain issue that you're you want to tackle to like, shifting into a provocation?

Caroline Sinders  
I think for me, it will always end up being a provocation. And then something like a research paper because I am not a company. But I want to give provocations that designers would see and understand, like, why that's there. So I think a lot of my work has band aids, so like, TRK is one small band aid for one small cut in regards to the 1000s of cuts that like machine learning creates, right. But what I want to do is like at least provide a framework and in a provocation for people to recognise how, you know, like, all of this kind of analysis, thought and care can go into like any product or making regardless of what the output is, I am really interested in in UX design, and product design, being more inserted into the art tech space, of which it's, it's not a lot of that a lot. And we do see that more with like speculative design now, with critical design, right. But even more so. And I think my work is often seen. While it's technically critical design, it's often seen as so useful that it could be a startup or a real product, as opposed to an art project. But the thing I would like to remind people of is some of the stuff I'm suggesting can never exist, because it would break at a certain scale, most likely. Right. And, but it does exist, at least to get us thinking about what are these some of these changes, right? Yeah. And so a lot of it for me is what I'm doing anything I am thinking about, how can I make it visual, or seen or like an interactive experience as a way of advocating what I'm talking about, and also like illustrating what I'm talking about, like I defined my art, my art practice as expanded documentary, which is sort of taking truth and expanding them through experience. And that does really pull from my photojournalism practice. So like, a lot of my work is really, you know, photojournalism, but interpreted through the lens of digital design or critical design, or like tech based art. And that's the space I do like to exist in. Um, yeah.

Nathalie Post  
Yeah. Yeah. So we're kind of almost through our time here, sadly, because I feel like we've like I have a million more questions I can ask you, um, but maybe to like, round it up. I'm super curious. Like what research areas beyond feminist data set that we talked about, you're interested in and yeah, are hoping to, like do more work in the future?

Caroline Sinders  
Yeah, I mean, I think I think I would like to emphasise with people as well. I'm an artist. I'm also a design researcher, and I really do love unpacking and solving problems in a space where I know people, like users are going to be touching it. So I'm hoping to do actually, more product work in the next few years. So anyone out there listening, running any kind of AI company, if you're looking for, if you're looking for some some design strategy that's definitely focused and responsible tech get at me. Um, but that, but that's something I would like to go back into, and I am thinking a lot about. And if we're building like dashboards to like audit algorithms, how one how, how are we measuring them? What are the outputs? And how do you make those legible beyond the engineering team, but to people on the product side people on the sales team, and then like, even even your consumers and your users? So that's a lot of the things I've been thinking about. Some of it is like how, how do we maybe start building protocols for understanding when a data model needs to be updated or recreated? So when does data degrade? So how can we think of data as a more organic material that has an expiration date, for example. So those kinds of provocations, while they seem, I think maybe very lofty, I think of them is very practical, because if I were to build, for example, a language model on abusive language, and I was only using things from 2014, that would start to be out of date now, right? We, we, you know, we use things in a different way. And so I think it's important to think about expiration dates and the kind of data you have, mainly in terms of like, and also the problems you're trying to solve. But that's something I'm really, really interested in. And I think, because I have this background in online harassment, what this means is I have like a very deep understanding, in how to like, question, and almost like QA and pen test products. And even if it has nothing to do with harassment, because I'm constantly thinking about all these different kinds of scenarios. And I'm like, oh, someone's gonna ask this weird question. Because I've had to deal with trolls most of my life, and I got you, I have a lot of things up. And also other people are gonna wonder, like, why did x happen? And you can explain that, in a very short sentence. Content strategy can be key a lot doesn't have to be a graph. But if you do have a graph, is the graph readable? So these are like all the things I'm sort of interested in doing, which is, you know, and these are questions that are popping up now we see around explainability, legibility and transparency. And those are all things that are important in the AI field, especially if you're listening to this. In the EU, most likely, we're going to have regulation that commands those things. And like, those are things I'm hoping to be working on the next few years, is helping companies and products think through those considerations.

Nathalie Post  
Yeah. This was absolutely wonderful. Thank you so much, Caroline. Maybe like a final final final question. If people want to learn more about you and your work, where can they find you? Where do you want to send them to?

Caroline Sinders  
I would suggest Twitter and then my tiny letter, I do put stuff on my website. But I publish a lot of stuff that I'm you know, and hopefully in two months, my design firm will be like launching our website where we're gonna have a library of things that we've written, we'll be publishing stuff, that's just, you know, our speculations and provocations. So that would be a good resource to check out. But until then, probably my tiny letter and my Twitter handle, everything is under my name. So it's not difficult to find just Caroline Sinders. That's me.

Nathalie Post  
Great. Great. Thank you so much. Once again, Caroline. Thank you.

Caroline Sinders  
Thank you. This is this was super fun.

permanere audire

Continue listening...

newsletter

Want to stay up to date?

Sign up for our newsletter, and we’ll keep you posted on our research, podcast and other AI goodies.
* We don't share your data. See our Privacy Policy
Thank you! You've subscribed.
Oops! Something went wrong while submitting the form.