Q&A: Chris Marais and Frida Mjaria on Artificial Intelligence and real-time public transport data in emerging markets
WhereIsMyTransport’s Real-Time Alerts for incidents and disruptions reduce uncertainty, benefiting anyone living and working in emerging markets — including people using our public transport app Rumbo. We spoke with Chris Marais, Machine Learning Engineer, and Frida Mjaria, Software Engineer, about how Artificial Intelligence could transform real-time public transport data production in the Majority World.
What do you do at WhereIsMyTransport, and what’s your background?
Chris: I started working at WhereIsMyTransport as a software engineer. My background is in maths and physics, so I’ve always been interested in data science and machine learning. I’ve been in the process of moving into that space, and I’m now working on machine learning problems — ways that we can automate some of the workflows we do manually, and start applying that thinking to other parts of the business, including in our public transport app Rumbo.
Frida: I studied computer science and computer engineering. I worked for a few years in the financial sector but I felt like I wanted to work on something that makes a difference. That’s when I found WhereIsMyTransport. I really liked how the focus was on improving the lives of people by solving the public transport data problem. Lack of good data is a fundamental issue in most emerging-market cities. I joined as a software engineer, now, I am working with Chris in the Artificial Intelligence space.
What is Artificial Intelligence and why does it matter?
Chris: Artificial Intelligence is basically the things you take for granted that humans do. Having conversations, understanding context, picking things up from an image. It’s trying to get machines to do portions of those cognitive tasks. The reason it’s so important for us is that writing code is generally quite a manual process. But increasingly we’re seeing that more and more of these tasks can be done by machine.
Frida: Artificial Intelligence means teaching a computer system or a machine to think and interpret information as a human would. It’s really important, especially the case of speech recognition, being able to recognise images, or being able to understand what text means. As humans, we are only able to do so much. Computers can process so much more information than we can. If we can teach them how to process things the way humans do, we can pick up patterns and get more useful information.
Chris: The frustrating thing about programming in a traditional sense is that computers are really good at processing information, but they are also really dumb. You have to tell them every single detail. It is quite painful. The beauty of machine learning is that it’s this new paradigm where you can actually take a bunch of data and tell the computer to figure it out.
Why is real-time public transport data challenging in emerging markets?
Frida: Public transport networks in our markets are complex. This means that for emerging-market cities, other than WhereIsMyTransport, there basically aren’t any central sources of complete data — and that includes real-time data. It’s not visible to people, and the systems are not there to support making real-time data visible. We also face situations where there are gaps in network data, or the routes changes and data isn’t updated.
Chris: It’s very complex. Most emerging-market cities lack the base layer of data from all of the public transport network, so it’s hard to surface the right information at the right time. Real-time data is valuable because there’s so much going on that people benefit from knowing about.
One of Rumbo’s features is real-time alerts, so we’re already getting that information to people in our markets right now. We have local teams using our suite of tools to monitor what’s going on in the city, establish what is a real incident, what is abnormal, and turn that into information in our app. But going even further with the automatic monitoring of these kinds of events is really exciting — including using machine learning to establish whether something is a real incident based on past data.
Frida: AI provides different ways to crowdsource. I think of it as smart crowdsourcing. We’re building systems that allow users to not just share information with other users, but automatically moderate and verify the information they share. We can also automatically tell how far a public transport vehicle is on its route thanks to commuter patterns. Machine learning is very good at being able to pick out these patterns. And the good thing about it is it requires less human intervention, which makes it an effective way of improving reach and impact.
One of the most important things is natural language processing, or NLP. NLP enables machines to learn different structures and meaning in texts — which is language dependent — meaning we get to a situation where you’re able to pick out information in one language and interpret it and port it into another language. Again, this expands reach and impact.
Chris: A lot of data that’s out there exists in the form of natural language. People are speaking to each other online, across various platforms. Often it’s just between people in groups, or it can be in public channels. Machine learning has got to a point where it’s pretty excellent at understanding the structure of natural language and what things actually mean. We can definitely use this to monitor what people are saying about public transport, trying to surface the most relevant things for people.
Frida: It’s very difficult to monitor entire public transport networks. Informal networks are flexible — they change a lot and have a higher risk of disruptions. But if you throw AI at it, and you have NLP, you’re able to get useful information to people that helps them with their journeys — like the people using our app Rumbo today. That’s why real-time data is so exciting. And the way NLP can interpret data, it’s also able to communicate to a user like it’s a person. If you’re trying to get someone to give you a piece of information — say there was an accident, for example — then you could say “could you please tell me more about this?” With that, you can get a more holistic view of what’s actually happening and what caused it, which can be passed on to more people.
What is public transport like where you live?
Chris: We both live in Cape Town. One thing about here that is so different to other big cities — where you can kind of take for granted that map apps know the state of things — is that it’s totally word of mouth. I stick to what I know, which is the southern train lines, just a tiny sliver of Cape Town. I work from home, but if I do go to town, I like taking the train. During lockdown, it pretty much wasn’t running, but now it has started again. There’s this beautiful blue train which I’m obsessed with. Big windows, beautiful views on the trip. When I noticed the train was running again, the first thing I did was start texting the local groups to find out how consistent and reliable the train was.
Frida: I use public transport in Cape Town and I have found that I have to learn the patterns myself. It took a long time to figure things out. If it’s 8AM and three buses go past me and they’re full, I know I need to switch to taxis or to trains. But I know that trains never work on Mondays. It’s a difficult and painful thing to learn and it can become extremely stressful, especially when you start running late for things which are very important, like work or appointments. If there was a way of getting real-time data for that in Cape Town, I think it would be the best thing ever. AI could support that. It would be much easier for a machine to be able to learn these patterns, and much easier to have impact.